Uploaded image for project: 'Railo'
  1. Railo
  2. RAILO-634

Server Side way to detect "accept-charset" OR encoding from POST (or FORM)

    XMLWordPrintable

    Details

      Description

      I need a way to insure POST BACKS (POST/FORM DATA) are always encoded in UTF-8.

      Within the Web Administrator I can define the default charsets for everything, and I do --> all set to UTF-8. Within my pages, I also setEncoding("FORM", "UTF-8") as well as my html meta tags and cfprocessingdirective pageEncoding="utf-8".

      Everything is set to UTF-8 in all the right places.

      So now, if I have a simple HTML form, like so:

      <form action="postPage.cfm" method="post" enctype="application/x-www-form-urlencoded; charset=UTF-8;">
      <input type="text" name="myField">
      <input type="submit" value="GO" >
      </form>

      and post some text back (including for example Chinese characters) everything encodes correctly (UTF-8).

      HERE'S THE PROBLEM:

      (NEVER TRUST THE CLIENT)

      The client (Web Browser) can be switched to use a different encoding, very easily. All the user need do is select another from the Browsers Pull Down menus (varies, depending on browser.)

      Lets say they switch their browser's encoding to Simplified Chinese (GBK)... then paste in some UTF-8 characters... the postback will encode the string with X-GBK encoding – jacking everything up. All the UTF-8 settings on the sever side have no effect in rectifying this, as the encoding is done by the Browser and sent to the server.

      If I reconstruct the <FORM> tag like so:

      <form action="postPage.cfm" method="post" enctype="application/x-www-form-urlencoded; charset=UTF-8;" accept-charset="UTF-8">
      <input type="text" name="myField">
      <input type="submit" value="GO" >
      </form>

      The "accept-charset" forces the browser to encode UTF-8 regardless of which character encoding the Browser is using. This is good, however – I never trust the client.

      SO....

      I need a way to test one of two things (or both); 1) The charset the data was posted in, or 2) a way to test (server side) the encoding method or string – ie. isUTF8Encoded(FORM.myField)

      For some odd reason the "Accept-Charset" for the <FORM> tag is not included in the header... or at least, not in a dumped CGI Scope structure.

        Gliffy Diagrams

          Attachments

            Activity

              People

              • Assignee:
                micstriit Michael Offner
                Reporter:
                jscnet Jason n/a
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated: