UTF-8 in URIs is not map to the correct characters

Issue #2996884 • Assigned to Nicolas A.


May 29, 2015
This issue is public.
Reported by 0 people

Steps to reproduce


Repro Steps:

  1. Extract the attached 20448385.zip to a directory. For fidelity, use jar -xvf 20448385.zip from the command line, or 7zip. DO NOT use Winzip or any pkzip, because these tools are so old that they use cp 437, which mangles the filenames beyond recognition, and cannot use UTF-8.

  2. Open page.html in a browser.

  3. Observe that in IE the third image says "windows 1252", and in other browsers "UTF-8". If you see neither of these you used Winzip to extract the zipfile, dammit!

Expected Results:

In this specific case, <img src="%C3%A7%C3%B5.png"> should load çõ.png instead of çõ.png.

In general, on a page where the charset is specified as UTF-8 in the Content-Type, UTF-8 encoding should be used for all %-encodings in URI references in a UTF-8 , and windows 1252 should not, since this encoding is not referenced anywhere in the page, and is platform specific.

Actual Results:

Dev Channel specific:



    Comments and activity

    • Microsoft Edge Team

      Changed Assigned To to “Kamen M.”

      Changed Assigned To to “Venkat K.”

      Changed Assigned To from “Venkat K.” to “Rajat J.”

      Changed Status to “Confirmed”

      Changed Assigned To from “Rajat J.” to “David W.”

      Changed Status from “Confirmed” to “Won’t fix”

      Changed Assigned To to “David W.”

      Changed Status from “Won’t fix”

      Changed Assigned To from “David W.” to “Venkat K.”

      Changed Assigned To from “Venkat K.” to “Nicolas A.”

