"Find on this page" handles Unicode combining characters poorly

Apr 27, 2014
Steps to reproduce

Simpler reduced repro:

Go to https://jsfiddle.net/7faeufdk/ in both Edge and Chrome

Copy this directly: caractère

Paste that word into find on page in both Edge and Chrome


Both find it



Only Chrome finds it.  Edge can find “caractère” but not "caractère".  Don’t be thrown off by the text rendering in Visual Studio if you are viewing this with the native client – it renders the former with the accent over the ‘r’ which is incorrect, both words should have the accent over the 'e’.




Original repro


URL = http://www.huffingtonpost.fr/cedric-villani/comite-de-soutien-anne-hidalgo_b_4628631.html?utm_hp_ref=france


Repro Steps:


A page exhibiting the problem "in the wild": http://www.huffingtonpost.fr/cedric-villani/comite-de-soutien-anne-hidalgo_b_4628631.html?utm_hp_ref=france

Repro steps:

  1. Create a web page that includes a word with combining characters, e.g. “caractère” (note that “è” is a sequence of two Unicode codepoints: U+0065LATIN SMALL LETTER E and U+0300 COMBINING GRAVE ACCENT).

  2. Open it in Internet Explorer (currently IE 11 on Windows 7 64 bits) and type Ctrl+F.

  3. Type “caractère” (note that “è” is one Unicode codepoint here: U+00E8 LATIN SMALL LETTER E WITH GRAVE).

  4. Note that the word is not found.

Expected Results:


When searching for some text on a page, all possible Unicode representations should be considered the same. This could be achieved e.g. by pre-converting to normalization form D (but other implementations are possible).

Actual Results:



