[isabelle-dev] Accented characters

Lars Noschinski noschinl at in.tum.de
Thu Jan 13 12:33:04 CET 2011

On 13.01.2011 12:15, Gerwin Klein wrote:
> On 13/01/2011, at 8:51 PM, Larry Paulson wrote:
>> Accented characters on our website no longer display correctly on
>> Macs. I don't know precisely when this happened, but I'm sure it's
>> fairly recent. In fact, the characters don't even display correctly in
>> the HTML source. It may be a character encoding problem. Clearly, it
>> renders correctly in Google Chrome but not in Firefox or Safari.
> This issue is a bit strange. It shows up on my Mac as in your picture
> from the Cambridge server, but it looks fine with the correct characters
> from the Munich and Sydney servers.
> It may have more to do with what the server expects as encoding in the
> source file than what the browser gets to see.

The files are encoded in iso-8859-1 (latin-1), which is declared with an 
<?xml ..?> processing instruction in the file. The difference between 
the three servers is the Content-Type header sent by the server:

Cambridge: text/html; charset=utf-8
Munich: text/html
Sydney: text/html; charset=iso-8859-1

Normally, the declaration in the document should override the 
declaration sent by the server; but probably the <?xml ..?> processing 
instruction is ignored by Firefox and Safari, as the document is marked 
as html, not xml. The correct fix is to add an additional meta-element, 
specifying the charset, as outlined in the XHTML specification:

Historically, the character encoding of an HTML document is either 
specified by a web server via the charset parameter of the HTTP 
Content-Type header, or via a meta element in the document itself. In an 
XML document, the character encoding of the document is specified on the 
XML declaration (e.g., <?xml version="1.0" encoding="EUC-JP"?>). In 
order to portably present documents with specific character encodings, 
the best approach is to ensure that the web server provides the correct 
headers. If this is not possible, a document that wants to set its 
character encoding explicitly must include both the XML declaration an 
encoding declaration and a meta http-equiv statement (e.g., <meta 
http-equiv="Content-type" content="text/html; charset=EUC-JP" />). In 
XHTML-conforming user agents, the value of the encoding declaration of 
the XML declaration takes precedence.

Greetings, Lars

More information about the isabelle-dev mailing list