Search Support

Avoid support scams. We will never ask you to call or text a phone number or share personal information. Please report suspicious activity using the “Report Abuse” option.

Learn More

Preventing mojibake with HTML emails

  • 4 பதிலளிப்புகள்
  • 1 இந்த பிரச்சனை உள்ளது
  • 28 views
  • Last reply by chruss2

I noticed that many of my recipients are seeing junk characters in my emails. I have taken a look at the raw bytes and I think I know what is going on.

MIME headers and HTML head elements being sent:

Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: 8bit

<meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> The body is encoded correctly in windows-1252 encoding. Everything looks correct, so what's the problem? Here's my theory: The recipient sees the MIME header. Because it is encoded in windows-1252, it converts the HTML content to UTF-8 but of course the meta tag will still *say* it is windows-1252. Basically, having two headers at different layers specifying window-1252 encoding is causing the recipient to doubly decode the HTML body. I think what we need is some way to make the HTML 7-bit friendly. Convert all non-ASCII characters to numerical entities like  . That way charset conversions can be done at the MIME layer without corrupting the HTML. Is there an option for this in Thunderbird? I poked around in Settings but couldn't find anything.

I noticed that many of my recipients are seeing junk characters in my emails. I have taken a look at the raw bytes and I think I know what is going on. MIME headers and HTML head elements being sent: Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: 8bit <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> </head> The body is encoded correctly in windows-1252 encoding. Everything looks correct, so what's the problem? Here's my theory: The recipient sees the MIME header. Because it is encoded in windows-1252, it converts the HTML content to UTF-8 but of course the meta tag will still *say* it is windows-1252. Basically, having two headers at different layers specifying window-1252 encoding is causing the recipient to doubly decode the HTML body. I think what we need is some way to make the HTML 7-bit friendly. Convert all non-ASCII characters to numerical entities like &#160;. That way charset conversions can be done at the MIME layer without corrupting the HTML. Is there an option for this in Thunderbird? I poked around in Settings but couldn't find anything.

தீர்வு தேர்ந்தெடுக்கப்பட்டது

Under Tools/Options/Display/Formatting, click the Advanced button where you can set the default encoding for outgoing mail, and the choice of encoding for replies (see picture).

If you're sending through a Yahoo (ATT, Rogers, Verizon etc.) type server and your recipients see unwanted characters, set mail.strictly_mime to true in Tools/Options/Advanced/General/Config. editor to enforce Quoted Printable encoding (Outlook default).

Read this answer in context 👍 0

All Replies (4)

Folks using yahoo have issues because for about a year they have been enforcing 7bit encoding. You would think they were living in a US centric version of the world.

Basically your mail should be encoded as UTF-8 and those windows encodings forgotten as they are just obsolete and their decoding varies from product to product. They are a product of the days in the early 1990s when Microsoft was of the view the internet was a fad that they could safely ignore.

How are you managing to get windows encoding in there? pasting from word as anything but plain text is a good way, or always has been.

I was replying to an email that used windows encoding. I agree about pushing towards "Unicode everywhere", but I can't control what encoding others use, and it appears that Thunderbird keeps the same encoding when replying (but not when forwarding).

Is there an option to force use of UTF-8 when replying?

தீர்வு தேர்ந்தெடுக்கப்பட்டது

Under Tools/Options/Display/Formatting, click the Advanced button where you can set the default encoding for outgoing mail, and the choice of encoding for replies (see picture).

If you're sending through a Yahoo (ATT, Rogers, Verizon etc.) type server and your recipients see unwanted characters, set mail.strictly_mime to true in Tools/Options/Advanced/General/Config. editor to enforce Quoted Printable encoding (Outlook default).

Thanks a lot for your help. I tested each of those options separately, and each of them solved the problem.

I am using yahoo SMTP servers. @Matt, you also mentioned Yahoo, but I thought you meant Yahoo Mail (a web client). Thanks @sfhowes for being more specific. It did not occur to me that my ISP could suck that much!

I tested their servers myself, and they are allowing 8-bit but only if it's valid UTF-8. Anything non-UTF-8 gets changed to UTF8(U+FFFD). Therefore either option works: forcing replies to UTF-8, or forcing quoted printable (7-bit).

Thanks for pointing me to the options. I would never have found the Reply encoding option under Display (Fonts and Colors). What an illogical place for that option!