RFC 6532 – Internationalized Email Headers
=?UTF-8?B?...?=. RFC 6532 says “just use UTF-8 directly.” The From line can say 田太郎 <taro@example.jp> in plain text, and the Subject can be in any language without encoding tricks. The email looks the same on the wire as it does on screen.
Why This RFC Exists
The original email message format (RFC 5322) restricts header fields to US-ASCII characters. To include non-ASCII text in headers like Subject, From display names, or address comments, senders had to use RFC 2047 encoded words — a cumbersome scheme that Base64 or Q-encodes the text and wraps it in =?charset?encoding?...?= markers.
This system works but produces headers that are unreadable in their raw form, difficult to implement correctly, and limited in where they can appear. For example, RFC 2047 encoded words cannot be used inside the local-part of an email address.
RFC 6532 updates RFC 5322 to allow UTF-8 directly in email headers. Combined with RFC 6531 (SMTPUTF8 for the SMTP envelope), this enables fully internationalized email where addresses, display names, subjects, and other headers can all use native scripts without encoding workarounds.
How It Works
- The message is transmitted via SMTP with the
SMTPUTF8extension (RFC 6531), signaling that the message uses internationalized content. - Header fields may contain raw UTF-8 octets anywhere that RFC 5322 allows text.
- Email addresses in headers (From, To, Cc, Reply-To) may contain UTF-8 characters in the local-part and the domain part.
- The
Content-Typefor the message header block is implicitlymessage/global(defined in RFC 6532) instead of the traditionalmessage/rfc822. - Receiving mail clients that support RFC 6532 display the headers natively. Clients that don't may display raw UTF-8 bytes or fail to parse the headers.
Header Examples
Traditional RFC 2047 encoded headers vs. RFC 6532 headers:
A complete internationalized message:
Key Technical Details
message/global vs. message/rfc822
RFC 6532 introduces a new MIME type: message/global. This is functionally identical to message/rfc822 but signals that the message may contain UTF-8 in headers. When a message is transmitted with SMTPUTF8, attached messages and forwarded messages should use message/global instead of message/rfc822:
| MIME Type | Header Encoding | Usage |
|---|---|---|
message/rfc822 |
ASCII only (RFC 2047 for non-ASCII) | Traditional email |
message/global |
UTF-8 allowed natively | Internationalized email |
Where UTF-8 Is Allowed
RFC 6532 permits UTF-8 in all header field positions where RFC 5322 allows text:
-
Display names:
From: 田太郎 <taro@example.jp> -
Address local-parts:
To: <田太郎@example.jp> -
Domain names:
Cc: <user@例.jp> -
Unstructured fields:
Subject: 会議の確認 -
Comments:
From: user@example.com (山田太郎)
DKIM Signing Considerations
DKIM signatures cover specific headers. When those headers contain raw UTF-8, the signing and verification process must handle the bytes correctly. The DKIM h= tag lists headers to sign, and the canonicalization applies to the raw UTF-8 bytes. Both sender and verifier must process the same byte sequence for signatures to match.
Backward Compatibility
RFC 6532 messages are not backward compatible with mail systems that only understand RFC 5322. A server or client that receives a message/global message but doesn't support RFC 6532 may:
- Display raw UTF-8 bytes (which most modern clients handle reasonably)
- Fail to parse address headers containing non-ASCII characters
- Break DKIM verification if the implementation doesn't handle UTF-8 headers
Common Mistakes
-
Using RFC 6532 headers without SMTPUTF8 in the envelope. If the SMTP session didn't use the
SMTPUTF8parameter, the message must use RFC 2047 encoded words for non-ASCII header content. RFC 6532 headers are only valid when sent via RFC 6531. - Mixing RFC 2047 and raw UTF-8 in the same header. Pick one approach per message. If you're sending via SMTPUTF8, use raw UTF-8. If not, use RFC 2047 throughout. Mixing them creates parsing ambiguity.
-
Using message/rfc822 for internationalized attachments. When forwarding or attaching a message that contains UTF-8 headers, use
message/globalas the MIME type, notmessage/rfc822. - Ignoring Unicode normalization. The same visual character can have multiple byte representations in Unicode. Use NFC normalization consistently for email addresses and header content to prevent matching and verification failures.
- Assuming all mail clients render UTF-8 correctly. While modern clients handle UTF-8 well, some older or embedded mail clients may not. For critical transactional email, consider whether your audience includes users on legacy systems.
- Breaking DKIM by modifying UTF-8 headers in transit. Some mail processing systems normalize or re-encode UTF-8 text. Any modification to a DKIM-signed header, even changing the Unicode normalization form, invalidates the signature.
Deliverability Impact
- Better display for international recipients. Native-script display names and subjects look professional and trustworthy. Encoded-word gibberish in headers can confuse recipients and trigger suspicion.
- DKIM must handle UTF-8 correctly. If your DKIM implementation signs UTF-8 headers and the verifier processes them differently (different normalization, different encoding), the signature fails. This directly impacts DMARC alignment and deliverability.
- Growing provider support. Gmail, Outlook.com, and other major providers support RFC 6532 headers. Messages with clean UTF-8 headers are displayed correctly by these providers.
- Required for fully internationalized addresses. You cannot have an internationalized email address without RFC 6532 headers. As EAI adoption grows, supporting this RFC becomes essential for reaching global users.
-
Fallback strategy matters. For maximum deliverability, generate both a
message/globalversion (for SMTPUTF8-capable servers) and a downgraded RFC 2047 version (for others). Your sending infrastructure should choose based on the receiver's capabilities.