← RFC Reference

RFC 2231: MIME Parameter Value Extensions

Current Standard MIME — Multipurpose Internet Mail Extensions MIMECharacter SetsInternationalizationAttachments
ELI5: Email attachments have filenames, and filenames can contain non-English characters or be very long. Original MIME could only handle plain ASCII values in headers. RFC 2231 adds three features: support for non-ASCII characters (like Japanese or Arabic filenames), language tagging, and splitting long values across multiple lines.

Why This Exists

MIME parameters appear in headers like Content-Type and Content-Disposition. The most common use case is the filename parameter for attachments:

Content-Disposition: attachment; filename="report.pdf"

This works fine for ASCII filenames. But what about a file named 報告書.pdf (Japanese for "report") or a filename with 200 characters? The original MIME specification (RFC 2045) had no mechanism for this. RFC 2231 solves three problems:

How It Works

Character Set and Language Encoding

To include non-ASCII characters, append an asterisk to the parameter name and use the format charset'language'encoded-value:

Content-Disposition: attachment;
    filename*=UTF-8''%E5%A0%B1%E5%91%8A%E6%9B%B8.pdf
             ^^^^^^^  ^^  ^^^^^^^^^^^^^^^^^^^^^^^^
             charset  lang  percent-encoded value
                     (empty = no language tag)

The value %E5%A0%B1%E5%91%8A%E6%9B%B8 is the UTF-8 bytes for "報告書" percent-encoded. The language tag between the single quotes is optional (often left empty).

Continuations for Long Values

When a parameter value is too long for a single header line, split it using numbered continuations:

Content-Type: application/pdf;
    filename*0="very-long-document-name-that-exceeds-the";
    filename*1="-reasonable-line-length-limit-for-headers.pdf"

The parts are reassembled in numeric order: *0, *1, *2, etc.

Combined: Continuations with Character Sets

For long non-ASCII values, combine both features. Only the first segment includes the charset and language; subsequent segments are just encoded values:

Content-Disposition: attachment;
    filename*0*=UTF-8''%E3%81%93%E3%82%8C%E3%81%AF%E9%95%B7;
    filename*1*=%E3%81%84%E3%83%95%E3%82%A1%E3%82%A4%E3%83%AB;
    filename*2*=%E5%90%8D.pdf
          ^^^^
          number + asterisk = encoded continuation

Key Technical Details

Parameter Name Syntax

Form Meaning Example
filename Plain ASCII value filename="report.pdf"
filename* Encoded value (charset'lang'value) filename*=UTF-8''%E5%A0%B1.pdf
filename*0 Continuation part 0, plain ASCII filename*0="very-long-"
filename*0* Continuation part 0, encoded filename*0*=UTF-8''%E5%A0%B1

Encoding Rules

Interaction with RFC 2047

RFC 2047 provides encoded-words (=?UTF-8?B?...?=) for non-ASCII text in headers. However, RFC 2047 explicitly states that encoded-words must NOT appear inside quoted strings or parameter values. RFC 2231 is the correct mechanism for MIME parameters. Despite this rule, many mail clients use RFC 2047 in filename parameters anyway, so robust parsers must handle both.

Common Mistakes

Deliverability Impact

Related RFCs