Email Address Internationalization K. Hurtta (EAI) March 17, 2007 Internet-Draft Intended status: Experimental Expires: September 18, 2007 Encapsulation mechanism for Internationalized Email draft-hurtta-eai-encapsulation-01 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on September 18, 2007. Copyright Notice Copyright (C) The IETF Trust (2007). Abstract The Email Address Internationalization (EAI) is implemented by allowing UTF-8 characters in SMTP envelope and mail headers. To deliver email which uses UTF-8 in email headers through EAI non- compliant environment converting (i.e downgrading) or encapsulation mechanism is required. Some UTF-8 email may sign email headers or email header fields. This document describes mechanism for encapsulation when converting can not be used because of signed email Hurtta Expires September 18, 2007 [Page 1] Internet-Draft EAI Encapsulation March 2007 headers. Encapsulation may also be used to forward EAI email through EAI non-compliant environment that way that original EAI email can be recovered. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Addition to internationalized email header . . . . . . . . . . 4 3.1. "Downgrade-Method" header field . . . . . . . . . . . . . 4 3.2. "I18N-Received" header field . . . . . . . . . . . . . . . 5 3.3. Registration of Downgrade-Method header field . . . . . . 5 3.4. Registration of I18N-Received header field . . . . . . . . 5 4. Encapsulation format . . . . . . . . . . . . . . . . . . . . . 6 4.1. "multipart/utf8-encapsulated" media type . . . . . . . . . 7 4.2. Registration of media type multipart/utf8-encapsulated . . 7 4.3. Registration of media type text/utf8-header . . . . . . . 9 5. Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . 10 5.1. Generic encapsulation . . . . . . . . . . . . . . . . . . 10 5.1.1. Encapsulation of recursive part . . . . . . . . . . . 15 5.1.2. Encapsulation example . . . . . . . . . . . . . . . . 18 5.1.3. Multipart encapsulation example . . . . . . . . . . . 19 5.1.4. Unknown top level type encapsulation example #1 . . . 21 5.1.5. Unknown top level type encapsulation example #2 . . . 22 5.2. Downgrading of internationalized email message . . . . . . 22 5.2.1. Encapsulation example . . . . . . . . . . . . . . . . 25 5.2.2. Multipart/signed encapsulation example . . . . . . . . 26 5.3. Attaching internationalized email message . . . . . . . . 29 5.3.1. Attaching example . . . . . . . . . . . . . . . . . . 29 6. Decoding encapsulation . . . . . . . . . . . . . . . . . . . . 31 6.1. Generic decoding . . . . . . . . . . . . . . . . . . . . . 31 6.1.1. Decoding of recursive part . . . . . . . . . . . . . . 34 6.2. Upgrading of internationalized email message . . . . . . . 35 6.2.1. Upgrading example . . . . . . . . . . . . . . . . . . 36 6.3. Retrieving attached internationalized email message . . . 38 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 39 8. Security Considerations . . . . . . . . . . . . . . . . . . . 39 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 39 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 40 10.1. Normative References . . . . . . . . . . . . . . . . . . . 40 10.2. Informative References . . . . . . . . . . . . . . . . . . 41 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 41 Intellectual Property and Copyright Statements . . . . . . . . . . 42 Hurtta Expires September 18, 2007 [Page 2] Internet-Draft EAI Encapsulation March 2007 1. Introduction Internationalized email includes UTF-8 characters [RFC3629] in email headers. When internationalized email is delivered to EAI non- compliant environment it's email header fields are converted (i.e. downgraded) to ASCII compatible form. When email comes back EAI compliant environment it is upgraded to internationalized form by decoding ASCII compatible encodings. When internationalized email is downgraded to ASCII compatible form and then upgraded to internationalized form, the result is not necessary original mail. For example some header fields may have originally used ASCII compatible form, but upgrading converts them to UTF-8 form. Sometimes, however, it be required that original internationalized email header part can be recovered. This document describes mechanism for encapsulation which allows recovering the original internationalized email. If mail headers or some mail header fields and message parts are cryptographically signed, this can require that the original mail is recovered before signature of mail is checked. This document provides an encapsulation method which has the following properties: o The encapsulation does not produce nesting encodings. o Content of an encapsulated mail is accessible to EAI non-compliant user agents. o The encapsulation does not hide original MIME parts although original MIME structure may be obscured. o The encapsulation provides way to recover a original internationalized email. Media types "multipart/utf8-encapsulated" and "text/utf8-header" are introduced. This document provides markup which indicates when downgrading to EAI non-compliant environment should be done with this encapsulation. That is done by adding "Downgrade-Method: encapsulate" header field. If internationalized email is encapsulated, "Downgrade-Method: encapsulated" header field is used. Only minimal amount of header fields are generated or left to header part of encapsulated message. This is used to hide signatures, which are placed to header fields, during encapsulation. For example Domain Keys Identified Mail [DKIM-Charter] uses this kind of signatures. Original header fields are stored in "text/utf8-header" Hurtta Expires September 18, 2007 [Page 3] Internet-Draft EAI Encapsulation March 2007 MIME part. The "multipart/signed" media type [RFC1847] signs header fields from MIME header. After encapsulation signature fails, because MIME header is changed. That signature is hidden by replacing "multipart/ signed" in the "Content-Type" header field with "multipart/mixed" value. Original "Content-Type" header field is stored in "text/ utf8-header" MIME part. This encapsulation copies all header fields of internationalized email to "text/utf8-header" MIME part. Saved header fields from "text/utf8-header" MIME part and "Received" header fields from encapsulation email are used when upgrading. Because this may cause duplication of "Received" header fields, original "Received" header fields are renamed to "I18N-Received" during encapsulation. 2. Terminology Terminology for this document is defined in [ietf-eai-framework] and [RFC2045]. 3. Addition to internationalized email header New header fields are introduced. 3.1. "Downgrade-Method" header field The "Downgrade-Method" header field is added to Internet Message Format [RFC2822] as specified below: fields /= downgrade-method downgrade-method = "Downgrade-Method:" downgrade-code CRLF downgrade-code = "encapsulate" / "encapsulated" ; term is defined in [RFC2822] Value "encapsulate" tells that downgrading is requested to be done with the encapsulation defined in this document and value "encapsulated" indicates that downgrading is done with encapsulation defined in this document. Hurtta Expires September 18, 2007 [Page 4] Internet-Draft EAI Encapsulation March 2007 3.2. "I18N-Received" header field The "I18N-Received" header field is added to Internet Message Format [RFC2822] as specified below: received /= "I18N-Received:" name-val-list ";" date-time CRLF ; terms , and ; are defined in [RFC2822] Original "Received" header fields are renamed as "I18N-Received" during encapsulation. "Received" header fields are saved with original name to "text/utf8-header" MIME part. 3.3. Registration of Downgrade-Method header field This section provides the header field registration application (as per [RFC3864]). Header field name: Downgrade-Method Applicable protocol: mail Status: experimental Author/Change controller: Kari Hurtta hurtta-ietf@elmme-mailer.org Specification document(s): RFC XXXX Related information: Downgrade-Method is used for signaling encapsulation with multipart/utf8-encapsulated media type. 3.4. Registration of I18N-Received header field This section provides the header field registration application (as per [RFC3864]). Header field name: I18N-Received Applicable protocol: mail Status: experimental Hurtta Expires September 18, 2007 [Page 5] Internet-Draft EAI Encapsulation March 2007 Author/Change controller: Kari Hurtta hurtta-ietf@elmme-mailer.org Specification document(s): RFC XXXX Related information: I18N-Received is used together with multipart/utf8-encapsulated media type. 4. Encapsulation format A "multipart/utf8-encapsulated" media type splits internationalized email message [ietf-eai-utf8headers] or MIME body part to two parts: o The header part of internationalized email message or header part of MIME body part is put to first body part of the "multipart/ utf8-encapsulated" media type. Media type of the first body part is "text/utf8-header". o The body part of internationalized email message or header part of MIME body part is put to second body part of of the "multipart/ utf8-encapsulated" media type. Media type of the second body part is same than the media type of original internationalized email message or the original MIME body part. However, if media type of original internationalized email message or original MIME body part was "multipart/signed" , media type of a second body part is "multipart/mixed". In some cases media type is "application/ octet-stream". NOTE: This encapsulation assumes that the "preamble" and "epilogue" areas of multipart media types include only ASCII. If these areas include UTF-8 text, that text is lost if encapsulating "multipart/ utf8-encapsulated" is converted to ASCII compatible format (i.e. during 8BITMIME downgrading [RFC1652].) This lost UTF-8 text on "preamble" and "epilogue" areas of multipart media types can be solved by adding third and fourth body part to the "multipart/utf8-encapsulated" media type. However author believes that this unnecessarily complicates encapsulation format and algorithm. The author assumes that messages which use signing do not put UTF-8 text to "preamble" and "epilogue" areas of multipart media types. If message is not signed, lost "preamble" and "epilogue" areas do not cause harm. Hurtta Expires September 18, 2007 [Page 6] Internet-Draft EAI Encapsulation March 2007 4.1. "multipart/utf8-encapsulated" media type The "multipart/utf8-encapsulated" can be used on three different roles. The "type" parameter is defined for "multipart/ utf8-encapsulated" media type. Value of "type" parameter is defined as following: type-value = "encapsulated" / "message" / "part" o Value "encapsulated" is used, when "multipart/utf8-encapsulated" media type is used as downgrading format of internationalized email. Value of "type" is set to "encapsulated" when internationalized email is downgraded because of "downgrade=encapsulate" value on "Header-Type" header field. o Value "message" is used, when "multipart/utf8-encapsulated" media type is used same purpose, to indicate which media type "message/ rfc822" is used for the non-EAI content. Value of "type" is set to "message" when internationalized email is attached to or included in the message. Roughly "multipart/utf8-encapsulated; type=message" is equivalent of "message/rfc822" except that format of attachment is different. o Value "part" is used, "multipart/utf8-encapsulated" media type is used as downgrading format of MIME body part. Value of "type" is set to "part" MIME structure of internationalized email or MIME body part is recursively downgraded, and MIME body part with UTF-8 header is found. 4.2. Registration of media type multipart/utf8-encapsulated This section provides the media type registration application (as per [RFC4288]). Type name: multipart Subtype name: utf8-encapsulated Required parameters: The "boundary" parameter is requires as per RFC 2046. The "type" parameter is required as per RFC XXXX. Optional parameters: Encoding considerations: 8bit or binary Security considerations: Hurtta Expires September 18, 2007 [Page 7] Internet-Draft EAI Encapsulation March 2007 This media type provides a method to encapsulate mail data. Specially this media type provides a method to smuggle mail header fields so that mail scanners do not see them. This may cause new security threats. This encapsulation does not hide original MIME parts. However, original MIME structure may be obscured. This may provide a method to smuggle MIME parts so that mail scanners do not see them. This may cause new security threats. This encapsulation preserves only "Received" header fields from encapsulating message. This may hide information when encapsulated message is upgraded to internationalized email format. Interoperability considerations: This media type provides a method to encapsulate internationalized email. Recipient of encapsulated email must decode encapsulation, before the email is fully accessible. However original MIME parts are not hidden from mail agents which do not know encapsulation used by this media type. Published specification: RFC XXXX Applications that use this media type: Internationalized mail user agents (MUAs), mail transport agents (MTAs) and IMAP servers. Additional information: Magic number(s): File extension(s): Macintosh file type code(s): Person & email address to contact for further information: Kari Hurtta hurtta-ietf@elmme-mailer.org Intended usage: common Restrictions on usage: Author: Kari Hurtta Change controller: Kari Hurtta Hurtta Expires September 18, 2007 [Page 8] Internet-Draft EAI Encapsulation March 2007 4.3. Registration of media type text/utf8-header This section provides the media type registration application (as per [RFC4288]). Type name: text Subtype name: utf8-header Required parameters: The "charset" with value "UTF-8", if UTF-8 header in fact is encapsulated. Optional parameters: charset Encoding considerations: 7bit or 8bit "8bit", if UTF-8 header in fact is encapsulated. Security considerations: This media type provides a method to encapsulate mail data. Specially this media type provides a method to smuggle mail header fields so that mail scanners do not see them. This may cause new security threats. Interoperability considerations: Mail agents which do not know this media type, treat this as text/ plain media type. Published specification: RFC XXXX Applications that use this media type: Internationalized mail user agents (MUAs), mail transport agents (MTAs) and IMAP servers. Additional information: On some cases ASCII header part is encapsulated instead of UTF-8 header part. Hurtta Expires September 18, 2007 [Page 9] Internet-Draft EAI Encapsulation March 2007 Magic number(s): File extension(s): Macintosh file type code(s): Person & email address to contact for further information: Kari Hurtta hurtta-ietf@elmme-mailer.org Intended usage: common Restrictions on usage: Author: Kari Hurtta Change controller: Kari Hurtta 5. Encapsulation On encapsulation the internationalized email message or MIME body part is split to two MIME body parts of "multipart/utf8-encapsulated" media type. There is three cases of encapsulation: o When internationalized email message is downgraded. o When internationalized email message is attached to or included to message. o When MIME body part is encapsulated because there is UTF-8 text on header. This is in the recursive part of algorithm. On "Generic Encapsulation" (Section 5.1) are described common parts of these three encapsulations. 5.1. Generic encapsulation Both an email message and MIME body part follow same general syntax: o Both have a header part and a body part. o Header part is followed by body part and these are separated by an empty line. Term "entity" refers to both an email message and a MIME body part. NOTE: Subtypes of "message" (i.e. media type is message/*) other than "message/rfc822" and other composite types than "multipart" or "message" are treated specially. This is done so that this algorithm is stable and result does not change when new subtypes of "message" are registered or new composite top level types are standardised. Unknown top level types are treated same way Hurtta Expires September 18, 2007 [Page 10] Internet-Draft EAI Encapsulation March 2007 because it is not possible to know if the top level type is composite. Also in these cases type is treated as "application/ octet-stream" if body part of original entity includes non-ASCII characters. Type information is not lost, because the whole header part of original entity is stored to the body of first MIME body part of the encapsulating entity. Processing as "application/octet-stream" is done because the algorithm does not know how to encapsulate it as a composite type. If the body part of original entity includes only ASCII characters, there can not be UTF-8 headers (when it is treated as composite type). NOTE: This algorithm looks complex. This complexity is result of the requirement that so called "the nested encoding rule" is not violated. This requirement causes that composite media types must be processed recursively. Special handling of unknown composite types, which includes non- ASCII characters, as "application/octet-stream" causes that "the nested encoding rule" is violated. In case of unknown types this is unavoidable, because it is not possible to parse internal structure of unknown types. Encapsulating entity is generated from original internationalized email message or MIME body part in the following way: o New header part for encapsulating entity is generated. * Media type for encapsulating entity is "multipart/ utf8-encapsulated". o New body part for encapsulating entity is generated. This body part consists two MIME body parts. * Media type for first MIME body part is "text/utf8-header". + Value of "charset" parameter is "UTF-8", if header part of original entity includes UTF-8 characters. + NOTE: This seems strange, but in certain cases also ASCII- only header part is encapsulated. In that case "charset" parameter is not required. * Body part of the first MIME body part is the header part of original entity (original internationalized email message or MIME body part). * It is strongly recommended that the body of the first MIME body part is base64 encoded (and of course "content-transfer- encoding" header field is updated correspondingly). * Generation of second MIME body part is described in the next chapters. Hurtta Expires September 18, 2007 [Page 11] Internet-Draft EAI Encapsulation March 2007 NOTE: An Unix mailbox format changes "From" on beginning of line to ">From". Therefore it is useful that "text/utf8-header" is encoded with base64 even when it includes only ASCII header fields. Actually it is more common to replace "From " with ">From ". This does not touch "From" header field (if there is no space between "From" and ":"). The second MIME body part of encapsulating entity is generated in following way: 1. Original entity is checked for following cases: * Media type value (type/subtype) of the original entity includes other than ASCII characters [ASCII]. This is an error condition. * Media type of the original entity is a subtype of "message" (i.e. media type is message/*) and it is not "message/rfc822" and the body part of original entity includes non-ASCII character. * Top level type of the original entity is unknown, the body part of the original entity includes non-ASCII characters and encoding of the original entity is identity (i.e. "content- transfer-encoding" is "8bit" or "binary") * Top level type of the original entity is other composite type than "multipart" or "message" and the body part of original entity includes non-ASCII characters. * Media type of the original entity is a subtype of "multipart" (i.e. media type is multipart/*) and "boundary" parameter is missing. This is an error condition. If found, * Media type for the second MIME body part is "application/ octet-stream" * Body part of the second MIME body part is body part of the original entity. * "Content-transfer-encoding" value for second MIME body part is copied from the original entity, if it includes only ASCII characters. Otherwise it is set to "7bit", "8bit" or "binary" as appropriate. Non-ASCII value is an error condition. 2. Otherwise if the media type of original entity is "multipart/ signed", then * Media type for the second MIME body part is "multipart/mixed" + Generation of the "boundary" parameter is described on next chapters. * Body part of the second MIME body part is "Composite encapsulated body part". Generation if this is described in the next chapters. Hurtta Expires September 18, 2007 [Page 12] Internet-Draft EAI Encapsulation March 2007 * "Content-transfer-encoding" is set to "7bit", "8bit" or "binary" as appropriate. + If the "Content-transfer-encoding" value of original entity is other than "7bit", "8bit" or "binary", this is an error condition. 3. Otherwise original entity is checked for following cases: * Top level type of original entity is other type than "multipart" or "message". + This includes all discrete media types. + This includes all unknown top level types. * Media type of original entity is subtype of "message" (i.e. media type is message/*) and it is not "message/rfc822". If found, * Media type for second MIME body part is same than media type of original entity. + Copying of media type parameters from original entity to second MIME body part is described on next chapters. * Body part of second MIME body part is body part of original entity. * "Content-transfer-encoding" value for second MIME body part is copied from original entity, if it includes only ASCII characters. Otherwise it is set to "7bit", "8bit" or "binary" as appropriate. Non-ASCII value is error condition. 4. Otherwise if original entity is composite type ("multipart" or "message/rfc822"), * Media type for second MIME body part is same than media type of original entity. + Copying of media type parameters from original entity to second MIME body part is described on next chapters. * Generation of body part for second MIME body part is handled specially when media type of original entity is composite. + Body of original entity is scanned when entity is composite. + Generally that causes that processing is recursive. + Body part of second MIME body part is called with term "composite encapsulated body part", if media type of original entity is composite. Generating of this body part is described on next chapter. * "Content-transfer-encoding" it is set to "7bit", "8bit" or "binary" as appropriate. + If "Content-transfer-encoding" value of original entity is other than "7bit", "8bit" or "binary", this is error condition. Media type parameters from original entity to second MIME body part is copied on following way Hurtta Expires September 18, 2007 [Page 13] Internet-Draft EAI Encapsulation March 2007 o This copying is done when media type for second MIME body part is same than media type of original entity. o ASCII parameters are copied (however see special note about "boundary" on next chapters.) o UTF-8 comments are removed. o Parameters which have UTF-8 value are encoded according of [RFC2231] when copied. * If required parameters of media type are known, and parameter is not required for media type, it is not required that it is copied (and encoded according of [RFC2231]). o If parameter name have UTF-8 characters, this is error condition and parameter is not copied. o If "boundary" parameter value of multipart media type have UTF-8 characters, it is handled specially. This is described on next chapters. The "Composite encapsulated body part" is generated in following way: o If "application/octet-stream" was assigned to media type for second MIME body part, body part of original entity is resulting "Composite encapsulated body part". This case is mentioned on previous chapter. o If media type of original entity is subtype of "message" (i.e. media type is message/*) and it is not "message/rfc822", body part of original entity is resulting "Composite encapsulated body part". This case is mentioned on previous chapter. o If media type of original entity is "message/rfc822", body part of original entity parsed (to header and body part) and is processed as described on "Encapsulation of recursive part" (Section 5.1.1). Result of processing is "Composite encapsulated body part". o If media type of original entity is subtype of "multipart" (i.e. media type is multipart/*), body part of original entity is processed as described on next chapter. Result of processing is "Composite encapsulated body part". o If top level type of original entity is other composite type than "multipart" or "message", it is treated as unknown type. This processing is described on previous chapter. For multipart types "Composite encapsulated body part" is generated as following: 1. A "boundary" parameter value from original entity is remembered. * Handling of missing "boundary" parameter is described on previous chapters. 2. A "boundary" parameter value for second MIME body part is selected. * Selected "boundary" parameter value must include only ASCII characters. Hurtta Expires September 18, 2007 [Page 14] Internet-Draft EAI Encapsulation March 2007 * In general this can be same than a "boundary" parameter value from original entity. * If a "boundary" parameter value from original entity includes UTF-8 characters, new ASCII-only value must selected. 3. The "preamble" area from body of original entity is copied to "Composite encapsulated body part". * If "preamble" area includes non-ASCII characters, this is an error condition. 4. Body parts of multipart (from body of original entity) are handled: 1. A boundary delimiter line is copied to "Composite encapsulated body part", but that way that a boundary of original entity is replaced with selected boundary of second MIME body part. 2. A body part is parsed (to header and body part) and is processed as described on "Encapsulation of recursive part" (Section 5.1.1). Result is copied to "Composite encapsulated body part". 5. A final boundary delimiter line is copied (from body of original entity), to "Composite encapsulated body part" but that way that boundary of original entity is replaced with selected boundary of second MIME body part. * A final final boundary delimiter line is not generated to "Composite encapsulated body part" if a final boundary delimiter line is missing on original entity. This is an error condition. 6. The "epilogue" area from body of original entity is copied to "Composite encapsulated body part". * If "epilogue" area includes non-ASCII characters, this is error condition. NOTE: The CRLF preceding the boundary delimiter line is conceptually attached to the boundary (as per [RFC2046]). That CRLF is not part of body part of multipart. If encapsulation and decoding of encapsulation process this CRLF different way, this encapsulation do not preserve all CRLFes or add extra CRLFes. NOTE: If original "Content-transfer-encoding" includes non-ASCII characters, this algorithm do not able to decode resulting encapsulation. Therefore it is recommended that internationalized email message is bounced or rejected on that error condition. 5.1.1. Encapsulation of recursive part An encapsulation of recursive part is done following way: 1. If header part of recursive part includes UTF-8 characters or if media type of recursive part is "multipart/signed" then Hurtta Expires September 18, 2007 [Page 15] Internet-Draft EAI Encapsulation March 2007 * Recursive part is considered to be "original entity" and "Generic encapsulation" (Section 5.1) is applied. * Value of parameter "type" is set to "part" for resulting "multipart/utf8-encapsulated" encapsulating entity. * Resulting encapsulating entity result is result for "Encapsulation on recursive part". 2. Otherwise if media type of recursive part is discrete, result for "Encapsulation on recursive part" is recursive part itself. 3. Otherwise if media type of recursive part is "message/rfc822", then * Header part of result for "Encapsulation on recursive part" result, is header part of recursive part. * Body part of recursive part is parsed (to header and body part) and is processed as described on "Encapsulation of recursive part" (Section 5.1.1). Body part of result for "Encapsulation on recursive part" is result of processing. 4. Otherwise if top level type is multipart (i.e. media type is multipart/*) and "boundary" parameter exists, handling of it is described on next chapter. * Missing "boundary" parameter on multipart types is error condition. 5. Otherwise if recursive part is ASCII only (body is ASCII, i.e. "content-transfer-encoding" is "7bit") result for "Encapsulation on recursive part" is recursive part itself. 6. Otherwise if encoding of recursive part is not identity (i.e. "content-transfer-encoding" is not "8bit" or "binary") result for "Encapsulation on recursive part" is recursive part itself. 7. Otherwise * Recursive part is considered to be "original entity" and "Generic encapsulation" (Section 5.1) is applied. * For resulting "multipart/utf8-encapsulated" encapsulating entity parameter "type" is set "part" as value. * Resulting encapsulating entity result is result for "Encapsulation on recursive part". * NOTE: This seems strange, but unknown composite media types are always encapsulated, if there is possibility that they include embedded UTF-8 headers. If top level type is multipart, result for "Encapsulation on recursive part" is generated following way: 1. Header part of result for "Encapsulation on recursive part" result is header part of recursive part. 2. A "boundary" parameter value from recursive part is remembered. * Handling of missing "boundary" parameter is described on previous chapters. 3. Body part for "Encapsulation on recursive part" result is initiated. Hurtta Expires September 18, 2007 [Page 16] Internet-Draft EAI Encapsulation March 2007 4. A "preamble" area from body of recursive part is copied to body part for "Encapsulation on recursive part" result. * If a "preamble" area includes non-ASCII characters, this is an error condition. 5. Body parts of multipart (from body of recursive part) are handled: 1. A boundary delimiter line is copied to body part for "Encapsulation on recursive part" result. 2. A body part is parsed (to header and body part) and is processed as described on "Encapsulation of recursive part" (Section 5.1.1). Result is copied to body part for "Encapsulation on recursive part" result. 6. A final boundary delimiter line is copied (from body of recursive part) to body part for "Encapsulation on recursive part" result. * A final final boundary delimiter line is not generated to body part for "Encapsulation on recursive part" result if final boundary delimiter line is missing on recursive part. This is an error condition. 7. An "epilogue" area from body of recursive part is copied to body part for "Encapsulation on recursive part" result. * If "epilogue" area includes non-ASCII characters, this is error condition. Hurtta Expires September 18, 2007 [Page 17] Internet-Draft EAI Encapsulation March 2007 5.1.2. Encapsulation example An encapsulation example Original internationalized entity: ========================================== Some-Header: { UTF-8 content } Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } ========================================== Encapsulated entity: ========================================== Content-Type: multipart/utf8-encapsulated; type={specified later}; boundary="12345" Content-Transfer-Encoding: 8bit --12345 Content-Type: text/utf8-header; charset=UTF-8 Content-Transfer-Encoding: 8bit Some-Header: { UTF-8 content } Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --12345 Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } --12345-- ========================================== An empty line on end of "text/utf8-header" body is not copied from original encapsulated headers. It is part of next boundary line of "multipart/utf8-encapsulated". Hurtta Expires September 18, 2007 [Page 18] Internet-Draft EAI Encapsulation March 2007 NOTE: On this example "text/utf8-header" part is not base64 encoded for clarity. Base64 encoding is recommended. 5.1.3. Multipart encapsulation example An multipart encapsulation example Original internationalized entity: ========================================== Content-Type: Multipart/mixed; boundary=12345 Content-Transfer-Encoding: 8bit --12345 Some-Header: { UTF-8 content } Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } --12345 ========================================== Encapsulated entity: ========================================== Content-Type: multipart/utf8-encapsulated; type={specified later}; boundary="67890" Content-Transfer-Encoding: 8bit --67890 Content-Type: text/utf8-header; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Type: Multipart/mixed; boundary=12345 Content-Transfer-Encoding: 8bit --67890 Content-Type: Multipart/mixed; boundary=12345 Content-Transfer-Encoding: 8bit --12345 Content-Type: multipart/utf8-encapsulated; type=part; boundary="abcde" Content-Transfer-Encoding: 8bit Hurtta Expires September 18, 2007 [Page 19] Internet-Draft EAI Encapsulation March 2007 --abcde Content-Type: text/utf8-header; charset=UTF-8 Content-Transfer-Encoding: 8bit Some-Header: { UTF-8 content } Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --abcde Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } --abcde-- --12345-- --67890-- ========================================== "Generic encapsulation" causes that a top level header part is always encapsulated, even when it is US-ASCII only. In general it is assumed that on internationalized email there is always some header fields which require this encapsulation. Hurtta Expires September 18, 2007 [Page 20] Internet-Draft EAI Encapsulation March 2007 5.1.4. Unknown top level type encapsulation example #1 An encapsulation example for unknown top level type with 7-bit body Original internationalized entity: ========================================== Some-Header: { UTF-8 content } Content-Type: X-message8/plain Content-Transfer-Encoding: 7bit { ASCII text } ========================================== Encapsulated entity: ========================================== Content-Type: multipart/utf8-encapsulated; type={specified later}; boundary="12345" Content-Transfer-Encoding: 8bit --12345 Content-Type: text/utf8-header; charset=UTF-8 Content-Transfer-Encoding: 8bit Some-Header: { UTF-8 content } Content-Type: X-message8/plain Content-Transfer-Encoding: 7bit --12345 Content-Type: X-message8/plain Content-Transfer-Encoding: 7bit { ASCII text } --12345-- ========================================== Hurtta Expires September 18, 2007 [Page 21] Internet-Draft EAI Encapsulation March 2007 5.1.5. Unknown top level type encapsulation example #2 An encapsulation example for unknown top level type with 8-bit body Original internationalized entity: ========================================== Some-Header: { UTF-8 content } Content-Type: X-message8/plain Content-Transfer-Encoding: 8bit { non-ASCII text } ========================================== ========================================== Content-Type: multipart/utf8-encapsulated; type={specified later}; boundary="12345" Content-Transfer-Encoding: 8bit --12345 Content-Type: text/utf8-header; charset=UTF-8 Content-Transfer-Encoding: 8bit Some-Header: { UTF-8 content } Content-Type: X-message8/plain Content-Transfer-Encoding: 8bit --12345 Content-Type: application/octet-stream Content-Transfer-Encoding: 8bit { non-ASCII text } --12345-- ========================================== 5.2. Downgrading of internationalized email message When an internationalized email [ietf-eai-utf8headers] leaves EAI compliant environment downgrade is required. [ietf-eai-downgrade] describes when downgrade occurs. This document defines "Downgrade-Method" header field. Downgrading method is selected following way: o If "Downgrade-Method" header field value is "encapsulate", downgrading of header part (and body) of mail is done as described on this section. Hurtta Expires September 18, 2007 [Page 22] Internet-Draft EAI Encapsulation March 2007 o Otherwise all message headers (including header fields from MIME body parts) may need to be parsed to discover that message is internationalized email and is downgrading candidate. * If a downgrading gateway is configured for tunneling operation for some recipients of mail, downgrading of header part (and body) of mail for these recipients is done as described on this section. * If "Downgrade-Method" header field header field does not exists and a downgrading gateway is not configured for tunneling operation, downgrading of header part (and body) of mail is done according of [ietf-eai-downgrade]. * If "Downgrade-Method" header field exists and it's value is not "encapsulate", this specification is not used for downgrading of header part (and body) of mail. Downgrading of internationalized email is done following way: o Internationalized email is considered to be "original entity" and "Generic encapsulation" (Section 5.1) is applied. o For resulting "multipart/utf8-encapsulated" encapsulating entity parameter "type" is set "encapsulated" as value. o Resulting encapsulating entity is downgraded internationalized email. * Addition of email header fields to downgraded internationalized email is described on next chapter. When mail is downgraded, some email header fields must be added. "Generic encapsulation" (Section 5.1) do not produce these email header fields. o "Downgrade-Method" header field with value "Encapsulated" is added. o "Received" header fields are copied from original international email added with new header field name. "I18N-Received" header field name is used for copied header fields. If "for" clause on "Received" header field includes non-ASCII, it is removed when "Received" header field is copied to "I18N-Received" header field. If some header field (excluding "for" clause) includes non-ASCII characters, it is not copied. o "Mime-Version" header field with value "1.0" is added. o "Date" header field is copied from original international email, if it includes only ASCII characters. Otherwise it is generated. o "From" header field is added. Several different values for "From" header field which can be used: * If "From" header field from original internationalized email can be used, if it includes only ASCII characters. * Algorithm from [ietf-eai-downgrade] can be used. * Value for "From" header field can be taken from downgraded envelope sender address. Hurtta Expires September 18, 2007 [Page 23] Internet-Draft EAI Encapsulation March 2007 * ASCII address which refers of a downgrading gateway, can be used. o "Subject" header field is copied from original internationalized email, if it includes only ASCII characters. Otherwise several different values for "Subject" header field can be used: * Algorithm from [ietf-eai-downgrade] or from [RFC2047] can be used. * ASCII subject which refers to downgrading operation, can be used. o If "From" and "Subject" are from original internationalized email and "Message-ID" header field on original internationalized email includes only ASCII characters, "Message-ID" header field is copied (from original internationalized email). Otherwise it is optionally generated. o Optionally "To" header is added. Several different values for "To" header field which can be used: * "To" header field from original international email can be used, if it includes only ASCII characters. * Algorithm from [ietf-eai-downgrade] can be used. o Optionally "Cc" header is added. Several different values for "Cc" header field which can be used: * "Cc" header field from original international email can be used, if it includes only ASCII characters. * Algorithm from [ietf-eai-downgrade] can be used. o It is important that all ASCII header fields are NOT copied. Some header fields may be used for signatures. If signature is checked from encapsulated form, it fails. For example Domain Keys Identified Mail [DKIM-Charter] uses these kind signatures. "Encapsulation on recursive part" (Section 5.1.1) mentions several error conditions. Although it defines output on that case converting MTA is permitted to bounce (return NDN) or reject (on SMTP level) internationalized email message. Downgrading MUA can refuse downgrading internationalized email message and give error message to user or produce downgraded message and give warning message to user. Silent operation is not recommended when error condition happens (on downgrading MUA). NOTE: Only "Date" and "From" header fields are required on email (as per [RFC2822]) However "multipart/utf8-encapsulated" format is also usable for non-EAI compliant MUAs assuming that they support MIME. Specially if an original internationalized email message was using UTF-8 characters only on main header part and not on header part of MIME body parts. Therefore it is useful if "From", "Subject", "To" and "Cc" header fields are derived from original internationalized email according of [ietf-eai-downgrade]. This allows reply Hurtta Expires September 18, 2007 [Page 24] Internet-Draft EAI Encapsulation March 2007 -commands work on non-EAI compliant MUAs. 5.2.1. Encapsulation example An encapsulation example Original internationalized email: ========================================== Received: from {idn-encoded-name} by downgrade.example.org with ESMTP id JGR17356; Wed, 13 Sep 2006 22:27:25 +0300 Downgrade-Method: encapsulate From: { UTF-8 address } To: someone@example.org Date: Wed, 13 Sep 2006 22:27:25 +0300 Subject: { UTF-8 subject } X-Foobar: XvrT Mime-Version: 1.0 Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } ========================================== Encapsulated internationalized email: ========================================== I18N-Received: from {idn-encoded-name} by downgrade.example.org with ESMTP id JGR17356; Wed, 13 Sep 2006 22:27:25 +0300 Downgrade-Method: Encapsulated From: { downgraded address } To: someone@example.org Date: Wed, 13 Sep 2006 22:27:25 +0300 Subject: { RFC 2047 encoded subject } Mime-Version: 1.0 Content-Type: multipart/utf8-encapsulated; type=encapsulated; boundary="12345" Content-Transfer-Encoding: 8bit --12345 Content-Type: text/utf8-header; charset=UTF-8 Content-Transfer-Encoding: 8bit Hurtta Expires September 18, 2007 [Page 25] Internet-Draft EAI Encapsulation March 2007 Received: from {idn-encoded-name} by downgrade.example.org with ESMTP id JGR17356; Wed, 13 Sep 2006 22:27:25 +0300 Downgrade-Method: encapsulate From: { UTF-8 address } To: someone@example.org Date: Wed, 13 Sep 2006 22:27:25 +0300 Subject: { UTF-8 subject } X-Foobar: XvrT Mime-Version: 1.0 Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --12345 Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } --12345-- ========================================== NOTE: On this example "text/utf8-header" part is not base64 encoded for clarity. Base64 encoding is recommended -- especially because it includes line starting with "From". 5.2.2. Multipart/signed encapsulation example Multipart/signed [RFC1847] encapsulation example Original internationalized email: ========================================== Received: from {idn-encoded-name} by downgrade.example.org with ESMTP id JGR17356; Wed, 13 Sep 2006 22:27:25 +0300 Downgrade-Method: encapsulate From: { UTF-8 address } To: someone@example.org Date: Wed, 13 Sep 2006 22:27:25 +0300 Subject: { UTF-8 subject } X-Foobar: XvrT Mime-Version: 1.0 Content-Type: multipart/signed; protocol="application/XYZ-signature"; Hurtta Expires September 18, 2007 [Page 26] Internet-Draft EAI Encapsulation March 2007 micalg="ABC"; boundary=12345 Content-Transfer-Encoding: 8bit --12345 Content-Description: { UTF-8 description } Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } --12345 Content-Type: application/XYZ-signature { signature data } --12345-- ========================================== Encapsulated internationalized email: ========================================== I18N-Received: from {idn-encoded-name} by downgrade.example.org with ESMTP id JGR17356; Wed, 13 Sep 2006 22:27:25 +0300 Downgrade-Method: Encapsulated From: { downgraded address } To: someone@example.org Date: Wed, 13 Sep 2006 22:27:25 +0300 Subject: { RFC 2047 encoded subject } Mime-Version: 1.0 Content-Type: multipart/utf8-encapsulated; type=encapsulated; boundary="45678" Content-Transfer-Encoding: 8bit --45678 Content-Type: text/utf8-header; charset=UTF-8 Content-Transfer-Encoding: 8bit Received: from {idn-encoded-name} by downgrade.example.org with ESMTP id JGR17356; Wed, 13 Sep 2006 22:27:25 +0300 Downgrade-Method: encapsulate From: { UTF-8 address } To: someone@example.org Hurtta Expires September 18, 2007 [Page 27] Internet-Draft EAI Encapsulation March 2007 Date: Wed, 13 Sep 2006 22:27:25 +0300 Subject: { UTF-8 subject } X-Foobar: XvrT Mime-Version: 1.0 Content-Type: multipart/signed; protocol="application/XYZ-signature"; micalg="ABC"; boundary=12345 Content-Transfer-Encoding: 8bit --45678 Content-Type: multipart/mixed; boundary=12345 Content-Transfer-Encoding: 8bit --12345 Content-Type: multipart/utf8-encapsulated; type=part; boundary="abcde" Content-Transfer-Encoding: 8bit --abcde Content-Type: text/utf8-header; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Description: { UTF-8 description } Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --abcde Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } --abcde-- --12345 Content-Type: application/XYZ-signature { signature data } --12345-- --45678-- ========================================== Hurtta Expires September 18, 2007 [Page 28] Internet-Draft EAI Encapsulation March 2007 Media type "multipart/signed" is replaced with "multipart/mixed" on encapsulated message. This allows encapsulation of signed header on MIME body part. NOTE: Again, "text/utf8-header" should be base64 encoded. It is not done for clarity. NOTE: Normally various multipart/signed protocols defined that body of signed content must be quoted-printable or base64 encoded if it includes 8-bit characters. If it includes 8-bit characters, signature is broken, when email is 8BITMIME downgraded [RFC1652]. Note that generally this encapsulation algorithm do not protect against breaking of signature on that case. On that example is may protect it, but that is side effect protection required for encapsulation of "Content-Description" header field. 5.3. Attaching internationalized email message "Message/rfc822" can be used to attach internationalized email messages on EAI compliant environments if "message/rfc822" allows UTF-8 header fields. "Multipart/utf8-encapsulated" with "type" parameter value "message" can be used to attach internationalized email messages on EAI non-compliant environments. NOTE: When inside of "message/rfc822" have "Multipart/ utf8-encapsulated" with "type" parameter value "encapsulated", this also represents attached internationalized email message. However author believes that "multipart/utf8-encapsulated" with "type" parameter value "message" provides useful shorthand. NOTE: If internationalized email was stored inside of "message/ rfc822" media type and "message/rfc822" is inside of mime structure which is encapsulated, "Encapsulation on recursive part" (Section 5.1.1) produces where inside of "message/rfc822" have "Multipart/utf8-encapsulated" with "type" parameter value "part". "Multipart/utf8-encapsulated" media type, which represents internationalized email message, is done following way: o Internationalized email is considered to be "original entity" and "Generic encapsulation" (Section 5.1) is applied. o Value of parameter "type" is set to "message" for resulting "multipart/utf8-encapsulated" encapsulating entity. 5.3.1. Attaching example On following example mail from earlier example (Section 5.2.1) is attached to message, which is sent to outside of EAI compliant environment. Hurtta Expires September 18, 2007 [Page 29] Internet-Draft EAI Encapsulation March 2007 Encapsulating message: ========================================== From: someone@example.org To: A@CC.example.org Subject: { UTF-8 subject } (fwd) Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="12345" Content-Transfer-Encoding: 8bit --12345 Content-Type: Text/plain See attached message. --12345 Content-Type: multipart/utf8-encapsulated; type=message; boundary="67890" Content-Transfer-Encoding: 8bit --67890 Content-Type: text/utf8-header; charset=UTF-8 Content-Transfer-Encoding: 8bit Downgrade-Method: encapsulate From: { UTF-8 address } To: someone@example.org Date: Wed, 13 Sep 2006 22:27:25 +0300 Subject: { UTF-8 subject } X-Foobar: XvrT Mime-Version: 1.0 Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --67890 Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } --67890-- --12345-- ========================================== Hurtta Expires September 18, 2007 [Page 30] Internet-Draft EAI Encapsulation March 2007 In this example it is assumed that MUA knows that A@CC.example.org do not handle UTF8SMTP messages and therefore encapsulates it. Recipient (A@CC.example.org) may need helper application for media type multipart/utf8-encapsulated although message is mostly readable without helper. 6. Decoding encapsulation There is three cases of encapsulation: o When internationalized email message is tunneled through EAI non- compliant environment, media type of message is "multipart/ utf8-encapsulated" with "type" parameter value "encapsulated". Original message is inside of that type. o When internationalized email message is included or attached message, media type "multipart/utf8-encapsulated" with "type" parameter value "message" represents included or attached message. o When MIME body part is encapsulated, media type "multipart/ utf8-encapsulated" with "type" parameter value "part" encapsulates original MIME body part. On "Generic decoding" (Section 6.1) is described common parts of decoding these three encapsulations. 6.1. Generic decoding On decoding an internationalized email message or a MIME body part from "multipart/utf8-encapsulated" are extracted. Both an email message and a MIME body part are refereed with term "entity". On error conditions encapsulating entity is not decoded. Instead original encapsulating entity is returned. Decoded internationalized entity is generated from encapsulating entity (multipart/utf8-encapsulated) in following way: o If media type of an encapsulating entity is not "multipart/ utf8-encapsulated", this is an error condition. o If number of MIME body parts on encapsulating entity is not two (2), this is an error condition. o If media type of first MIME body part is not "text/utf8-header", this is an error condition. o If value of "charset" parameter of first MIME body part is not "UTF-8" or "US-ASCII", this is an error condition. Missing "charset" parameter is treated as equivalent of "US-ASCII" as per [RFC2046]. o Body of first MIME body part forms header part of decoded entity. Encoding (as given on "content-transfer-encoding" header field) is decoded. Hurtta Expires September 18, 2007 [Page 31] Internet-Draft EAI Encapsulation March 2007 o Body of second MIME body parts forms body part of decoded entity. Generating of body part of decoded entity is described on next chapter. Body of decoded entity is generated on following way: o If both media type of second MIME body part is discrete and media type for decoded entity (from body of first MIME body part) is discrete, then * If encoding for decoded entity (from body of first MIME body part) is identity (i.e. "content-transfer-encoding" is "7bit", "8bit" or "binary") + Body of second MIME body parts forms body part of decoded entity. + Encoding (as given on "content-transfer-encoding" header field on second MIME body part) is decoded. * If encoding for decoded entity (from body of first MIME body part) is same than encoding of second MIME body part, + Body of second MIME body parts forms body part of decoded entity. + Encoding is not decoded. * Otherwise this is an error condition. o Otherwise if both top level type of second MIME body part is "multipart" and top level type for decoded entity (from body of first MIME body part) is "multipart", then * Generating of body part of decoded entity is described on next chapters ("Decoding of multipart"). o Otherwise if both media type of second MIME body part is "message/ rfc822" and media type for decoded entity (from body of first MIME body part) is "message/rfc822", then * A body of second MIME body part is parsed (to header and body part) and is processed as described on "Decoding of recursive part" (Section 6.1.1). Result is copied to body of decoded entity. o Otherwise if media type of second MIME body part is "application/ octet-stream", then * If encoding for decoded entity (from body of first MIME body part) is identity (i.e. "content-transfer-encoding" is "7bit", "8bit" or "binary") + Body of second MIME body parts forms body part of decoded entity. + Encoding (as given on "content-transfer-encoding" header field on second MIME body part) is decoded. * If encoding for decoded entity (from body of first MIME body part) is same than encoding of second MIME body part, + Body of second MIME body part forms body part of decoded entity. Hurtta Expires September 18, 2007 [Page 32] Internet-Draft EAI Encapsulation March 2007 + Encoding is not decoded. * Otherwise this is an error condition. o Otherwise if media type of second MIME body part is same than media type for decoded entity (from body of first MIME body part), then * If encoding for decoded entity (from body of first MIME body part) is identity (i.e. "content-transfer-encoding" is "7bit", "8bit" or "binary") + Body of second MIME body part forms body part of decoded entity. + Encoding (as given on "content-transfer-encoding" header field on second MIME body part) is decoded. * If encoding for decoded entity (from body of first MIME body part) is same than encoding of second MIME body part, + Body of second MIME body part forms body part of decoded entity. + Encoding is not decoded. * Otherwise this is an error condition. + NOTE: This algorithm do not handle cases where body part is re-encoded (for example quoted-printable to base64.) Reverse re-enconfig of course is possible, but it does not necessary give exactly same representation. * NOTE: This handles unknown media types. But unknown composite media types was stored as "application/octet-stream", if they includes non-ASCII characters, so this handles mostly discrete media types. It is possible that generator of encapsulation knows that type is discrete, but decoder of encapsulation do not know it. o Otherwise this is an error condition. Body of decoded entity is generated following way when media type is multipart (both on second MIME body part and on decoded entity): 1. A "boundary" parameter value from second MIME body part is remembered. * If "boundary" parameter is missing, this is a error condition. 2. A "boundary" parameter value from decoded entity (from body of first MIME body part) is remembered. This is new boundary, which is used on generated body of decoded entity. * If a "boundary" parameter is missing, this is a error condition. 3. A "preamble" area from body of second MIME body part is copied to body of decoded entity. * It is not an error condition on decoding if "preamble" area includes non-ASCII characters. 4. Body parts of multipart (from body of second MIME body part) are handled: Hurtta Expires September 18, 2007 [Page 33] Internet-Draft EAI Encapsulation March 2007 1. A boundary delimiter line is copied to body of decoded entity, but that way that boundary of second MIME body part is replaced with boundary of decoded entity. 2. A body part is parsed (to header and body part) and is processed as described on "Decoding of recursive part" (Section 6.1.1). Result is copied to body of decoded entity. 5. A final boundary delimiter line is copied to body of decoded entity, but that way that boundary of second MIME body part is replaced with boundary of decoded entity. * A final final boundary delimiter line is not generated to decoded entity if final boundary delimiter line is missing on second MIME body part. This is not an error condition on decoding. 6. An "epilogue" area from body of second MIME body part is copied to body of decoded entity. * It is not an error condition on decoding if "epilogue" area includes non-ASCII characters. NOTE: The CRLF preceding the boundary delimiter line is conceptually attached to the boundary (as per [RFC2046]). That CRLF is not part of body part of multipart. 6.1.1. Decoding of recursive part Decoding of recursive part is done following way: 1. If media type of recursive part is "multipart/utf8-encapsulated" and "type" parameter is "part" as value: 1. Recursive part is considered to be "encapsulating entity" and "Generic decoding" (Section 6.1) is applied. 2. Resulting decoded entity is result for "Decoding of recursive part". 2. If media type of recursive part is discrete, result for "Decoding of recursive part" is recursive part itself. 3. Otherwise if media type of recursive part is "message/rfc822", then * Header part of result for "Decoding of recursive part" result, is header part of recursive part. * Body part of recursive part is parsed (to header and body part) and is processed as described on "Decoding of recursive part" (Section 6.1.1). Body part of result for "Decoding of recursive part" is result of processing. 4. Otherwise if top level type of recursive part is multipart (i.e. media type is multipart/*) and "boundary" parameter exists, handling of it is described on next chapter. * Missing "boundary" parameter on multipart types is not error condition on decoding. Hurtta Expires September 18, 2007 [Page 34] Internet-Draft EAI Encapsulation March 2007 5. Otherwise if recursive part is ASCII only and encoding of recursive part is identity (i.e. "content-transfer-encoding" is "7bit", "8bit" or "binary") result for "Decoding of recursive part" is recursive part itself. 6. Otherwise this is error condition. * NOTE: This means that missing "boundary" parameter is error condition for decoding if body is not ASCII only (or required encoding). * NOTE: This means that unknown composite types is error condition, if body is not ASCII only (or required encoding). If top level type is multipart, result for "Decoding of recursive part" is generated following way: 1. Header part of result for "Decoding of recursive part" result, is header part of recursive part. 2. A "boundary" parameter value from recursive part is remembered. * Handling of missing "boundary" parameter is described on previous chapters. 3. Body part for "Decoding of recursive part" result is initiated. 4. A "preamble" area from body of recursive part is copied to body part for "Decoding of recursive part" result. * It is not an error condition on decoding if "preamble" area includes non-ASCII characters. 5. Body parts of multipart (from body of recursive part) are handled: 1. A boundary delimiter line is copied to body part for "Decoding of recursive part" result. 2. A body part is parsed (to header and body part) and is processed as described on "Decoding of recursive part" (Section 6.1.1). Result is copied to body part for "Decoding of recursive part" result. 6. A final boundary delimiter line is copied (from body of recursive part) to body part for "Decoding of recursive part" result. * A final final boundary delimiter line is not generated to body part for "Decoding of recursive part" result if final boundary delimiter line is missing on second MIME body part. This is not an error condition on decoding. 7. An "epilogue" area from body of recursive part is copied to body part for "Decoding of recursive part" result. * It is not an error condition on decoding if "epilogue" area includes non-ASCII characters. 6.2. Upgrading of internationalized email message When downgraded internationalized email enters EAI compliant environment upgrade is allowed. [ietf-eai-downgrade] describes when upgrade occurs. Hurtta Expires September 18, 2007 [Page 35] Internet-Draft EAI Encapsulation March 2007 This document defines "Encapsulated" value to "Downgrade-Method" header field. "Header-Type" header field defines how upgrade occurs. o If header field "Downgraded" exits, upgrading of header part (and body) of mail is done according of [ietf-eai-downgrade]. o If header field "Downgrade-Method" exists with value is "Encapsulated", upgrading of header part (and body) of mail is done as described on this section. o If both header field "Downgraded" and "Downgrade-Method" exists, this is error condition and upgrading is not node. Encapsulating entity is not decoded on error conditions. Instead original encapsulating entity is returned. Upgrading of internationalized email is done following way: o If media type of downgraded internationalized email is not "multipart/utf8-encapsulated" or if parameter "type" have not "encapsulated" as value, this is a error condition. o Downgraded internationalized email is considered to be "encapsulating entity" and "Generic decoding" (Section 6.1) is applied. o Resulting decoded internationalized entity is upgraded internationalized email. o "Received" header fields from downgraded internationalized are prepended to upgraded internationalized email. * Upgraded internationalized email already includes all original header fields. This adds trace header fields which are inserted to mail after it was downgrading. This do not re-add trace header fields which was added before downgrading, because them are renamed to "I18N-Received" on downgraded internationalized email. 6.2.1. Upgrading example An upgrading example of mail from earlier example (Section 5.2.1) is used. Mail is assumed 8BITMIME downgraded afterwards. This process was added also some extra header fields to mime parts. Downgraded internationalized email: ========================================== Received: from fw.example.org by upgrade.example.org with ESMTP id JAX77356; Wed, 13 Sep 2006 22:27:32 +0300 Received: from downgrade.example.org by fw.example.org with ESMTP id JAX77356; Hurtta Expires September 18, 2007 [Page 36] Internet-Draft EAI Encapsulation March 2007 Wed, 13 Sep 2006 22:27:29 +0300 I18N-Received: from {idn-encoded-name} by downgrade.example.org with ESMTP id JGR17356; Wed, 13 Sep 2006 22:27:25 +0300 Downgrade-Method: Encapsulated From: { downgraded address } To: someone@example.org Date: Wed, 13 Sep 2006 22:27:25 +0300 Subject: { RFC 2047 encoded subject } Mime-Version: 1.0 Content-Type: multipart/utf8-encapsulated; type=encapsulated; boundary="12345" Content-Transfer-Encoding: 7bit --12345 Content-Type: text/utf8-header; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by downgrade.example.org id JGR17356 Received: from {idn-encoded-name} by downgrade.example.org with ESMTP id JGR17356; Wed, 13 Sep 2006 22:27:25 +0300 Downgrade-Method: encapsulate From: { q-p encoded UTF-8 address } To: someone@example.org Date: Wed, 13 Sep 2006 22:27:25 +0300 Subject: { q-p encoded UTF-8 subject } X-Foobar: XvrT Mime-Version: 1.0 Content-Type: Text/plain; charset=3DUTF-8 Content-Transfer-Encoding: 8bit --12345 Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by downgrade.example.org id JGR17356 { q-p encoded UTF-8 text } --12345-- ========================================== Upgraded internationalized email: Hurtta Expires September 18, 2007 [Page 37] Internet-Draft EAI Encapsulation March 2007 ========================================== Received: from fw.example.org by upgrade.example.org with ESMTP id JAX77356; Wed, 13 Sep 2006 22:27:32 +0300 Received: from downgrade.example.org by fw.example.org with ESMTP id JAX77356; Wed, 13 Sep 2006 22:27:29 +0300 Received: from {idn-encoded-name} by downgrade.example.org with ESMTP id JGR17356; Wed, 13 Sep 2006 22:27:25 +0300 Downgrade-Method: encapsulate From: { UTF-8 address } To: someone@example.org Date: Wed, 13 Sep 2006 22:27:25 +0300 Subject: { UTF-8 subject } X-Foobar: XvrT Mime-Version: 1.0 Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } ========================================== Note handling of Received: header fields. That is only header field what was preserved from downgraded internationalized email. All other header fields are got from "text/utf8-header" MIME part. This also means that upgrading do not need add "Header-Type" header field, because it necessary already on "text/utf8-header" MIME part. On downgraded e-mail there was empty line after "UTF-8 text", but on upgraded email it is disappeared because it was part of multipart boundary. 6.3. Retrieving attached internationalized email message "Multipart/utf8-encapsulated" with "type" parameter value "message" can be used to attach internationalized email messages on EAI non- compliant environments. Retrieving internationalized email can be done following way: o If media type is "message/rfc822", then * It is parsed (to header and body part). Hurtta Expires September 18, 2007 [Page 38] Internet-Draft EAI Encapsulation March 2007 * Body part is processed as described "Upgrading of internationalized email message" (Section 6.2) * Result is internationalized email. o If media type is "Multipart/utf8-encapsulated" and parameter "type" value is "message", then * It is considered to be "encapsulating entity" and "Generic decoding" (Section 6.1) is applied. * Result is internationalized email. 7. IANA Considerations IANA is requested to register I18N-Received and Downgrade-Method header fields and multipart/utf8-encapsulated and text/utf8-header media types as given on registration applications on this document. 8. Security Considerations This "multipart/utf8-encapsulated" media type provides method to encapsulate mail data. Specially this media type provides method to smuggle mail header fields so that mail scanners do not see them. This may provide new security threats. This encapsulation do not hide original MIME parts. However original MIME structure may be obscured. This may provide method to smuggle MIME parts so that mail scanners do not see them. This may provide new security threats. This encapsulation preservers only "Received" header fields from encapsulating message. This may hide information when encapsulated message is upgraded to internationalized email format. 9. Acknowledgements Originally this encapsulation format is suggested on former IMAA mailing list discussions. Various ideas are suggested on IMA mailing list discussions. John C. Klensin was strongly encouraging author to write this documentation. 10. References Hurtta Expires September 18, 2007 [Page 39] Internet-Draft EAI Encapsulation March 2007 10.1. Normative References [ASCII] American National Standards Institute (formerly United States of America Standards Institute), "USA Code for Information Interchange", ANSI X3.4-1968, 1968. ANSI X3.4-1968 has been replaced by newer versions with slight modifications, but the 1968 version remains definitive for the Internet. [ietf-eai-framework] Klensin, J. and Y. Ko, "Overview and Framework for Internationalized Email", draft-ietf-eai-framework-05 (work in progress), February 2007. [ietf-eai-downgrade] YONEYA, Y., Ed. and K. Fujiwara, Ed., "Downgrading mechanism for Email Address Internationalization", draft-ietf-eai-downgrade-03 (work in progress), March 2007. [ietf-eai-utf8headers] Yeh, J., Ed. and Abel, Ed., "Internationalized Email Headers", draft-ietf-eai-utf8headers-04 (work in progress), March 2007. [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2045, November 1996. [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", RFC 2046, November 1996. [RFC2047] Moore, K., "Multipurpose Internet Mail Extensions (MIME) Part Three: Message Header Extensions for Non-ASCII Text", RFC 2047, November 1996. [RFC2822] Resnick, P., "Internet Message Format", RFC 2822, April 2001. [RFC2231] Freed, N. and K. Moore, "MIME Parameter Value and Encoded Word Extensions: Character Sets, Languages, and Continuations", RFC 2231, November 1997. [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 10646", RFC 3629, November 2003. Hurtta Expires September 18, 2007 [Page 40] Internet-Draft EAI Encapsulation March 2007 10.2. Informative References [DKIM-Charter] IETF, "Domain Keys Identified Mail (dkim)", October 2006, . [RFC1847] Galvin, J., Murphy, S., Crocker, S., and N. Freed, "Security Multiparts for MIME: Multipart/Signed and Multipart/Encrypted", RFC 1847, October 1995. [RFC1652] Freed, N., Ed., Rose, M., Stefferud, E., and D. Crocker, "SMTP Service Extension for 8bit-MIMEtransport", RFC 1652, July 1994. [RFC4288] Freed, N. and J. Klensin, "Media Type Specifications and Registration Procedures", RFC 4288, BCP 13, December 2005. [RFC3864] Klyne, G., Nottingham, M., and J. Mogul, "Registration Procedures for Message Header Fields", RFC 3864, BCP 90, September 2004. Author's Address Kari Hurtta Kala-Matti 4 B 24 02230 Espoo FI Email: hurtta-ietf@elmme-mailer.org URI: http://iki.fi/keh/ Hurtta Expires September 18, 2007 [Page 41] Internet-Draft EAI Encapsulation March 2007 Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Hurtta Expires September 18, 2007 [Page 42]