Email Address Internationalization K. Hurtta (EAI) November 1, 2006 Internet-Draft Intended status: Experimental Expires: May 5, 2007 Encapsulation mechanism for Email Address Internationalization (EAI) draft-hurtta-eai-encapsulation-00 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on May 5, 2007. Copyright Notice Copyright (C) The Internet Society (2006). Abstract The Email Address Internationalization (EAI) is implemented by allowing UTF-8 characters in SMTP envelope and mail headers. To deliver email which uses UTF-8 on email headers through EAI incompliant environment converting (i.e downgrading) or encapsulation mechanism is required. Some of UTF-8 email may sign email headers or email header fields. This document describes mechanism for encapsulation when converting can not be used because of signed email Hurtta Expires May 5, 2007 [Page 1] Internet-Draft EAI Encapsulation November 2006 headers. Encapsulation may also used to forward EAI email through EAI incompliant environment that way that original EAI email can be recovered. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Addition to internationalized email header . . . . . . . . . . 4 3.1. "Encapsulated" value . . . . . . . . . . . . . . . . . . . 4 3.2. "downgrade" parameter . . . . . . . . . . . . . . . . . . 4 3.3. "I18N-Received" header field . . . . . . . . . . . . . . . 5 3.4. Registration of I18N-Received header field . . . . . . . . 5 4. Encapsulation format . . . . . . . . . . . . . . . . . . . . . 5 4.1. "multipart/utf8-encapsulated" media type . . . . . . . . . 6 4.2. Registration of media type multipart/utf8-encapsulated . . 7 4.3. Registration of media type text/utf8-header . . . . . . . 8 5. Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . 10 5.1. Generic encapsulation . . . . . . . . . . . . . . . . . . 10 5.1.1. Encapsulation of recursive part . . . . . . . . . . . 15 5.1.2. Encapsulation example . . . . . . . . . . . . . . . . 17 5.1.3. Multipart encapsulation example . . . . . . . . . . . 18 5.1.4. Unknown top level type encapsulation example #1 . . . 20 5.1.5. Unknown top level type encapsulation example #2 . . . 21 5.2. Downgrading of internationalized email message . . . . . . 21 5.2.1. Encapsulation example . . . . . . . . . . . . . . . . 24 5.2.2. Multipart/signed encapsulation example . . . . . . . . 25 5.3. Attaching internationalized email message . . . . . . . . 28 5.3.1. Attaching example . . . . . . . . . . . . . . . . . . 29 6. Decoding encapsulation . . . . . . . . . . . . . . . . . . . . 31 6.1. Generic decoding . . . . . . . . . . . . . . . . . . . . . 31 6.1.1. Decoding of recursive part . . . . . . . . . . . . . . 34 6.2. Upgrading of internationalized email message . . . . . . . 35 6.2.1. Upgrading example . . . . . . . . . . . . . . . . . . 36 6.3. Retrieving attached internationalized email message . . . 38 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 39 8. Security Considerations . . . . . . . . . . . . . . . . . . . 39 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 39 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 39 10.1. Normative References . . . . . . . . . . . . . . . . . . . 39 10.2. Informative References . . . . . . . . . . . . . . . . . . 40 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 41 Intellectual Property and Copyright Statements . . . . . . . . . . 42 Hurtta Expires May 5, 2007 [Page 2] Internet-Draft EAI Encapsulation November 2006 1. Introduction Internationalized email includes UTF-8 characters [RFC3629] on email headers. When internationalized email is delivered to EAI incompliant environment it's email header fields are converted (i.e. downgraded) to ASCII compatible form. When email is back on EAI compliant environment it is upgraded to internationalized form by decoding ASCII compatible encodings. When internationalized email is downgraded to ASCII compatible form and then upgraded to internationalized form, result is not necessary original mail. For example some header fields may be originally used ASCII compatible form, but upgrading converts them to UTF-8 form. Sometimes however it may be required that original internationalized email header part can be recovered. This document describes mechanism for encapsulation which allows recovering a original internationalized email. If mail headers or some mail header fields and message parts are cryptographically signed, this may require that the original mail is recovered before signature of mail is checked. This document provides an encapsulation method which have following properties: o An encapsulation do not produce nesting encodigs. o Content of an encapsulated mail is accessible to EAI incompliant user agents. o An encapsulation do not hide original MIME parts although original MIME structure may be obscured. o An encapsulation provides way to recover a original internationalized email. Media types "multipart/utf8-encapsulated" and "text/utf8-header" are introduced. This document provides markup which indicates when downgrading to EAI incompliant environment should be done with this encapsulation. That is done by adding parameter "downgrade=encapsulate" to "Header-Type" header field. If internationalized email is encapsulated, "Header-Type" header field value "Encapsulated" is used instead of "Downgraded". Only minimal amount of header fields are generated or left to header part of encapsulated message. This is used to hide signatures, which are placed to header fields, during encapsulation. For example Domain Keys Identified Mail [DKIM-Charter] uses these kind signatures. Original header fields are stored to "text/utf8-header" Hurtta Expires May 5, 2007 [Page 3] Internet-Draft EAI Encapsulation November 2006 MIME part. The "multipart/signed" media type [RFC1847] signs header fields from MIME header. After encapsulation signature fails, because MIME header is changed. That signature is hidden by replacing "multipart/ signed" on "Content-Type" header field with "multipart/mixed" value. Original "Content-Type" header field is stored to "text/utf8-header" MIME part. This encapsulation copies all header fields of internationalized email to "text/utf8-header" MIME part. Saved header fields from "text/utf8-header" MIME part and "Received" header fields from encapsulation email is used when upgrading. Because this may cause duplication of "Received" header fields, original "Received" header fields are renamed to "I18N-Received" during encapsulation. 2. Terminology Terminology for this document is defined in [ietf-eai-framework] and [RFC2045]. 3. Addition to internationalized email header New values to "Header-Type" header field (defined in [ietf-eai-utf8headers]) are introduced. 3.1. "Encapsulated" value The "Header-Type" header field syntax is modified as specified below: content-code /= "Encapsulated" ; "content-code" is defined in [ietf-eai-utf8headers] Value "Encapsulated" tells that downgrading is done with encapsulation defined on this document. 3.2. "downgrade" parameter The "downgrade" parameter is defined for "Header-Type" header field. Value of "downgrade" parameter is defined as following: downgrade-value = "convert" / "encapsulate" The "downgrade" parameter instructs how downgrading of internationalized email should be done. Hurtta Expires May 5, 2007 [Page 4] Internet-Draft EAI Encapsulation November 2006 o Value "convert" indicates that downgrading of internationalized email is done as defined in [ietf-eai-downgrade]. o Value "encapsulate" indicates that downgrading is done with encapsulation defined on this document. 3.3. "I18N-Received" header field The "I18N-Received" header field is added to Internet Message Format [RFC2822] as specified below: received /= "I18N-Received:" name-val-list ";" date-time CRLF ; terms , t and ; are defined in [RFC2822] Original "Received" header fields are renamed to "I18N-Received" during encapsulation. "Received" header fields are saved with original name to "text/utf8-header" MIME part. 3.4. Registration of I18N-Received header field This section provides the header field registration application (as per [RFC3864]). Header field name: I18N-Received Applicable protocol: mail Status: experimental Author/Change controller: Kari Hurtta hurtta-ietf@elmme-mailer.org Specification document(s): RFC XXXX Related information: I18N-Received is used together with multipart/utf8-encapsulated media type. 4. Encapsulation format A "multipart/utf8-encapsulated" media type splits internationalized email message or MIME subpart to two parts: Hurtta Expires May 5, 2007 [Page 5] Internet-Draft EAI Encapsulation November 2006 o The header part of internationalized email message or header part of MIME subpart is put to first subpart of the "multipart/ utf8-encapsulated" media type. Media type of a first subpart is "text/utf8-header". o The body part of internationalized email message or header part of MIME subpart is put to second subpart of of the "multipart/ utf8-encapsulated" media type. Media type of a second subpart is same than a media type of original internationalized email message or original MIME subpart. However, if media type of original internationalized email message or original MIME subpart was "multipart/signed" , media type of a second subpart is "multipart/ mixed". On some cases media type is "application/octet-stream". NOTE: This encapsulation assumes that a "preamble" and "epilogue" areas of multipart media types includes only ASCII. If these areas includes UTF-8 text, that text is lost if encapsulating "multipart/utf8-encapsulated" is converted to ASCII compatible format (ie. during 8BITMIME downgrading [RFC1652].) This lost of UTF-8 text on "preamble" and "epilogue" areas of multipart media types can be solved by adding third and fourth subpart to the "multipart/utf8-encapsulated" media type. However author believes that this unnecessarily complicates encapsulation format and algorithm. Author assumes that messages which use signing do not put UTF-8 text to "preamble" and "epilogue" areas of multipart media types. If message is not signed, lost of "preamble" and "epilogue" areas do not cause harm. 4.1. "multipart/utf8-encapsulated" media type The "multipart/utf8-encapsulated" can be used on three different roles. The "type" parameter is defined for "multipart/ utf8-encapsulated" media type. Value of "type" parameter is defined as following: type-value = "encapsulated" / "message" / "subpart" o Value "encapsulated" is used, when "multipart/utf8-encapsulated" media type is used as downgrading format of internationalized email. Value of "type" is set to "encapsulated" when internationalized email is downgraded because of "downgrade=encapsulate" value on "Header-Type" header field. o Value "message" is used, when "multipart/utf8-encapsulated" media type is used used on same purpose, which media type "message/ rfc822" is used on non-EAI content. Value of "type" is set to "message" when internationalized email is attached or included to message. Roughly "multipart/utf8-encapsulated; type=message" is Hurtta Expires May 5, 2007 [Page 6] Internet-Draft EAI Encapsulation November 2006 equivalent of "message/rfc822" except that format of attachment is different. o Value "subpart" is used, "multipart/utf8-encapsulated" media type is used as downgrading format of MIME subpart. Value of "type" is set to "subpart" MIME structure of internationalized email or MIME subpart is recursively downgraded, and MIME subpart with UTF-8 header is found. 4.2. Registration of media type multipart/utf8-encapsulated This section provides the media type registration application (as per [RFC4288]). Type name: multipart Subtype name: utf8-encapsulated Required parameters: The "boundary" parameter is requires as per RFC 2046. The "type" parameter is required as per RFC XXXX. Optional parameters: Encoding considerations: 8bit or binary Security considerations: This media type provides method to encapsulate mail data. Specially this media type provides method to smuggle mail header fields so that mail scanners do not see them. This may provide new security threats. This encapsulation do not hide original MIME parts. However original MIME structure may be obscured. This may provide method to smuggle MIME parts so that mail scanners do not see them. This may provide new security threats. This encapsulation preserves only "Received" header fields from encapsulating message. This may hide information when encapsulated message is upgraded to internationalized email format. Interoperability considerations: Hurtta Expires May 5, 2007 [Page 7] Internet-Draft EAI Encapsulation November 2006 This media type provides method to encapsulate internationalized email. Recipeint of encapsulate email must decode encapsulation, before email is fully accessible. However original MIME parts are not hidden to mail agents which do not know encapsulation used by this media type. Published specification: RFC XXXX Applications that use this media type: Internationalized mail user agents (MUAs), mail transport agents (MTAs) and IMAP servers. Additional information: Magic number(s): File extension(s): Macintosh file type code(s): Person & email address to contact for further information: Kari Hurtta hurtta-ietf@elmme-mailer.org Intended usage: common Restrictions on usage: Author: Kari Hurtta Change controller: Kari Hurtta 4.3. Registration of media type text/utf8-header This section provides the media type registration application (as per [RFC4288]). Type name: text Subtype name: utf8-header Required parameters: The "charset" with value "UTF-8", if UTF-8 header in fact is encapsulated. Optional parameters: Hurtta Expires May 5, 2007 [Page 8] Internet-Draft EAI Encapsulation November 2006 charset Encoding considerations: 7bit or 8bit "8bit", if UTF-8 header in fact is encapsulated. Security considerations: This media type provides method to encapsulate mail data. Specially this media type provides method to smuggle mail header fields so that mail scanners do not see them. This may provide new security threats. Interoperability considerations: Mail agents which do not know this media type, treat this as text/ plain media type. Published specification: RFC XXXX Applications that use this media type: Internationalized mail user agents (MUAs), mail transport agents (MTAs) and IMAP servers. Additional information: On some cases ASCII header part is encapsulated instead of UTF-8 header part. Magic number(s): File extension(s): Macintosh file type code(s): Person & email address to contact for further information: Kari Hurtta hurtta-ietf@elmme-mailer.org Intended usage: common Restrictions on usage: Author: Kari Hurtta Change controller: Kari Hurtta Hurtta Expires May 5, 2007 [Page 9] Internet-Draft EAI Encapsulation November 2006 5. Encapsulation On encapsulation internationalized email message or MIME subpart is split to two MIME subparts of "multipart/utf8-encapsulated" media type. There is three cases of encapsulation: o When internationalized email message is downgraded. o When internationalized email message is attached or included to message. o When MIME subpart is encapsulated because there is UTF-8 text on header. This is recursive part of algorithm. On "Generic Encapsulation" (Section 5.1) is described common parts of these three encapsulations. 5.1. Generic encapsulation Both a email message and MIME subpart follow same general syntax: o Both have header part and body part. o Header part is followed by body part and these are separated by an empty line. Term "entity" refers to both an email message and a MIME subpart. NOTE: Subtypes of "message" (i.e. media type is message/*) other than "message/rfc822" and other composite types than "multipart" or "message" are treated specially. This is done so that this algorithm is stable and result do not change when new subtypes of "message" is registered or new composite top level types are standardised. Unknown top level types are treated same way because it is not possible to know, if top level type is composite. Also on these cases type is treated as "application/ octet-stream" if body part of original entity includes non-ASCII characters. Type information is not lost, because whole header part of original entity is stored to body of first MIME subpart on encapsulating entity. Processing as "application/octet-stream" is done because algorithm do not know how to encapsulate it as composite type. If body part of original entity includes only ASCII characters, there can not be UTF-8 headers (when it is treated as composite type.) NOTE: This algorithm looks complex. This complexity is result of requirement that so called "the nested encoding rule" is not violated. This requirement causes that composite media types must be processed recurssively. Hurtta Expires May 5, 2007 [Page 10] Internet-Draft EAI Encapsulation November 2006 Special handling of unknown composite types, which includes non- ASCII characters, as "application/octet-stream" causes that "the nested encoding rule" is violated. In case of unknown types this is unavoidable, because it is not possible to parse internal structure of unknown types. Encapsulating entity is generated from original internationalized email message or MIME subpart following way: o New header part for encapsulating entity is generated. * Media type for encapsulating entity is "multipart/ utf8-encapsulated". o New body part for encapsulating entity is generated. This body part consists two MIME subparts. * Media type for first MIME subpart is "text/utf8-header". + Value of "charset" parameter is "UTF-8", if header part of original entity includes UTF-8 characters. + NOTE: This seems strange, but on certain cases also ASCII- only header part is encapsulated. In that case "charset" parameter is not required. * Body part of first MIME subpart is header part of original entity (original internationalized email message or MIME subpart). * It is strongly recommended that body of first MIME subpart is base64 encoded (and of course "content-transfer-encoding" header field is updated correspondingly). * Generation of second MIME subpart is described on next chapters. NOTE: An unix mailbox format changes "From" on beginning of line to ">From". Therefore it is usefull that "text/utf8-header" is encodeed with base64 even when it includes only ASCII header fields. Actually it is more common to replace "From " with ">From ". This does not touch "From" header field (if there is no space between "From" and ":"). Second MIME subpart of encapsulating entity is generated in following way: 1. Original entity is checked for following cases: * Media type value (type/subpart) of original entity includes other than ASCII characters [ASCII]. This is error condition. * Media type of original entity is subtype of "message" (i.e. media type is message/*) and it is not "message/rfc822" and body part of original entity includes non-ASCII character. * Top level type of original entity is unknown, body part of original entity includes non-ASCII characters and encoding of original entity is identity (ie. "content-transfer-encoding" Hurtta Expires May 5, 2007 [Page 11] Internet-Draft EAI Encapsulation November 2006 is "8bit" or "binary") * Top level type of original entity is other composite type than "multipart" or "message" and body part of original entity includes non-ASCII characters. * Media type of original entity is subtype of "multipart" (i.e. media type is multipart/*) and "boundary" parameter is missing. This is error condition. If found, * Media type for second MIME subpart is "application/ octet-stream" * Body part of second MIME subpart is body part of original entity. * "Content-transfer-encoding" value for second MIME subpart is copied from original entity, if it includes only ASCII characters. Otherwise it is set to "7bit", "8bit" or "binary" as appropriate. Non-ASCII value is error condition. 2. Otherwise if media type of original entity is "multipart/signed", then * Media type for second MIME subpart is "multipart/mixed" + Generation of "boundary" parameter is described on next chapters. * Body part of second MIME subpart is "Composite encapsulated body part". Generation if this is described on next chapters. * "Content-transfer-encoding" it is set to "7bit", "8bit" or "binary" as appropriate. + If "Content-transfer-encoding" value of original entity is other than "7bit", "8bit" or "binary", this is error condition. 3. Otherwise original entity is checked for following cases: * Top level type of original entity is other type than "multipart" or "message". + This includes all discrete media types. + This includes all unknown top level types. * Media type of original entity is subtype of "message" (i.e. media type is message/*) and it is not "message/rfc822". If found, * Media type for second MIME subpart is same than media type of original entity. + Copying of media type parameters from original entity to second MIME subpart is described on next chapters. * Body part of second MIME subpart is body part of original entity. * "Content-transfer-encoding" value for second MIME subpart is copied from original entity, if it includes only ASCII characters. Otherwise it is set to "7bit", "8bit" or "binary" as appropriate. Non-ASCII value is error condition. Hurtta Expires May 5, 2007 [Page 12] Internet-Draft EAI Encapsulation November 2006 4. Otherwise if original entity is composite type ("multipart" or "message/rfc822"), * Media type for second MIME subpart is same than media type of original entity. + Copying of media type parameters from original entity to second MIME subpart is described on next chapters. * Generation of body part for second MIME subpart is handled specially when media type of original entity is composite. + Body of original entity is scanned when entity is composite. + Generally that causes that processing is recursive. + Body part of second MIME subpart is called with term "composite encapsulated body part", if media type of original entity is composite. Generating of this body part is described on next chapter. * "Content-transfer-encoding" it is set to "7bit", "8bit" or "binary" as appropriate. + If "Content-transfer-encoding" value of original entity is other than "7bit", "8bit" or "binary", this is error condition. Media type parameters from original entity to second MIME subpart is copied on following way o This copying is done when media type for second MIME subpart is same than media type of original entity. o ASCII parameters are copied (however see special note about "boundary" on next chapters.) o UTF-8 comments are removed. o Parameters which have UTF-8 value are encoded according of [RFC2231] when copied. * If required parameters of media type are known, and parameter is not required for media type, it is not required that it is copied (and encoded according of [RFC2231]). o If parameter name have UTF-8 characters, this is error condition and parameter is not copied. o If "boundary" parameter value of multipart media type have UTF-8 characters, it is handled specially. This is described on next chapters. The "Composite encapsulated body part" is generated in following way: o If "application/octet-stream" was assigned to media type for second MIME subpart, body part of original entity is resulting "Composite encapsulated body part". This case is mentioned on previous chapter. o If media type of original entity is subtype of "message" (i.e. media type is message/*) and it is not "message/rfc822", body part of original entity is resulting "Composite encapsulated body part". This case is mentioned on previous chapter. Hurtta Expires May 5, 2007 [Page 13] Internet-Draft EAI Encapsulation November 2006 o If media type of original entity is "message/rfc822", body part of original entity parsed (to header and body part) and is processed as described on "Encapsulation of recursive part" (Section 5.1.1). Result of processing is "Composite encapsulated body part". o If media type of original entity is subtype of "multipart" (i.e. media type is multipart/*), body part of original entity is processed as described on next chapter. Result of processing is "Composite encapsulated body part". o If top level type of original entity is other composite type than "multipart" or "message", it is treated as unknown type. This processing is described on previous chapter. For multipart types "Composite encapsulated body part" is generated as following: 1. A "boundary" paramater value from original entity is remembered. * Handling of missing "boundary" paramater is described on previous chapters. 2. A "boundary" parameter value for second MIME subpart is selected. * Selected "boundary" parameter value must include only ASCII characters. * In general this can be same than a "boundary" paramater value from original entity. * If a "boundary" paramater value from original entity includes UTF-8 characters, new ASCII-only value must selected. 3. The "preamble" area from body of original entity is copied to "Composite encapsulated body part". * If "preamble" area includes non-ASCII characters, this is an error condition. 4. Subparts of multipart (from body of original entity) are handled: 1. A boundary delimiter line is copied to "Composite encapsulated body part", but that way that a boundary of original entity is replaced with selected boundary of second MIME subpart. 2. A subpart is parsed (to header and body part) and is processed as described on "Encapsulation of recursive part" (Section 5.1.1). Result is copied to "Composite encapsulated body part". 5. A final boundary delimiter line is copied (from body of original entity), to "Composite encapsulated body part" but that way that boundary of original entity is replaced with selected boundary of second MIME subpart. * A final final boundary delimiter line is not generated to "Composite encapsulated body part" if a final boundary delimiter line is missing on original entity. This is an error condition. 6. The "epilogue" area from body of original entity is copied to "Composite encapsulated body part". Hurtta Expires May 5, 2007 [Page 14] Internet-Draft EAI Encapsulation November 2006 * If "epilogue" area includes non-ASCII characters, this is error condition. NOTE: The CRLF preceding the boundary delimiter line is conceptually attached to the boundary (as per [RFC2046]). That CRLF is not part of subpart of multipart. If encapsulation and decoding of encapsulation process this CRLF different way, this encapsulation do not preserve all CRLFes or add extra CRLFes. NOTE: If original "Content-transfer-encoding" includes non-ASCII characters, this algorithm do not able to decode resulting encapsulation. Therefore it is recommended that internationalized email message is bounced or rejected on that error condition. 5.1.1. Encapsulation of recursive part An encapsulation of recursive part is done following way: 1. If header part of recursive part includes UTF-8 characters or if media type of recursive part is "multipart/signed" then * Recursive part is considered to be "original entity" and "Generic encapsulation" (Section 5.1) is applied. * Value of parameter "type" is set to "subpart" for resulting "multipart/utf8-encapsulated" encapsulating entity. * Resulting encapsulating entity result is result for "Encapsulation on recursive part". 2. Otherwise if media type of recursive part is discrete, result for "Encapsulation on recursive part" is recursive part itself. 3. Otherwise if media type of recursive part is "message/rfc822", then * Header part of result for "Encapsulation on recursive part" result, is header part of recursive part. * Body part of recursive part is parsed (to header and body part) and is processed as described on "Encapsulation of recursive part" (Section 5.1.1). Body part of result for "Encapsulation on recursive part" is result of processing. 4. Otherwise if top level type is multipart (i.e. media type is multipart/*) and "boundary" parameter exists, handling of it is described on next chapter. * Missing "boundary" parameter on multipart types is error condition. 5. Otherwise if recursive part is ASCII only (body is ASCII, ie. "content-transfer-encoding" is "7bit") result for "Encapsulation on recursive part" is recursive part itself. 6. Otherwise if encoding of recursive part is not identity (ie. "content-transfer-encoding" is not "8bit" or "binary") result for "Encapsulation on recursive part" is recursive part itself. Hurtta Expires May 5, 2007 [Page 15] Internet-Draft EAI Encapsulation November 2006 7. Otherwise * Recursive part is considered to be "original entity" and "Generic encapsulation" (Section 5.1) is applied. * For resulting "multipart/utf8-encapsulated" encapsulating entity parameter "type" is set "subpart" as value. * Resulting encapsulating entity result is result for "Encapsulation on recursive part". * NOTE: This seems strange, but unknown composite media types are always encapsulated, if there is possibility that they include embedded UTF-8 headers. If top level type is multipart, result for "Encapsulation on recursive part" is generated following way: 1. Header part of result for "Encapsulation on recursive part" result is header part of recursive part. 2. A "boundary" paramater value from recursive part is remembered. * Handling of missing "boundary" paramater is described on previous chapters. 3. Body part for "Encapsulation on recursive part" result is initiated. 4. A "preamble" area from body of recursive part is copied to body part for "Encapsulation on recursive part" result. * If a "preamble" area includes non-ASCII characters, this is an error condition. 5. Subparts of multipart (from body of recursive part) are handled: 1. A boundary delimiter line is copied to body part for "Encapsulation on recursive part" result. 2. A subpart is parsed (to header and body part) and is processed as described on "Encapsulation of recursive part" (Section 5.1.1). Result is copied to body part for "Encapsulation on recursive part" result. 6. A final boundary delimiter line is copied (from body of recursive part) to body part for "Encapsulation on recursive part" result. * A final final boundary delimiter line is not generated to body part for "Encapsulation on recursive part" result if final boundary delimiter line is missing on recursive part. This is an error condition. 7. An "epilogue" area from body of recursive part is copied to body part for "Encapsulation on recursive part" result. * If "epilogue" area includes non-ASCII characters, this is error condition. Hurtta Expires May 5, 2007 [Page 16] Internet-Draft EAI Encapsulation November 2006 5.1.2. Encapsulation example An encapsulation example Original internationalized entity: ========================================== Some-Header: { UTF-8 content } Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } ========================================== Encapsulated entity: ========================================== Content-Type: multipart/utf8-encapsulated; type={specified later}; boundary="12345" Content-Transfer-Encoding: 8bit --12345 Content-Type: text/utf8-header; charset=UTF-8 Content-Transfer-Encoding: 8bit Some-Header: { UTF-8 content } Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --12345 Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } --12345-- ========================================== An empty line on end of "text/utf8-header" body is not copied from original encapsulated headers. It is part of next boundary line of "multipart/utf8-encapsulated". Hurtta Expires May 5, 2007 [Page 17] Internet-Draft EAI Encapsulation November 2006 NOTE: On this example "text/utf8-header" part is not base64 encoded for clarity. Base64 encoding is recommended. 5.1.3. Multipart encapsulation example An multipart encapsulation example Original internationalized entity: ========================================== Content-Type: Multipart/mixed; boundary=12345 Content-Transfer-Encoding: 8bit --12345 Some-Header: { UTF-8 content } Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } --12345 ========================================== Encapsulated entity: ========================================== Content-Type: multipart/utf8-encapsulated; type={specified later}; boundary="67890" Content-Transfer-Encoding: 8bit --67890 Content-Type: text/utf8-header; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Type: Multipart/mixed; boundary=12345 Content-Transfer-Encoding: 8bit --67890 Content-Type: Multipart/mixed; boundary=12345 Content-Transfer-Encoding: 8bit --12345 Content-Type: multipart/utf8-encapsulated; type=subpart; boundary="abcde" Content-Transfer-Encoding: 8bit Hurtta Expires May 5, 2007 [Page 18] Internet-Draft EAI Encapsulation November 2006 --abcde Content-Type: text/utf8-header; charset=UTF-8 Content-Transfer-Encoding: 8bit Some-Header: { UTF-8 content } Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --abcde Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } --abcde-- --12345-- --67890-- ========================================== "Generic encapsulation" causes that a top level header part is always encapsulated, even when it is US-ASCII only. In general it is assumed that on internationalized email there is always some header fields which require this encapsulation. Hurtta Expires May 5, 2007 [Page 19] Internet-Draft EAI Encapsulation November 2006 5.1.4. Unknown top level type encapsulation example #1 An encapsulation example for unknown top level type with 7-bit body Original internationalized entity: ========================================== Some-Header: { UTF-8 content } Content-Type: X-message8/plain Content-Transfer-Encoding: 7bit { ASCII text } ========================================== Encapsulated entity: ========================================== Content-Type: multipart/utf8-encapsulated; type={specified later}; boundary="12345" Content-Transfer-Encoding: 8bit --12345 Content-Type: text/utf8-header; charset=UTF-8 Content-Transfer-Encoding: 8bit Some-Header: { UTF-8 content } Content-Type: X-message8/plain Content-Transfer-Encoding: 7bit --12345 Content-Type: X-message8/plain Content-Transfer-Encoding: 7bit { ASCII text } --12345-- ========================================== Hurtta Expires May 5, 2007 [Page 20] Internet-Draft EAI Encapsulation November 2006 5.1.5. Unknown top level type encapsulation example #2 An encapsulation example for unknown top level type with 8-bit body Original internationalized entity: ========================================== Some-Header: { UTF-8 content } Content-Type: X-message8/plain Content-Transfer-Encoding: 8bit { non-ASCII text } ========================================== ========================================== Content-Type: multipart/utf8-encapsulated; type={specified later}; boundary="12345" Content-Transfer-Encoding: 8bit --12345 Content-Type: text/utf8-header; charset=UTF-8 Content-Transfer-Encoding: 8bit Some-Header: { UTF-8 content } Content-Type: X-message8/plain Content-Transfer-Encoding: 8bit --12345 Content-Type: application/octet-stream Content-Transfer-Encoding: 8bit { non-ASCII text } --12345-- ========================================== 5.2. Downgrading of internationalized email message When an internationalized email leaves EAI compliant environment downgrade is required. [ietf-eai-downgrade] describes when downgrade occurs. This document defines "downgrade" parameter to "Header-Type" header field. "Downgrade" parameter defines how downgrade occurs: o If parameter value is "convert", downgrading of header part (and body) of mail is done according of [ietf-eai-downgrade]. Hurtta Expires May 5, 2007 [Page 21] Internet-Draft EAI Encapsulation November 2006 o If parameter value is "encapsulate", downgrading of header part (and body) of mail is done as described on this section. Downgrading of internationalized email is done following way: o Internationalized email is considered to be "original entity" and "Generic encapsulation" (Section 5.1) is applied. o For resulting "multipart/utf8-encapsulated" encapsulating entity parameter "type" is set "encapsulated" as value. o Resulting encapsulating entity is downgraded internationalized email. * Addition of email header fields to downgraded internationalized email is described on next chapter. When mail is downgraded, some email header fields must be added. "Generic encapsulation" (Section 5.1) do not procude these email header fields. o "Header-Type" header field with value "Encapsulated" is added. o All header fields, which are mentioned on "required-fields" parameter on "Header-Type" header field on original international email, are copied as following: * Header field is copied as is if it includes only ASCII characters. * Otherwise it is converted as specified on [ietf-eai-downgrade]. * If it is not possible to cenvert header field, international email is bounced or rejected. o "Received" header fields are copied from original international email added with new header field name. "I18N-Received" header field name is used for copied header fields. If some header field includes non-ASCII characters, it is not copied. o "Mime-Version" header field with value "1.0" is added. o "Date" header field is copied from original international email, if it includes only ASCII characters. Otherwise it is generated. o "From" header field is added. Several different values for "From" header field which can be used: * If "From" header field from original internationalized email can be used, if it includes only ASCII characters. * Algorith from [ietf-eai-downgrade] can be used. * Value for "From" header field can be taken from downgraded envelope sender address. * ASCII address which refers to downgrading gateway, can be used. o "Subject" header field is copied from original internationalized email, if it includes only ASCII characters. Otherwise several different values for "Subject" header field can be used: * Algorith from [ietf-eai-downgrade] or from [RFC2047] can be used. * ASCII subject which refers to downgrading operation, can be used. Hurtta Expires May 5, 2007 [Page 22] Internet-Draft EAI Encapsulation November 2006 o If "From" and "Subject" are from original internationalized email and "Message-ID" header field on original internationalized email includes only ASCII characters, "Message-ID" header field is copied (from original internationalized email). Otherwise it is optionally generated. o Optionally "To" header is added. Several different values for "To" header field which can be used: * "To" header field from original international email can be used, if it includes only ASCII characters. * Algorith from [ietf-eai-downgrade] can be used. o Optionally "Cc" header is added. Several different values for "Cc" header field which can be used: * "Cc" header field from original international email can be used, if it includes only ASCII characters. * Algorith from [ietf-eai-downgrade] can be used. o It is important that all ASCII header fields are NOT copied. Some header fields may be used for signatures. If signature is checked from encapsulated form, it fails. For example Domain Keys Identified Mail [DKIM-Charter] uses these kind signatures. "Encapsulation on recursive part" (Section 5.1.1) mentions several error conditions. Although it defines output on that case converting MTA is permitted to bounce (return NDN) or reject (on SMTP level) internationalized email message. Downgrading MUA can refuse downgrading internationalized email message and give error message to user or produce downgraded message and give warning message to user. Silent operation is not recommended when error condition happens (on downgrading MUA). NOTE: Only "Date" and "From" header fields are required on email (as per [RFC2822]) However "multipart/utf8-encapsulated" format is also usable for non-EAI compliant MUAs assuming that they support MIME. Specially if an original internationalized email message was using UTF-8 characters only on main header part and not on header part of MIME subparts. Therefore it is usefull if "From", "Subject", "To" and "Cc" header fields are derived from original internationalized email according of [ietf-eai-downgrade]. This allows reply -commands work on non-EAI compliant MUAs. NOTE: It is possible to copy almost all ASCII header fields to encapsulating header part from original internationalized email header part and avoid signature header fields, if "hidden-fields" parameter is added to "Header-Type" field. That paramater will tell which ASCII header fields are not copied. However author believes that this unnecessarily complicates encapsulation format and algorithm. Gain from this addition is little, when main Hurtta Expires May 5, 2007 [Page 23] Internet-Draft EAI Encapsulation November 2006 purpose of encapsulation is provide tunneling between EAI compliant environments which are separated with MTAs which do not support UTF8SMTP ([ietf-eai-smtpext]). 5.2.1. Encapsulation example An encapsulation example Original internationalized email: ========================================== Received: from {idn-encoded-name} by downgrade.example.org with ESMTP id JGR17356; Wed, 13 Sep 2006 22:27:25 +0300 Header-Type: UTF8; downgrade=encapsulate From: { UTF-8 address } To: someone@example.org Date: Wed, 13 Sep 2006 22:27:25 +0300 Subject: { UTF-8 subject } X-Foobar: XvrT Mime-Version: 1.0 Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } ========================================== Encapsulated internationalized email: ========================================== I18N-Received: from {idn-encoded-name} by downgrade.example.org with ESMTP id JGR17356; Wed, 13 Sep 2006 22:27:25 +0300 Header-Type: Encapsulated From: { downgraded address } To: someone@example.org Date: Wed, 13 Sep 2006 22:27:25 +0300 Subject: { RFC 2047 encoded subject } Mime-Version: 1.0 Content-Type: multipart/utf8-encapsulated; type=encapsulated; boundary="12345" Content-Transfer-Encoding: 8bit --12345 Hurtta Expires May 5, 2007 [Page 24] Internet-Draft EAI Encapsulation November 2006 Content-Type: text/utf8-header; charset=UTF-8 Content-Transfer-Encoding: 8bit Received: from {idn-encoded-name} by downgrade.example.org with ESMTP id JGR17356; Wed, 13 Sep 2006 22:27:25 +0300 Header-Type: UTF8; downgrade=encapsulate From: { UTF-8 address } To: someone@example.org Date: Wed, 13 Sep 2006 22:27:25 +0300 Subject: { UTF-8 subject } X-Foobar: XvrT Mime-Version: 1.0 Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --12345 Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } --12345-- ========================================== NOTE: On this example "text/utf8-header" part is not base64 encoded for clarity. Base64 encoding is recommended -- especially because it includes line starting with "From". 5.2.2. Multipart/signed encapsulation example Multipart/signed [RFC1847] encapsulation example Original internationalized email: ========================================== Received: from {idn-encoded-name} by downgrade.example.org with ESMTP id JGR17356; Wed, 13 Sep 2006 22:27:25 +0300 Header-Type: UTF8; downgrade=encapsulate From: { UTF-8 address } To: someone@example.org Date: Wed, 13 Sep 2006 22:27:25 +0300 Hurtta Expires May 5, 2007 [Page 25] Internet-Draft EAI Encapsulation November 2006 Subject: { UTF-8 subject } X-Foobar: XvrT Mime-Version: 1.0 Content-Type: multipart/signed; protocol="application/XYZ-signature"; micalg="ABC"; boundary=12345 Content-Transfer-Encoding: 8bit --12345 Content-Description: { UTF-8 description } Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } --12345 Content-Type: application/XYZ-signature { signature data } --12345-- ========================================== Encapsulated internationalized email: ========================================== I18N-Received: from {idn-encoded-name} by downgrade.example.org with ESMTP id JGR17356; Wed, 13 Sep 2006 22:27:25 +0300 Header-Type: Encapsulated From: { downgraded address } To: someone@example.org Date: Wed, 13 Sep 2006 22:27:25 +0300 Subject: { RFC 2047 encoded subject } Mime-Version: 1.0 Content-Type: multipart/utf8-encapsulated; type=encapsulated; boundary="45678" Content-Transfer-Encoding: 8bit --45678 Content-Type: text/utf8-header; charset=UTF-8 Content-Transfer-Encoding: 8bit Received: from {idn-encoded-name} by downgrade.example.org with ESMTP Hurtta Expires May 5, 2007 [Page 26] Internet-Draft EAI Encapsulation November 2006 id JGR17356; Wed, 13 Sep 2006 22:27:25 +0300 Header-Type: UTF8; downgrade=encapsulate From: { UTF-8 address } To: someone@example.org Date: Wed, 13 Sep 2006 22:27:25 +0300 Subject: { UTF-8 subject } X-Foobar: XvrT Mime-Version: 1.0 Content-Type: multipart/signed; protocol="application/XYZ-signature"; micalg="ABC"; boundary=12345 Content-Transfer-Encoding: 8bit --45678 Content-Type: multipart/mixed; boundary=12345 Content-Transfer-Encoding: 8bit --12345 Content-Type: multipart/utf8-encapsulated; type=subpart; boundary="abcde" Content-Transfer-Encoding: 8bit --abcde Content-Type: text/utf8-header; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Description: { UTF-8 description } Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --abcde Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } --abcde-- --12345 Content-Type: application/XYZ-signature { signature data } Hurtta Expires May 5, 2007 [Page 27] Internet-Draft EAI Encapsulation November 2006 --12345-- --45678-- ========================================== Media type "multipart/signed" is replaced with "multipart/mixed" on encapsulated message. This allows encapsulation of signed header on MIME subpart. NOTE: Again, "text/utf8-header" should be base64 encoded. It is not done for clarity. NOTE: Normally various multipart/signed protocols defined that body of signed content must be quoted-printable or base64 encoded if it includes 8-bit characters. If it includes 8-bit characters, signature is broken, when email is 8BITMIME downgraded [RFC1652]. Note that generally this encapsulation algorithm do not protect agaist breaking of signature on that case. On that example is may protect it, but that is side effect protection required for encapsulation of "Content-Description" header field. 5.3. Attaching internationalized email message "Message/rfc822" can be used to attach internationalized email messages on EAI compliant environments if "message/rfc822" allows UTF-8 header fields. "Multipart/utf8-encapsulated" with "type" parameter value "message" can be used to attach internationalized email messages on EAI incompliant environments. NOTE: When inside of "message/rfc822" have "Multipart/ utf8-encapsulated" with "type" parameter value "encapsulated", this also represents attached internationalized email message. However author believes that "multipart/utf8-encapsulated" with "type" parameter value "message" provides usefull shorthand. NOTE: If internationalized email was stored inside of "message/ rfc822" media type and "message/rfc822" is inside of mime structure which is encapsulated, "Encapsulation on recursive part" (Section 5.1.1) produces where inside of "message/rfc822" have "Multipart/utf8-encapsulated" with "type" parameter value "subpart". "Multipart/utf8-encapsulated" media type, which represents internationalized email message, is done following way: o Internationalized email is considered to be "original entity" and "Generic encapsulation" (Section 5.1) is applied. Hurtta Expires May 5, 2007 [Page 28] Internet-Draft EAI Encapsulation November 2006 o Value of parameter "type" is set to "message" for resulting "multipart/utf8-encapsulated" encapsulating entity. 5.3.1. Attaching example On following example mail from earlier example (Section 5.2.1) is attached to message, which is sent to outside of EAI compliant environment. Hurtta Expires May 5, 2007 [Page 29] Internet-Draft EAI Encapsulation November 2006 Encapsulating message: ========================================== From: someone@example.org To: A@CC.example.org Subject: { UTF-8 subject } (fwd) Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="12345" Content-Transfer-Encoding: 8bit --12345 Content-Type: Text/plain See attached message. --12345 Content-Type: multipart/utf8-encapsulated; type=message; boundary="67890" Content-Transfer-Encoding: 8bit --67890 Content-Type: text/utf8-header; charset=UTF-8 Content-Transfer-Encoding: 8bit Header-Type: UTF8; downgrade=encapsulate From: { UTF-8 address } To: someone@example.org Date: Wed, 13 Sep 2006 22:27:25 +0300 Subject: { UTF-8 subject } X-Foobar: XvrT Mime-Version: 1.0 Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --67890 Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } --67890-- --12345-- ========================================== Hurtta Expires May 5, 2007 [Page 30] Internet-Draft EAI Encapsulation November 2006 In this example it is assumed that MUA knows that A@CC.example.org do not handle UTF8SMTP messages and therefore encapsulates it. Recipient (A@CC.example.org) may need helper application for media type multipart/utf8-encapsulated although message is mostly readable without helper. 6. Decoding encapsulation There is three cases of encapsulation: o When internationalized email message is tunneled through EAI incompliant environment, media type of message is "multipart/ utf8-encapsulated" with "type" parameter value "encapsulated". Original message is inside of that type. o When internationalized email message is included or attached message, media type "multipart/utf8-encapsulated" with "type" parameter value "message" represents included or attached message. o When MIME subpart is encapsulated, media type "multipart/ utf8-encapsulated" with "type" parameter value "subpart" encapsulates original MIME subpart. On "Generic decoding" (Section 6.1) is described common parts of decoding these three encapsulations. 6.1. Generic decoding On decoding an internationalized email message or a MIME subpart from "multipart/utf8-encapsulated" are extracted. Both an email message and a MIME subpart are refered with term "entity". On error conditions encapsulating entity is not decoded. Instead original encapsulating entity is returned. Decoded internationalized entity is generated from encapsulating entity (multipart/utf8-encapsulated) in following way: o If media type of an encapsulating entity is not "multipart/ utf8-encapsulated", this is an error condition. o If number of MIME subparts on encapsulating entity is not two (2), this is an error condition. o If media type of first MIME subpart is not "text/utf8-header", this is an error condition. o If value of "charset" parameter of first MIME subpart is not "UTF-8" or "US-ASCII", this is an error condition. Missing "charset" parameter is treated as equivalent of "US-ASCII" as per [RFC2046]. o Body of first MIME subpart forms header part of decoded entity. Encoding (as given on "content-transfer-encoding" header field) is decoded. Hurtta Expires May 5, 2007 [Page 31] Internet-Draft EAI Encapsulation November 2006 o Body of second MIME subparts forms body part of decoded entity. Generating of body part of decoded entity is described on next chapter. Body of decoded entity is generated on following way: o If both media type of second MIME subpart is discrete and media type for decoded entity (from body of first MIME subpart) is discrete, then * If encoding for decoded entity (from body of first MIME subpart) is identity (ie. "content-transfer-encoding" is "7bit", "8bit" or "binary") + Body of second MIME subparts forms body part of decoded entity. + Encoding (as given on "content-transfer-encoding" header field on second MIME subpart) is decoded. * If encoding for decoded entity (from body of first MIME subpart) is same than encoding of second MIME subpart, + Body of second MIME subparts forms body part of decoded entity. + Encoding is not decoded. * Otherwise this is an error condition. o Otherwise if both top level type of second MIME subpart is "multipart" and top level type for decoded entity (from body of first MIME subpart) is "multipart", then * Generating of body part of decoded entity is described on next chapters ("Decoding of multipart"). o Otherwise if both media type of second MIME subpart is "message/ rfc822" and media type for decoded entity (from body of first MIME subpart) is "message/rfc822", then * A body of second MIME subpart is parsed (to header and body part) and is processed as described on "Decoding of recursive part" (Section 6.1.1). Result is copied to body of decoded entity. o Otherwise if media type of second MIME subpart is "application/ octet-stream", then * If encoding for decoded entity (from body of first MIME subpart) is identity (ie. "content-transfer-encoding" is "7bit", "8bit" or "binary") + Body of second MIME subparts forms body part of decoded entity. + Encoding (as given on "content-transfer-encoding" header field on second MIME subpart) is decoded. * If encoding for decoded entity (from body of first MIME subpart) is same than encoding of second MIME subpart, + Body of second MIME subpart forms body part of decoded entity. Hurtta Expires May 5, 2007 [Page 32] Internet-Draft EAI Encapsulation November 2006 + Encoding is not decoded. * Otherwise this is an error condition. o Otherwise if media type of second MIME subpart is same than media type for decoded entity (from body of first MIME subpart), then * If encoding for decoded entity (from body of first MIME subpart) is identity (ie. "content-transfer-encoding" is "7bit", "8bit" or "binary") + Body of second MIME subpart forms body part of decoded entity. + Encoding (as given on "content-transfer-encoding" header field on second MIME subpart) is decoded. * If encoding for decoded entity (from body of first MIME subpart) is same than encoding of second MIME subpart, + Body of second MIME subpart forms body part of decoded entity. + Encoding is not decoded. * Otherwise this is an error condition. + NOTE: This algorith do not handle cases where body part is re-encoded (for example quoted-printable to base64.) Reverse re-enconfig of course is possible, but it does not necessary give exactly same representation. * NOTE: This handles unknown media types. But unknown composite media types was stored as "application/octet-stream", if they includes non-ASCII characters, so this handles mostly discrete media types. It is possible that generator of encapsulation knows that type is discrete, but decoder of encapsulation do not know it. o Otherwise this is an error condition. Body of decoded entiry is generated following way when media type is multipart (both on second MIME subpart and on decoded entity): 1. A "boundary" paramater value from second MIME subpart is remembered. * If "boundary" parameter is missing, this is a error condition. 2. A "boundary" parameter value from decoded entity (from body of first MIME subpart) is remembered. This is new boundary, which is used on generated body of decoded entity. * If a "boundary" parameter is missing, this is a error condition. 3. A "preamble" area from body of second MIME subpart is copied to body of decoded entity. * It is not an error condition on decoding if "preamble" area includes non-ASCII characters. 4. Subparts of multipart (from body of second MIME subpart) are handled: 1. A boundary delimiter line is copied to body of decoded entity, but that way that boundary of second MIME subpart is replaced with boundary of decoded entity. Hurtta Expires May 5, 2007 [Page 33] Internet-Draft EAI Encapsulation November 2006 2. A subpart is parsed (to header and body part) and is processed as described on "Decoding of recursive part" (Section 6.1.1). Result is copied to body of decoded entity. 5. A final boundary delimiter line is copied to body of decoded entity, but that way that boundary of second MIME subpart is replaced with boundary of decoded entity. * A final final boundary delimiter line is not generated to decoded entity if final boundary delimiter line is missing on second MIME subpart. This is not an error condition on decoding. 6. An "epilogue" area from body of second MIME subpart is copied to body of decoded entity. * It is not an error condition on decoding if "epilogue" area includes non-ASCII characters. NOTE: The CRLF preceding the boundary delimiter line is conceptually attached to the boundary (as per [RFC2046]). That CRLF is not part of subpart of multipart. 6.1.1. Decoding of recursive part Decoding of recursive part is done following way: 1. If media type of recursive part is "multipart/utf8-encapsulated" and "type" paramater is "subpart" as value: 1. Recursive part is considered to be "encapsulating entity" and "Generic decoding" (Section 6.1) is applied. 2. Resulting decoded entity is result for "Decoding of recursive part". 2. If media type of recursive part is discrete, result for "Decoding of recursive part" is recursive part itself. 3. Otherwise if media type of recursive part is "message/rfc822", then * Header part of result for "Decoding of recursive part" result, is header part of recursive part. * Body part of recursive part is parsed (to header and body part) and is processed as described on "Decoding of recursive part" (Section 6.1.1). Body part of result for "Decoding of recursive part" is result of processing. 4. Otherwise if top level type of recursive part is multipart (i.e. media type is multipart/*) and "boundary" parameter exists, handling of it is described on next chapter. * Missing "boundary" parameter on multipart types is not error condition on decoding. 5. Otherwise if recursive part is ASCII only and encoding of recursive part is identity (ie. "content-transfer-encoding" is "7bit", "8bit" or "binary") result for "Decoding of recursive part" is recursive part itself. Hurtta Expires May 5, 2007 [Page 34] Internet-Draft EAI Encapsulation November 2006 6. Otherwise this is error condition. * NOTE: This means that missing "boundary" parameter is error condition for decoding if body is not ASCII only (or required encoding). * NOTE: This means that unknown composite types is error condition, if body is not ASCII only (or required encoding). If top level type is multipart, result for "Decoding of recursive part" is generated following way: 1. Header part of result for "Decoding of recursive part" result, is header part of recursive part. 2. A "boundary" paramater value from recursive part is remembered. * Handling of missing "boundary" paramater is described on previous chapters. 3. Body part for "Decoding of recursive part" result is initiated. 4. A "preamble" area from body of recursive part is copied to body part for "Decoding of recursive part" result. * It is not an error condition on decoding if "preamble" area includes non-ASCII characters. 5. Subparts of multipart (from body of recursive part) are handled: 1. A boundary delimiter line is copied to body part for "Decoding of recursive part" result. 2. A subpart is parsed (to header and body part) and is processed as described on "Decoding of recursive part" (Section 6.1.1). Result is copied to body part for "Decoding of recursive part" result. 6. A final boundary delimiter line is copied (from body of recursive part) to body part for "Decoding of recursive part" result. * A final final boundary delimiter line is not generated to body part for "Decoding of recursive part" result if final boundary delimiter line is missing on second MIME subpart. This is not an error condition on decoding. 7. An "epilogue" area from body of recursive part is copied to body part for "Decoding of recursive part" result. * It is not an error condition on decoding if "epilogue" area includes non-ASCII characters. 6.2. Upgrading of internationalized email message When downgraded internationalized email enters EAI compliant environment upgrade is allowed. [ietf-eai-downgrade] describes when upgrade occurs. This document defines "Encapsulated" value to "Header-Type" header field. "Header-Type" header field defines how upgrade occurs. o If header field value is "Downgraded", upgrading of header part (and body) of mail is done according of [ietf-eai-downgrade]. Hurtta Expires May 5, 2007 [Page 35] Internet-Draft EAI Encapsulation November 2006 o If header field value is "Encapsulated", upgrading of header part (and body) of mail is done as described on this section. Encapsulating entity is not decoded on error conditions. Instead original encapsulating entity is returned. Upgrading of internationalized email is done following way: o If media type of downgraded internationalized email is not "multipart/utf8-encapsulated" or if parameter "type" have not "encapsulated" as value, this is a error condition. o Downgraded internationalized email is considered to be "encapsulating entity" and "Generic decoding" (Section 6.1) is applied. o Resulting decoded internationalized entity is upgraded internationalized email. o "Received" header fields from downgraded internationalized are prepended to upgraded internationalized email. * Upgraded internationalized email already includes all original header fields. This adds trace header fields which are inserted to mail after it was downgrading. This do not re-add trace header fields which was added before downgrading, because them are renamed to "I18N-Received" on downgraded internationalized email. 6.2.1. Upgrading example An upgrading example of mail from earlier example (Section 5.2.1) is used. Mail is assumed 8BITMIME downgraded afterwards. This process was added also some extra header fields to mime parts. Downgraded internationalized email: ========================================== Received: from fw.example.org by upgrade.example.org with ESMTP id JAX77356; Wed, 13 Sep 2006 22:27:32 +0300 Received: from downgrade.example.org by fw.example.org with ESMTP id JAX77356; Wed, 13 Sep 2006 22:27:29 +0300 I18N-Received: from {idn-encoded-name} by downgrade.example.org with ESMTP id JGR17356; Wed, 13 Sep 2006 22:27:25 +0300 Header-Type: Encapsulated From: { downgraded address } Hurtta Expires May 5, 2007 [Page 36] Internet-Draft EAI Encapsulation November 2006 To: someone@example.org Date: Wed, 13 Sep 2006 22:27:25 +0300 Subject: { RFC 2047 encoded subject } Mime-Version: 1.0 Content-Type: multipart/utf8-encapsulated; type=encapsulated; boundary="12345" Content-Transfer-Encoding: 7bit --12345 Content-Type: text/utf8-header; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by downgrade.example.org id JGR17356 Received: from {idn-encoded-name} by downgrade.example.org with ESMTP id JGR17356; Wed, 13 Sep 2006 22:27:25 +0300 Header-Type: UTF8; downgrade=3Dencapsulate From: { q-p encoded UTF-8 address } To: someone@example.org Date: Wed, 13 Sep 2006 22:27:25 +0300 Subject: { q-p encoded UTF-8 subject } X-Foobar: XvrT Mime-Version: 1.0 Content-Type: Text/plain; charset=3DUTF-8 Content-Transfer-Encoding: 8bit --12345 Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by downgrade.example.org id JGR17356 { q-p encoded UTF-8 text } --12345-- ========================================== Upgraded internationalized email: ========================================== Received: from fw.example.org by upgrade.example.org with ESMTP id JAX77356; Wed, 13 Sep 2006 22:27:32 +0300 Hurtta Expires May 5, 2007 [Page 37] Internet-Draft EAI Encapsulation November 2006 Received: from downgrade.example.org by fw.example.org with ESMTP id JAX77356; Wed, 13 Sep 2006 22:27:29 +0300 Received: from {idn-encoded-name} by downgrade.example.org with ESMTP id JGR17356; Wed, 13 Sep 2006 22:27:25 +0300 Header-Type: UTF8; downgrade=encapsulate From: { UTF-8 address } To: someone@example.org Date: Wed, 13 Sep 2006 22:27:25 +0300 Subject: { UTF-8 subject } X-Foobar: XvrT Mime-Version: 1.0 Content-Type: Text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit { UTF-8 text } ========================================== Note handling of Received: header fields. That is only header field what was preserved from downgraded internationalized email. All other header fields are got from "text/utf8-header" MIME part. This also means that upgrading do not need add "Header-Type" header field, because it necessary already on "text/utf8-header" MIME part. On downgraded e-mail there was empty line after "UTF-8 text", but on upgraded email it is disappered because it was part of multipart boundary. 6.3. Retrieving attached internationalized email message "Multipart/utf8-encapsulated" with "type" parameter value "message" can be used to attach internationalized email messages on EAI incompliant environments. Retrieving internationalized email can be done following way: o If media type is "message/rfc822", then * It is parsed (to header and body part). * Body part is processed as described "Upgrading of internationalized email message" (Section 6.2) * Result is internationalized email. o If media type is "Multipart/utf8-encapsulated" and parameter "type" value is "message", then Hurtta Expires May 5, 2007 [Page 38] Internet-Draft EAI Encapsulation November 2006 * It is considered to be "encapsulating entity" and "Generic decoding" (Section 6.1) is applied. * Result is internationalized email. 7. IANA Considerations IANA is requested to register I18N-Received header field and multipart/utf8-encapsulated and text/utf8-header media types as given on registration applications on this document. 8. Security Considerations This "multipart/utf8-encapsulated" media type provides method to encapsulate mail data. Specially this media type provides method to smuggle mail header fields so that mail scanners do not see them. This may provide new security threats. This encapsulation do not hide original MIME parts. However original MIME structure may be obscured. This may provide method to smuggle MIME parts so that mail scanners do not see them. This may provide new security threats. This encapsulation preservers only "Received" header fields from encapsulating message. This may hide information when encapsulated message is upgraded to internationalized email format. 9. Acknowledgements Originally this encapsulation format is suggested on former IMAA mailing list discusions. John C. Klensin was strongly encouraging author to write this documentation. 10. References 10.1. Normative References [ASCII] American National Standards Institute (formerly United States of America Standards Institute), "USA Code for Information Interchange", ANSI X3.4-1968, 1968. ANSI X3.4-1968 has been replaced by newer versions with slight modifications, but the 1968 version remains Hurtta Expires May 5, 2007 [Page 39] Internet-Draft EAI Encapsulation November 2006 definitive for the Internet. [ietf-eai-framework] Klensin, J. and Y. Ko, "Overview and Framework for Internationalized Email", draft-ietf-eai-framework-02 (work in progress), August 2006. [ietf-eai-downgrade] YONEYA, Y., Ed. and K. Fujiwara, Ed., "Downgrading mechanism for Internationalized eMail Address (IMA)", draft-ietf-eai-downgrade-02 (work in progress), August 2005. [ietf-eai-utf8headers] Yeh, J., "Internationalized Email Headers", draft-ietf-eai-utf8headers-02 (work in progress), October 2006. [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2045, November 1996. [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", RFC 2046, November 1996. [RFC2047] Moore, K., "Multipurpose Internet Mail Extensions (MIME) Part Three: Message Header Extensions for Non-ASCII Text", RFC 2047, November 1996. [RFC2822] Resnick, P., "Internet Message Format", RFC 2822, April 2001. [RFC2231] Freed, N. and K. Moore, "MIME Parameter Value and Encoded Word Extensions: Character Sets, Languages, and Continuations", RFC 2231, November 1997. [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 10646", RFC 3629, November 2003. 10.2. Informative References [DKIM-Charter] IETF, "Domain Keys Identified Mail (dkim)", October 2006, . [ietf-eai-smtpext] Yao, J., Ed. and W. Mao, Ed., "SMTP extension for Hurtta Expires May 5, 2007 [Page 40] Internet-Draft EAI Encapsulation November 2006 internationalized email address", draft-ietf-eai-smtpext-01 (work in progress), July 2006. [RFC1847] Galvin, J., Murphy, S., Crocker, S., and N. Freed, "Security Multiparts for MIME: Multipart/Signed and Multipart/Encrypted", RFC 1847, October 1995. [RFC1652] Freed, N., Ed., Rose, M., Stefferud, E., and D. Crocker, "SMTP Service Extension for 8bit-MIMEtransport", RFC 1652, July 1994. [RFC4288] Freed, N. and J. Klensin, "Media Type Specifications and Registration Procedures", RFC 4288, BCP 13, December 2005. [RFC3864] Klyne, G., Nottingham, M., and J. Mogul, "Registration Procedures for Message Header Fields", RFC 3864, BCP 90, September 2004. Author's Address Kari Hurtta Kala-Matti 4 B 24 02230 Espoo FI Email: hurtta-ietf@elmme-mailer.org URI: http://iki.fi/keh/ Hurtta Expires May 5, 2007 [Page 41] Internet-Draft EAI Encapsulation November 2006 Full Copyright Statement Copyright (C) The Internet Society (2006). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Hurtta Expires May 5, 2007 [Page 42]