Skip to main content

Structured Email
draft-ietf-sml-structured-email-02

Document Type Active Internet-Draft (sml WG)
Author Hans-Jörg Happel
Last updated 2024-07-08
Replaces draft-happel-structured-email
RFC stream Internet Engineering Task Force (IETF)
Intended RFC status (None)
Formats
Additional resources Mailing list discussion
Stream WG state WG Document
Document shepherd (None)
IESG IESG state I-D Exists
Consensus boilerplate Unknown
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-ietf-sml-structured-email-02
SML                                                         H.-J. Happel
Internet-Draft                                              audriga GmbH
Intended status: Standards Track                             8 July 2024
Expires: 9 January 2025

                            Structured Email
                   draft-ietf-sml-structured-email-02

Abstract

   This document specifies how a machine-readable version of the content
   of email messages can be added to those messages.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 9 January 2025.

Copyright Notice

   Copyright (c) 2024 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Happel                   Expires 9 January 2025                 [Page 1]
Internet-Draft              Structured Email                   July 2024

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Conventions Used in This Document . . . . . . . . . . . . . .   3
   3.  Representing structured data  . . . . . . . . . . . . . . . .   3
     3.1.  Knowledge representation language . . . . . . . . . . . .   3
     3.2.  Vocabularies  . . . . . . . . . . . . . . . . . . . . . .   4
   4.  Structured data in email messages . . . . . . . . . . . . . .   4
     4.1.  Placement . . . . . . . . . . . . . . . . . . . . . . . .   4
       4.1.1.  Full representation . . . . . . . . . . . . . . . . .   5
       4.1.2.  Partial representation  . . . . . . . . . . . . . . .   5
       4.1.3.  Non-representation  . . . . . . . . . . . . . . . . .   5
     4.2.  Identifiers . . . . . . . . . . . . . . . . . . . . . . .   5
       4.2.1.  Using identifiers in structured data  . . . . . . . .   6
       4.2.2.  Using structured data identifiers in text/html  . . .   6
   5.  Structured data across email messages . . . . . . . . . . . .   7
     5.1.  Forwarding  . . . . . . . . . . . . . . . . . . . . . . .   7
     5.2.  Replies . . . . . . . . . . . . . . . . . . . . . . . . .   7
     5.3.  Error replies . . . . . . . . . . . . . . . . . . . . . .   8
     5.4.  Updates . . . . . . . . . . . . . . . . . . . . . . . . .   8
   6.  Header fields and message flags . . . . . . . . . . . . . . .   8
     6.1.  Presence of structured data . . . . . . . . . . . . . . .   9
     6.2.  Action processing . . . . . . . . . . . . . . . . . . . .   9
   7.  Examples  . . . . . . . . . . . . . . . . . . . . . . . . . .  10
     7.1.  Partial representation  . . . . . . . . . . . . . . . . .  10
   8.  Security and trust  . . . . . . . . . . . . . . . . . . . . .  10
   9.  Implementation status . . . . . . . . . . . . . . . . . . . .  10
     9.1.  Structured Email plugin for Roundcube Webmail . . . . . .  11
   10. Security considerations . . . . . . . . . . . . . . . . . . .  11
   11. Privacy considerations  . . . . . . . . . . . . . . . . . . .  11
   12. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  11
   13. Informative References  . . . . . . . . . . . . . . . . . . .  11
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  13

1.  Introduction

   Information on websites and in email messages mostly addresses human
   readers.  However, various attempts have been made to make such
   information - fully or in part - machine-readable, so that tools can
   assist users in dealing with that information more efficiently.

   One widespread approach is the usage of [SchemaOrg] vocabulary which
   can be embedded in the HTML markup of websites.  It is then crawled
   by web search engines and used to improve the quality of search
   result snippets (e.g., by showing displaying ratings, opening hours,
   or contact information).

Happel                   Expires 9 January 2025                 [Page 2]
Internet-Draft              Structured Email                   July 2024

   Similarly, a number of web shops, hotels, or airlines include
   Schema.org vocabulary in order receipt email messages sent to
   customers.  This information is extracted and used by some ISPs and
   open source tools ([SchemaOrgEmail]).  However, these implementations
   differ in many details.

   The goal of this specification is to provide a clear and
   comprehensive specification for this practice and to provide ground
   for potential future extensions.

2.  Conventions Used in This Document

   The terms "message" and "email message" refer to "electronic mail
   messages" or "emails" as specified in [RFC5322].  The term "Message
   User Agent" (MUA) denotes an email client application as per
   [RFC5598].

   The terms "machine-readable data" and "structured data" are used in
   contrast to "human-readable" messages and denote information
   expressed "in a structured format (..) which can be consumed by
   another program using consistent processing logic" [MachineReadable].

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

3.  Representing structured data

   In order to exchange structured data, one needs to chose a formal
   language and a serialization format.  Based on this, vocabularies can
   be helpful to establish a shared understanding of structured data
   among heterogeneous senders and receivers.

3.1.  Knowledge representation language

   The Resource Description Framework ([RDF]) is a formal language for
   knowledge representation standardized by the W3C.  It is already used
   for annotating websites and emails, as it is underlying [SchemaOrg].
   Among the various serializations for RDF, JSON-LD ([JSONLD]) has
   become the most commonly used serialization used on websites
   ([WDCStats]).

   Hence, structured data in email messages SHOULD be expressed in the
   JSON-LD serialization of RDF.

Happel                   Expires 9 January 2025                 [Page 3]
Internet-Draft              Structured Email                   July 2024

   For discussion, see also:
   https://github.com/hhappel/draft-happel-structured-email/issues/1

3.2.  Vocabularies

   Using RDF/JSON-LD, users are free to express any kind of information
   in structured data.  For reuse and reference however, it is common to
   agree upon certain core concepts/entities and properties for a
   certain domain.  Those are typically collected and shared in so-
   called vocabularies.

   [SchemaOrg] is a widespread vocabulary, which was design for
   annotating content on websites.  A small subset of its concepts is
   already used by email senders and processed by email providers.

   Users that want to add structured data into email message SHOULD use
   concepts from [SchemaOrg], if they fit their use case.  They MAY
   however use any valid JSON-LD.

   There might also be certain vocabularies for email-specific use cases
   (such as [I-D.happel-sml-structured-vacation-notices-00]), that will
   be specifically endorsed by the IETF or by respective RFCs.

   MUAs may choose freely if and how to use structured data extracted
   from messages.  If they do not explictly support a certain
   vocabulary, MUAs may also rely on extensions or passing data to
   outside applications, similar to the case of MIME body parts.

   For discussion, see also:
   https://github.com/hhappel/draft-happel-structured-email/issues/2

4.  Structured data in email messages

   This section discusses the placement of structured data within email
   messages and identifiers for referencing between structured data and
   other parts of a message.

4.1.  Placement

   This document targets structured data describing the content of an
   email message itself.  Since users may add other arbitrary structured
   data (e.g., as MIME body parts of type "application/ld+json") to an
   email message, we need to define which kinds of structured data are
   supposed to be representative of the email message content.

   For this, we distinguish the cases of full, partial, and non-
   representation.

Happel                   Expires 9 January 2025                 [Page 4]
Internet-Draft              Structured Email                   July 2024

   For discussion, see also:
   https://github.com/hhappel/draft-happel-structured-email/issues/3

4.1.1.  Full representation

   If structured data is intended by the sender to _fully_ describe the
   human readable content of an email message, it MUST be added as a
   multipart/alternative entity with the content type application/
   ld+json.

   The email message SHOULD in this case also contain a text/plain and a
   text/html version of the content.

   MUAs supporting this specification SHOULD prefer the application/
   ld+json representation when receiving such email messages if they are
   able to process the used vocabulary or are able to process the
   structured data otherwise.

4.1.2.  Partial representation

   If structured data is intended to describe only a _subset_ of the
   human-readable content, it must be enclosed in a <script> HTML tag
   within the HTML <body> tag of the text/html body part of the email
   message (see example at the end).

   MUAs receiving such messages may use the structured data to provide
   an enhanced user experience.

4.1.3.  Non-representation

   In the case of non-representation, there is no relation between
   structured data and the human readable content.

   This may be useful for special scenarios, such as embedding
   "preemptive" structured vacation notices as described in [I-D.happel-
   sml-structured-vacation-notices-00] into email messages.

   As in the case of partial representation, MUAs receiving such
   messages may take according action based on the structured data
   extracted.

4.2.  Identifiers

   There are existing use cases for cross-referencing between different
   parts of a MIME message, for which [RFC2392] defines the cid: and
   mid: URI schemes.

Happel                   Expires 9 January 2025                 [Page 5]
Internet-Draft              Structured Email                   July 2024

   In a similar fashion, cross-referencing might occur between
   structured data and other message parts.

4.2.1.  Using identifiers in structured data

   Most nodes and properties in JSON-LD are identified using IRIs
   [RFC3987].  Since any [RFC2392] (cid/mid) reference forms a valid
   IRI, those references can be directly used in JSON-LD.

   There are two main cases for which cid:-identifiers SHOULD be used in
   structured data.

   First, if structured data references binary content such as images or
   other files, which already exist as MIME body parts within the same
   message.

   Second, if a cid: value is used in a JSON-LD @id property, the
   corresponding JSON-LD node can be considered to describe the MIME
   body part identified by that cid:. This MAY be used to denote that
   certain structured data is explictily describing that MIME body part.
   This MUST NOT be used for the main text/plain or text/html body
   parts, though.

   For discussion, see also:
   https://github.com/hhappel/draft-happel-structured-email/issues/4

4.2.2.  Using structured data identifiers in text/html

   In the case of "partial representation", a MUA will still primarily
   display the human readable part of a message (e.g., text/plain or
   text/html).

   It might however be helpful if the MUA is able to determine which
   parts of human readable text refer to certain structured data - e.g.,
   to offer actions based on structured data directly in the context of
   the corresponding human-readable content.

   For this purpose, the sender may add a HTML "data-id" property
   ([HTMLData]) to any HTML entity in the text/html body, which
   references the @id property of a JSON-LD node in the structured data.

   Besides referencing the corresponding JSON-LD node, a sender might
   also want to denote if the underlying data is "extensively" described
   or just mentioned in the human readable representation.  Depending on
   that, a MUA might provide different additional visualizations for the
   user.

Happel                   Expires 9 January 2025                 [Page 6]
Internet-Draft              Structured Email                   July 2024

   For discussion, see also:
   https://github.com/hhappel/draft-happel-structured-email/issues/5

5.  Structured data across email messages

5.1.  Forwarding

   Forwarding messages including structured data needs to be considered
   from a privacy perspective, particularly in cases of "non-
   representation", when the user has no way to determine structured
   data from the human readable part of the message.

   A MUA MUST strip non-representative structured data when forwarding
   messages.  Note that this does only apply to MUAs directed by users
   and not for automated forwarding set up by a user.

   Beyond that, privacy issues also apply to forwarding regular email
   messages, such that a more general solution might be specified
   outside of the specific context of structured email.

   For discussion, see also:
   https://github.com/hhappel/draft-happel-structured-email/issues/6

5.2.  Replies

   In order to allow responses to structured email messages, the
   [SchemaOrg] vocabulary specifies a property called "potentialAction"
   ([PotentialAction]).

   Accordingly, there can be two different ways of replying to a
   structured email: regular email replies such as supported by many
   MUAs, and particular structured email replies.

   MUAs should ensure that both types of reply can be clearly
   distinguished by end users.

   If the "target" property of an action points to a "mailto:" URI, the
   email user agent SHOULD reply with a structured email if the user
   triggers the action.

   For discussion, see also:
   https://github.com/hhappel/draft-happel-structured-email/issues/7

Happel                   Expires 9 January 2025                 [Page 7]
Internet-Draft              Structured Email                   July 2024

5.3.  Error replies

   In general, an original sender may not assume that a structured email
   has been processed by a recipient.  Hence, there will typically be no
   response or error message returned, if the receiving MUA cannot make
   sense of a structured email for whatever reason.

   This may be slightly different when sending a structured email in
   response to an initial structured email.  In this case, the original
   sender MAY want to signal an issue with a response received, such as
   if a contradicting response has already been received, or if a
   response is formally inconsistent in another way.

   In this case, a "full representation"-style error message MAY be
   returnend to the sender of the erroneous response.  Example: TBD

   For discussion, see also:
   https://github.com/hhappel/draft-happel-structured-email/issues/8

5.4.  Updates

   In human-readable messages, human language can be used to update or
   recall information that was conveyed in prior messages.  Accordingly,
   there needs to be a machine-readable mechanism that allows to express
   the update or recall of information of structured data.

   Structured data SHOULD be updated, if a later email message with a
   `SUPERSEDES header field ([RFC4021]; "superseding message")
   referencing the message id of the original email message is
   processed.  In this case, structured data of the original message
   should be fully revoked and replaced by the structured data of the
   superseding message (which might be empty).

   Structured data in a superseding message MUST be ignored if:

   *  Structured data from the original message is not or cannot be
      revoked
   *  In particular, if the original message has already been replied to
      by the recipient

   For discussion, see also:
   https://github.com/hhappel/draft-happel-structured-email/issues/9

6.  Header fields and message flags

   This sections presents header fields and IMAP flags which are
   supposed to support MUAs in dealing with structured email.

Happel                   Expires 9 January 2025                 [Page 8]
Internet-Draft              Structured Email                   July 2024

6.1.  Presence of structured data

   In some use cases, MUAs might benefit from information about message
   details without having to evaluate the full message body.

   For example, the $hasAttachment IMAP flag ([HasAttachment]) was
   proposed to signal the existence of MIME attachments in a message
   which otherwise would need to be redetermined based on complex MIME
   parsing.

   The following procedures should apply to structured email.

   A sending MUA (aMUA) SHOULD add a header field Structured data if a
   message contains structured data.  The value for this field MUST
   include only one of the following values (case-insensitive):

   *  Full for full representation
   *  Partial for partial representation
   *  Other for non-representation

   The Structured data fields SHOULD additionally include (case-
   insensitive, comma-separated) the value Action, if a message contains
   a "potentialAction" a MUA might want to investigate.

   Similarly, the IMAP flags $hasStructuredData and
   $hasStructuredDataAction MAY be used, if an inbound message is found
   to contain structured data, but neither of the aforementioned header
   fields.

   For discussion, see also:
   https://github.com/hhappel/draft-happel-structured-email/issues/10

6.2.  Action processing

   A structured email can contain "potentialActions".  MUAs need to
   ensure that such actions are not triggered multiple times - either
   within the same MUA or across multiple concurrent MUAs.

   For this purpose, the \Answered flag ([RFC9051]) is not appropriate,
   as it has an established meaning and implementations for regular,
   manually authored responses.

   Therefore, a MUA MUST set a flag $structuredDataActionSent if a
   potentialAction has been responsed to - either by the user or some
   other mechanism on behalf of the user.

   For discussion, see also:
   https://github.com/hhappel/draft-happel-structured-email/issues/11

Happel                   Expires 9 January 2025                 [Page 9]
Internet-Draft              Structured Email                   July 2024

7.  Examples

7.1.  Partial representation

   Placement of JSON-LD markup in a text/html body part:

   <html>
   <body>
   <script type="ld+json">
   ...
   </script>
   </body>
   </html>

8.  Security and trust

   Email user agents that want to support structured email should follow
   guidance to ensure trust and security standards.  These will be
   elaborated in a separate specification.

9.  Implementation status

   < RFC Editor: before publication please remove this section and the
   reference to [RFC7942] >

   This section records the status of known implementations of the
   protocol defined by this specification at the time of posting of this
   Internet-Draft, and is based on a proposal described in [RFC7942].
   The description of implementations in this section is intended to
   assist the IETF in its decision processes in progressing drafts to
   RFCs.  Please note that the listing of any individual implementation
   here does not imply endorsement by the IETF.  Furthermore, no effort
   has been spent to verify the information presented here that was
   supplied by IETF contributors.  This is not intended as, and must not
   be construed to be, a catalog of available implementations or their
   features.  Readers are advised to note that other implementations may
   exist.

   According to [RFC7942], "this will allow reviewers and working groups
   to assign due consideration to documents that have the benefit of
   running code, which may serve as evidence of valuable experimentation
   and feedback that have made the implemented protocols more mature.
   It is up to the individual working groups to use this information as
   they see fit".

Happel                   Expires 9 January 2025                [Page 10]
Internet-Draft              Structured Email                   July 2024

9.1.  Structured Email plugin for Roundcube Webmail

   An open source plugin for the Roundcube Webmail software is developed
   to serve as a reference implementation for this specification
   ([RC-SML]).

   Beyond that, some ISPs and open source tools provide implementation
   partly compliant with this specficiation ([SchemaOrgEmail]).

10.  Security considerations

   See section "security and trust".

11.  Privacy considerations

   See section "security and trust".

12.  IANA Considerations

   This document has no IANA actions at this time.

   (TBD IMAP flags)

13.  Informative References

   [HTMLData] WHATWG, "HTML Living Standard: Embedding custom non-
              visible data with the data-* attributes",
              <https://html.spec.whatwg.org/multipage/dom.html#attr-
              data-*>.

   [HasAttachment]
              IETF imapext WG mailing list, "Registering $hasAttachment
              & $hasNoAttachment",
              <https://mailarchive.ietf.org/arch/msg/imapext/
              MVE5eNHOaNIVGUvN1RKtBL8b278/>.

   [JSONLD]   W3C JSON-LD Working Group, "JSON-LD 1.1",
              <https://www.w3.org/TR/json-ld/>.

   [MachineReadable]
              NIST, "NIST IR 7511 Rev. 4",
              <https://csrc.nist.gov/glossary/term/Machine_Readable>.

   [PotentialAction]
              W3C Schema.org Community Group, "Schema.org:
              potentialAction", <https://schema.org/potentialAction>.

Happel                   Expires 9 January 2025                [Page 11]
Internet-Draft              Structured Email                   July 2024

   [RC-SML]   audriga GmbH, "Structured Email plugin for Roundcube
              Webmail",
              <https://github.com/audriga/roundcube-structured-email/>.

   [RDF]      W3C RDF Working Group), "RDF 1.1 Concepts and Abstract
              Syntax", <https://www.w3.org/TR/rdf11-concepts/>.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC2392]  Levinson, E., "Content-ID and Message-ID Uniform Resource
              Locators", RFC 2392, DOI 10.17487/RFC2392, August 1998,
              <https://www.rfc-editor.org/info/rfc2392>.

   [RFC3987]  Duerst, M. and M. Suignard, "Internationalized Resource
              Identifiers (IRIs)", RFC 3987, DOI 10.17487/RFC3987,
              January 2005, <https://www.rfc-editor.org/info/rfc3987>.

   [RFC4021]  Klyne, G. and J. Palme, "Registration of Mail and MIME
              Header Fields", RFC 4021, DOI 10.17487/RFC4021, March
              2005, <https://www.rfc-editor.org/info/rfc4021>.

   [RFC5322]  Resnick, P., Ed., "Internet Message Format", RFC 5322,
              DOI 10.17487/RFC5322, October 2008,
              <https://www.rfc-editor.org/info/rfc5322>.

   [RFC5598]  Crocker, D., "Internet Mail Architecture", RFC 5598,
              DOI 10.17487/RFC5598, July 2009,
              <https://www.rfc-editor.org/info/rfc5598>.

   [RFC7942]  Sheffer, Y. and A. Farrel, "Improving Awareness of Running
              Code: The Implementation Status Section", BCP 205,
              RFC 7942, DOI 10.17487/RFC7942, July 2016,
              <https://www.rfc-editor.org/info/rfc7942>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

   [RFC9051]  Melnikov, A., Ed. and B. Leiba, Ed., "Internet Message
              Access Protocol (IMAP) - Version 4rev2", RFC 9051,
              DOI 10.17487/RFC9051, August 2021,
              <https://www.rfc-editor.org/info/rfc9051>.

Happel                   Expires 9 January 2025                [Page 12]
Internet-Draft              Structured Email                   July 2024

   [SchemaOrg]
              W3C Schema.org Community Group, "Schema.org",
              <https://schema.org/>.

   [SchemaOrgEmail]
              Structured Email, "Schema.org for email",
              <https://structured.email/content/related_work/frameworks/
              schema_org_for_email.html>.

   [WDCStats] Web Data Commons Project, "Web Data Commons - Microdata,
              RDFa, JSON-LD, and Microformat Data Sets",
              <http://webdatacommons.org/structureddata/#toc3>.

Author's Address

   Hans-Joerg Happel
   audriga GmbH
   Email: happel@audriga.com
   URI:   https://www.audriga.com

Happel                   Expires 9 January 2025                [Page 13]