<?xml version="1.0" encoding="utf-8"?>
<?xml-model href="rfc7991bis.rnc"?>  <!-- Required for schema validation and schema-aware editing -->
<!-- <?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?> -->
<!-- This third-party XSLT can be enabled for direct transformations in XML processors, including most browsers -->
<!DOCTYPE rfc [
  <!ENTITY nbsp    "&#160;">
  <!ENTITY zwsp   "&#8203;">
  <!ENTITY nbhy   "&#8209;">
  <!ENTITY wj     "&#8288;">
]>
<!-- If further character entities are required then they should be added to the DOCTYPE above.
     Use of an external entity file is not recommended. -->
<rfc
  xmlns:xi="http://www.w3.org/2001/XInclude"
  category="info"
  docName="draft-nurpmeso-dkim-access-control-diff-changes-05"
  ipr="trust200902"
  obsoletes=""

  updates="6376"

  submissionType="IETF"
  xml:lang="en"
  version="3">
<!--
     [CHECK]
       * category should be one of std, bcp, info, exp, historic
       * ipr should be one of trust200902, noModificationTrust200902,
         noDerivativesTrust200902, pre5378Trust200902
       * updates can be an RFC number as NNNN
       * obsoletes can be an RFC number as NNNN
-->
  <front>

   <title>DKIM Access Control and Differential Changes</title>

   <seriesInfo name="Internet-Draft" value="draft-nurpmeso-dkim-access-control-diff-changes-05"/>

    <author fullname="Steffen Nurpmeso" initials="S" role="editor" surname="Nurpmeso">
      <address><email>steffen@sdaoden.eu</email></address>
    </author>

    <date year="2025" month="03" day="31"/>

    <area>General</area>
    <workgroup>Internet Engineering Task Force</workgroup>

    <keyword>DKIM</keyword>

    <abstract><t>
      This document specifies a bundle of
      DKIM (RFC 6376)
      extensions and adjustments.
      They do not hinder the currently distributed processing
      environment that includes DKIM, ARC, DMARC, and SPF,
      and are as such backward compatible.
      Their aim is however to ultimately slim down the email
      environment that needs to be administrated and maintained,
      by establishing mutual agreements in between sender and
      receiver(s),
      verifiable through public-key cryptography,
      and let the SMTP protocol handle decisions solely based upon that.
    </t></abstract>

  </front>
  <middle>

    <section>
      <name>Introduction</name>
      <t>
        Public-key cryptography is used for secure transactions on many
        levels, and in many protocols.
        For example, transport layer security
        TLS<xref target="RFC9325"/>
        provides encrypted data exchange.
        It is omnipresent, desired where optional,
        even enforced by standard means: newer IETF transports, like
        QUIC<xref target="RFC9369"/>,
        may even exist only in conjunction with it.
        The usual public-key cryptography mode of operation is,
        that if no trust can be established, the operation is cancelled.
        It simply does not happen.
      </t><t>
        DKIM<xref target="RFC6376"/>,
        on the other hand, defines as one of its core details that
        "signature verification failure does not force rejection".
        Yet there is such a pressing need of email operators to be able
        to enforce policy, that a plethora of extensive accompanying
        standards surrounding
        SMTP<xref target="RFC5321"/>
        and DKIM were developed, among which
        are ARC, DMARC and SPF.
        Reality is that the complexity of email setup, of administrative
        effort, has massively increased in the last decade plus,
        so much that many small commercial and private operators have
        ceased to exist, or have turned away from providing their own
        service.
        Reality is also that large parts of those which still exist do
        not follow-suit "so-called" IETF progress out of belief of
        improving the situation,
        but instead they wait until interoperability problems arise,
        especially with the giant email players,
        before minimally invasive solutions are searched for.
        These are usually found by searching the internet,
        often by doing copy and paste of shared configuration snippets.
      </t><t>
        Some of the mentioned standards even introduce massive
        complications of decade old habits and usage patterns.
        For example, many universities and other "groupings" offer
        stable member email addresses,
        and then forward email to current, "real addresses".
        This is made impossible by
        SPF<xref target="RFC7208"/>
        if taken by the word (RECOMMENDET), which it often, but
        dependent upon a software implementation or configuration, is.
        Non-standardized solutions,
        like "Sender Rewriting Scheme" for the given example,
        are then developed, and implemented, by the sheer necessity to
        keep a grown infrastructure in a usable state.
        Often these solutions are imperfect.
        In any case they try to circumvent a defect of an IETF standard,
        in an onion-alike environment of standards that has no
        other desire,
        if one lets aside all those masses of "reporting" capabilities
        that IETF standards developed over the last years,
        than to provide reliable and trustworthy verification of the
        sender / receiver relationship and the communicated data.
      </t><t>
        What this specification tries to achieve is to provide a path to
        lesser complexity, to easier maintenance and administration
        efforts, on the one hand.
        And on the other hand it tries to solve the issues which still
        exist, regardless of the sheer number of IETF standards invented
        to improve the situation.
      </t>

      <section>
        <name>Requirements Language</name>
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL",
          "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT
          RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
          interpreted as described in BCP 14 <xref target="RFC2119"/>
          <xref target="RFC8174"/> when, and only when, they appear in
          all capitals, as shown here.</t>
      </section>
    </section>

    <section>
      <name>DKIMACDC</name>
      <t>
        The
        DKIM<xref target="RFC6376"/>
        extension Access Control and Differential Changes:
      </t><ul>
        <li>
          Places DKIM signatures in a random-accessible ordered sequence
          which' state correlate.

          Identical DKIM signatures generated at the same hop,
          but which differ in only the used algorithm,
          share, however, a sequence number.
        </li><li>
          Adds reversible data difference tracking,
          and as such supports cryptographical content verification
          of (potentially) any intermediate message representation,
          up to the initial variant as sent by the originator.

          (Potentially allowing user interfaces to, also partially or in
          configurable dose, undo modifications that the email system
          introduced along the message path.
          For example, mailing-list specific mutations:
          it could show the original From address line,
          not the DKIM/DMARC mitigation caused mailing-list address;
          but see below.)
        </li><li>
          Takes, on mutual agreement with receiver domains, cryptographically
          verifiable precautions to ensure that only initially addressed
          "mailbox local-part"s can be used as
          SMTP<xref target="RFC5321"/>
          <tt>RCPT TO</tt> addressees,
          as well as to ensure that the
          SMTP<xref target="RFC5321"/>
          <tt>MAIL FROM</tt>
          mailbox is identifieable;
          the latter avoids that receivers can be fooled by man-in-the-middle
          to send backscatter bounces to random addresses.
        </li><li>
          DKIMACDC allows for rather cheap and easy detection (and testing)
          of the highest numbered signature,
          which can be sufficient for intermediate hops given the DKIM
          paradigm that "a single successful verification is sufficient
          for validation".
        </li><li>
          With DKIMACDC certain "detectable conditions" allow for quick
          rejection in a broken chain of trust.
        </li><li>
          DKIMACDC allows for pretty certain collection of statistics of
          organizational trust (<xref target="RFC5863"/>, section 2.5),
          in turn improving the mentioned "detectable conditions".
        </li>
      </ul><t>
        The
        DKIM<xref target="RFC6376"/>
        extension Access Control and Differential Changes
        is announced by adding an acdc= tag to the DKIM-Signature.
        (For efficiency reasons it <bcp14>SHOULD</bcp14> be placed
        early, before tags like h=, bh= and b=, for example.)

        The tag starts with "sequence",
        a decimal number starting at 1,
        or incremented by 1 from the highest DKIMACDC sequence number
        encountered in the message;
        the maximum value is 999:
        if incrementing would result in overflow,
        the message <bcp14>MUST</bcp14> to be rejected;
        sequence holes <bcp14>MUST</bcp14> also cause rejection
        (but see below);
        in both cases
        SMTP<xref target="RFC5321"/>
        reply code 550 is to be used; with
        enhanced SMTP status codes<xref target="RFC3463"/>
        5.5.4 <bcp14>MUST</bcp14> be used.
      </t><blockquote>
        <em>Informative remark:</em>
        999 is both a constraint and a very high limit,
        dependent upon which type of processing is actually involved.
        In todays' DKIM use several signatures per actual hop are not
        uncommon, also in the sense that per-hop processing pipelines
        involve several processing steps that each create DKIM
        signatures.
        Since DKIMACDC is meant as a transparent upgrade path it seems
        unwise to introduce a limit too low thus.
        On the other hand a high limit creates a D(enial) O(f) S(ervice)
        attack surface, but again,
        since most often only the highest numbered signature needs to
        be verified, this seems acceptable.
      </blockquote><t>
        Flag description is normative.
        (Note the missing <tt>FWS</tt> separators around <tt>=</tt>.)
        ABNF<xref target="RFC5234"/>:
      </t><sourcecode><![CDATA[
acdc = %x61 %x63 %x64 %x63 = sequence ":" 1*(flag) ":" [id] ":"
sequence = 1*3DIGIT; DIGIT from RFC 5234
flag = "A" / "a" / "D" / "E" / "O" / "P" / "R" /
       "V" / "v" / "X" / "x" / "Y" / "y" / "Z" / "z"
id = *42(ALPHA / DIGIT / "+" / "-"); optional (bounce) identifier
      ]]></sourcecode><dl>
        <dt>A</dt><dd>
          Access control is active;
          DKIM-Access-Control header(s), as below, are included.

          Once set, necessarily in combination with the O flag,
          all future DKIMACDC signatures must copy it.
          (It may be removed by a signature which claims a new
          message origin by setting the O flag.)
        </dd><dt>a</dt><dd>
          Access control is not active.
        </dd><dt>D</dt><dd>
          The message was modified at this hop,
          DKIMACDC differential changes were generated,
          and are stored in a DKIM-Diff header.

          Necessarily only in combination with the O flag.
          The Y flag has to be set.
        </dd><dt>E</dt><dd>
          SMTP<xref target="RFC5321"/>
          envelope (<tt>MAIL FROM</tt>, <tt>RCPT TO</tt>) was modified.

          Necessarily only in combination with the O flag.
          The y flag has to be set.
        </dd><dt>O</dt><dd>
          This hop claims the message origin.

          This either means that the message originated at this hop,
          in which case the signature (usually, DKIM-typical) refers to
          the first address of the From header,
          and the sequence number is 1.

          It can also mean that an intermediate hop performed modifications,
          or for other reasons claims "ownership" of the message.

          For example, a mailing-list received a message, and is now
          re-distributing it to its members,
          changing the SMTP envelope accordingly
          (and setting E and y flags).
          At the time of this writing this usually comes in conjunction
          with From header munging for DMARC mitigation,
          and often more IMF modifications
          (for example addition of a list-info footer),
          which therefore results in the necessity for differential change
          production,
          and setting the D and Y flags.

          The SMTP envelope
          <tt>MAIL FROM</tt> is adjusted to refer to the domain that claims
          ownership etc.
          Any formerly present DKIM-Access-Control header was removed.

          Access control header fields are only generated for messages
          with the O flag set.
        </dd><dt>P</dt><dd>
          Postmaster mode.
          With this flag set the behaviour of DKIMACDC borders test
          mode in that rejections must not occur (due to DKIMACDC).
          This is to allow for a communication possibility window in
          a situation where messages would always be rejected,
          due to misconfigurations et cetera,
          and as such reflects
          SMTP<xref target="RFC5321"/>
          section 4.5.1 Minimum Implementation.

          (If, due to some failure, the sequence number would be
          excessed by such a message, the sequence increment shall not
          be performed, even if it makes the message "more invalid".
          Implementations necessarily count the number of DKIMACDC
          instances, and may imply an absolute maximum in order to
          avoid endless message wandering aka "loops" nonetheless.)

          If the sequence number is 1,
          message recipients have to be inspected.
          If the
          IMF<xref target="RFC5322"/>
          header fields To and Cc only contain a single addressee with
          the local part
          postmaster<xref target="RFC1123"/>,
          and if the same "postmaster" is addressed as a
          SMTP<xref target="RFC5321"/>
          <tt>RCPT TO</tt> recipient,
          and if no more than two <tt>RCPT TO</tt> recipients exist in
          total,
          then the P flag has to be set.

          Once set, all future DKIMACDC signatures must copy it.
          (It may be removed by a signature which claims a new
          message origin by setting the O flag.)
        </dd><dt>R</dt><dd>
          Reputation check to collect
          organizational trust (<xref target="RFC5863"/>, section 2.5)
          along the signature chain was performed.

          On top of the V flag this means that all differential changes
          have been applied,
          and all signatures along the chain have been verified,
          and the entire chain validated correctly.

          Only in signatures with sequence numbers greater than 1,
          and without the Z or z flags (in earlier signatures).
        </dd><dt>V</dt><dd>
          DKIMACDC signature verified successfully.
          This means that the signature with the highest sequence number
          has been verified correctly,
          that the sequence of DKIMACDC signatures is complete,
          and their flags make sense (in the sequence).

          In conjunction with the flag R even deeper inspection was
          performed.

          Only in signatures with sequence numbers greater than 1.
        </dd><dt>v</dt><dd>
          DKIM signature verified successfully.

          In signatures with sequence number 1,
          then missing the O flag,
          it means the message originated at a non-DKIMACDC-aware host,
          and normal DKIM processing was performed and succeeded.
          Unless DKIM processing succeeded for the DKIM signature which
          covered the messages' From header address,
          the Z flag must be set, otherwise the z flag.

          In messages with higher sequence numbers it comes alongside
          the X flag: necessarily the DKIMACDC chain was broken, and the
          message changed, by an intermediate non-DKIMACDC-aware hop.
          The z flag must be set.
        </dd><dt>X</dt><dd>
          DKIMACDC verification failed;
          however, the normal DKIM signature verification was performed,
          and succeeded.

          The z flag must be set.
        </dd><dt>x</dt><dd>
          DKIM verification failed.

          In signatures with sequence number 1,
          then missing the O flag,
          it means the message originated at a non-DKIMACDC-aware host,
          and normal DKIM processing was performed and failed.
          The z flag must be set.

          In messages with higher sequence numbers it comes alongside
          the X flag: necessarily the DKIMACDC chain was broken, and the
          message changed, by an intermediate non-DKIMACDC-aware hop.
          The z flag must be set.
        </dd><dt>Y</dt><dd>
          The message has seen
          IMF<xref target="RFC5322"/>
          modifications:
          somewhere along the chain the original message data was modified.

          Once set, all future DKIMACDC signatures must copy it.
        </dd><dt>y</dt><dd>
          The message has seen
          SMTP<xref target="RFC5321"/>
          envelope modifications:
          somewhere along the chain the original envelope was modified.

          Once set, all future DKIMACDC signatures must copy it.
        </dd><dt>Z</dt><dd>
          Announces the DKIMACDC chain is incomplete.
          The message was processed by DKIMACDC unaware hops.
          However, the message verifies correctly and seems to have
          never been modified non-reversibly.

          Once set, all future DKIMACDC signatures must copy it,
          unless later downgraded to the z flag.
        </dd><dt>z</dt><dd>
          The message has seen non-reversible modifications,
          and cannot be cryptographically verified back to its origin.

          Once set, all future DKIMACDC signatures must copy it.

          If this flag is set DKIMACDC looses its decisive meaning
          and "degrades" to normal DKIM:
          no more differential data is generated,
          and messages are distributed further / accepted if just any
          DKIM(ACDC) signature verifies.
          (Software configuration <bcp14>MAY</bcp14> allow otherwise.)
        </dd><dt>id</dt><dd>
          The optional "bounce identifier" offers enough room to store
          Universally Unique IDentifiers<xref target="RFC9562"/>.

          It <bcp14>MAY</bcp14> be generated to help sending domains
          to uniquely identify messages within the DKIM t= and x= time delta,
          as well as to ensure that successively sent identical messages
          are not detected as the same.

          Receiving domains should not use this identifier due to the
          denial of service attack surface,
          regardless of collected organizational trust (see R flag).
        </dd>
      </dl><t>
        Unknown flags <bcp14>MUST</bcp14> be ignored.
        Invalid flag combinations and flag misuse <bcp14>MUST</bcp14>
        result in rejection with SMTP reply code 550; if
        enhanced status codes<xref target="RFC3463"/>
        are used, 5.5.4 <bcp14>MUST</bcp14> be used.
        (This includes the P flag upon incorrect use.)
      </t>
    </section>

    <section>
      <name>The DKIM-Store header</name>
      <t>
        The DKIM-Store header has no meaning in the email system.
        The sole purpose of mentioning it is to announce that it
        <bcp14>MUST</bcp14> be removed when messages enter
        and leave the email system.

        It could for example be temporarily created and used by
        non-integrated mail filter (milter) software to pass
        informational data in between the "ingress" and the "egress"
        processing side.
        To aid in software bugs and possible configuration errors
        this specification enforces removal of all occurrences.

        It is suggested to encrypt data passed around in this temporary
        header with a key internal to the "local" email processing
        system in order to achieve locality.
      </t>
    </section>

    <section>
      <name>Access Control</name>
      <t>
        DKIM replay attacks have been reported,
        where messages with valid DKIM signatures were repeatedly sent
        to receivers not initially addressed by the sender.
        That is: because the sent
        IMF<xref target="RFC5322"/>
        message does not include the Bcc header field,
        and, to be exact, because the actual
        SMTP<xref target="RFC5321"/>
        <tt>RCPT TO</tt> recipients are not included at all,
        DKIM does not cover the real set of message receivers:
        effectively any malicious party can use the validatable message
        with any possible SMTP recipient.
      </t><t>
        Whereas DKIM x= signature validity expiration tags can
        (<bcp14>MUST</bcp14> with ACDC as below) be used,
        the stamina and forgiveness of SMTP,
        owed to the necessity to deliver messages to receivers in
        various conditions,
        requires an expiration timestamp that leaves plenty of time for
        malicious players to misuse messages with valid signatures.
      </t><t>
        In addition the actual
        SMTP<xref target="RFC5321"/>
        <tt>MAIL FROM</tt>
        sender is not covered by DKIM:
        any intermediate hop can (use the validatable message and) cause
        bounces to any possible <tt>MAIL FROM</tt> (backscatter bounce).
      </t><t>
        Access control addresses replay and backscatter bounces.
        When signing as an originator (O flag set),
        all distinct domain-names found within the list of intended
        SMTP <tt>RCPT TO</tt> addressees are collected.
        Thereafter the DKIMACDC state of all found domains is queried,
        by looking up their _dkimacdc DNS entry, as below.

        For any domain that announces DKIMACDC support
        the completely prepared message,
        including the readily prepared DKIM-Signature(s), is forged,
        the A flag is set,
        (a) dedicated DKIM-Access-Control header(s) is/are created and
        prepended,
        and the resulting domain-specific message is sent to the logical
        recipient subset.
      </t><blockquote>
        <em>Informative remark:</em>
        Dedicated DKIM-Signatures are necessary:
        if the message is also sent to a domain which does not
        support DKIMACDC, but which forwards the message to a domain
        which does, that destination would otherwise falsely assume the
        presence of access control;

        To simplify per-receiver-domain message creation the
        DKIM-Signature header(s) can be readily prepared except for toggling
        the single flag byte a to A, and, of course, creation of the
        cryptographic signature itself.
      </blockquote><t>
        To address replay attacks by man-in-the-middle
        the DKIM x= tag <bcp14>MUST</bcp14> be used
        in order to allow receiver domains to manage a message identity cache.

        The maximum t= to x= delta <bcp14>MUST NOT</bcp14> be greater than
        864000 seconds (ten days: to reach into the next working week).
        Example delta values for tag auto-generation may be the bounce defaults
        432000 seconds (five days: used for example by
        the Mailman2 and mlmmj mailing-list managers and the postfix MTA),
        345600 seconds (four days: OpenSMTPD MTA),
        172800 seconds (two days: Exim MTA).

        DKIMACDC aware receivers must keep a cache of received message
        identities to address this kind of replay attack during delta validity.
        (The DKIM-Access-Control header's signature appears like a natural
        cache key source, but see below.)

        In order to keep things simple,
        and the cache a write-once data structure,
        DKIMACDC senders <bcp14>MUST NOT</bcp14> generate per-receiver-domain
        messages with more than the 100 recipients that
        SMTP<xref target="RFC5321"/>
        section 4.5.3.1.8 guarantees as a minimum:
        if more recipients need to be addressed on a single domain,
        multiple messages with recipient subsets must be generated:
        like this each message is "atomic",
        and it is ensured the recipients of the SMTP envelope are all
        included in the DKIMACDC access-control signature, and vice versa.
        SMTP MTAs of domains which announce DKIMACDC <bcp14>MUST</bcp14>
        conform to
        SMTP<xref target="RFC5321"/>
        (section 4.5.3.1.8).
      </t><blockquote>
        <em>Informative remark:</em>
        Implementations <bcp14>MAY</bcp14> offer configuration options
        to specify other recipient limits.
        For example, it may offer domain whitelist settings which can be used
        to bundle domains with higher limits.
        Like this the much higher limits in actual use
        (for example, the Exim MTA has a default limit of 50000)
        can be utilized.
<cref>The _dkimacdc DNS entry *could* announce a definitive limit of whatever sort!</cref>
      </blockquote><blockquote>
        <em>Informative remark:</em>
        Space constraints resulting from maintaining an identity cache
        may be addressed by timing out entries by minutes or hours not seconds,
        by partitioning the cache through DKIM d= tag values,
        and by using a hash-attack proven message-digest output
        instead of message (access-control signature) content data for keys.
        To selectively garbage collect cache entries on memory shortage,
        collected reputation (see R flag) may be used.
      </blockquote><t>
        A DKIMACDC-enabled and -announcing domain that receives a message
        with a set A flag <bcp14>MUST</bcp14> reject the message unless
        it contains (a) DKIM-Access-Control header(s) dedicated to itself
        with SMTP reply code 550; if
        enhanced status codes<xref target="RFC3463"/>
        are used, 5.5.4 <bcp14>MUST</bcp14> be used.

        It <bcp14>MUST</bcp14> also reject messages which fail the
        signature, condition and flag check verification of such a header with
        SMTP reply code 550;
        the enhanced status code <bcp14>MUST</bcp14> be 5.7.7.

        Senders <bcp14>MAY</bcp14> use
        Delivery Status Notifications<xref target="RFC3461"/>
        to fine-tune the resulting behaviour.
      </t>

      <section>
        <name>The DKIM-Access-Control header</name>
        <t>
          The presence of this header empowers the receiving domain to
          cryptographically verify
          that it is indeed the correct destination domain,
          and that any given
          SMTP<xref target="RFC5321"/>
          <tt>RCPT TO</tt> was indeed addressed by the message sender,
          which indeed is the one mentioned in
          <tt>MAIL FROM</tt>;
          if the header included does not contain a superset of the SMTP
          envelope list,
          the message <bcp14>MUST</bcp14> be rejected with SMTP reply
          code 550; if
          enhanced status codes<xref target="RFC3463"/>
          are used, 5.5.4 <bcp14>MUST</bcp14> be used;
          or instead 5.7.7 if signature verification failed.
        </t><t>
          This header is to be sent only as part of exclusive and
          dedicated message instances, as documented above,
          it <bcp14>MUST</bcp14> be removed by the destination domain
          as soon as possible;
          it <bcp14>MUST NOT</bcp14> be delivered by local delivery
          agents as part of the message,
          and it <bcp14>MUST NOT</bcp14> be part of a rejected message.

          Any instance of such a header that is not targeted to the
          destination domain indicates an error
          and <bcp14>MUST</bcp14> result in message rejection with SMTP
          reply code 550; if
          enhanced status codes<xref target="RFC3463"/>
          are used, 5.5.4 <bcp14>MUST</bcp14> be used.
        </t><t>
          The syntax of this header is a semicolon separated list.

          It starts with the sequence number of the DKIM-Signature
          to which it links.
          As there may be multiple DKIM signatures with the same
          sequence number, which differ only in the used algorithm,
          multiple DKIM-Access-Control header fields may be generated;
          in any case the linked signature(s) necessarily <bcp14>MUST</bcp14>
          have the O flag set.

          The sequence number is followed by the selector value of the s=
          tag of the according DKIM-Signature;
          the actual algorithm can be deduced from there.

          The next field is reserved for later extension,
          it <bcp14>MUST</bcp14> be skipped over.
          (It may include the string "VERP" to indicate variable envelope
          return path addresses at some later time.)

          Thereafter follows the
          SMTP<xref target="RFC5321"/>
          <tt>MAIL FROM</tt>
          of the covered message,

          the receiver domain name which is addressed,

          followed by all SMTP
          <tt>RCPT TO</tt>
          local-parts of the receiver domain actually addressed by the message.

          The list is concluded with the cryptographic signature which
          has been generated on the DKIM "relaxed" normalized content
          of the DKIM-Access-Control header up to, and including, the
          semicolon that precedes the signature.

          <em>Warning:</em>
          SMTP<xref target="RFC5321"/>
          address local-parts permit quoted-strings.
        </t>
      </section>

      <section>
        <name>The _dkimacdc.DOMAIN DNS TXT RR</name>
        <blockquote>
          <em>Apologies.</em>
          We now come to the reason why this proposal does not work in todays',
          totally distorted state of the email infrastructure,
          including IETF's very own email system
          (ok: bug tracker; not ok: other mailing-lists i know).

          The problem is that DKIMv1 signatures may be consciously broken,
          or even removed completely, (or renamed, for example,
          mailman2 may rename to X-Mailman-Original-DKIM-Signature),
          along the path from the sender to the receiver domain.

          This (also) depends on the DMARC state of the sender etc.

          In any case this destroys DKIMACDC chains.

          This is why i, somewhere, somewhen, claimed that the DNS RR of DMARC
          is the sole use case for it i can think about:
          if, instead of _dkimacdc, we would extend the DMARC RR to include
          things necessary for ACDC, so that everybody who wants ACDC has to
          actually provide a (necessarily =reject) DMARC DNS entry, then things
          were different.

          The only other possibility would be to create a new header field,
          say, DKIM2, because the infrastructure does not know that yet.
          It could be absolutely identical to the DKIM signature, though.
        </blockquote>
        <t>
          The format of this DNS resource record mirrors the syntax of
          DKIM<xref target="RFC6376"/>
          section 3.5 on the DKIM-Signature header,
          with the exception that <tt>FWS</tt> separation is not allowed;
          supported are the tags v= and a=
          (other tags <bcp14>MUST</bcp14> be ignored),
          however, v= is optional,
          and none to multiple a= tags <bcp14>MAY</bcp14> exist.
          They indicate, in descending order, the most desirable algorithms
          for this domain,
          and that the domain prefers to receive DKIM-Access-Control
          (and DKIM-Signature, if applicable)
          header fields of the best fit algorithm.

          This can avoid unnecessary signature instances of undesired
          algorithms in case a domain normally produces signatures with
          multiple algorithms:
          it is only a hint to reduce processing cost of senders,
          it has no meaning beside this.
          Senders <bcp14>MUST</bcp14> be capable to follow DNS CNAME chains
          when looking up this DNS RR.

<cref>It could or should or must announce a definitive recipient limit of whatever sort!</cref>
        </t>
      </section>
    </section>

    <section>
      <name>Differential Changes</name>
      <t>
        DKIM signatures never were designed to work with the existing
        mailing-list infrastructure,
        which often tags message subjects and/or appends footers
        (headers are supposed to be more of a theoretical issue).
        With the advent of some supplementary standard which worked
        around the DKIM
        "signature verification failure does not force rejection"
        paradigm,
        the resulting DKIM signature verification failures started to
        cause non-deliveries.
        Mailing-list software adapted in that they started to rewrite
        the From header in order to avoid breakage of the sender's
        signature.
        Further standards were developed that tried to bring back trust
        that was lost by those modifications initiated to avoid that the
        forced signature breakage caused message delivery breakage.
      </t><t>
        This specification adds the creation of differential changes,
        which can be applied in reverse order of creation, and therefore
        be used to cryptographically verify all intermediate changes
        back to the original version as sent by the sender.

        Whenever a DKIMACDC enabled domain breaks a message signature,
        for example if a mailing-list tags the subject
        and adds a message footer,
        an according DKIM-Diff header has to be created,
        accompanied by flag changes as described above.

        All existing DKIM-Diff header fields <bcp14>MUST</bcp14> be
        included in DKIMACDC enabled DKIM-Signatures.
      </t><blockquote>
        <em>Informative remark:</em>
        It follows that the "changes cause a new message"
        paradigm of today's DKIM/DMARC usage stays intact.

        It is deemed correct behaviour:
        <em>Note that a message sent to a mailing list is addressed to
        a mailing list.
        It is not addressed to the 'final' recipients.
        That additional addressing is done by the mailing list, not the
        original author.
        This is a rather stark demonstration that the intermediary has
        taken delivery and then re-posted the message.</em>
        (Dave Crocker.)

        However, DKIMACDC allows for cryptographically verifying the
        original message, and therefore can overcome the trust problem
        incurred by those "correct" changes,
        which of course break the DKIM signature of the original
        message.
      </blockquote><blockquote>
        <em>Informative remark:</em>
        Today many mailing-list instances re-encode message data for
        policy reasons, needlessly: for example from some 7-bit clean
        content-transfer-encoding to 8-bit, or anything into base64 (as
        below).

        This policy usually causes enlargening of the differential
        changes on at least the first level (which for one is most often
        the only one involved, and second it depends on the content of
        the original message).
        This negative impact can thus easily vanish, upon policy change.
      </blockquote>

      <section>
        <name>The DKIM-Diff header</name>
        <t>
          The DKIM-Diff header consists of a sequence number that
          links it with (a) DKIMACDC enabled DKIM-Signature heade field(s),
          followed by a semicolon,
          and the result of the BSDiff differential algorithm, as below.

          The input to this algorithm is the
          DKIM<xref target="RFC6376"/>
          "relaxed" normalized header and body content,
          separated by an empty (normalized) line,
          alongside the equally normalized version present before
          modifications took place.
        </t><blockquote>
          <em>Informative remark:</em>
          For non-integrated systems like mail filters for example
          the DKIM-Store header can be used to pass around the necessary
          data in between the ingress side that sees the original
          message,
          and the egress side which will dispatch the modified variant.
        </blockquote><t>
          All header fields covered by the DKIM-Signature
          <bcp14>MUST</bcp14> be included,
          as <bcp14>MUST</bcp14> be all
          MIME<xref target="RFC2045"/>
          related header fields,
          regardless of their normal inclusion in the DKIM-Signature.

          MIME related header fields <bcp14>MUST</bcp14> be regulary
          included in DKIM signatures to avoid the otherwise existing
          attack surface against the MIME structure through maliciously
          injected header fields and body content.

          All DKIMACDC-enabled DKIM-Signature header fields
          <bcp14>MUST</bcp14> be included,
          as <bcp14>MUST</bcp14> be all DKIM-Diff ones.

          The header fields <bcp14>MUST</bcp14> be sorted byte-wise
          by-value by name,
          and the formed subgroups <bcp14>MUST</bcp14> consist in the order
          defined by
          DKIM<xref target="RFC6376"/>
          section 5.4.2, Signatures Involving Multiple Instances of a Field.

          Other than that the advice of
          DKIM<xref target="RFC6376"/>,
          section 5.4.1, on recommended signature content, still applies,
          but is hereby extended with the
          Author Header<xref target="RFC9057"/>.
        </t><blockquote>
          <em>Informative remark:</em>
          Since DKIMACDC is meant to (effectively) incur the most minimal
          changes on the software side it does not change the way how
          existing DKIM software verifies or creates signatures in general.

          To integrate this extension into the existing infrastructure it
          seems best to accept a small overhead in the highly compressible
          BSDiff control data,
          instead of introducing expensive prefiltering processing costs,
          for example, by grouping "old" and "new" header fields.

          Here also to note that in mail filters the name and the content of
          header fields fly by as distinct data arrays, for example,
          so that the necessary control structures for the sorting algorithm
          as above can be implemented more efficiently than it sounds at
          first,
          and alongside the normal processing.
        </blockquote><blockquote>
          <em>Informative remark:</em>
          When undoing a modification by applying the output of the
          algorithm as a patch,
          care should be taken despite the cryptographic verifiability:
          the result again must be a
          "relaxed" normalized header and body content,
          separated by an empty (normalized) line.
        </blockquote>
      </section>

      <section>
        <name>The BSDiff differential algorithm</name>
        <t>
          Differences are generated with the BSDiff algorithm of
          Colin Percival, which has excellent characteristics.

          No reimplementation of the algorithm was necessary due to the
          Open Source licenses used in all its different parts,
          instead it was taken from the FreeBSD operating system source
          code, and slightly rearranged.

          There is a freely usable (BSD 2-clause, ISC and MIT licenses)
          plug-and-play ISO C99 and perl implementation available
          (<eref target="https://github.com/sdaoden/s-bsdipa"/>),
          which includes further references on the algorithm.

          DKIMACDC uses a 32-bit adaption sufficient for email
          that almost halves memory requirements compared to 64-bit,
          and also produces smaller difference control data.

          The resulting binary difference is then
          ZLIB<xref target="RFC1950"/>
          compressed and encoded with
          BASE64<xref target="RFC4648"/>
          for inclusion in the DKIM-Diff header.
        </t>

        <section>
          <name>BSDiff adaption</name>
          <ul>
            <li>
              First of all: the string suffix sorting and difference creation
              approach of Colin Percival has been left unchanged.
            </li><li>
              The original had been fixated on 64-bit file sizes
              and content representation.
              The adaption supports (compile-time switching in between) 32-bit
              (and 64-bit).
              Using 32-bit almost halves memory constraints,
              and produces smaller patch control data.
              It is deemed sufficient for email purposes.
              (32-bit and 64-bit patches are not interchangeable.)
            </li><li>
              The "magic window of inspection" has been made configurable,
              from the fixed original value 8,
              which represents a perfect fit for compiler output.
              The adaption uses the default value 16,
              which is a very good fit for textual data.
              The value is, however, irrelevant on the patch application side.
            </li><li>
              In order to reduce memory usage during patch generation,
              the adaption uses a shared memory region for differential and
              extra data: the former is therefore stored in reversed order,
              top down.
              (This reduces memory usage by the size of the target data set.)
            </li><li>
              The adoption stores data in big endian (network; MSF;
              most significant byte first) instead of little endian (LSF;
              least significant byte first) byte order.
            </li><li>
              The original uses three separate bzip2 streams to serialize
              control, differential and extra data.
              The adaption separated patch generation from the I/O layer,
              which will therefore see the entire readily prepared patch data.
              DKIMACDC uses
              ZLIB<xref target="RFC1950"/>
              for patch compression.
            </li><li>
              The original header did not contain the size of the extra
              data, which was stored last, with its size implicitly extending
              to the end of the patch.
              The adaption includes the extra data size in the header,
              allowing more verification tests to be applied with only the
              header being readily parsed.
              This also enables the I/O layer to allocate perfectly sized
              memory with only the header data being available.
            </li><li>
              The adaption performs memory allocations through user provided
              callbacks.
            </li>
          </ul>
        </section>

        <section>
          <name>Patch content</name>
          <t>
            Overall, the patch consists of the header,
            followed by the control data.
            Thereafter the two byte streams of
            differential data (in reverse order)
            and extra data conclude the patch.
            The header and the control data consist of 32-bit signed integers,
            stored in big endian byte order (as above).
            The control data is a stream of tuples of three values each,
            the first denoting the length of differential data to copy in
            8-bit bytes, the second that of extra data.
            The last value denotes the number of 8-bit bytes to seek
            relatively in the data source after the copying has taken place:
            of all the values, only this one may be negative.
            The header consists of four values denoting
            the length of the control block in 8-bit bytes,
            the length of the difference data block,
            the length of the extra data block,
            concluded by the length of the original data source;
            The sum of the first three values must be one less than the maximum
            positive 32-bit signed integer.
            It follows that control data copy instructions also do not exceed
            this value.
          </t>
        </section>
      </section>

      <section>
        <name>Rationale</name>
        <t>
          Differences are included to allow DKIM verifiers to restore
          previous message content for cryptographical verification
          purposes.

          Whereas user interfaces may (and should) use them to offer
          differential visualization (after signature verification,
          and with the usual precautions necessary for displaying content),
          empowering users to make decisions on the trustworthiness of
          those intermediate stations which actually incurred message
          modifications,
          the restored message data is not meant to result in a usable
          message by itself.

          For example some embedded OpenPGP signature and text couple
          would likely fail to verify because of DKIM normalization
          (dependent upon the original MIME transfer encoding).

          This was deemed acceptable because of the purpose of including
          differential changes,
          and because a visualization of the DKIM covered message should
          still be sufficient to allow users making responsible
          decisions.

          Finally, the given example will likely verify as part of the
          complete received message, unless altered along the SMTP path:
          DKIMACDC can ideally say where
          (and exactly what, in an unbroken ACDC chain).
        </t><t>
          User interfaces could for example use traffic light semantics
          that unfold on click to traffic light semantics of all
          stations that a message passed,
          which would visualize differences on a further click.

          They could build complex reputation statistics based upon
          DKIMACDC verification and perceived user hints.
          This could be used to restrict DKIMACDC verification,
          to reduce complete-chain-verification to random samples.

          Further possibilities could arise shall SMTP/DKIM/DKIMACDC
          remain as the only solution to email verification in the
          future.
          For example, mitigations may become a thing of the past.
        </t>
      </section>
    </section>

    <section anchor="IANA">
      <name>IANA Considerations</name>
      <t>This memo includes no request to IANA.</t>
    </section>

    <section anchor="Security">
      <name>Security Considerations</name>
      <t>
        Public-key cryptography is the safest approach to identification
        of counterparts and verification of data.
        This specification aims in making use of these attributes for
        the combined pair of SMTP and DKIM.
        It opens a door to reduction of email server maintenance and
        administration efforts,
        and to restoration of some email core aspects which got lost,
        or became a nuisance to use, over the last decade(s),
        like email forwarding and mailing-list usage.
        It may reduce implementation burden and complexity of the entire
        email infrastructure.
        It allows for building of
        organizational trust (<xref target="RFC5863"/>, section 2.5)
        that aids in decision making, to increase processing performance
        and decrease energy consumption.
        If superfluous protocols vanish this effect potentiates.
      </t>
    </section>

  </middle>
  <back>

    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4648.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6376.xml"/>
      </references>

      <references>
        <name>Informative References</name>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.1123.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.1950.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2045.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3461.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3463.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.5234.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.5321.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.5322.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.5863.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7208.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9057.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9325.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9369.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9562.xml"/>
      </references>
    </references>

    <section anchor="FurtherDKIMUpdates">
      <name>Further DKIM Updates</name>
      <ul><li>
        This specification obsoletes the simple canonicalization type;
        It <bcp14>MUST NOT</bcp14> be used by software announcing DKIMACDC.

        <em>Rationale:</em> in order to minimize processing cost in time and
        space for and of differential processing,
        being able to work on and with only one data representation is
        beneficial.
        The "extremely crude ASCII Art attacks" mentioned in
        DKIM<xref target="RFC6376"/>
        section 8.1 are considered to be a rather artificial attack vector.
      </li><li>
        This specification obsoletes the DKIM l= tag that restricts the
        number of DKIM covered bytes of the normalized message body.
        This tag <bcp14>MUST NOT</bcp14> be used by software announcing
        DKIMACDC support,
        and all the message body <bcp14>MUST</bcp14> always be used to
        create the body hash.

        <em>Rationale:</em> l= has always been insufficient to deal with
        message changes caused by mailing-lists etc,
        but effectively includes the security risk that message parts which
        are not covered by the signature appear as "valid content" to users
        looking at a DKIM verified message.
        The DKIMACDC differential changes offer a better approach to deal
        with message changes, while completely covered message bodies ensure
        content validity.
      </li><li>
        This specification obsoletes the DKIM z= tag that was defined
        "for diagnostic use" to copy a freely defined set of header fields
        and their values present during signature creation.
        This tag <bcp14>MUST NOT</bcp14> be used by software announcing
        DKIMACDC.

        <em>Rationale:</em> the DKIMACDC differential changes provide access
        to the same information distinct from the DKIM-Signature header.
      </li><li>
        For the q= tag this specification obsoletes the possible use of
        DKIM-Quoted-Printable for the optional x-sig-q-tag-args of
        possibly introduced future query types.

        <em>Rationale:</em> shall ever a new type become standardized beside
        the dns/txt that is with DKIM from the very start,
        that standard can very well give meaning to a "hyphenated-word"
        proxy identifier without making use of byte values which would
        require encoding.
      </li><li>
        This specification obsoletes the DKIM key representation tag n=
        that was meant to include "notes that might be of interest to
        a human", "intended for use by administrators, not end users",
        and which "should be used sparingly".

        <em>Rationale:</em> no use case has been encountered in the DNS,
        let alone serious such; if future non-space-constrained key
        providers other than DNS should ever exist and be used to
        distribute DKIM keys, it is likely that they support inclusion
        of strings via some method that need not be included in the DKIM
        key representation itself.
      </li><li>
        Because above changes remove all use cases for the
        "dkim-quoted-printable" encoding defined in RFC 6376 2.11,
        this specification obsoletes the DKIM-Quoted-Printable encoding.
      </li><li>
        This specification obsoletes the use of <tt>FWS</tt> in ag-spec.
        Second its use was never encountered by the author.
        But first of all
        MIME<xref target="RFC2045"/>
        introduced parameters in ABNF as
        <tt>parameter := attribute "=" value</tt>
        without <tt>FWS</tt>,
        and its presence complicates parsers and hinders parser code reuse.
        The acdc= tag ("parameter") is defined without <tt>FWS</tt> support.
      </li></ul>
    </section>

    <section anchor="Acknowledgements">
      <name>Acknowledgements</name>
      <t>
        This document contains a citation of Dave Crocker.
        Thanks to, in the order of appearance,
        Jesse Thompson,
        Richard Clayton for arguments against reliance on header stacks,
        and pro the numbering scheme,
        and especially for noticing the partial transaction replay attack
        problem,
        Douglas Foster,
        Michael Thomas for explicit man-in-the-middle replay addressing;
        Alessandro Vesely inspired the explicitness of the E flag.
        A big fat acknowledgment is due to Murray S. Kucherawy.
        Special thanks to Klaus Schulze, Manuel Goettsching,
        both also as Ash Ra Tempel,
        Laeuten der Seele,
        Laurent Garnier,
        as well as the Sleeping Environmental Bot broadcast.
      </t>
    </section>

 </back>
</rfc>
<!-- vim:set tw=1000:s-ts-mode -->
