DKIM Access Control and Differential Changes

steffen@sdaoden.eu

General Internet Engineering Task Force DKIM This document specifies a bundle of DKIM (RFC 6376) adjustments and extensions. They do not hinder the currently distributed processing environment that includes DKIM, ARC, DMARC and SPF, and are as such backward compatible. Their aim is however to ultimately slim down the email environment that needs to be administrated and maintained, by establishing mutual agreements in between sender and receiver(s), verifiable through public-key cryptography, and let the SMTP protocol handle decisions only based upon that.

Introduction Public-key cryptography is used for secure transactions on many levels, and in many protocols. For example, there is transport layer security TLS that is used to provide encrypted data exchange. These mechanisms are omnipresent, desired where optional, even enforced by standard means, and newer IETF transports, like QUIC, even exist only in conjunction with them. The usual mode of operation of such a mechanism is that if no trust can be established through the used public-key cryptography, then the operation is cancelled. It simply does not happen. DKIM on the other hand, defined as one of its core details that "signature verification failure does not force rejection". Yet there is such a pressing need of email operators to be able to enforce policy, that a plethora of extensive accompanying standards surrounding SMTP and DKIM were developed, among which are ARC, DMARC and SPF. Reality is that the complexity of email setup, of administrative effort, has massively increased in the last decade plus, so much that many small commercial and private operators have ceased to exist, aka have turned away from providing their own service. Reality is also that those which still exist do not follow-suit so-called IETF progress out of belief of improving the situation, but instead they wait until interoperability problems arise, especially with the giant email players, before minimal invasive solutions are searched for. These are usually found by searching the internet, often by doing copy and paste of shared configuration snippets. Yes, some of the mentioned standards introduce massive complications of decade old habits and usage patterns. For example, universities and other "groupings" which offer their members email addresses, and then forward received email to "real addresses". This is made impossible by SPF if taken by the word (RECOMMENDET), which it often, but dependent upon a software implementation or configuration, is. Non-standardized solutions, like "Sender Rewriting Scheme" for the given example, are then developed, and implemented, by the sheer necessity to keep a grown infrastructure in a usable state. Often these solutions are imperfect. In any cases they try to circumvent a defect of an IETF standard, in an onion-alike environment of standards that has no other desire, if one lets aside all those masses of "reporting" capabilities that IETF standards developed, than to provide reliable and trustworthy verification of the sender / receiver relationship and the communicated data. What this specification tries to achieve is to provide a path to lesser complexity, to easier maintenance and administration efforts, on the one hand. And on the other hand it tries to solve the issues which still exist, regardless of the sheer number of IETF standards invented to improve the situation.

DKIMACDC DKIM Access Control and Differential Changes is announced by adding an acdc= tag to DKIM-Signature. ABNF:

The DKIM-Store header field The DKIM-Store header has no meaning in the email system. The sole purpose of mentioning it is to announce that it MUST be removed when messages enter and leave the email system. It could for example be temporarily created and used by non-integrated mail filter (milter) software to pass informational data in between the "ingress" and the "egress" processing side. But to aid in software bugs and possible configuration errors this specification makes it a MUST to remove any occurrence. It is suggested to encrypt and base64 encode data passed around in this temporary header, so that it can be assured that only email-system internal semantics are contained.

Access Control DKIM replay attacks have been reported, where messages with valid DKIM signatures were repeatedly sent to receivers not initially addressed by the sender. (That is, Bcc: headers are not included in messages as sent, and SMTP RCPT TO: can be inserted at will by any "man in the middle", without a possibility to detect the condition by further domains. Whereas x= signature expiration tags can and should be used, the stamina and forgivenessing of SMTP, owed to the necessity to deliver emails to receivers in various conditions, requires an expiration timestamp that leaves plenty of time for malicious players to misuse messages with valid signatures. To address this issue access control is introduced. When signing a message any distinct domain-name found within the list of message receivers is individually queried for announcing DKIMACDC support in the DNS. [DISCUSSION: _dkimacdc.DOMAIN record. which content? maybe really the "default DKIM key to be exepcted", then MUST ADAED25519 :), but surely a version number] If just any domain does, an "A" flag has to be added to the acdc= tag in the DKIM-Signature. And for any domain which does, the completely prepared message, including the readily prepared DKIM-Signature that is, is forged, a dedicated DKIM-Access-Control header is created and prepended, and the resulting domain-specific message is sent. If a DKIMACDC enabled sender sends a message with the A flag set, a DKIMACDC enabled receiver domain that announces so in the DNS MUST reject messages which do not contain a DKIM-Access-Control header dedicated to itself. [DISCUSSION: SMTP REPLY CODE 550, but which extended: RFC 7372? Text!] It MUST also reject messages which fail the signature verification of such a header. [DISCSSUIN: we need several different extended reply codes, which requires a SMTP extension to get this thing going; unfortunately basic reply codes cannot be used; it is essential the the message which comes back to the sender contains a message giving details of the different reject causes!] It MUST, however, not do so after some time after creating the according DNS record. Senders MAY use Delivery Status Notifications to fine-tune the resulting behaviour.

The DKIM-Access-Control header field This header is to be sent only as part of exclusive and dedicated message instances, and MUST be removed on "ingress" by the destination domain. It MUST NOT be delivered by local delivery agents as part of the message. It starts with the destination domain-name, [DISCUSSION: i hate this syntax. email addresses need to be quoted except in address headers. so what: comma-separated list, WS separated list? regular quoting as necessary?] and after a semicolon a list of all addressed local-parts follows, terminated with another semicolon. It is created just like the DKIM-Signature field in that it is readily prepared and thereafter signed; the signature is appended after the mentioned semicolon. This empowers the receiving domain to cryptographically verify not only that it is indeed the correct destination domain, but it can verify that any given SMTP RCPT TO: was indeed addressed by the message sender, wheth

Differential Changes DKIM signatures never were designed to work with the existing mailing-list infrastructure, which tags message subjects, and which often append footers (headers are supposed to be more of a theoretical issue). With the advent of some supplementary standard which worked around the DKIM "signature verification failure does not force rejection" paradigm, the resulting DKIM signature verification failures started to cause non-delivery troubles. Mailing-list software adopted in that they started to rewrite the From header in order to avoid breakage of the sender's signature. Further standards were developed that tried to bring back trust that was lost by those modifications initiated to avoid that the forced signature breakage caused message delivery breakage. This specification adds the creation of differential changes which can be used to cryptographically verify all intermediate changes back to the original version as sent by the sender. Whenever a DKIMACDC enabled sender forcefully breaks the signature of a message, for example if a mailing-list adds a message footer and a subject tag, a "D" flag has to be added to the acdc= tag in the DKIM-Signature, and a DKIM-Diff-NUMBER header has to be created and prepended to the message before a new DKIM-Signature is created for the resulting, again valid, message. All DKIM-Diff-NUMBER headers MUST be included in DKIM-Signatures. [FIXME [Discussion: Richard Clayton said "there is a belief (perhaps not well-founded) that there are systems out there that reorder headers", as well as "the actual cost of numbering is very low, and it provides significant clarity and reassurance". I have never encountered the former, but the latter can of course only be agreed upon. And since DKIM-Diff- is a new header..] It is clear from the above that the "changes cause a new message" paradigm of today stays intact, at least for the forseeable future. The DKIM-Signature of the for example change-causing mailing-list is therefore the sender, and verifying its signature lets the message succeed. As shown below user interfaces could however adapt, and could, for example, allow transparent replacements of From or Subject headers with original variants, shall they and their user(s) have this desire. Integrated systems could also allow users to wave through (certain changes of) certain intermediate stations, like mailing-lists, or could otherwise always try to verify up to the sender. DKIM verification could also become restricted to only a single signature, if it is clear that elder ones cannot verify (except by unrolling the changes). [DISCUSSION: work this out. it is better than that. One can SMTP reject if restoration fails!! but anyway it needs to be described in a better fashion.] How receivers deal with differential changes is not part of this specification otherwise. Until DKIMACDC has found widespread adoption intermediate stations can break signatures without including differences that can be followed in a cryptographically verifiable way. Possibilities which arise shall SMTP/DKIM/DKIMACDC remain as the only solution to email verification, are not part of this specification.

INFORMATIVE NOTE: the file differences are included to allow DKIM verifiers to restore previous message content, to make it possible to cryptographically verify all message mutations back to the original message. Whereas user interfaces can (and SHOULD) make use of them by offering differential visualization, to empower users of making decisions on the trustworthiness of those intermediate stations which actually incurred message modifications, this visualization of a restored message is not meant to result in a usable message by itself. For example, if a text part contained inline OpenPGP signed text, then the DKIM normalization algorithm (whitespace reduction etc) can render the signature invalid, that is, non-verifiable by itself. (Unless the Content-Transfer-Encoding was Base64.) It is the author's believe that this is acceptable because the purpose of including differential changes is as written, and the visualization of the DKIM covered message should still be sufficient to allow users making responsible and correct decisions.

EXAMPLE: For example, in the common case that a mailing-list adds a message footer, and exchanges the From: header in order to avoid the breakage of the original DKIM signature thus, the user can easily decide correctly by visual inspection. This is true even if the mentioned example of an inline OpenPGP signed text cannot be verified by itself, as long as the original DKIM signature of the message as sent can be verified: either the signature was broken in the original version already, or it will be verifieable in the finally received message. In any case the visualization empowers the user to give an exact hint for a particular station, best treated by the user interface in combination with a specific message trace route, and only. User interfaces should make available differential visualization easily, regardless of known user hints. They could for example use traffic light semantics that unfold on click to traffic light semantics of all stations that a message passed, which would visualize differences on a further click.

The DKIM-Diff-NUMBER header field [Discussion: Richard Clayton said "there is a belief (perhaps not well-founded) that there are systems out there that reorder headers", as well as "the actual cost of numbering is very low, and it provides significant clarity and reassurance". I have never encountered the former, but the latter can of course only be agreed upon. And since DKIM-Diff- is a new header.. Including the potentially large diff in the normal DKIM-Signature seems not to be a good idea.] To generate differential changes the DKIM normalized headers and the DKIM normalized message body content is stored, separated by an empty (normalized) line, and is compared to the version present before the modifications took place via the BSDIFF32 algorithm as below. For non-integrated systems like mail filters the DKIM-Store header can for example be used to pass around the necessary data. In general all headers covered by the DKIM-Signature MUST be included there, as MUST be all MIME related headers, even if not included in the DKIM-Signature (which is discouraged by itself). [Discussion: maybe one should say "all but trace headers", but at least things like subject, date, all address fields, they should really be MUST here, in particular because DKIM configurations that have been seen in the wild, which covered only minimal most content] All DKIM-Diff-NUMBER header MUST be included in DKIM-Signatures. The DKIM-Diff-NUMBER header field SHOULD be treated as though it were a trace header field as defined in Section 3.6 of IMF and hence SHOULD NOT be reordered and SHOULD be prepended to the message. [DISCUSSION: that trace thing can go if we really go numbered, because it is not necessary then and only cruft. ALSO ABNF is missing]

The BSDIFF32 differential algorithm [FIXME references] This is an adoption of the BSDIFF algorithm of Colin Percival. The adoption includes reduction of runtime memory requirements through adjusted data usage, as well as of result storage as such since file offsets became restricted to 32-bit. [DISCUSSION: for now only RFC 1950 ZLIB] It generates a compressed binary representation that needs to become base64 encoded for inclusion in the DKIM-Diff-NUMBER header. [DISCUSSION: the code is not yet production ready for some days. Add references once published] There is a freely usable (BSD 2-clause/ISC and MIT licenses) plug-and-play ISO C99 implementation available, which includes further references on the algorithm.

IANA Considerations This memo includes no request to IANA.

Security Considerations Public-key cryptography is the safest approach to identification of counterparts and verification of data. This specification aims in making use of these attributes for the pair of SMTP and DKIM. It opens a door to reduction of email server maintenance and administration, and to restoration of some core email aspects which got lost, or became a nuisance to use, over the last decade, like email forwarding and mailing-list usage. It MAY reduce implementation burden and complexity of the entire email infrastructure, as well as processing performance.

References Normative References Informative References

Further DKIM Updates This specification obsoletes the simple canonicalization type; It MUST NOT be used by software announcing DKIMACDC. Rationale: in order to minimize processing cost in time and space for and of differential processing, being able to work on and with only one data representation is beneficial. This specification obsoletes the DKIM l= tag that restricts the number of DKIM covered bytes of the normalized message body. This tag MUST NOT be used by software announcing DKIMACDC support, and all the message body MUST always be use to create the body hash. This specification obsoletes the DKIM z= tag that was defined "for diagnostic use" to copy a freely defined set of headers and their values present during signature creation. This tag MUST NOT be used by software announcing DKIMACDC. Rationale: the DKIMACDC differential changes provide access to the same information distinct from the DKIM-Signature header. For the q= tag this specification obsoletes the possible use of DKIM-Quoted-Printable for the optional x-sig-q-tag-args of possibly introduced future query types. Rationale: shall ever a new type become standardized beside the dns/txt that is with DKIM from the very start, that standard can very well give meaning to a "hyphenated-word" proxy identifier without making use of byte values which would require encoding. This specification obsoletes the DKIM key representation tag n= that was meant to include "notes that might be of interest to a human", "intended for use by administrators, not end users", and which "should be used sparingly". Rationale: no use case has been encountered in the DNS, let alone serious such; if future non-space-constrained key providers other than DNS should ever exist and be used to distribute DKIM keys, it is likely that they support inclusion of strings via some method that need not be included in the DKIM key representation itself. Because above changes remove all use cases for the "dkim-quoted-printable" encoding defined in RFC 6376 2.11, this specification obsoletes the DKIM-Quoted-Printable encoding.

Acknowledgements Thanks to, in the order of appearance, Jesse Thompson and Richard Clayton.