<?xml version="1.0" encoding="UTF-8"?>
<rfc category="exp" consensus="true" docName="draft-liao-aipref-autoctl-core-00" ipr="trust200902" sortRefs="true" submissionType="IETF" symRefs="true" tocInclude="true" version="3" xmlns:xi="http://www.w3.org/2001/XInclude">
  <front>
    <title abbrev="aipref-autoctl">Protocol for Basic Automation Control</title>
    <seriesInfo name="Internet-Draft" value="draft-liao-aipref-autoctl-core-00"/>
    <author fullname="Liao Peiyuan">
      <organization>Condé Nast</organization>
      <address>
        <postal>
          <country>United States of America</country>
        </postal>
        <email>peiyuan_liao@condenast.com</email>
      </address>
    </author>
    <date year="2025" month="April" day="8"/>
    <area>Applications</area>
    <workgroup>AI Preferences</workgroup>
    <keyword>AI Preferences</keyword>
    <keyword>Automation Control</keyword>
    <keyword>Web Automation</keyword>
    <abstract>
      <t>
        This document specifies the core automation-preferences.txt protocol, a machine-readable
        document that defines server-side automation permissions with a focus on essential
        controls for AI and automation use cases. Unlike the traditional robots.txt, which
        governs crawling, this protocol addresses a broader range of automation activities
        while maintaining simplicity for initial implementation. It defines the fundamental
        file format, policy declarations, HTTP method restrictions, and purpose requirements.
        Advanced features are addressed in a separate extension specification.
      </t>
    </abstract>
    <note removeInRFC="true">
      <name>About This Document</name>
      <t>The latest revision of this draft can be found at <eref target="https://datatracker.ietf.org/doc/draft-liao-aipref-autoctl-core/"/>.
      Status information for this document may be found at <eref target="https://datatracker.ietf.org/doc/draft-liao-aipref-autoctl-core/"/>.</t>
      <t>Discussion of this document takes place on the
      AI Preferences Working Group mailing list (<eref target="mailto:ai-control@ietf.org"/>),
      which is archived at <eref target="https://mailarchive.ietf.org/arch/browse/ai-control/"/>.
      Subscribe at <eref target="https://www.ietf.org/mailman/listinfo/ai-control/"/>.</t>
    </note>
  </front>

  <middle>
    <section anchor="introduction">
      <name>Introduction</name>
      <t>
        The evolution of web automation has outpaced the capabilities of existing
        standards such as robots.txt, which only provides for crawler permissions.
        This document introduces the core automation-preferences.txt protocol that enables server
        operators to explicitly define basic policies governing automated interactions.
        These core policies cover fundamental aspects of automation control, including
        HTTP method restrictions and automation purpose declarations.
      </t>
      <t>
        The automation-preferences.txt protocol is designed with extensibility in mind,
        allowing future enhancements while maintaining backward compatibility. This
        specification focuses on essential controls to facilitate initial adoption
        and implementation, with more advanced features defined in a separate
        extension document.
      </t>

      <section anchor="applicability">
        <name>Applicability</name>
        <t>
          The automation-preferences.txt protocol applies to automated systems interacting with
          web servers, especially those driven by foundation models or other types of
          advanced AI models. It is designed to benefit both content owners-by allowing
          them to specify acceptable automation behaviors-and developers of
          automated systems, who can use these directives to ensure compliance.
        </t>
      </section>

      <section anchor="relationship-to-extension-specification">
        <name>Relationship to Extension Specification</name>
        <t>
          This document defines the core functionality of the automation-preferences.txt protocol.
          A separate document, "Protocol Extension for Advanced AI Automation Control," extends this 
          core specification with additional directives and capabilities.
          The extension specification builds upon this core document without modifying its
          requirements, providing a path for progressive implementation.
        </t>
        <t>
          Implementations conforming to only this core specification are considered compliant
          with the automation-preferences.txt protocol. The extension specification defines optional
          enhancements that may be implemented once the core functionality is established.
        </t>
      </section>
    </section>

    <section anchor="conventions-and-definitions">
      <name>Conventions and Definitions</name>
      <t>The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>", "<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL
      NOT</bcp14>", "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>", "<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>",
      "<bcp14>MAY</bcp14>", and "<bcp14>OPTIONAL</bcp14>" in this document are to be interpreted as
      described in BCP&#xa0;14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only when, they
      appear in all capitals, as shown here.</t>
      <t>
        The following terms are used in this document:
      </t>
      <ul spacing="normal">
        <li>Automation: Programmatic interactions with a web server.</li>
        <li>State-changing requests: HTTP methods that alter server state
           (e.g., POST, PUT, DELETE, PATCH).</li>
        <li>Automation policy: A set of rules defining permitted and prohibited
           automated behaviors for a specific scope.</li>
        <li>Automation purpose: The declared intent or use case for which
           automation is being performed.</li>
      </ul>
      <t>
        All terminology defined in this core specification applies to the extension
        specification without redefinition. The extension specification may introduce
        additional terms for concepts not covered in this document.
      </t>
    </section>

    <section anchor="protocol-specification">
      <name>Protocol Specification</name>
      <section anchor="file-location-and-format">
        <name>File Location and Format</name>
        <t>
          The automation-preferences.txt file MUST be hosted at the root of the domain, in the
          same manner as robots.txt. The file is structured as a series of key-value
          pairs that specify automation permissions.
        </t>
        <t>
          The file MUST be served with the text/plain MIME type. Lines beginning with the
          hash symbol (#) are considered comments and MUST be ignored by parsers. Each
          directive consists of a field name, followed by a colon, followed by a value.
          Multiple values MAY be separated by commas.
        </t>
        <t>
          Parsers MUST silently ignore any directives they do not recognize. This
          enables future extensions to add new capabilities without breaking compatibility
          with existing implementations.
        </t>
      </section>

      <section anchor="automation-policy-declaration">
        <name>Automation Policy Declaration</name>
        <t>
          A top-level directive, <tt>AutomationPolicy</tt>, indicates the overall stance
          of the server regarding automated interactions and state-changing requests. 
          The following values are defined:
        </t>
        <ul spacing="normal">
          <li><strong>open</strong>: Automation is generally permitted with few restrictions.</li>
          <li><strong>limited</strong>: Automation is permitted with specific restrictions.</li>
          <li><strong>strict</strong>: Automation is heavily restricted or prohibited except in
             specific circumstances.</li>
        </ul>
        <t>
          If the <tt>AutomationPolicy</tt> directive is not present, clients SHOULD assume a
          default value of "limited".
        </t>
        <t>
          Example:
        </t>
        <figure>
          <artwork><![CDATA[
AutomationPolicy: limited
          ]]></artwork>
        </figure>
      </section>

      <section anchor="http-method-restrictions">
        <name>HTTP Method Restrictions</name>
        <t>
          The protocol MUST explicitly list allowed and disallowed HTTP methods using
          the <tt>AllowedMethods</tt> and <tt>DisallowedMethods</tt> directives. Typically, GET and
          HEAD are permitted while methods such as POST, PUT, DELETE, and PATCH are
          disallowed for automated processing.
        </t>
        <t>
          If no HTTP method directives are specified, clients SHOULD assume that only
          GET and HEAD methods are permitted.
        </t>
        <t>
          Example:
        </t>
        <figure>
          <artwork><![CDATA[
AllowedMethods: GET, HEAD
DisallowedMethods: POST, PUT, DELETE, PATCH
          ]]></artwork>
        </figure>
      </section>

      <section anchor="purpose-declaration">
        <name>Purpose Declaration</name>
        <t>
          The protocol enables automation clients to declare their intended usage purposes.
          This is managed using the following directives:
        </t>
        <ul spacing="normal">
          <li><tt>RequireAutomationPurpose</tt>: Boolean value indicating whether clients must 
             declare a purpose for automation.</li>
          <li><tt>AllowedPurposes</tt>: Comma-separated list of permitted purposes using standardized 
             vocabulary terms.</li>
          <li><tt>DisallowedPurposes</tt>: Comma-separated list of prohibited purposes.</li>
        </ul>
        <t>
          The specific vocabulary terms for automation purposes are intentionally not defined
          in this protocol specification. Instead, this protocol provides a mechanism for
          expressing allowed and disallowed purposes, which can be populated with terms from
          any widely accepted vocabulary standard. This approach ensures that the protocol
          remains flexible and can adapt to evolving vocabulary standards while maintaining
          the essential structure for purpose declarations.
        </t>
        <t>
          Example:
        </t>
        <figure>
          <artwork><![CDATA[
RequireAutomationPurpose: true
AllowedPurposes: [PLACEHOLDER_PURPOSE1], [PLACEHOLDER_PURPOSE2]
DisallowedPurposes: [PLACEHOLDER_PURPOSE3]
          ]]></artwork>
        </figure>
      </section>

      <section anchor="scope-and-applicability">
        <name>Scope and Applicability</name>
        <t>
          In a manner similar to robots.txt, the automation-preferences.txt file is divided into
          groups, each of which applies to a specific subset of content. Each group
          begins with one or more scope directives that define the target of the
          preferences. The following directives MAY be used within a group:
        </t>
        <ul spacing="normal">
          <li><tt>Scope</tt>: Specifies the URL pattern (e.g., <tt>/admin/</tt>) to which the group
             applies. Wildcards MAY be used to indicate variable components of the URL.</li>
          <li><tt>Host</tt>: Specifies a subdomain or host. If present, the group applies only
             to the indicated subdomain; if omitted, the group is assumed to apply to the
             entire host.</li>
        </ul>
        <t>
          Groups are processed in order of specificity. When multiple groups match a
          given request, the group with the longest matching Scope directive SHALL
          take precedence.
        </t>
        <t>
          Example:
        </t>
        <figure>
          <artwork><![CDATA[
# Group 1: Applies to the entire site
Host: example.com
Scope: /
AutomationPolicy: limited
AllowedMethods: GET, HEAD
DisallowedMethods: POST, PUT, DELETE, PATCH

# Group 2: Specific preferences for the /admin/ path
Host: example.com
Scope: /admin/
AutomationPolicy: strict
AllowedMethods: GET
DisallowedMethods: POST, PUT, DELETE, PATCH
          ]]></artwork>
        </figure>
      </section>
    </section>

    <section anchor="extension-mechanism">
      <name>Extension Mechanism</name>
      <t>
        The automation-preferences.txt protocol is designed for extensibility. The protocol
        defines a forward-compatible approach where implementations:
      </t>
      <ul spacing="normal">
        <li>MUST process all recognized directives according to this specification</li>
        <li>MUST silently ignore any unrecognized directives</li>
        <li>MUST NOT fail or produce errors when encountering extended directives</li>
      </ul>
      <t>
        This enables future extensions to add new capabilities while maintaining
        compatibility with implementations of this core specification. Extended
        directives may include:
      </t>
      <ul spacing="normal">
        <li>Rate limiting controls</li>
        <li>Automation technology restrictions</li>
        <li>API and XHR permissions</li>
        <li>Session requirements</li>
        <li>Asset-level annotation methods</li>
      </ul>
    </section>

    <section anchor="implementation-and-enforcement">
      <name>Implementation and Enforcement</name>
      <t>
        Servers implementing this protocol SHOULD:
      </t>
      <ul spacing="normal">
        <li>Verify incoming requests against the rules specified in automation-preferences.txt.</li>
        <li>Respond with appropriate HTTP status codes (e.g., 403 Forbidden) for
           non-compliant requests.</li>
        <li>Include HTTP headers that reference the applicable automation policy
           for transparency.</li>
      </ul>
      <t>
        Clients consuming this protocol SHOULD:
      </t>
      <ul spacing="normal">
        <li>Fetch and parse the automation-preferences.txt file before performing automated
           operations.</li>
        <li>Honor the HTTP method restrictions specified in the file.</li>
        <li>Declare their automation purpose when required.</li>
        <li>Respect the scope directives when performing operations on different paths.</li>
      </ul>
      <t>
        Implementations SHOULD cache the automation-preferences.txt file to reduce server
        load, but SHOULD NOT cache it for longer than 24 hours to ensure timely
        policy updates.
      </t>
    </section>

    <section anchor="security-considerations">
      <name>Security Considerations</name>
      <t>
        The use of machine-readable automation policies introduces security
        considerations that must be addressed by implementations:
      </t>
      <ul spacing="normal">
        <li>Parsing of automation-preferences.txt MUST be performed securely to prevent
           vulnerabilities such as buffer overruns and denial-of-service attacks.</li>
        <li>Care SHOULD be taken to avoid exposing sensitive policy details that could
           be exploited by adversaries.</li>
        <li>The protocol does not provide authentication or cryptographic verification
           mechanisms for the file content. Servers SHOULD ensure the file is served
           via secure connections to prevent tampering.</li>
        <li>The protocol does not enforce client compliance; it relies on good-faith
           adherence by automation providers. Servers SHOULD implement additional
           detection and enforcement mechanisms as needed.</li>
      </ul>
    </section>

    <section anchor="iana-considerations">
      <name>IANA Considerations</name>
      <t>This document has no IANA actions.</t>
    </section>
  </middle>

  <back>
    <references>
      <name>References</name>
      <references anchor="normative">
        <name>Normative References</name>
        <reference anchor="RFC2119">
          <front>
            <title>Key words for use in RFCs to Indicate Requirement Levels</title>
            <author initials="S." surname="Bradner" fullname="Scott Bradner">
              <organization>Harvard University</organization>
            </author>
            <date year="1997" month="March" />
          </front>
          <seriesInfo name="BCP" value="14" />
          <seriesInfo name="RFC" value="2119" />
          <seriesInfo name="DOI" value="10.17487/RFC2119" />
        </reference>
        <reference anchor="RFC8174">
          <front>
            <title>Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words</title>
            <author initials="B." surname="Leiba" fullname="Barry Leiba">
              <organization>Huawei Technologies</organization>
            </author>
            <date year="2017" month="May" />
          </front>
          <seriesInfo name="BCP" value="14" />
          <seriesInfo name="RFC" value="8174" />
          <seriesInfo name="DOI" value="10.17487/RFC8174" />
        </reference>
      </references>
      
      <references anchor="informative">
        <name>Informative References</name>
        <reference anchor="RFC9309">
          <front>
            <title>Robots Exclusion Protocol</title>
            <author initials="M." surname="Koster" fullname="Martijn Koster">
              <organization></organization>
            </author>
            <author initials="G." surname="Illyes" fullname="Gary Illyes">
              <organization>Google LLC</organization>
            </author>
            <author initials="H." surname="Zeller" fullname="Henner Zeller">
              <organization>Google LLC</organization>
            </author>
            <author initials="L." surname="Sassman" fullname="Lizzi Sassman">
              <organization>Google LLC</organization>
            </author>
            <date year="2022" month="September" />
          </front>
          <seriesInfo name="RFC" value="9309" />
          <seriesInfo name="DOI" value="10.17487/RFC9309" />
        </reference>
      </references>
    </references>

    <section numbered="false" anchor="sample-automation-preferences-txt-file">
      <name>Sample automation-preferences.txt File</name>
      <t>
        The following is an example of a automation-preferences.txt file that adheres to this
        specification:
      </t>
      <figure>
        <artwork><![CDATA[
# Automation preferences for example.com
# Version: 1.0
# Last updated: 2025-04-08

# Group 1: Applies to the entire site
Host: example.com
Scope: /
AutomationPolicy: limited
AllowedMethods: GET, HEAD
DisallowedMethods: POST, PUT, DELETE, PATCH
RequireAutomationPurpose: true
AllowedPurposes: [PLACEHOLDER_PURPOSE1], [PLACEHOLDER_PURPOSE2]
DisallowedPurposes: [PLACEHOLDER_PURPOSE3]
ContactEmail: automation-policy@example.com

# Group 2: Specific preferences for the /admin/ path
Host: example.com
Scope: /admin/
AutomationPolicy: strict
AllowedMethods: GET
DisallowedMethods: POST, PUT, DELETE, PATCH
AllowedPurposes: [PLACEHOLDER_PURPOSE1]
DisallowedPurposes: [PLACEHOLDER_PURPOSE2], [PLACEHOLDER_PURPOSE3]
        ]]></artwork>
      </figure>
    </section>
  </back>
</rfc>