<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE rfc [
  <!ENTITY nbsp    "&#160;">
  <!ENTITY zwsp   "&#8203;">
  <!ENTITY nbhy   "&#8209;">
  <!ENTITY wj     "&#8288;">
]>
<?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?>
<!-- generated by https://github.com/cabo/kramdown-rfc version 1.6.17 (Ruby 3.1.2) -->
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" ipr="trust200902" docName="draft-bormann-cbor-cddl-csv-01" category="std" consensus="true" submissionType="IETF" tocInclude="true" sortRefs="true" symRefs="true" version="3">
  <!-- xml2rfc v2v3 conversion 3.14.1 -->
  <front>
    <title abbrev="CDDL for CSVs">Using CDDL for CSVs</title>
    <seriesInfo name="Internet-Draft" value="draft-bormann-cbor-cddl-csv-01"/>
    <author initials="C." surname="Bormann" fullname="Carsten Bormann">
      <organization>Universität Bremen TZI</organization>
      <address>
        <postal>
          <street>Postfach 330440</street>
          <city>Bremen</city>
          <code>D-28359</code>
          <country>Germany</country>
        </postal>
        <phone>+49-421-218-63921</phone>
        <email>cabo@tzi.org</email>
      </address>
    </author>
    <date year="2022" month="August" day="29"/>
    <keyword>Internet-Draft</keyword>
    <abstract>
      <t>The Concise Data Definition Language (CDDL), standardized in RFC 8610,
is defined to provide data models for data shaped like JSON or CBOR.</t>
      <t>Another representation format that is quote popular is the CSV
(Comma-Separated Values) file as defined by RFC 4180.</t>
      <t>The present document shows a way how to use CDDL to provide a data model for
CSV files.</t>
    </abstract>
  </front>
  <middle>
    <section anchor="intro">
      <name>Introduction</name>
      <t>The Concise Data Definition Language (CDDL), standardized in <xref target="RFC8610"/>,
is defined to provide data models for data shaped like JSON or CBOR.</t>
      <t>Another representation format that is quote popular is the CSV file as
defined by <xref target="RFC4180"/>.</t>
      <t>The present document shows how to use CDDL to provide a data model for
CSV files.</t>
      <section anchor="terminology">
        <name>Terminology</name>
        <t>The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>", "<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL
NOT</bcp14>", "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>", "<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>",
"<bcp14>MAY</bcp14>", and "<bcp14>OPTIONAL</bcp14>" in this document are to be interpreted as
described in BCP 14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only when, they
appear in all capitals, as shown here.</t>
        <t>This specification uses terminology from <xref target="RFC8610"/>.</t>
      </section>
    </section>
    <section anchor="csv-generic-data-model">
      <name>CSV generic data model</name>
      <t>The CSV format is defined in <xref target="RFC4180"/>.
The generic data model for the data in a CSV file can be described in CDDL as:</t>
      <sourcecode type="cddl"><![CDATA[
csv = [?header, *record]
header = [+header-field]
record = [+field]
header-field = text
field = text
]]></sourcecode>
      <t>Note that the elements of this data model describe the interpretation
of the data after processing and removal of lexical structure such as newlines,
commas, escape characters, and quotation marks.</t>
      <t>For the purposes of a specific application, the data model level
structure of each field may be described in a more elaborate way,
e.g., as a number.  CDDL currently does not have a way to express the
transformation from the text string in the CSV field to the number
that this text string
represents at the application data model level; the usage of anything
but "text" for a field therefore <bcp14>MUST</bcp14> be
accompanied by an instruction how to perform the translation.
As a preferred choice, the JSON representation of the data model item, if it
exists, <bcp14>MAY</bcp14> be chosen by that instruction.</t>
      <t>Since the CSV media type text/csv defaults to using the US-ASCII
character set (i.e., <xref target="STD80"/>; see <xref section="3" sectionFormat="of" target="RFC4180"/>), many uses of CSV will need to specify the media type
parameter <tt>charset</tt>.
(Note that CDDL can describe text information that is in UTF-8 form,
which includes US-ASCII as that is a subset of UTF-8.
If a different form that is not a subset of UTF-8 is really still
needed, some rules for conversion will need to be defined by the
application.)</t>
      <t>The media type parameter <tt>header</tt> <bcp14>MAY</bcp14> be used to
indicate the presence or absence of a header line; if it is not given,
the grammar <bcp14>MUST NOT</bcp14> be ambiguous about the presence of a header
(i.e., it <bcp14>MUST</bcp14> be either mandatory or absent).</t>
      <t>Note that the ABNF <xref target="STD68"/> in <xref target="RFC4180"/> does not quite handle the case that
<tt>charset</tt> is not <tt>us-ascii</tt>.
For the purposes of the present specification, the ABNF is understood
to allow all characters from the <tt>charset</tt> except %x22 and %x2C in <tt>TEXTDATA</tt>.
For the purposes of the present specification, the ABNF rule <tt>CRLF</tt> is
read as:</t>
      <sourcecode type="abnf"><![CDATA[
CRLF = [CR] LF
]]></sourcecode>
      <t>as is hinted in <xref section="3" sectionFormat="of" target="RFC4180"/>.</t>
    </section>
    <section anchor="examples">
      <name>Examples</name>
      <t>A simplified CSV form definition of a SID file <xref target="I-D.ietf-core-sid"/>
might look like this:</t>
      <sourcecode type="cddl"><![CDATA[
; header = absent

SID-File = [meta-record, 
            *dependency-record,
            *range-record,
            *item-record]

meta-record = ["meta",
               module-name: text,
               module-revision: empty / text,
               sid-file-revision: empty / text,
               description: empty / text]

dependency-record = ["dep",
                     module-name: text,
                     module-revision: text]

range-record = ["range",
                entry-point: uint,
                size: uint]

item-record = ["item",
               namespace: "module" / "identity" / "feature" / "data",
               identifier: yang-identifier / schema-node-path
               ; the above probably should say which namespace
               ; goes with which identifier
               sid: uint]

yang-identifier = text .abnf ("yang-identifier" .det id-abnf)
schema-node-path = text .abnf ("schema-node-path" .det id-abnf)
id-abnf = '
  schema-node-path = QID *( "/" OQID)
  yang-identifier = ID
  QID = ID ":" ID
  OQID = ID [":" ID]
  ID = I *C
  I = "_" / %x41-5a / %x61-7a
  C = I / %x30-39 / "-" / "."
'

empty = ""

]]></sourcecode>
      <t>TODO: show the example in <xref section="A" sectionFormat="of" target="I-D.ietf-core-sid"/></t>
    </section>
    <section anchor="iana-considerations">
      <name>IANA Considerations</name>
      <t>This document makes no requests of IANA.</t>
    </section>
    <section anchor="security-considerations">
      <name>Security considerations</name>
      <t>The security considerations of <xref target="RFC8610"/> and <xref target="RFC4180"/> apply.</t>
    </section>
  </middle>
  <back>
    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>
        <reference anchor="RFC8610" target="https://www.rfc-editor.org/info/rfc8610">
          <front>
            <title>Concise Data Definition Language (CDDL): A Notational Convention to Express Concise Binary Object Representation (CBOR) and JSON Data Structures</title>
            <author fullname="H. Birkholz" initials="H." surname="Birkholz">
              <organization/>
            </author>
            <author fullname="C. Vigano" initials="C." surname="Vigano">
              <organization/>
            </author>
            <author fullname="C. Bormann" initials="C." surname="Bormann">
              <organization/>
            </author>
            <date month="June" year="2019"/>
            <abstract>
              <t>This document proposes a notational convention to express Concise Binary Object Representation (CBOR) data structures (RFC 7049).  Its main goal is to provide an easy and unambiguous way to express structures for protocol messages and data formats that use CBOR or JSON.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="8610"/>
          <seriesInfo name="DOI" value="10.17487/RFC8610"/>
        </reference>
        <reference anchor="RFC4180" target="https://www.rfc-editor.org/info/rfc4180">
          <front>
            <title>Common Format and MIME Type for Comma-Separated Values (CSV) Files</title>
            <author fullname="Y. Shafranovich" initials="Y." surname="Shafranovich">
              <organization/>
            </author>
            <date month="October" year="2005"/>
            <abstract>
              <t>This RFC documents the format used for Comma-Separated Values (CSV) files and registers the associated MIME type "text/csv".  This memo provides information for the Internet community.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="4180"/>
          <seriesInfo name="DOI" value="10.17487/RFC4180"/>
        </reference>
        <reference anchor="STD68" target="https://www.rfc-editor.org/info/rfc5234">
          <front>
            <title>Augmented BNF for Syntax Specifications: ABNF</title>
            <author fullname="D. Crocker" initials="D." role="editor" surname="Crocker">
              <organization/>
            </author>
            <author fullname="P. Overell" initials="P." surname="Overell">
              <organization/>
            </author>
            <date month="January" year="2008"/>
            <abstract>
              <t>Internet technical specifications often need to define a formal syntax.  Over the years, a modified version of Backus-Naur Form (BNF), called Augmented BNF (ABNF), has been popular among many Internet specifications.  The current specification documents ABNF. It balances compactness and simplicity with reasonable representational power.  The differences between standard BNF and ABNF involve naming rules, repetition, alternatives, order-independence, and value ranges.  This specification also supplies additional rule definitions and encoding for a core lexical analyzer of the type common to several Internet specifications.  [STANDARDS-TRACK]</t>
            </abstract>
          </front>
          <seriesInfo name="STD" value="68"/>
          <seriesInfo name="RFC" value="5234"/>
          <seriesInfo name="DOI" value="10.17487/RFC5234"/>
        </reference>
        <reference anchor="RFC2119" target="https://www.rfc-editor.org/info/rfc2119">
          <front>
            <title>Key words for use in RFCs to Indicate Requirement Levels</title>
            <author fullname="S. Bradner" initials="S." surname="Bradner">
              <organization/>
            </author>
            <date month="March" year="1997"/>
            <abstract>
              <t>In many standards track documents several words are used to signify the requirements in the specification.  These words are often capitalized. This document defines these words as they should be interpreted in IETF documents.  This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t>
            </abstract>
          </front>
          <seriesInfo name="BCP" value="14"/>
          <seriesInfo name="RFC" value="2119"/>
          <seriesInfo name="DOI" value="10.17487/RFC2119"/>
        </reference>
        <reference anchor="RFC8174" target="https://www.rfc-editor.org/info/rfc8174">
          <front>
            <title>Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words</title>
            <author fullname="B. Leiba" initials="B." surname="Leiba">
              <organization/>
            </author>
            <date month="May" year="2017"/>
            <abstract>
              <t>RFC 2119 specifies common key words that may be used in protocol  specifications.  This document aims to reduce the ambiguity by clarifying that only UPPERCASE usage of the key words have the  defined special meanings.</t>
            </abstract>
          </front>
          <seriesInfo name="BCP" value="14"/>
          <seriesInfo name="RFC" value="8174"/>
          <seriesInfo name="DOI" value="10.17487/RFC8174"/>
        </reference>
      </references>
      <references>
        <name>Informative References</name>
        <reference anchor="I-D.ietf-core-sid" target="https://www.ietf.org/archive/id/draft-ietf-core-sid-19.txt">
          <front>
            <title>YANG Schema Item iDentifier (YANG SID)</title>
            <author fullname="Michel Veillette">
              <organization>Trilliant Networks Inc.</organization>
            </author>
            <author fullname="Alexander Pelov">
              <organization>Acklio</organization>
            </author>
            <author fullname="Ivaylo Petrov">
              <organization>Google Switzerland GmbH</organization>
            </author>
            <author fullname="Carsten Bormann">
              <organization>Universität Bremen TZI</organization>
            </author>
            <author fullname="Michael Richardson">
              <organization>Sandelman Software Works</organization>
            </author>
            <date day="26" month="July" year="2022"/>
            <abstract>
              <t>   YANG Schema Item iDentifiers (YANG SID) are globally unique 63-bit
   unsigned integers used to identify YANG items, as a more compact
   method to identify YANG items that can be used for efficiency and in
   constrained environments (RFC 7228).  This document defines the
   semantics, the registration, and assignment processes of YANG SIDs
   for IETF managed YANG modules.  To enable the implementation of these
   processes, this document also defines a file format used to persist
   and publish assigned YANG SIDs.


   // The present version (-19) adds in draft text about objectives,
   // parties, and roles.  This attempts to record discussions at side
   // meetings before, at, and after IETF 113.

              </t>
            </abstract>
          </front>
          <seriesInfo name="Internet-Draft" value="draft-ietf-core-sid-19"/>
        </reference>
        <reference anchor="STD80" target="https://www.rfc-editor.org/info/rfc20">
          <front>
            <title>ASCII format for network interchange</title>
            <author fullname="V.G. Cerf" initials="V.G." surname="Cerf">
              <organization/>
            </author>
            <date month="October" year="1969"/>
          </front>
          <seriesInfo name="STD" value="80"/>
          <seriesInfo name="RFC" value="20"/>
          <seriesInfo name="DOI" value="10.17487/RFC0020"/>
        </reference>
      </references>
    </references>
    <section numbered="false" anchor="acknowledgements">
      <name>Acknowledgements</name>
      <t>Rob Wilton, unknowingly, made me write this specification.
I hope it will be useful.</t>
      <!--  LocalWords:  dedenting dedented
 -->

</section>
  </back>
  <!-- ##markdown-source:
H4sIAAAAAAAAA81Y63IbtxX+j6dAtpOJ5HAZkZIdiY7d0KTUsCOLjkg5TT2e
CtwFSYx2F+sFVhKjkZ+lP/ok7Yv1O8ByeVMymeZPOSMKOAAOzuU7FzAMQ2aV
TWSHv2acXxmVzXiv3z/nU13w3ui9YWIyKeRtZ4sa6ygTKY7FhZjacKKLVGRZ
GGEQRnGchJG5DRNhpbEsxr8Obx+02+HBcdg+YQzsDtmNXNzpIu7wQWZlkUkb
9okXi4TtcGNjFunMyMyUpsNtUUpmykmqjFE6s4scHAen4zPGRGnnuuhA+hB/
nKsM+3tN/saL5Ghe1J4ojJXZxoouZh1+lalbWRhl//Mvy98UMsWm8d8HboOx
hZQQ6J02diqiOT88PDg6OnBrkbKLTnXAE3SMe/ph+/jw+UlFKTNbYNdfJF26
cMR8rjPs+/roJDxqt8J26zh8cXjSbrlFmQqVdHgkJvp7+4tqQkLGWEYyW4hJ
il6e9Y5ftA6wCZb286PWMc3NLaajcf/FMe3jYYeLSTal4asObXvePjxiTGXT
dXaDsN9U0k7DSBcyNAouwZdnRFxxrn3AWBiG4AZ7iMgyNp5L3tNZpIzkfWEF
78upypSFd/i5yGalmEm+R6DZb8CIIotFEatfZAwHEUdOGjSYMjymg6BbzfNC
36pY8pgYpjBmYhzk3NzMRY5tibqR/K+j4QUnLL4ZXjYZ62bazmXBC5kXEpix
wsnhteR2ji9c9KnUVvJc52UiCiJY0mH0nu31dJqKcCRzUQCrMX8vklKafT5V
ieRiJeJk4UQnYze9Car7OOKhTGlg5vrOcMHvxIJjSFqVMJGLnjUNxZqOJCeD
HO46A8bO1KmCcyVjiI5Cx2XkNKo+D39SRH1kr9Y+f9AnDw8ucB8f/z+csrQ9
W7M9iWhuHx9/2/b/q9XHiFCV6UTPFuSA5cffhWTFKVsZHry9Go2Dhv/PL4Zu
fHn649Xg8rRP49EP3fPzesCqHaMfhlfn/dVodbI3fPv29KLvD4PKN0gseNv9
GStwFg+G78aD4UX3PCCH2Tn5aam7KCRpOpFYQjqFbQjIzn4mKtTEO/lN792/
/9k6giW/oKhutU4eH6vJcevbI0zu5jLzt+ksWVRT+GTBRJ5LclHGRZIgP+XK
isQ0KD7I8BmHsyXs+OwDWeZjh383ifLW0euKQApvEJc22yA6m+1Sdg57Iz5B
euKa2pob9C1Lb8rb/XljvrT7GpFgAfubXEZqqiKPbWAOAF7hiE8Lna4iC8Yh
wM1kJgsVrWFxI453Ipow6oNmLTCriPXhQNt2uboopXByJPLbKrAikRFWNsDh
okWYDmOfP3/2tQX8+Sv+4c9zKWJZNPizQqJIxB+ZJ9Da134YTpVMsOA3uIWK
sr4OupX3lm1McBtjF5QGXFIgiWVCNdUarqcVzldaLWV2G2uwOwcwt71SGL0E
JEToR9K4voZAjVqtb0VCfBN5D78lVOGRXUvEjylR3wHnTN4lMLJpoAVBYQDG
cSXyHI/mgqofegUfIpS8vOdTUdxQEjmrTJ6XRa4JDbhI1CjhiKGkAktjJanX
K5G3QMJKGpyU1HB4W6WoJ9sOo5MFGQu9AhUuKjoNJpuzpotKwbMynciiyb1r
o7IoYFREdawhGdIzn4tbWRUrJA95TynVpWCGMp+Zqk2grE1AJoHJY2QyMqhL
QstsTUKCBxH8tazyJuX01SFWlwMI6J29ZpQde7x0O0pDlYtMmS3AEFwmpeUB
sQ0cysVSAMpBUzKKSzkTyUQEH+YiU76EAPZoEJ2N6bqqVuSyIFW9gqR44qRp
si5ZEfJOJUwXw/9aRdJ7zlW7rdq2Dj+vhLIybXA1xYABb8YCOcgu5Eoww0kS
ytfClVjA0Uhlkaytm8pYCU5drzPlNxSWSASiTGBFV+vIHbT7ahR2R73BgNVQ
5UZavqeaEqB4eHAt3ePjS1AlTaW3wyFFjs8maA2oUfW5DFS6/04h42fS9wMe
zAt33UowRr1TKunCa7obt1432d4qrD0EYf9V/BIq6l5UZ3VPAFxdjc/CY5f2
GuxurhAGMEhS4mytIkF8eQIhVk5IUQjsjjbZgAIvVlO4jupj5V+/naC/c4QW
ConatgBUoTAjhWWMTkmnkhclugSHNTxL3GsBAm/YxUVn3atQDK0Bu7nvk/ma
J9cM5lPk9RIZMD1xRJ8e03GPAw80gILgPqmGpGOViiljvfRIW6o4Q4OfNRid
nuEu5Ci+rMR0jUgnalbqEuab6NJu3bJizSr0gHEVVVwq19ql1EJaXSxqoex+
czuXd99cnFGtopcIGoy1wrXKQ59KBAqyUYaW1x2KhPEsWI2mpVbXpQmFiZQC
vp5Kt3atNdwoz42VOGBVZlDNWK1jBufB7UgFrrGpc/wq6a1kkPeRzC3/8r7d
dhUAgx6pdD0+/du43x13/4BQBDF+3bs8PyNdkShFvCrH7h1Ha1Rae5cf+fmZ
L5yIAmgzp0JYdQV1UPM6qOGU03uR5sAwW3UXXbzzQIMsOLpsMzyI1TKfCT4a
9H3PALfhXfj4yFI1m1ueaH3jG3/K8etdw0tetwceFMhng354RkwgPUAvQt8n
NDjja59nscwl/JJFi+WGzXXk5pl8eokybbhsT9jaHXRlQPNg8wQ+yNGweeh/
HqB09Gs7CnmrKOI7eJ7ndsG/eXo3zBOSqX7vfp8Kc7uzExrsmMLpAequGr9X
mV9Rqbpw3bbuLkd44jZJv2iEuQbkOrzE9+4Wg1elXwPjNc84vjTfZUtym1xE
OBd4EQMYI8CzLbPKLtxkKgW1Rm5MRXaXi98OQBcdvoD84YqAQyaaSzz0M5Tm
MBd2vn3adxvIhrcUrXoiJlQM5rpEa2EEvYaoEtWS7h6fUUK7Q3Kstq5ufwIq
tYG2BfV9MW9SzPO9YGs54M0YZQtQo/V9tq3U9vHt9e3z1QDHvoKQT3D7EQng
2R4Pvgn4EON97NqVeNAHmXbSkAedwFOGNemDp32kH5wciT/r0RjD4B/k0S/v
j1rhc+FGL1rhtwKrPbeRKIcH4eEJ+T103m8G7CvGfMiAQcB8MhwP+8OOe476
Z4TPeT4tdnMKKHXPuy4v+lTGBt2LLv1ogqksXEY2T7zGqtde/dpOxY2rXGgY
PpXS+IcKsaLWTaLRBmKpUfhtrmuPPPP0IWJbPx9dwalrJzUXi6b/rWgiohuk
8+gm03eJjGf+8bRzI3voVL25jF8FmQ6g/6We8J9UYqkSlRkxQCuZLKgJjKld
4XeFsj7FbxYudFhon9HHoC1wfZBvXKZlAqG++wJS8XON99VP9KtJh3Kdgwv6
VD+SMeNh+Jr9F8diknODFgAA

-->

</rfc>
