<?xml version="1.0" encoding="US-ASCII"?>
<!-- This template is for creating an Internet Draft using xml2rfc,
    which is available here: http://xml.resource.org. -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!-- One method to get references from the online citation libraries.
    There has to be one entity for each item to be referenced. 
    An alternate method (rfc include) is described in the references. -->
<!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY RFC2629 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2629.xml">
<!ENTITY RFC3552 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3552.xml">
<!ENTITY I-D.narten-iana-considerations-rfc2434bis SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.narten-iana-considerations-rfc2434bis.xml">
]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- used by XSLT processors -->
<!-- For a complete list and description of processing instructions (PIs), 
    please see http://xml.resource.org/authoring/README.html. -->
<!-- Below are generally applicable Processing Instructions (PIs) that most I-Ds might want to use.
    (Here they are set differently than their defaults in xml2rfc v1.32) -->
<?rfc strict="yes" ?>
<!-- give errors regarding ID-nits and DTD validation -->
<!-- control the table of contents (ToC) -->
<?rfc toc="yes"?>
<!-- generate a ToC -->
<?rfc tocdepth="4"?>
<!-- the number of levels of subsections in ToC. default: 3 -->
<!-- control references -->
<?rfc symrefs="yes"?>
<!-- use symbolic references tags, i.e, [RFC2119] instead of [1] -->
<?rfc sortrefs="yes" ?>
<!-- sort the reference entries alphabetically -->
<!-- control vertical white space 
    (using these PIs as follows is recommended by the RFC Editor) -->
<?rfc compact="yes" ?>
<!-- do not start each main section on a new page -->
<?rfc subcompact="no" ?>
<!-- keep one blank line between list items -->
<!-- end of list of popular I-D processing instructions -->
<rfc category="std" docName="draft-xu-lsr-isis-flooding-reduction-in-msdc-05"
     ipr="trust200902">
  <front>
    <title abbrev="">IS-IS Flooding Reduction in MSDC</title>

    <author fullname="Xiaohu Xu" initials="X." surname="Xu">
      <organization>China Mobile</organization>

      <address>
        <email>xuxiaohu_ietf@hotmail.com</email>
      </address>
    </author>

    <author fullname="Luyuan Fang" initials="L. " surname="Fang">
      <organization>eBay</organization>

      <address>
        <postal>
          <street/>

          <city/>

          <region/>

          <code/>

          <country/>
        </postal>

        <phone/>

        <facsimile/>

        <email>luyuanf@gmail.com</email>

        <uri/>
      </address>
    </author>

    <author fullname="Jeff Tantsura" initials="J." surname="Tantsura">
      <organization>Nvidia</organization>

      <address>
        <postal>
          <street/>

          <city/>

          <region/>

          <code/>

          <country/>
        </postal>

        <phone/>

        <facsimile/>

        <email>jefftant.ietf@gmail.com</email>

        <uri/>
      </address>
    </author>

    <author fullname="Shaowen Ma" initials="S." surname="Ma">
      <organization>Google</organization>

      <address>
        <postal>
          <street/>

          <city/>

          <region/>

          <code/>

          <country/>
        </postal>

        <phone/>

        <facsimile/>

        <email>shaowen@google.com</email>

        <uri/>
      </address>
    </author>

    <!--

-->

    <date day="31" month="January" year="2024"/>

    <abstract>
      <t>IS-IS is a commonly used routing protocol in MSDC (Massively Scalable
      Data Center) networks where CLOS is the most popular topology. In a CLOS
      topology, each IS-IS router would receive multiple copies of the same
      LSP (Link State Packet) from multiple IS-IS neighbors. Moreover, two
      IS-IS neighbors may send each other the same LSP simultaneously. The
      unnecessary link-state information flooding results in a large waste of
      resources for IS-IS routers, as there are too many neighbors for each
      router. To address this scaling problem, this document introduces some
      extensions to the IS-IS protocol. These extensions aim to significantly
      reduce the IS-IS flooding within MSDC networks, which can greatly
      improve the scalability of such networks.</t>
    </abstract>

    <note title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119">RFC 2119</xref>.</t>
    </note>
  </front>

  <middle>
    <section title="Introduction">
      <t>IS-IS is a commonly used routing protocol in MSDC (Massively Scalable
      Data Center) networks where CLOS is the most popular topology. In a CLOS
      topology, each IS-IS router would receive multiple copies of the same
      LSP (Link State Packet) from multiple IS-IS neighbors. Moreover, two
      IS-IS neighbors may send each other the same LSP simultaneously. The
      unnecessary link-state information flooding results in a large waste of
      resources for IS-IS routers, as there are too many neighbors for each
      router. </t>

      <t>As a result, some MSDC operators had to opt for BGP as the routing
      protocol <xref target="RFC7938"/>. However, with the introduction of
      high-performance Ethernet networks, which are widely used in AI and
      high-performance computing (HPC), it has become essential to have
      visibility of the whole network topology and even the link capacity and
      load information for global load-balancing. Therefore, for large-scale
      AI and HPC Ethernet networks, link-state routing protocols like IS-IS
      should be reconsidered as the routing protocol. However, it is crucial
      to address the scaling issue associated with link-state routing
      protocols as mentioned earlier.</t>

      <t>This document presents an effective solution to the scaling issue
      mentioned above. Instead of transmitting link-state information between
      neighboring IS-IS routers with the MSDC network fabric, link-state
      information originating from each IS-IS router will be gathered by
      centralized controllers. These controllers will then distribute the
      collected link-state information to all IS-IS routers within the MSDC.
      As illustrated in Figure 1, all IS-IS routers in an MDSC network fabric
      will be linked to one or more centralized controllers through a
      dedicated Local Area Network (LAN). This LAN is specifically intended
      for link-state information collection and distribution. For redundancy
      purposes, there should be at least two link-state collection and
      distribution LANs.</t>

      <t><figure>
          <artwork align="center"><![CDATA[           +----------+                  +----------+                     
           |Controller|                  |Controller|                     
           +----+-----+                  +-----+----+                     
                |DIS                           |Candidate DIS                       
                |                              |                          
                |                              |                          
   ---+---------+---+----------+-----------+---+---------+-LS Collection&Distribution LAN       
      |             |          |           |             |                
      |Non-DIS      |Non-DIS   |Non-DIS    |Non-DIS      |Non-DIS          
      |             |          |           |             |                
      |         +---+--+       |       +---+--+          |                
      |         |Router|       |       |Router|          |                
      |         *------*-      |      /*---/--*          |                
      |        /     \   --    |    //    /    \         |                
      |        /     \     --  |  //      /    \         |                
      |       /       \      --|//       /      \        |                
      |       /        \      /*-       /        \       |                
      |      /          \   // | --    /         \       |                
      |      /          \ //   |   --  /          \      |                
      |     /           /X     |     --           \      |                
      |     /         //  \    |     / --          \     |                
      |    /        //    \    |     /   --         \    |                
      |    /      //       \   |    /      --       \    |                
      |   /     //          \  |   /         --      \   |                
      |   /   //             \ |  /            --     \  |                
      |  /  //               \ |  /              --   \  |                
    +-+- //*                +\\+-/-+               +---\-++               
    |Router|                |Router|               |Router|               
    +------+                +------+               +------+               

                              Figure 1
]]></artwork>
        </figure></t>

      <t>In the MSDC network, the IS-IS routers do not need to exchange any
      IS-IS Protocol Datagram Units (PDUs) other than Hello packets among
      them. This is due to the presence of a controller that acts as an IS-IS
      Designated Intermediate System (DIS) for the link-state collection and
      distribution LAN. To obtain the complete topology information of the
      MSDC network, these IS-IS routers exchange the link-state information
      with the controller, which is elected as IS-IS DIS for the link-state
      collection and distribution LAN.</t>

      <t>To further reduce the flooding of the multicast IS-IS PDUs over the
      link-state collection and distribution LAN, IS-IS routers will not send
      multicast IS-IS Hello packets over that LAN. Instead, they will wait for
      IS-IS Hello packets from the controller that has been elected as IS-IS
      DIS initially. Once an IS-IS DIS has been discovered, the routers will
      start sending IS-IS Hello packets directly to the IS-IS DIS at regular
      intervals as unicasts. Consequently, IS-IS routers would only form an
      adjacency with the IS-IS DIS over that LAN. Additionally, IS-IS routers
      will send IS-IS PDUs to the IS-IS DIS as unicasts. However, the IS-IS
      DIS will continue to send IS-IS PDUs as before. These changes to the
      current IS-IS router behaviors will significantly reduce IS-IS flooding
      and improve the scalability of MSDC networks.</t>
    </section>

    <section anchor="Abbreviations_Terminology" title="Terminology">
      <t>This memo makes use of the terms defined in <xref
      target="RFC1195"/>.</t>
    </section>

    <section title="Modifications to Current IS-IS Behaviors ">
      <t/>

      <section title="IS-IS Routers as Non-DIS">
        <t>IS-IS routers exchange Hello packets bidirectionally. After that,
        they originate Link State PDUs (LSPs) accordingly. However, these
        self-originated LSPs don't need to be directly exchanged between the
        routers. They only need to be sent to the IS-IS DIS for the link-state
        collection and distribution LAN. It is important to note that IS-IS
        routers should not be elected as IS-IS DIS for the link-state
        collection and distribution LAN (this can be done by setting the DIS
        Priority of those IS-IS routers to zero).</t>

        <t>To further minimize the number of multicast IS-IS PDUs transmitted
        over the link-state collection and distribution LAN, IS-IS routers
        should send IS-IS PDUs as unicasts. Specifically, IS-IS routers must
        send unicast IS-IS Hello packets periodically to the controller
        elected as IS-IS DIS. This means that IS-IS routers will not send any
        IS-IS Hello packet over the link-state collection and distribution LAN
        until they have identified an IS-IS DIS for the link-state collection
        and distribution LAN. As a result, IS-IS routers will not discover
        each other over the link-state collection and distribution LAN, and
        will not establish adjacencies with each other. Moreover, IS-IS
        routers should send all types of IS-IS PDUs to the IS-IS DIS as
        unicasts as well.</t>

        <t>To prevent data traffic from being forwarded across the link-state
        collection and distribution LAN, the interfaces of all IS-IS routers
        to the LAN must be set to the maximum cost value.</t>
      </section>

      <section title="Controllers as DIS">
        <t>When a controller is elected as the IS-IS DIS, it would send IS-IS
        PDUs as multicasts or unicasts as normal. Additionally, it is required
        to accept and process those unicast IS-IS PDUs originated from other
        IS-IS routers. Upon receiving any new LSP from a given IS-IS router,
        the DIS must flood it immediately to the link-state collection and
        distribution LAN. This serves two purposes: 1) to acknowledge the
        receipt of that LSP implicitly, and 2) to synchronize that LSP to all
        other IS-IS routers.</t>

        <t>To reduce the frequency of advertising the Complete Sequence Number
        PDU (CSNP) on the DIS for the link-state collection and distribution
        LAN, it is recommended that IS-IS routers send an explicit
        acknowledgement with a Partial Sequence Number PDU (PSNP) upon
        receiving a new LSP from that DIS.</t>
      </section>
    </section>

    <section anchor="Acknowledgements" title="Acknowledgements">
      <t>The authors would like to thank Peter Lothberg and Erik Auerswald for
      their valuable comments and suggestions on this document.</t>

      <!---->
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>TBD.</t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>TBD.</t>

      <!---->
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include='reference.RFC.2119'?>

      <?rfc include='reference.RFC.1195'?>

      <!---->
    </references>

    <references title="Informative References">
      <?rfc ?>

      <?rfc include='reference.RFC.4136'?>

      <?rfc include='reference.RFC.7938'?>

      <!---->
    </references>
  </back>
</rfc>
