<?xml version="1.0" encoding="US-ASCII"?>
<!-- This template is for creating an Internet Draft using xml2rfc,
    which is available here: http://xml.resource.org. -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!-- One method to get references from the online citation libraries.
    There has to be one entity for each item to be referenced. 
    An alternate method (rfc include) is described in the references. -->

<!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY RFC2629 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2629.xml">
<!ENTITY RFC3552 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3552.xml">
<!ENTITY RFC5226 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5226.xml">
]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- used by XSLT processors -->
<!-- For a complete list and description of processing instructions (PIs), 
    please see http://xml.resource.org/authoring/README.html. -->
<!-- Below are generally applicable Processing Instructions (PIs) that most I-Ds might want to use.
    (Here they are set differently than their defaults in xml2rfc v1.32) -->
<?rfc strict="yes" ?>
<!-- give errors regarding ID-nits and DTD validation -->
<!-- control the table of contents (ToC) -->
<?rfc toc="yes"?>
<!-- generate a ToC -->
<?rfc tocdepth="4"?>
<!-- the number of levels of subsections in ToC. default: 3 -->
<!-- control references -->
<?rfc symrefs="yes"?>
<!-- use symbolic references tags, i.e, [RFC2119] instead of [1] -->
<?rfc sortrefs="yes" ?>
<!-- sort the reference entries alphabetically -->
<!-- control vertical white space 
    (using these PIs as follows is recommended by the RFC Editor) -->
<?rfc compact="yes" ?>
<!-- do not start each main section on a new page -->
<?rfc subcompact="no" ?>
<!-- keep one blank line between list items -->
<!-- end of list of popular I-D processing instructions -->
<rfc category="info" docName="draft-ietf-idr-enhanced-gr-06.txt" ipr="trust200902">
 <!-- category values: std, bcp, info, exp, and historic
    ipr values: trust200902, noModificationTrust200902, noDerivativesTrust200902,
       or pre5378Trust200902
    you can add the attributes updates="NNNN" and obsoletes="NNNN" 
    they will automatically be output with "(if approved)" -->

 <!-- ***** FRONT MATTER ***** -->

 <front>
   <!-- The abbreviated title is used in the page header - it is only necessary if the 
        full title is longer than 39 characters -->

   <title abbrev="Abbreviated Title">Accelerated Routing Convergence for BGP Graceful Restart
   	</title>

   <!-- add 'role="editor"' below for the editors if appropriate -->

   <!-- Another author who claims to be an editor -->

   <author fullname="Keyur Patel" initials="K." 
           surname="Patel">
     <organization>Cisco Systems</organization>

     <address>
       <postal>
         <street>170 W. Tasman Drive</street>

         <!-- Reorder these if your country does things differently -->

         <city>San Jose</city>

         <region>CA</region>

         <code>95134</code>

         <country>USA</country>
       </postal>

       <phone></phone>

       <email>keyupate@cisco.com</email>
     </address>
   </author>

   <author fullname="Enke Chen" initials="E." 
           surname="Chen">
     <organization>Cisco Systems</organization>

     <address>
       <postal>
         <street>170 W. Tasman Drive</street>

         <!-- Reorder these if your country does things differently -->

         <city>San Jose</city>

         <region>CA</region>

         <code>95134</code>

         <country>USA</country>
       </postal>

       <phone></phone>

       <email>enkechen@cisco.com</email>
     </address>
   </author>

   <author fullname="Rex Fernando" initials="R." 
           surname="Fernando">
     <organization>Cisco Systems</organization>

     <address>
       <postal>
         <street>170 W. Tasman Drive</street>

         <!-- Reorder these if your country does things differently -->

         <city>San Jose</city>

         <region>CA</region>

         <code>95134</code>

         <country>USA</country>
       </postal>

       <phone></phone>

       <email>rex@cisco.com</email>
     </address>
   </author>

   <author fullname="John Scudder" initials="J." 
           surname="Scudder">
     <organization>Juniper Networks</organization>

     <address>
       <postal>
         <street>1133 Innovation Way</street>

         <!-- Reorder these if your country does things differently -->

         <city>Sunnyvale</city>

         <region>CA</region>

         <code>94089</code>

         <country>USA</country>
       </postal>

       <phone></phone>

       <email>jgs@juniper.net</email>
     </address>
   </author>

   <date/>

   <!-- Meta-data Declarations -->

   <area>Routing</area>

   <workgroup>Internet Engineering Task Force</workgroup>

   <keyword>BGP</keyword>

   <abstract>
   	<t>
   In this document we specify extensions to BGP graceful restart in
   order to avoid unnecessary transmission of the routing information
   preserved across a session restart, thus accelerating the routing
   convergence.
   	</t>
   </abstract>
 </front>

 <middle>
   <section title="Introduction">
	<t>
   Currently the BGP graceful restart (GR) mechanism specified in
   <xref target="RFC4724"/> requires a complete re-advertisement of the routing
   information across a session restart, even though the routing
   information may have been preserved.  For example, as described in
   <xref target="RFC4724"/>, the "Receiving Speaker" temporarily maintains the routes
   received from its neighbor with the GR Capability.  In addition, the
   "Restarting Speaker" may also be able to preserve routing information
   across a BGP restart by check-pointing routing information to a
   standby or secondary facility.
	</t>
	<t>
   Clearly the routing re-convergence post a session restart would be
   faster if we can avoid unnecessary transmission of the routing
   information preserved across a session restart. That is the goal of
   this document.
	</t>
	<t>
   In this document we specify extensions to BGP graceful restart in
   order to avoid unnecessary transmission of the routing information
   preserved across a session restart, thus accelerating the routing
   convergence.  More specifically, we describe a "version number" based
   mechanism for keeping track of the routing information across a
   session restart.  A new BGP message type, UPDATE-VERSION, is
   introduced for checkpointing the update version maintained for a
   neighbor.  We also introduce the Enhanced Graceful Restart
   Capability, and specify procedures for handling routing update across
   a session restart.
	</t>

     <section title="Requirements Language">
       <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
       "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
       document are to be interpreted as described in <xref
       target="RFC2119">RFC 2119</xref>.</t>
     </section>
   </section>
   
   <section title="Version Numbers for Routing Entities">
   	<t>
   In order to avoid unnecessary transmission of the routing information
   preserved across a session restart, a BGP speaker will need to
   identify exactly "what" has been preserved by a remote speaker.
	</t>
	<t>
   The approach described here is "version number" (or "sequence
   number") based, and it consists of (a) assigning a unique,
   monotonically increasing number as the version number for each
   routing entity (e.g., route or message) when it is created or
   modified; and (b) maintaining an update version (for each neighbor)
   calculated as the maximum of the version numbers of all the routing
   entities that have been sent to the neighbor.
	</t>
	<t>
   A BGP speaker can tell whether a given routing entity has been sent
   to a neighbor by comparing the version number of the entity with the
   update version for the neighbor.  Thus by checkpointing the update
   version for a neighbor across a session restart, a BGP speaker would
   be able to identify exactly "what" has been preserved by a remote
   speaker, and also "what" remains to be sent.
	</t>
	<t>
   In this document a version number is a 8-octet unsigned integer.
   Value 0 is used to indicate the beginning (or "epoch") of the update
   generation.  The version number is not expected to wrap.  However, in
   the unlikely scenario that it does wrap, the sender MUST maintain its
   internal consistency, and also MUST perform a route refresh <xref target="RFC2918"/>,
   <xref target="RFC7313"/> toward the receiver.
	</t>
	<t>
   The number space for the version numbers should be AFI/SAFI [RFC4760]
   specific.  Version numbers are also assigned (from the same number
   space) to other AFI/SAFI specific, non-update information (such as
   ROUTE-REFRESH <xref target="RFC2918"/>), and are included in the calculation of the
   update version for a neighbor.
	</t>
   </section>

   <section title="UPDATE-VERSION Message">
   	<t>
   The UPDATE-VERSION message is a new BGP message type with type code
   &lt;TBD&gt;.  In addition to the fixed-size BGP header <xref target="RFC4271"/>, the
   UPDATE-VERSION message contains the following fields:
	</t>

     <figure align="center">
       <artwork align="left"><![CDATA[
             +------------------------------------------------+
             | Address Family Identifier (2 octets)           |
             +------------------------------------------------+
             | Subsequent Address Family Identifier (1 octet) |
             +------------------------------------------------+
             | Message Subtype (1 octet)                      |
             +------------------------------------------------+
             | Version (8 octets)                             |
             +------------------------------------------------+
           ]]></artwork>
     </figure>

	<t>
   The "Address Family Identifier" (AFI) field and the "Subsequent
   Address Family Identifier" (SAFI) field are the same as the ones used
   in <xref target="RFC4760"/>.
	</t>
	<t>
   The "Message Subtype" field indicates whether the sender is (a)
   sending an update version (value 1), (b) acknowledging the receipt of
   an update version (value 2), or (c) requesting updates from the very
   last update version the sender has acknowledged (value 3).
	</t>
	<t>
   The Version field contains an update version associated with the
   message subtypes 1 and 2.  The value of this field is irrelevant for
   the message subtype 3.  This value of the field is opaque to the
   receiver.
	</t>
	<t>
   As detailed in the Operation section, the UPDATE-VERSION message can
   be used by a BGP speaker to either carry an update version, or
   acknowledge the receipt of an update version, or request updates from
   the very last update version acknowledged.
	</t>
   </section>

   <section title="Enhanced Graceful Restart Capability">
	<t>
   The Enhanced Graceful Restart (GR) Capability is a new BGP capability
   <xref target="RFC5492"/>.  The Capability Code for this capability is specified in
   the IANA Considerations section of this document. The Capability
   Length field of this capability is 0.
	</t>
	<t>
   By advertising the Enhanced GR Capability to a peer, a BGP speaker
   conveys to the peer that the speaker is capable of receiving and
   properly handling the UPDATE-VERSION message from the peer, as well
   as recognizing the two new bit flags defined below for the GR
   Capability.
	</t>
	<t>
   The two new bit flags for the "Flags for Address Family" field of the
   GR Capability are defined as follows:
	</t>

     <figure align="center">
       <artwork align="left"><![CDATA[
                        0 1 2 3 4 5 6 7
                       +-+-+-+-+-+-+-+-+
                       | | |R|T|       |
                       +-+-+-+-+-+-+-+-+
           ]]></artwork>
     </figure>

	<t>
   The third most significant bit (R) is defined as the "RX Routing
   State", which is used to indicate whether during the previous session
   restart the routes of the given AFI/SAFI that were received have
   indeed been preserved up to the update version acknowledged by the
   speaker previously.  When set (value 1), the bit indicates that the
   routes have been preserved.
	</t>
	<t>
   The fourth most significant bit (T) is defined as the "TX Routing
   State", which is used to indicate whether the speaker has indeed
   preserved enough state to resume advertising routes of the given
   AFI/SAFI from the update version acknowledged by the neighbor
   previously.  When set (value 1), the bit indicates that the state has
   been preserved.
	</t>
   </section>
   
   <section title="Operation">
	<t>
   In order for a BGP speaker to be able to resume sending routing
   information for an AFI/SAFI from the last update version that was
   previously acknowledged by a peer, the speaker MUST maintain enough
   state for all the routing information that has been sent until their
   acknowledgment is received by the speaker.  The routing information
   includes reachable / unreachable information as well as other
   AFI/SAFI specific, non-update information.  Furthermore, the route
   advertisement state needs to be maintained properly in order to
   minimize spurious route withdraws across a session restart.
	</t>
	<t>
   An implementation SHOULD impose an upper bound on how much state it
   would maintain in the case that a receiver ("slow peer") is not able
   to generate an acknowledgment in a timely manner.  The upper bound
   might be based on a number of factors such as the number of pending
   unacknowledged withdraws or more generally, the volume of
   unacknowledged state, and a timer.  Once the acknowledgment from a
   peer is not received within the specified upper bound, and the
   maintained state is compromised, then the speaker MUST clear the "TX
   Routing State" in the GR Capability to be advertised to the peer in
   the next session restart.
	</t>
	<t>
   A BGP speaker MAY advertise the Enhanced GR Capability to its peer if
   the speaker is capable of receiving and properly handling the UPDATE-
   VERSION message from the peer, and also recognizing the two new bit
   flags in the GR Capability.  If the GR Capability is to be sent by
   the speaker, the "RX Routing State" for an AFI/SAFI in the GR
   Capability SHOULD be set if the speaker has preserved the routing
   information from the peer up to the update version that the speaker
   acknowledged previously.  In addition, the "TX Routing State" for an
   AFI/SAFI in the GR Capability SHOULD be set if the speaker has
   preserved enough routing state to resume sending messages from the
   update version acknowledged by the peer previously.
	</t>
	<t>
   When both the GR Capability and the Enhanced GR Capability are to be
   included in an OPEN message, it is RECOMMENDED (though not required)
   that the Enhanced GR Capability be placed ahead of the GR Capability.
	</t>
	<t>
   In processing the GR Capability in an OPEN message from a peer, a BGP
   speaker MUST NOT examine the two new bit flags defined in this
   document for the GR Capability unless the Enhanced GR Capability is
   also present in the OPEN message.
	</t>
	<t>
   A BGP speaker MAY send an UPDATE-VERSION message to a peer only if
   the Enhanced GR Capability is received from the peer.
	</t>
	<t>
   Once a BGP speaker receives the Enhanced GR Capability from its peer,
   the speaker SHOULD send an UPDATE-VERSION message carrying the update
   version after sending significant amount of routing information
   (including non-UPDATE messages) for an AFI/SAFI.  This SHALL continue
   as long as routing information is being sent.  To reduce the overhead
   by excessive number of UPDATE-VERSION messages, we highly recommend
   the "batching" approach, that is, use one UPDATE-VERSION message to
   cover a number of routing updates, and/or a meaningful duration of
   time.
	</t>
	<t>
   When a BGP speaker receives an UPDATE-VERSION message carrying an
   update version, if the AFI/SAFI carried by the message does not match
   any AFI/SAFI that the speaker is willing to receive from the peer,
   the UPDATE-VERSION message SHALL be ignored.  Otherwise, the speaker
   MUST send an UPDATE-VERSION message back promptly acknowledging the
   receipt of the update version.  The UPDATE-VERSION messages carrying
   the acknowledgments MUST be sent in the same order as the received
   UPDATE-VERSION messages carrying the update versions.
	</t>
	<t>
   When a BGP speakers receives an UPDATE-VERSION message acknowledging
   an update version, the speaker MUST record this latest update version
   being acknowledged for future use.
    </t>
    <t>
   Consider the case that both the GR Capability and the Enhanced GR
   Capability are exchanged between Speaker A and Speaker B, and for an
   AFI/SAFI the "TX Routing State" is set in the GR advertised by A, and
   the "RX Routing State" is also set in the GR received from B.  Then
   Speaker A SHALL send routing information from the last update version
   that was previously acknowledged by Speaker B.  Note that it may be
   advantageous for Speaker B to send an UPDATE-VERSION message
   acknowledging the most recent update version immediately after the
   session is established.  Also, Speaker B MUST NOT follow the
   procedures described in <xref target="RFC4724"/> for purging stale routes.  If the
   conditions specified in this paragraph are not satisfied, then the
   procedures described in <xref target="RFC4724"/> remain unchanged.
	</t>
	<t>
   During the lifetime of an established session, if needed, a BGP
   speaker MAY use the UPDATE-VERSION message to request updates from
   the last update version that was previously acknowledged as long as
   the speaker has received the Enhanced GR Capability from its peer.
	</t>
	<t>
   When a BGP speaker receives such a request, it SHALL try to send
   routing information from the last acknowledged update version that
   the speaker has recorded.  If the speaker is unable to do so for some
   reason (e.g., "slow peer"), then it SHOULD perform a route refresh
   using mechanism defined in <xref target="RFC7313"/> if possible.  Otherwise, the BGP
   speaker SHOULD reset the session.
	</t>
   </section>
   
   <section title="Error Handling">
   	<t>
   This document defines a new NOTIFICATION error code:
	</t>
       <texttable>
         <ttcol align="center">Error Code</ttcol>

         <ttcol align="center">Symbolic Name</ttcol>

         <c>TBD</c>

         <c>UPDATE-VERSION Message Error</c>
       </texttable>
	<t>
   The following error subcodes are defined as well:
	</t>
       <texttable>
         <ttcol align="center">Subode</ttcol>

         <ttcol align="center">Symbolic Name</ttcol>

         <c>1</c>
         <c>Invalid Message Length</c>
         
         <c>2</c>
         <c>Invalid Message Subtype</c>
       </texttable>
	<t>
   If a BGP speaker detects an error while processing an UPDATE-VERSION
   message, it MUST send a NOTIFICATION message with Error Code UPDATE-
   VERSION Message Error.  The Data field of the NOTIFICATION message
   MUST contain the complete UPDATE-VERSION message.
	</t>
	<t>
   If the Length field for the UPDATE-VERSION message is incorrect, then
   the error subcode is set to "Invalid Message Length".
	</t>
	<t>
   If the Message Subtype in the UPDATE-VERSION message is not any of
   the defined value, then the error subcode is set to "Invalid Message
   Subtype".
   	</t>
   </section>

   <section anchor="Acknowledgements" title="Acknowledgements">
     <t>Thanks to Jonathan Looney for valuable review and suggestions.</t>
   </section>

   <!-- Possibly a 'Contributors' section ... -->

   <section anchor="IANA" title="IANA Considerations">
     <t>
   This document introduces the Enhanced Graceful Restart Capability.
   The capability code needs to be assigned by IANA per <xref target="RFC5492"/>.
	</t>
	<t>
   This document introduce a new BGP message type, UPDATE-VERSION.  The
   type code needs to be assigned by IANA.
	</t>
	<t>
   In addition, this document defines an NOTIFICATION error code and
   several error subcodes for the UPDATE-VERSION message.  They need to
   be registered with the IANA.
     </t>
   </section>

   <section anchor="Security" title="Security Considerations">
     <t>
   This extension to BGP does not change the underlying security issues
   inherent in the existing BGP <xref target="RFC4271"/> <xref target="RFC4724"/>.
     </t>
   </section>
 </middle>

 <!--  *****BACK MATTER ***** -->

 <back>
   <!-- References split into informative and normative -->

   <!-- There are 2 ways to insert reference entries from the citation libraries:
    1. define an ENTITY at the top, and use "ampersand character"RFC2629; here (as shown)
    2. simply use a PI "less than character"?rfc include="reference.RFC.2119.xml"?> here
       (for I-Ds: include="reference.I-D.narten-iana-considerations-rfc2434bis.xml")

    Both are cited textually in the same manner: by using xref elements.
    If you use the PI option, xml2rfc will, by default, try to find included files in the same
    directory as the including file. You can also define the XML_LIBRARY environment variable
    with a value containing a set of directories to search.  These can be either in the local
    filing system or remote ones accessed by http (http://domain/dir/... ).-->

   <references title="Normative References">
     <?rfc include="reference.RFC.2119.xml"?>
     <?rfc include="reference.RFC.2918.xml"?>
     <?rfc include="reference.RFC.4271.xml"?>
     <?rfc include="reference.RFC.4724.xml"?>
     <?rfc include="reference.RFC.4760.xml"?>
     <?rfc include="reference.RFC.5492.xml"?>
     <?rfc include="reference.RFC.7313.xml"?>
   </references>
 </back>
</rfc>
