<?xml version="1.0" encoding="UTF-8"?>

<?rfc sortrefs="yes"?>
<?rfc subcompact="no"?>
<?rfc symrefs="yes"?>
<?rfc toc="yes"?>
<?rfc tocdepth="3"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>

<rfc
  category="std"
  docName="draft-ietf-idr-bgp-sendholdtimer-03"
  updates="4271"
  ipr="trust200902"
  submissionType="IETF"
  consensus="true">

<front>

  <title abbrev="BGP SendHoldTimer">Border Gateway Protocol 4 (BGP-4) Send Hold Timer</title>

  <author fullname="Job Snijders" initials="J." surname="Snijders">
    <organization>Fastly</organization>
    <address>
      <postal>
        <street />
        <city>Amsterdam</city>
        <code />
        <country>Netherlands</country>
      </postal>
      <email>job@fastly.com</email>
    </address>
  </author>

  <author fullname="Ben Cartwright-Cox" initials="B." surname="Cartwright-Cox">
    <organization>Port 179 Ltd</organization>
    <address>
      <postal>
        <street />
        <city>London</city>
        <code />
        <country>United Kingdom</country>
      </postal>
      <email>ben@benjojo.co.uk</email>
    </address>
  </author>

  <author fullname="Yingzhen Qu" initials="Y" surname="Qu">
    <organization>Futurewei Technologies</organization>
    <address>
      <postal>
        <street />
        <city>Santa Clara</city>
        <code />
        <country>United States</country>
      </postal>
      <email>yingzhen.ietf@gmail.com</email>
    </address>
  </author>
  <date />

  <area>Routing</area>
  <workgroup>IDR</workgroup>
  <keyword>BGP</keyword>
  <keyword>session</keyword>
  <keyword>TCP</keyword>
  <keyword>EBGP</keyword>


  <abstract>

    <t>
      This document defines the SendHoldtimer and the SendHoldTimer Expired events for the Border Gateway Protocol (BGP) Finite State Machine (FSM).
      Implementation of the SendHoldTimer helps overcome situations where a BGP session is not terminated after the local system detects that the remote system is not processing BGP messages.
      This document specifies that the local system should close the BGP connection and not solely rely on the remote system for session closure when the SendHoldTimer expires.
      This document updates RFC4271.
    </t>

  </abstract>

  <note title="Requirements Language">
      <t>
        The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only when, they appear in all capitals, as shown here.
      </t>
  </note>

</front>

<middle>

  <section title="Introduction">

    <t>
      This document defines the SendHoldtimer and the SendHoldTimer Expired events for the Border Gateway Protocol (BGP) <xref target="RFC4271" /> Finite State Machine (FSM) defined in section 8.
    </t>

    <t>
      Failure to terminate a blocked BGP session can result in Denial Of Service, and the subsequent failure to generate and deliver BGP WITHDRAW and UPDATE messages to other BGP peers of the local system is detrimental to all participants of the inter-domain routing system.
      This phenomena is theorized to have contributed to IP traffic blackholing events in global Internet routing system <xref target="bgpzombies"/>.
    </t>
    <t>
      This specification intends to improve this situation by requiring sessions to be terminated if the local system has detected that the remote system cannot possibly have received any BGP messages for the duration of the SendHoldTime.
      Through codification of the aforementioned requirement, operators will benefit from consistent behavior across different BGP implementations.
    </t>
    <t>
      BGP speakers following this specification do not exclusively rely on remote systems closing blocked connections, but will also locally close connections.
    </t>

  </section>

  <section title="Example of a problematic scenario">
    <t>
      In implementations lacking the concept of a SendHoldTimer, a malfunctioning or overwhelmed remote peer may cause data on the BGP socket in the local system to accumulate ad infinitum.
      This could result in forwarding failure and traffic loss, as the overwhelmed peer continues to utilize stale routes.
    </t>
    <t>
      An example fault state: as BGP runs over TCP <xref target="RFC9293" />, it is possible for a BGP speaker in the ESTABLISHED state to encounter a BGP peer that is advertising a TCP Receive Window (RCV.WND) of size zero.
      This 0 window prevents the local system from sending KEEPALIVE, CEASE, WITHDRAW, UPDATE, or any other critical BGP messages across the network socket to the remote peer.
    </t>
    <t>
      Generally BGP implementations have no visibility into lower-layer subsystems such as TCP or the peer's current Receive Window size, and there is no existing BGP mechanism for such a blocked session to be torn down.
      Hence BGP implementations are not able to handle this situation in a consistent fashion.
    </t>
    <t>
      This document provides a mechanism for BGP implementations to detect whether the TCP socket to a BGP peer is progressing (data is being transmitted), or persisting in a stalled state.
      In case of a stalled state, the BGP session can be restarted.
    </t>
  </section>

  <section title="SendHoldTimer - Changes to RFC 4271">
  <!-- todo:

     rfc 4271 section 4 - explain the sendholdtimer does not need to be negotiated
     rfc 4271 section 6.* - Send Hold Timer Expired Error Handling
     rfc 4271 section 8.1.1 - event
     rfc 4271 section 8.1.3 - timer
     rfc 4271 section 8.2.1.4 - fsm event numbers?
    -->

  <t>
    BGP speakers are implemented following a conceptual model "BGP Finite State Machine" (FSM), which is outlined in section 8 of <xref target="RFC4271"/>.
    This specification adds a BGP timer, SendHoldTimer, and updates the BGP FSM as following:
  </t>

  <section title="Session Attributes">
    <t>
      The following mandatory session attributes for each connection are added to Section 8, before "The state session attribute indicates the current state of the BGP FSM":
    </t>

    <t>
      <list style="empty">
        <t>
          9) SendHoldTimer
        </t>
        <t>
          10) SendHoldTime (an initial value of 8 minutes is recommended)
        </t>
      </list>
    </t>

    <t>
      The SendHoldTime determines how long a BGP speaker would stay in Established state before the TCP connection is dropped because no BGP messages can be transmitted to its peer.
      A BGP speaker can configure the value of the SendHoldTime to each peer independently.
    </t>
  </section>

  <section title="Timer Event: SendHoldTimer_Expires">
  <t>
    Another timer event is added to Section 8.1.3 of <xref target="RFC4271"/> as following:
  </t>
  <dl newline="true">
    <dt>Event XX1: SendHoldTimer_Expires</dt>
    <dd>
      <dl>
      <dt>Definition:</dt>
      <dd>An event generated when the SendHoldTimer expires. </dd>
      <dt>Status:</dt>
      <dd>Mandatory</dd>
      </dl>
    </dd>
  </dl>
  </section>

  <section title="Changes to the FSM">
    <t>The following changes are made to section 8.2.2 in <xref target="RFC4271"/>.</t>
    <t>In "OpenConfirm State", the handling of Event 26 is revised as follows:</t>
    <dl newline="true">
      <dt>Old Text:</dt>
      <dd>
        <t>If the local system receives a KEEPALIVE message (KeepAliveMsg
           (Event 26)), the local system:
          <list style="hanging">
            <t hangText="-">restarts the HoldTimer and</t>
            <t hangText="-">changes its state to Established.</t>
          </list></t>
      </dd>
      <dt>Next Text:</dt>
      <dd>
        <t>If the local system receives a KEEPALIVE message (KeepAliveMsg
           (Event 26)), the local system:
          <list style="hanging">
            <t hangText="-">restarts the HoldTimer,</t>
            <t hangText="-">sets the SendHoldTimer to the default or configured value, and</t>
            <t hangText="-">changes its state to Established.</t>
          </list></t>
      </dd>
    </dl>

    <t>The following paragraph is added to section 8.2.2 in "Established State",
      after the paragraph which ends "unless the negotiated HoldTime value
      is zero.":</t>
      <t>
      <list style="empty">
        <t>If the SendHoldTimer_Expires (Event XX1), the local system:
          <list style="hanging">
            <t hangText="-">sends a NOTIFICATION message with the BGP Error Code "Send Hold Timer Expired",</t>
            <t hangText="-">logs an error message in the local system with the BGP Error Code "Send Hold Timer Expired",</t>
            <t hangText="-">releases all BGP resources,</t>
            <t hangText="-">sets the ConnectRetryTimer to zero,</t>
            <t hangText="-">drops the TCP connection,</t>
            <t hangText="-">increments the ConnectRetryCounter by 1,</t>
            <t hangText="-">(optionally) performs peer oscillation damping if the DampPeerOscillations attribute is set to TRUE, and</t>
            <t hangText="-">changes its state to Idle.</t>
          </list>
        </t>
        <t>Each time the local system sends a KEEPALIVE, UPDATE, and/or NOTIFICATION message, it restarts its SendHoldTimer.</t>
      </list>
    </t>
  </section>

  <section title="Changes to BGP Timers">
    <t>In Section 10 of <xref target="RFC4271"/> summarizes BGP Timers. This document adds another BGP timer: SendHoldTimer.</t>
    <t>SendHoldTime is a mandatory FSM attribute that stores the initial value for the SendHoldTimer. The suggested default value for SendHoldTime is 8 minutes. An implementation MAY make it configurable.</t>
  </section>

  </section>

  <section title="Send Hold Timer Expired Error Handling">
    <t>
	If a system does not send successive KEEPALIVE, UPDATE, and/or NOTIFICATION messages within the period specified in the Send Hold Time, then the BGP connection is closed and a log message is emitted.
    </t>
  </section>

  <section title="Operational Considerations">
    <t>
      When the local system recognizes a remote peer is not processing any BGP messages for the duration of the SendHoldTime, likely the local system will not be able to inform the remote peer through a BGP message as to why the session is being closed by sending a NOTIFICATION.
      This documents suggests that a NOTIFICATION message with the "Send Hold Timer Expired" error code is still sent, meanwhile an error message SHOULD be logged into the local system.
    </t>
    <t>
      Other mechanisms can be used as well, for example BGP speakers SHOULD provide this reason as part of their operational state; e.g. bgpPeerLastError in the <xref target="RFC4273">BGP MIB</xref>.
    </t>
  </section>

  <section title="Security Considerations">

    <t>
      This specification addresses the vulnerability of a BGP speaker to a potential attack whereby a BGP peer can pretend to be unable to process BGP messages and in doing so create a scenario where the local system is poisoned with stale routing information.
    </t>
    <t>
      There are three detrimental aspects to the problem of not robustly handling blocked peers:
      <list style="none">
        <t>
          Failure to send BGP messages to a peer implies the peer is operating based on stale routing information.
        </t>
        <t>
          Failure to disconnect from a blocked peer hinders the local system's ability to construct a non-stale local Routing Information Base (RIB).
        </t>
        <t>
          Failure to disconnect from a blocked peer hinders the local system's ability to inform other BGP peers with current network reachability information.
        </t>
      </list>
    </t>
    <t>
      In other respects, this specification does not change BGP's security characteristics.
    </t>
  </section>

  <section title="IANA Considerations">
    <t>
      IANA has registered code 8 for "Send Hold Timer Expired" in the "BGP Error (Notification) Codes" sub-registry under the "Border Gateway Protocol (BGP) Parameters" registry.
    </t>
  </section>

  <section title="Acknowledgements">
    <t>
      The authors would like to thank
      William McCall,
      Theo de Raadt,
      John Heasley,
      Nick Hilliard,
      Jeffrey Haas,
      Tom Petch,
      and
      Susan Hares
      for their helpful review of this document.
    </t>
  </section>

</middle>

<back>

  <references title="Normative References">
    <?rfc include="reference.RFC.2119.xml"?>
    <?rfc include="reference.RFC.4271.xml"?>
    <?rfc include="reference.RFC.4273.xml"?>
    <?rfc include="reference.RFC.8174.xml"?>

    <reference anchor="RFC9293" target="https://www.rfc-editor.org/info/rfc9293">
      <front>
        <title>Transmission Control Protocol (TCP)</title>
        <author fullname="Wesley M. Eddy" initials="W." surname="Eddy" role="editor"/>
        <date month="August" year="2022"/>
        <abstract>
          <t>This document specifies the Transmission Control Protocol (TCP). TCP is an important transport-layer protocol in the Internet protocol stack, and it has continuously evolved over decades of use and growth of the Internet. Over this time, a number of changes have been made to TCP as it was specified in RFC 793, though these have only been documented in a piecemeal fashion. This document collects and brings those changes together with the protocol specification from RFC 793. This document obsoletes RFC 793, as well as RFCs 879, 2873, 6093, 6429, 6528, and 6691 that updated parts of RFC 793. It updates RFCs 1011 and 1122, and it should be considered as a replacement for the portions of those documents dealing with TCP requirements. It also updates RFC 5961 by adding a small clarification in reset handling while in the SYN-RECEIVED state. The TCP header control bits from RFC 793 have also been updated based on RFC 3168.
          </t>
        </abstract>
      </front>
      <seriesInfo name="RFC" value="9293"/>
      <seriesInfo name="DOI" value="10.17487/RFC9293"/>
    </reference>

  </references>

  <references title="Informative References">

    <reference anchor="openbgpd" target="https://marc.info/?l=openbsd-tech&amp;m=160820754925261&amp;w=2">
      <front>
        <title>bgpd send side hold timer</title>
        <author fullname="Claudio Jeker"><organization>OpenBSD</organization></author>
        <date month="December" year="2020" />
      </front>
    </reference>

    <reference anchor="frr" target="https://github.com/FRRouting/frr/pull/11225">
      <front>
        <title>bgpd: implement SendHoldTimer</title>
        <author fullname="David Lamparter"><organization>NetDEF</organization></author>
        <date month="May" year="2022" />
      </front>
    </reference>

    <reference anchor="neo-bgp" target="https://bgp.tools/kb/bgp-support">
      <front>
        <title>What does bgp.tools support</title>
        <author fullname="Ben Cartwright-Cox"><organization>Port 179 Ltd</organization></author>
        <date month="Aug" year="2022" />
      </front>
    </reference>

    <reference anchor="BIRD" target="https://gitlab.nic.cz/labs/bird/-/commit/bcf2327425d4dd96f381b87501cccf943bed606e">
      <front>
        <title>BIRD Internet Routing Daemon</title>
        <author fullname=" Katerina Kubecova"><organization>CZ.NIC</organization></author>
        <date month="Oct" year="2023" />
      </front>
    </reference>

    <reference anchor="bgpzombies" target="https://labs.ripe.net/author/romain_fontugne/bgp-zombies/">
      <front>
        <title>BGP Zombies</title>
        <author fullname="Romain Fontugne"><organization>IIJ Research Lab</organization></author>
        <date month="april" year="2019" />
      </front>
    </reference>

  </references>

  <section title="Implementation status - RFC EDITOR: REMOVE BEFORE PUBLICATION">
    <t>
      This section records the status of known implementations of the protocol defined by this specification at the time of posting of this Internet-Draft, and is based on a proposal described in RFC 7942.
      The description of implementations in this section is intended to assist the IETF in its decision processes in progressing drafts to RFCs.
      Please note that the listing of any individual implementation here does not imply endorsement by the IETF.
      Furthermore, no effort has been spent to verify the information presented here that was supplied by IETF contributors.
      This is not intended as, and must not be construed to be, a catalog of available implementations or their features.
      Readers are advised to note that other implementations may exist.
    </t>

    <t>
      According to RFC 7942, "this will allow reviewers and working groups to assign due consideration to documents that have the benefit of running code, which may serve as evidence of valuable experimentation and feedback that have made the implemented protocols more mature.
      It is up to the individual working groups to use this information as they see fit".
    </t>

    <t>
      <list style="symbols">
        <t>
          OpenBGPD <xref target="openbgpd"/>
        </t>
      </list>
      <list style="symbols">
        <t>
          FRRouting <xref target="frr"/>
        </t>
      </list>
      <list style="symbols">
        <t>
          neo-bgp (bgp.tools) <xref target="neo-bgp"/>
        </t>
      </list>
      <list style="symbols">
        <t>
          BIRD <xref target="BIRD"/>
        </t>
      </list>
    </t>
    <t>
      Patches to recognize error code 8 were merged into OpenBSD's and the-tcpdump-group's tcpdump implementations.
    </t>
  </section>

</back>

</rfc>
