<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="info" docName="draft-du-detnet-layer3-low-latency-05"
     ipr="trust200902">
  <front>
    <title abbrev="L3 Low-latency Traffic">Micro-burst Decreasing in Layer3
    Network for Low-Latency Traffic</title>

    <author fullname="Zongpeng Du" initials="Z." surname="Du">
      <organization>China Mobile</organization>

      <address>
        <postal>
          <street>No.32 XuanWuMen West Street</street>

          <city>Beijing</city>

          <code>100053</code>

          <country>China</country>
        </postal>

        <email>duzongpeng@foxmail.com</email>
      </address>
    </author>

    <author fullname="Peng Liu" initials="P." surname="Liu">
      <organization>China Mobile</organization>

      <address>
        <postal>
          <street>No.32 XuanWuMen West Street</street>

          <city>Beijing</city>

          <code>100053</code>

          <country>China</country>
        </postal>

        <email>liupengyjy@chinamobile.com</email>
      </address>
    </author>

    <date month="" year=""/>

    <area>Routing Area</area>

    <workgroup>Network Working Group</workgroup>

    <keyword>DetNet, Micro-burst, Low-latency</keyword>

    <abstract>
      <t>It is complex to support deterministic forwarding in a large scale
      network because there is too much dynamic traffic in the network and the
      data model becomes hard to predict after traffic aggregation on the
      intermediate nodes. This document introduces the problem of micro-bursts
      in the layer3 network, and analyses the method to decrease the
      micro-bursts in layer3 network for low-latency traffic.</t>
    </abstract>

    <note title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119">RFC 2119</xref>.</t>
    </note>
  </front>

  <middle>
    <section title="Introduction">
      <t>The DetNet architecture <xref target="RFC8655"/> is supposed to work
      in campus-wide networks and private WANs, including the large-scale ISP
      network scenario, such as the 5G bearing network, as mentioned in <xref
      target="RFC8578"/>. It is essential for the large-scale ISP network to
      be able to provide the low-latency service. The low-latency requirement
      exists in both L2 and L3 networks, and in both small and large
      networks.</t>

      <t>However, as talked in <xref
      target="I-D.qiang-detnet-large-scale-detnet"/>, deploying deterministic
      services in a large-scale network brings a lot of new challenges. A
      novel method called LDN (Large-scale Deterministic Network) is
      introduced in <xref target="I-D.qiang-detnet-large-scale-detnet"/> and
      <xref target="I-D.dang-queuing-with-multiple-cyclic-buffers"/>, which
      explore the deterministic forwarding over a large-scale network.</t>

      <t>This document also explores the deterministic service in the
      large-scale layer 3 network, and analyses the method based on
      micro-burst decreasing, which can benefit the forwarding of low-latency
      traffic in the large-scale network.</t>
    </section>

    <section title="Gaps for Large-scale Layer 3 Deterministic Network ">
      <t>In this document, the large-scale network means that there are many
      dynamic flows in the network, but it is hard to do per-flow shaping on
      the intermediate nodes because they have high pressure on forwarding on
      the data plane.</t>

      <t>According to <xref target="RFC8655"/>, DetNet operates at the IP
      layer and delivers service over lower-layer technologies such as MPLS
      and IEEE 802.1 Time-Sensitive Networking <xref target="TSN"/>. However,
      the TSN mechanisms are designed for L2 network originally, and cannot be
      directly used in the large-scale layer 3 network because of various
      reasons. Some of them are described as below.</t>

      <t>Some TSN mechanisms need synchronization of the network equipments,
      which is easier in a small network, but hard in a large network. It
      brings in some complex maintenance jobs across a long distance that are
      not needed before.</t>

      <t>Some TSN mechanisms need a per-flow state in the forwarding plane,
      which is un-scalable. Aggregation methods need to be considered.</t>

      <t>Some TSN mechanisms need a constant and forecastable traffic
      characteristics, which is more complicated in a large network which
      includes much more flows joining in or leaving randomly and the traffic
      characteristics are more dynamic.</t>

      <t>The main aspects of the problems are the simplicity and the
      scalability. The former can ensure that the mechanism is easy to deploy,
      and the second can ensure that the mechanism is able to bear a large
      number of deterministic services. An analysis job about the requirements
      of the large scaled DetNet network is being done in <xref
      target="I-D.liu-detnet-large-scale-requirements"/>.</t>
    </section>

    <section title="Micro-burst Problem in IP Forwarding">
      <t/>

      <t>The current IP forwarding mechanism is considered to be a good
      example fulfilling the requirements of simplicity and scalability.
      However, the traditional IP network is based on statistical
      multiplexing, and can only provide Best Effort service, short of SLA
      guaranteed mechanisms.</t>

      <t>When we rethink the problem in the current IP forwarding mechanism,
      we can find that in the current IP network, a long delay in queuing, or
      some packet losses due to burst are acceptable; however, it may be
      unacceptable in the deterministic forwarding. Therefore, they have
      different design principles in the low layer.</t>

      <t/>

      <t>The current forwarding mechanism in an IP router, which is based on
      statistical multiplexing, can not provide the deterministic service
      because of various reasons. Even be given a high priority, a critical
      packet can experience a long congestion delay or be lost in a relatively
      light-loaded network, which is caused by micro-bursts in the network.
      The "critical packet" here means that the packet is a DetNet packet, and
      is sensitive to the latency.</t>

      <t>Micro-burst is a special case of network congestion, which typically
      lasts a short period, at the granularity of millisecond. In a
      micro-burst, a lot of data are received on the interface suddenly, and
      the temporary bandwidth requirement would be tens of or hundreds of the
      average bandwidth requirement, or even exceed the interface
      bandwidth.</t>

      <t>In most cases, the buffer on the equipment can handle the
      micro-bursts. However, in some corner cases, micro-bursts bring in a
      long delay (for example, at the granularity of millisecond) or even
      packet loss.</t>

      <t>We introduce the main causes of the micro-burst in the following
      paragraphs.</t>

      <t>Firstly, IP traffic has an instinct of burstiness no matter in the
      macro or micro aspect, i.e., it does not have a constant traffic model
      even after aggregations.</t>

      <t>Secondly, IP network has a flexible topology, where the incoming
      traffic may exceed the bandwidth of the outgoing interface. For example,
      an interface with a large bandwidth may need to send traffic to an
      interface with a smaller bandwidth, or multiple flows from several
      incoming interfaces may need to occupy the same outgoing interface.</t>

      <t>Thirdly, the IP node has been designed to send traffic as quickly as
      possible, and it is not aware whether the downstream node&lsquo;s buffer
      can handle the traffic. For example, Figure 1 below shows the problem of
      the current IP scheduling mechanism. Before the scheduling in an IP
      network, the packets are well paced, but after the scheduling, the
      packets will be gathered even the total traffic rate is unchanged. When
      an IP outgoing interface receives multiple critical flows from several
      incoming interfaces, the situation becomes worse. However, an IP router
      will try to send them as soon as possible, so occasionally, in some
      later hops, micro-bursts will emerge.</t>

      <t/>

      <t><figure>
          <artwork><![CDATA[
    _     _     _     _     _     _     _     _     _     _     _
   | |   | |   | |   | |   | |   | |   | |   | |   | |   | |   | |   
---------------------------------------------------------------------
                   Before scheduling in an IP network
              
    _  _  _  _  _  _                 _  _  _  _  _
   | || || || || || |               | || || || || |   
---------------------------------------------------------------------
                   After scheduling in an IP network

   Figure 1: Change of the traffic characteristics in an IP network

]]></artwork>
        </figure></t>

      <t/>

      <t/>
    </section>

    <section title="Analysis of the Method to Decrease Micro-bursts">
      <t/>

      <t>This document analyses the method to support the low latency traffic
      bearing in an IP network, such as the 5G bearing network, by avoiding
      micro-bursts in the network as much as possible. The principle in this
      method is to forward critical and BE traffic separately, and does not
      distinguish different critical flows on the forwarding plane on the
      intermediate nodes.</t>

      <t>As talked before, the target method should be scalable and easy to
      deploy. As the intermediate nodes have high pressure on forwarding
      packets, the target method should not bring in too much complex process
      on the data plane. Several requirements are listed as follows.</t>

      <t>The first is that the DetNet traffic should support aggregation. The
      intermediate nodes should not do per-flow process on the date plane.</t>

      <t>The second is that separation process of the control plane and data
      plane on the intermediate nodes. The status of the aggregated DetNet
      traffic on the control plane may change frequently in the large-scale
      network. We should not assume that the control plane on an intermediate
      node can interact with the data plane frequently, for example, to change
      a shaper parameter frequently. On the data plane, some self-decision
      process should be supported.</t>
    </section>

    <section title="An Example of Method to Decrease Micro-bursts">
      <t>In this section, we describes an example of method fulfilling the
      requirements mentioned in the last section. It is a traffic forwarding
      method in the DetNet network, which can decrease micro-bursts for the
      critical traffic. It needs the cooperation of the edge nodes and the
      forwarding/intermediate nodes in an IP network.</t>

      <section title="Working Flow of the Method">
        <t>Generally, the method contains two steps:</t>

        <t>Step1: per flow schedule on the edge node. The purpose is to make
        sure that each critical traffic has a constant traffic model.</t>

        <t>Step2: per interface schedule on the intermediate nodes. Traffic
        are aggregated to ensure the scalability, and the pacing also makes
        sure that they do not gather. The purpose is to make the critical
        traffic be forwarded as the shape when outgoing the edge, not as
        quickly as possible. We assume that the sending rate of the buffer for
        the critical traffic is the same as or similar to the receiving rate
        (how to achieve this is out of scope of this document). If all work
        well, the buffer will be maintained with a proper depth.</t>

        <t>Other requirements include an RSVP-TE liked mechanism with a good
        scalability, which should be used to make sure the bandwidth is not
        exceeded on the interface.</t>

        <t/>

        <t/>
      </section>

      <section title="Process of Edge Node">
        <t>The edge node of the IP network can recognize each critical flows
        just as in the TSN network, and then give them individually a good
        shaping. In fact, in TSN mechanisms, no micro-burst will emerge for
        critical traffic, and each TSN mechanism is proved to be effective
        under certain conditions.</t>

        <t>This document suggests the edge node to shape the critical traffic
        by using the CBS method in <xref target="IEEE802.1Qav"/>, or the
        shaping methods in <xref target="IEEE802.1Qcr"/>. Generally, the
        shaping methods can generate a paced traffic for each critical
        flow.</t>

        <t>The parameters of the shaper, such as the sending rate, can be
        configured for each flow by some means.</t>

        <t/>
      </section>

      <section title="Process of Forwarding Node">
        <t>For the forwarding node, it is uneasy to recognize each critical
        flow because of the high pressure of forwarding a large amount of
        packets. It is suggested that no per-flow state is maintained on the
        forwarding node. It is to say that, on the forwarding node, the
        critical flows should be aggregated and handled together.</t>

        <t>We do not distinguish each critical flow on the forwarding node,
        but all the packets of critical flows should have a common
        identification to be recognisable, which also stands for that the
        packet is time sensitive. The forwarding node can obtain the
        identification in the critical packet, and accordingly forward it to a
        specific queue.</t>

        <t>This document suggests that the forwarding node can deploy a
        specific queue on each outgoing interface to buffer the time sensitive
        packets waiting to be sent. When receiving a packet of critical
        traffic, the forwarding node will forward it to the specific queue on
        the outgoing interface according to its destination address and its
        identification. The queue will buffer all critical traffic that need
        to go out through that interface, and will pace them by using methods
        mentioned in the last section.</t>

        <t>A shaping method in TSN is used here instead of the original
        forwarding method in an IP router, which can make the critical traffic
        be forwarded orderly instead of as soon as possible. Therefore,
        micro-bursts can be decreased in the network.</t>

        <t>If all the forwarding nodes can do their jobs properly, i.e., they
        can well pace the critical traffic, no or rare micro-bursts for the
        critical traffic would take place. In this way, the critical traffic
        will have a relatively low latency in the IP network with less
        uncertainty of micro-bursts.</t>

        <t>As no per-flow state is maintained on the forwarding node, the
        sending rate of the shaper is hard to decide. As said in the last
        session, the sending rate is suggested to be adjusted referring to the
        incoming rate of the queue. In other words, before sending the
        critical packet, we should shape the specific queue by using a shaper
        parameter based on the computing result of the incoming rate of the
        queue. The purpose is to maintain a proper buffer depth for the
        queue.</t>

        <t>Although it is claimed that the proposed method is simpler than the
        TSN mechanisms, forwarding/intermediate nodes also need to be updated.
        The detailed realization of the method on the intermediate nodes is
        out of scope of this document.</t>
      </section>

      <section title="Analysis of the Proposed Method">
        <t>The method proposed does not need synchronization, just as the
        asynchronous mechanisms studied in <xref target="IEEE802.1Qcr"/>.
        Furthermore, the method has a larger aggregation granularity, which
        can fulfill the requirements of simplicity and scalability as much as
        possible. However, in theory, it has a larger uncertainty on the
        forwarding than the zero congestion loss target in the TSN
        mechanisms.</t>

        <t>We compare three mechanisms in the following paragraphs. The first
        is the priority based light-load mechanism, i.e., the traditional
        method. The second is the TSN mechanism, such as CQF. The third is the
        proposed mechanism.</t>

        <t>In the first mechanism, we only give a high priority to the
        critical traffic, and thus the scalability of the deterministic system
        is good. However, the uncertainty on the forwarding plane perhaps can
        not fulfill the requirements in the industry network where SLA
        requirements are very essential. Perhaps, it is only able to work well
        when a small amount of critical traffic exist in the network.</t>

        <t>If we use the scheduling method in the TSN, such as CQF. Its
        uncertainty is very low, but its scalability is not very good as said
        in Section 2. It should be noted that in a large deterministic system,
        the ISP normally will not guarantee the user 100 percent reliability,
        instead of which it perhaps is a value very close to.</t>

        <t>The proposed method has a better scalability than the TSN
        mechanisms, and a better reliability than the priority based method.
        If we assume that different services need different deterministic
        levels, this method may be helpful for the service that does not need
        a very high deterministic level. For example, the method can be used
        in the consumption Internet, in which the deterministic service needs
        a relatively lower deterministic level than the industry Internet.</t>

        <t/>
      </section>
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>This document has no IANA actions.</t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>Detailed security considerations can refer to <xref
      target="I-D.ietf-detnet-bounded-latency"/> and <xref
      target="I-D.ietf-detnet-security"/>.</t>
    </section>

    <section anchor="Acknowledgements" title="Acknowledgements">
      <t>Thanks for the valuable comments from J&aacute;nos Farkas, Lou
      Berger, and David Black.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include="reference.RFC.2119"?>

      <?rfc include="reference.RFC.8655"?>
    </references>

    <references title="Informative References">
      <?rfc include='reference.I-D.qiang-detnet-large-scale-detnet'?>

      <?rfc include="reference.I-D.ietf-detnet-bounded-latency"?>

      <?rfc include="reference.I-D.liu-detnet-large-scale-requirements"?>

      <?rfc include="reference.I-D.ietf-detnet-security"?>

      <?rfc include="reference.RFC.8578"?>

      <?rfc include='reference.I-D.dang-queuing-with-multiple-cyclic-buffers'?>

      <reference anchor="TSN" target="https://1.ieee802.org/tsn/">
        <front>
          <title>Time-Sensitive Networking (TSN) Task Group</title>

          <author>
            <organization>IEEE 802.1</organization>
          </author>

          <date year="2012"/>
        </front>
      </reference>

      <reference anchor="IEEE802.1Qcr"
                 target="https://standards.ieee.org/standard/802_1Qcr-2020.html">
        <front>
          <title>IEEE 802.1Qcr-2020 - IEEE Standard for Local and Metropolitan
          Area Networks--Bridges and Bridged Networks - Amendment 34:
          Asynchronous Traffic Shaping</title>

          <author>
            <organization>IEEE 802.1</organization>
          </author>

          <date year="2020"/>
        </front>
      </reference>

      <reference anchor="IEEE802.1Qav"
                 target="https://standards.ieee.org/standard/802_1Qav-2009.html">
        <front>
          <title>IEEE 802.1Qav-2009 - IEEE Standard for Local and metropolitan
          area networks-- Virtual Bridged Local Area Networks Amendment 12:
          Forwarding and Queuing Enhancements for Time-Sensitive
          Streams</title>

          <author>
            <organization>IEEE 802.1</organization>
          </author>

          <date year="2009"/>
        </front>
      </reference>

      <?rfc ?>
    </references>
  </back>
</rfc>
