<?xml version='1.0' ?>
<!DOCTYPE rfc SYSTEM 'rfc2629.dtd' [ 
  <!ENTITY rfc2119 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'>
  <!ENTITY rfc3261 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3261.xml'>
  <!ENTITY rfc3263 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3263.xml'>
  <!ENTITY rfc3265 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3265.xml'>
  <!ENTITY rfc4412 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.4412.xml'>
  <!ENTITY i-d.rosenberg-sipping-overload-reqs PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.rosenberg-sipping-overload-reqs.xml'>
]>
<rfc ipr='full3978' category='info' docName='draft-hilt-sipping-overload-design-00'>

<?rfc toc='yes'?>
<?rfc compact='yes'?>
<?rfc sortrefs='yes'?>

<front>
  <title abbrev='Overload Control'>Design Considerations for Session
  Initiation Protocol (SIP) Overload Control</title> 

  <author initials='V.H.' surname='Hilt (Ed.)' fullname='Volker Hilt (Ed.)'>
    <organization>Bell Labs/Alcatel-Lucent</organization>
    <address>
      <postal>
	<street>791 Holmdel-Keyport Rd</street>
	<city>Holmdel</city> <region>NJ</region>
	<code>07733</code>
	<country>USA</country>
      </postal> 
      <email>volkerh@bell-labs.com</email>
    </address>
  </author>

  <date month='July' year='2008' />
  <area>Real-time Applications and Infrastructure</area>
  <workgroup>SIPPING Working Group</workgroup>
  <keyword>SIP</keyword>
  <keyword>Overload Control</keyword>
  <abstract>
    <t>Overload occurs in Session Initiation Protocol (SIP) networks when
    SIP servers have insufficient resources to handle all SIP messages
    they receive. Even though the SIP protocol provides a limited
    overload control mechanism through its 503 (Service Unavailable)
    response code, SIP servers are still vulnerable to overload. This
    document discusses models and design considerations for a SIP
    overload control mechanism.</t> 
  </abstract>
</front>

<middle>

  <section title="Introduction">

    <t>As with any network element, a Session Initiation Protocol
    (SIP) <xref target="RFC3261" /> server can suffer from overload
    when the number of SIP messages it receives exceeds the number of
    messages it can process. Overload can pose a serious problem for a
    network of SIP servers. During periods of overload, the throughput
    of a network of SIP servers can be significantly degraded. In
    fact, overload may lead to a situation in which the throughput
    drops down to a small fraction of the original processing
    capacity. This is often called congestion collapse.</t>

    <t>Overload is said to occur if a SIP server does not have
    sufficient resources to process all incoming SIP messages. These
    resources may include CPU, memory, network bandwidth,
    input/output, or disk resources.</t>

    <t>For overload control, we only consider failure cases where SIP
    servers are unable to process all SIP requests due to resource
    constraints. There are other cases where a SIP server can
    successfully process incoming requests but has to reject them due
    to other failure conditions. For example, a PSTN gateway that runs
    out of trunk lines but still has plenty of capacity to process SIP
    messages should reject incoming INVITEs using a 488 (Not
    Acceptable Here) response <xref target="RFC4412" />. Similarly, a
    SIP registrar that has lost connectivity to its registration
    database but is still capable of processing SIP messages should
    reject REGISTER requests with a 500 (Server Error) response <xref
    target="RFC3261" />. Overload control does not apply to these
    cases and SIP provides appropriate response codes for them.</t>  

    <t>The SIP protocol provides a limited mechanism for overload
    control through its 503 (Service Unavailable) response
    code. However, this mechanism cannot prevent overload of a SIP
    server and it cannot prevent congestion collapse. In fact, the use
    of the 503 (Service Unavailable) response code may cause traffic
    to oscillate and to shift between SIP servers and thereby worsen
    an overload condition. A detailed discussion of the SIP overload
    problem, the problems with the 503 (Service Unavailable) response
    code and the requirements for a SIP overload control mechanism can
    be found in <xref target="I-D.rosenberg-sipping-overload-reqs"
    />.</t>

    <t>This document discusses the models, assumptions and design
    considerations for a SIP overload control mechanism. The document
    is a product of the SIP overload control design team. </t> 

  </section>

<!--
  <section title="Terminology">

    <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
    NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and
    "OPTIONAL" in this document are to be interpreted as described in
    <xref target="RFC2119">RFC 2119</xref>.</t>

  </section>
-->

  <section title="Implicit vs. Explicit Overload Control">

    <t>Two fundamental approaches to overload control exist: implicit
    and explicit overload control. </t>

    <t>A key contributor to the SIP congestion collapse <xref
    target="I-D.rosenberg-sipping-overload-reqs" /> is the 
    regenerative behavior of overload in the SIP protocol. Messages
    that get dropped by a SIP server due to overload are retransmitted
    and increase the offered load for the already overloaded
    server. This increase in load worsens the severity of the overload
    condition and, in turn, causes more messages to be dropped. The
    goal of an implicit overload control is therefore to change the
    fundamental mechanisms of the SIP protocol such that regenerative
    behavior of overload is avoided. In the ideal case, overload
    behavior of SIP would be fully non-regenerative, which would lead
    to a stable operation during overload. Even if a fully
    non-regenerative behavior for SIP is challenging to achieve,
    changes to the SIP retransmission timer mechanisms can help to
    reduce the degree of regeneration during overload. More work is
    needed to understand the impact of SIP retransmission timers on
    the regenerative overload behavior of SIP.</t>

    <t>For a SIP INVITE transaction to be successful a minimum of
    three messages need to be forwarded by a SIP server, often five or
    more. If a SIP server under overload randomly discards messages
    without evaluating them, the chances that all messages belonging
    to a transaction are passed on will decrease as the load
    increases. Thus, the number of successful transactions will
    decrease even if the message throughput of a server remains up and 
    the overload behavior is fully non-regenerative. A SIP server
    might (partially) parse incoming messages to determine if it is a
    new request or a message belonging to an existing
    transaction. However, after having spend resources on parsing
    a SIP message, discarding this message becomes expensive as the
    resources already spend are lost. The number of successful
    transactions will therefore decline with an increase in load
    as less and less resources can be spent on forwarding
    messages. The slope of the decline depends on the amount of
    resources spent to evaluate each message.</t> 

    <t>The main idea of a explicit overload control is to use an
    explicit overload signal to request a reduction in the offered
    load. This enables a SIP server to adjust the offered load to a
    level at which it can perform at maximum capacity. </t>

    <t>Reducing the extent to which SIP server overload is
    regenerative and an efficient explicit overload control mechanism
    to control incoming load are two complementary approaches to
    improve SIP performance under overload.</t>

  </section>

  <section title="System Model">

    <t>The model shown in <xref target="fig:archa" /> identifies
    fundamental components of an explicit SIP overload control
    mechanism:</t>

    <t><list style='hanging'>
      <t hangText="SIP Processor:">The SIP Processor processes SIP
      messages and is the component that is protected by overload
      control.</t>

      <t hangText="Monitor:">The Monitor measures the current load of
      the SIP processor on the receiving entity. It implements the
      mechanisms needed to determine the current usage of resources
      relevant for the SIP processor and reports load samples (S) to
      the Control Function.</t>

      <t hangText="Control Function:">The Control Function implements
      the overload control algorithm. The control function uses the
      load samples (S) and determines if overload has occurred and a
      throttle (T) needs to be set to adjust the load sent to the SIP
      processor on the receiving entity. The control function on the
      receiving entity sends load feedback (F) to the sending
      entity.</t>

      <t hangText="Actuator:">The Actuator implements the algorithms
      needed to act on the throttles (T) and to adjust the amount of
      traffic forwarded to the receiving entity. For example, a
      throttle may instruct the Actuator to reduce the traffic
      destined to the receiving entity by 10%. The algorithms in the
      Actuator then determine how the traffic reduction is achieved,
      e.g., by selecting the messages that will be affected and
      determining whether they are rejected or redirected.</t>

    </list></t>

    <t>The type of feedback (F) conveyed from the receiving to the
    sending entity depends on the overload control method used
    (i.e., loss-based, rate-based or window-based overload control;
    see <xref target="sec:method" />), the overload control algorithm
    (see <xref target="sec:algorithm" />) as well as other design
    parameters. In any case, the feedback (F) enables the sending
    entity to adjust the amount of traffic forwarded to the receiving
    entity to a level that is acceptable to the receiving entity
    without causing overload.</t>

    <figure title="System Model for Overload Control" anchor="fig:archa">
<artwork><![CDATA[
       Sending                Receiving 
        Entity                  Entity
  +----------------+      +----------------+    
  |    Server A    |      |    Server B    | 
  |  +----------+  |      |  +----------+  |    -+
  |  | Control  |  |  F   |  | Control  |  |     | 
  |  | Function |<-+------+--| Function |  |     | 
  |  +----------+  |      |  +----------+  |     |
  |     T |        |      |       ^        |     | Overload 
  |       v        |      |       | S      |     | Control
  |  +----------+  |      |  +----------+  |     |
  |  | Actuator |  |      |  | Monitor  |  |     | 
  |  +----------+  |      |  +----------+  |     |
  |       |        |      |       ^        |    -+
  |       v        |      |       |        |    -+
  |  +----------+  |      |  +----------+  |     | 
<-+--|   SIP    |  |      |  |   SIP    |  |     |  SIP
--+->|Processor |--+------+->|Processor |--+->   | System
  |  +----------+  |      |  +----------+  |     | 
  +----------------+      +----------------+    -+

 ]]></artwork>
    </figure>

  </section>

  <section title="Degree of Cooperation">

    <t>A SIP request is often processed by more than one SIP
    server on its path to the destination. Thus, a design choice for
    overload control is where to place the components of overload
    control along the path of a request and, in particular, where to place the
    Monitor and Actuator. This design choice determines the degree of
    cooperation between the SIP servers on the path. Overload control
    can be implemented hop-by-hop with the Monitor on one server and
    the Actuator on its direct upstream neighbor. Overload control can
    be implemented end-to-end with Monitors on all SIP servers along
    the path of a request and one Actuator on the sender. In this
    case, Monitors have to cooperate to jointly determine the current
    resource usage on this path. Finally, overload control can be
    implemented locally on a SIP server if Monitor and Actuator reside
    on the same server. In this case, the sending entity and receiving
    entity are the same SIP server and Actuator and Monitor operate on
    the same SIP processor (although, the Actuator typically operates
    on a pre-processing stage in local overload control). These three 
    configurations are shown in <xref target="fig:hbh-e2e" />.</t> 

    <figure title="Degree of Cooperation between Servers" anchor="fig:hbh-e2e">
<artwork><![CDATA[

                      +-+                    +---------+          
                      v |           +------+ |         |      
 +-+      +-+        +---+          |      | |        +---+   
 v |      v |    //=>| C |          v      | v    //=>| C |  
+---+    +---+ //    +---+       +---+    +---+ //    +---+   
| A |===>| B |                   | A |===>| B |             
+---+    +---+ \\    +---+       +---+    +---+ \\    +---+  
                 \\=>| D |                   ^    \\=>| D | 
                     +---+                   |        +---+ 
                      ^ |                    |         |     
                      +-+                    +---------+       

        (a) local                      (b) hop-by-hop

   +------(+)---------+
   |       ^          |
   |       |         +---+
   v       |     //=>| C |
+---+    +---+ //    +---+
| A |===>| B |     
+---+    +---+ \\    +---+
   ^       |     \\=>| D |
   |       |         +---+  
   |       v          |   
   +------(+)---------+ 

      (c) end-to-end

 ==> SIP request flow
 <-- Overload feedback loop

 ]]></artwork>
    </figure>

    <section title="Hop-by-Hop">

      <t>The idea of hop-by-hop overload control is to instantiate a
      separate control loop between all neighboring SIP servers that
      directly exchange traffic. I.e., the Actuator is located on the
      SIP server that is the direct upstream neighbor of the SIP
      server that has the corresponding Monitor. Each control loop
      between two servers is completely independent of the control
      loop between other servers further up- or downstream. In the
      example in <xref target="fig:hbh-e2e" />(b), three independent
      overload control loops are instantiated: A - B, B - C and B -
      D. Each loop only controls a single hop. Overload feedback
      received from a downstream neighbor is not forwarded further
      upstream. Instead, a SIP server acts on this feedback, for 
      example, by re-routing or rejecting traffic if needed. If the
      upstream neighbor of a server also becomes overloaded, it will
      report this problem to its upstream neighbors, which again take
      action based on the reported feedback. Thus, in hop-by-hop
      overload control, overload is always resolved by the direct
      upstream neighbors of the overloaded server without the need to
      involve entities that are located multiple SIP hops away.</t> 

      <t>Hop-by-hop overload control reduces the impact of overload on
      a SIP network and, in particular, can avoid congestion
      collapse. In addition, hop-by-hop overload control is simple and
      scales well to networks with many SIP entities. It does not
      require a SIP entity to aggregate a large number of overload
      status values or keep track of the overload status of SIP
      servers it is not communicating with.</t> 

    </section>

    <section title="End-to-End">

      <t>End-to-end overload control implements an overload control
      loop along the entire path of a SIP request, from UAC to UAS. An 
      end-to-end overload control mechanism consolidates overload
      information from all SIP servers on the way including all 
      proxies and the UAS and uses this information to throttle
      traffic as far upstream as possible. An end-to-end overload
      control mechanism has to be able to frequently collect the
      overload status of all servers on the potential path(s) to a
      destination and combine this data into meaningful overload
      feedback.</t>

      <t>A UA or SIP server only needs to throttle requests if it
      knows that these requests will eventually be forwarded to an
      overloaded server. For example, if D is overloaded in <xref
      target="fig:hbh-e2e" />(c), A should only throttle requests it
      forwards to B when it knows that they will be forwarded to D. It
      should not throttle requests that will eventually be forwarded
      to C, since server C is not overloaded. In many cases, it is
      difficult for A to determine which requests will be routed to C
      and D since this depends on the local routing decision made by
      B.</t>

      <t>The main problem of end-to-end path overload control is its
      inherent complexity since UAC or SIP servers need to monitor
      all potential paths to a destination in order to determine which
      requests should be throttled and which requests may be sent. In
      addition, the routing decisions of a SIP server depend on local
      policy, which can be difficult to infer for an upstream
      neighbor. Therefore, end-to-end overload control is likely to
      only work well in simple, well-known topologies (e.g., a server
      that is known to only have one downstream neighbor) or if a
      UA/server sends many requests to the exact same destination.</t>

    </section>

    <section title="Local Overload Control"> 
      
      <t>Local overload control does not require an explicit overload
      signal between SIP entities as it is implemented locally on a
      SIP server. It can be by a SIP server to determine when to
      reject incoming requests instead of forwarding them based on
      current resource usage. Local overload control can be used in
      conjunction with an explicit overload control mechanisms and
      provides an additional layer of protection against overload, 
      for example, when upstream servers do not support explicit
      overload control. In general, servers should use an explicit
      mechanisms if available to throttle upstream neighbors before
      using local overload control as a mechanism of last resort.</t>

  </section>

  </section>

  <section title="Topologies" anchor="sec:topologies">

    <t>The following topologies describe four generic SIP server 
    configurations, which each poses specific challenges for an
    overload control mechanism.</t> 

    <t>In the "load balancer" configuration shown in <xref
    target="fig:multiple" />(a) a set of SIP servers (D, E and F)
    receives traffic from a single source A. A load balancer is a
    typical example for such a configuration. In this configuration,
    overload control needs to prevent server A (i.e., the load
    balancer) from sending too much traffic to any of its downstream
    neighbors D, E and F. If one of the downstream neighbors becomes
    overloaded, A can direct traffic to the servers that still have
    capacity. If one of the servers serves as a backup, it can be
    activated once one of the primary servers reaches overload.</t>

    <t>If A can reliably determine that D, E and F are its only
    downstream neighbors and all of them are in overload, it may
    choose to report overload upstream on behalf of D, E and
    F. However, if the set of downstream neighbors is not fixed or
    only some of them are in overload then A should not use overload 
    control since A can still forward the requests destined to
    non-overloaded downstream neighbors. These requests would be
    throttled as well if A would use overload control towards its
    upstream neighbors.</t>

    <t>In the "multiple sources" configuration shown in <xref
    target="fig:multiple" />(b), a SIP server D receives traffic from  
    multiple upstream sources A, B and C. Each of these sources can
    contribute a different amount of traffic, which can vary over
    time. The set of active upstream neighbors of D can change as
    servers may become inactive and previously inactive servers may
    start contributing traffic to D.</t>
    
    <t>If D becomes overloaded, it needs to generate feedback to
    reduce the amount of traffic it receives from its upstream
    neighbors. D needs to decide by how much each upstream neighbor
    should reduce traffic. This decision can require the consideration
    of the amount of traffic sent by each upstream neighbor and it may
    need to be re-adjusted as the traffic contributed by each upstream
    neighbor varies over time. </t>

<!--
    <t>An important goal for overload control is to achieve fairness
    across upstream neighbors. I.e., no upstream neighbor should be
    required to throttle more than another neighbor. In a fair system,
    each request that is routed to D has an equal chance of being
    processed, independent of the upstream neighbor it is coming
    from. A SIP server may have local policies that prefers some
    sources over others. For example, it can throttle a less preferred
    upstream neighbor more or earlier than a preferred neighbor.</t>
-->
    <t>In many configurations, SIP servers form a "mesh" as shown in <xref
    target="fig:multiple" />(c). Here, multiple upstream servers A, B
    and C forward traffic to multiple alternative servers D and
    E. This configuration is a combination of the "load balancer" and
    "multiple sources" scenario.</t> 

    <figure title="Topologies" anchor="fig:multiple">
<artwork><![CDATA[

                +---+              +---+
             /->| D |              | A |-\
            /   +---+              +---+  \
           /                               \   +---+
    +---+-/     +---+              +---+    \->|   |
    | A |------>| E |              | B |------>| D |
    +---+-\     +---+              +---+    /->|   |
           \                               /   +---+
            \   +---+              +---+  / 
             \->| F |              | C |-/ 
                +---+              +---+

    (a) load balancer             (b) multiple sources    

    +---+                          
    | A |---\                        a--\
    +---+-\  \---->+---+                 \
           \/----->| D |             b--\ \--->+---+
    +---+--/\  /-->+---+                 \---->|   |
    | B |    \/                      c-------->| D |
    +---+---\/\--->+---+                       |   |
            /\---->| E |            ...   /--->+---+
    +---+--/   /-->+---+                 /
    | C |-----/                      z--/
    +---+                         

          (c) mesh                   (d) edge proxy

 ]]></artwork>
    </figure>

    <t>Overload control that is based on reducing the number of
    messages a sender is allowed to send is not suited for servers
    that receive requests from a very large population of senders, each of
    which only infrequently sends a request. This scenario is shown in 
    <xref target="fig:multiple" />(d). An edge proxy that is connected
    to many UAs is a typical example for such a configuration.</t>

    <t>Since each UA typically only contributes a few requests, which
    are often related to the same call, it can't decrease its message
    rate to resolve the overload. In such a configuration, a SIP
    server can resort to local overload control by rejecting a
    percentage of the requests it receives with 503 (Service Unavailable)
    responses. Since there are many upstream neighbors that contribute  
    to the overall load, sending 503 (Service Unavailable) to a
    fraction of them can gradually reduce load without entirely
    stopping all incoming traffic. Using 503 (Service Unavailable)
    towards individual sources can, however, not prevent overload if a
    large number of users places calls at the same time.</t>

    <t><list>
        <t>Note: The requirements of the "edge proxy" topology are
        different than the ones of the other topologies, which may
        require a different method for overload control.</t> 
    </list></t>

  </section>

  <section title="Type of Overload Control Feedback" anchor="sec:method">

    <t>The type of feedback generated by a receiver to limit the
    amount of traffic it receives is an important aspect of the
    design. We discuss the following three different types of overload
    control feedback: rate-based, loss-based and window-based overload
    control.</t>

    <section title="Rate-based Overload Control">

      <t>The key idea of rate-based overload control is to limit the
      request rate at which an upstream element is allowed to forward
      to the downstream neighbor. If overload occurs, a SIP server
      instructs each upstream neighbor to send at most X requests per 
      second. Each upstream neighbor can be assigned a different rate
      cap. </t>

      <t>The rate cap ensures that the number of requests received by 
      a SIP server never increases beyond the sum of all rate caps
      granted to upstream neighbors. It can protect a SIP server
      against overload even during load spikes if no new upstream
      neighbors start sending traffic. New upstream neighbors need to
      be factored into the rate caps assigned as soon as they
      appear. The current overall rate cap used by a SIP server is 
      determined by an overload control algorithm, e.g., based on
      system load.</t>

      <t>An algorithm for the sending entity to implement a rate cap
      of a given number of requests per second X is request 
      gapping. After transmitting a request to a downstream neighbor,
      a server waits for 1/X seconds before it transmits the next
      request to the same neighbor. Requests that arrive during the
      waiting period are not forwarded and are either redirected, 
      rejected or buffered.</t>

      <t>A drawback of this mechanism is that it requires a SIP
      server to assign a certain rate cap to each of its upstream
      neighbors during an overload condition based on its overall
      capacity. Effectively, a server assigns a share of its capacity
      to each upstream neighbor during overload. The server needs to
      ensure that the sum of all rate caps assigned to upstream
      neighbors is not (significantly) higher than its actual
      processing capacity. This requires a SIP server to keep track of
      the set of upstream neighbors and to adjust the rate cap if a
      new upstream neighbor appears or an existing neighbor stops
      transmitting. If the cap assigned to upstream neighbors is too
      high, the server may still experience overload. However, if the
      cap is too low, the upstream neighbors will reject requests even
      though they could be processed by the server. </t> 

      <t>A SIP server can evaluate the amount of load it receives from
      each upstream neighbor and assign a rate cap that is suitable
      for this neighbor without limiting it too much. This way, the
      SIP server can allocate resources that are not used by one
      upstream neighbor because it is sending less requests than
      allowed by the rate cap to another server.</t> 
      
      <t>An alternative technique to allocate a rate cap to each 
      upstream neighbor is using a fixed proportion of some control
      variable, X, where X is initially equal to the capacity of the
      SIP server. The server then increases or decreases X until the 
      workload arrival rate matches the actual server
      capacity. Usually, this will mean that the sum of the rate caps
      sent out by the server (=X) exceeds its actual capacity, but
      enables upstream neighbors who are not generating more than
      their fair share of the work to be effectively unrestricted. An
      advantage of this approach is that the server only has to
      measure the aggregate arrival rate, and that the calculation of
      the individual rate caps is fairly trivial. </t>

    </section>
   
    <section title="Loss-based Overload Control">
 
      <t>A loss percentage enables a SIP server to ask an upstream
      neighbor to reduce the number of requests it would normally
      forward to this server by a percentage X. For example, a SIP
      server can ask an upstream neighbor to reduce the number of
      requests this neighbor would normally send by 10%. The upstream
      neighbor then redirects or rejects X percent of the traffic that
      is destined for this server.</t>
	
      <t>An algorithm for the sending entity to implement a loss
      percentage is to draw a random number between 1 and 100 for each
      request to be forwarded. The request is not forwarded to the
      server if the random number is less than or equal to X. </t>

      <t>An advantage of loss-based overload control is that, the
      receiving entity does not need to track the set of upstream
      neighbors or the request rate it receives from each upstream
      neighbor. It is sufficient to monitor the overall system
      utilization. To reduce load, a server can ask its upstream
      neighbors to lower the traffic forwarded by a certain
      percentage. The server calculates this percentage by 
      combining the loss percentage that is currently in use (i.e.,
      the loss percentage the upstream neighbors are currently using
      when forwarding traffic), the current system utilization and the
      desired system utilization. For example, if the server load
      approaches 90% and the current loss percentage is set to a 50%
      traffic reduction, then the server can decide to increase the loss
      percentage to 55% in order to get to a system utilization of 
      80%. Similarly, the server can lower the loss percentage if
      permitted by the system utilization. </t>

<!--
       Loss-based overload control
      achieves fairness among incoming requests if all upstream
      neighbors are throttled by the same percentage. In this case,
      each request destined for an overloaded server has the same
      chance of being rejected by overload control.</t>
-->

      <t>The main drawback of percentage throttling is that the
      throttle percentage needs to be adjusted to the current number
      of requests received by the server. This is in particular
      important if the number of requests received fluctuates
      quickly. For example, if a SIP server sets a throttle value of
      10% at time t1 and the number of requests increases by 20%
      between time t1 and t2 (t1&lt;t2), then the server will see an
      increase in traffic by 10% between time t1 and t2. This is even
      though all upstream neighbors have reduced traffic by 10% as
      told. Thus, percentage throttling requires an adjustment of the
      throttling percentage in response to the traffic received and
      may not always be able to prevent a server from encountering
      brief periods of overload in extreme cases. </t>

    </section>

    <section title="Window-based Overload Control" anchor="sec:window">

      <t>The key idea of window-based overload control is to allow an
      entity to transmit a certain number of messages before it needs
      to receive a confirmation for the messages in transit. Each
      sender maintains an overload window that limits the number of
      messages that can be in transit without being confirmed.</t>

      <t>Each sender maintains an unconfirmed message counter for each
      downstream neighbor it is communicating with. For each message
      sent to the downstream neighbor, the counter is increased by
      one. For each confirmation received, the counter is decreased by
      one. The sender stops transmitting messages to the downstream
      neighbor when the unconfirmed message counter has reached the
      current window size.</t>

      <t>A crucial parameter for the performance of window-based
      overload control is the window size. The windows size together
      with the round-trip time between sender and receiver determines
      the effective message rate that can be achieved. Each sender has
      an initial window size it uses when first sending a
      request. This window size can be changed based on the feedback
      it receives from the receiver.  </t>

      <t>The sender adjusts its window size as soon as it receives the
      corresponding feedback from the receiver. If the new window size
      is smaller than the current unconfirmed message counter, the
      sender stops transmitting messages until more messages are
      confirmed and the current unconfirmed message counter is less
      than the window size.</t>

      <t>A sender should not treat the reception of a 100 Trying
      response as an implicit confirmation for a message. 100 Trying
      responses are often created by a SIP server very early in 
      processing and do not indicate that a message has been successfully
      processed and cleared from the input buffer. If the downstream
      neighbor is a stateless proxy, it will not create 100 Trying
      responses at all and instead pass through 100 Trying responses
      created by the next stateful server. Also, 100 Trying responses
      are typically only created for INVITE requests. Explicit message
      confirmations via an overload feedback mechanism do not have
      these problems.</t>

      <t>The behavior and issues of window-based overload control are
      similar to rate-based overload control, in that the total
      available receiver buffer space needs to be divided among all
      upstream neighbors. However, unlike rate-based overload control,
      window-based overload control can ensure that the receiver
      buffer does not overflow under normal conditions. The
      transmission of messages by senders is effectively clocked by
      message confirmations received from the receiver. A buffer
      overflow can occur if a large number of new upstream neighbors
      arrives at the same time. </t>

<!--
Window-based
      overload control is also more robust against errors in the
      division of capacity among upstream neighbors than rate-based
      overload control. A window size that is too large will create a
      buffer overflow, however, senders will eventally stop
      transmitting new requests. 
-->

    </section>

    <section title="On-/Off Overload Control" anchor="sec:onoff">

      <t>On-/off overload control feedback enables a SIP server to
      turn the traffic it is receiving either on or off. The 503
      (Service Unavailable) response implements on-/off overload
      control. On-/off overload control is less effective in
      controlling load than the fine grained control methods above. In
      fact, the above methods can realize on/-off overload control,
      e.g., by setting the allowed rate to either zero or
      unlimited.</t> 

    </section>

  </section>

  <section title="Overload Control Algorithms" anchor="sec:algorithm">

    <t>An important aspect of the design of an overload control mechanism
    is the overload control algorithm. The control algorithm
    determines when the amount of traffic a SIP server receives needs
    to be decreased and when it can be increased.</t> 

    <t>Overload control algorithms have been studied to a large extent
    and many different overload control algorithms exist. With many
    different overload control algorithms available, it seems
    reasonable to define a baseline algorithm and allow the use of
    other algorithms if they don't violate the protocol
    semantics. This will also allow the development of future
    algorithms, which may lead to a better performance. </t>

  </section>

  <section title="Self-Limiting">

    <t>An important design aspect for an overload control mechanism
    is that it is self limiting. I.e., an overload control mechanism
    should stop a sender if the sender does not receive any feedback
    from the receiver. This avoids that an overloaded server, which
    has become unable to generate overload control feedback, will be
    overwhelmed with requests.</t>

    <t>Window-based overload control is inherently self-limiting since
    a sender cannot continue without receiving confirmations. Servers
    using Rate- or Loss-based overload control need to be configured
    to stop transmitting if they do not receive any feedback from the
    receiver.</t> 

  </section>

  <section anchor="sec:security" title="Security Considerations">
 
    <t>[TBD.]</t> 

  </section>

  <section anchor="sec:iana" title="IANA Considerations">
  
    <t>[TBD.]</t> 

  </section>

</middle>

<back>

<section title="Contributors">

  <t>Contributors to this document are: Ahmed Abd el al (Sonus
   Networks), Mary Barnes (Nortel), Carolyn Johnson (AT&T Labs), Daryl
   Malas (CableLabs), Eric Noel (AT&T Labs), Tom Phelan (Sonus
   Networks), Jonathan Rosenberg (Cisco), Henning Schulzrinne
   (Columbia University), Charles Shen (Columbia University), Nick
   Stewart (British Telecommunications plc), Rich Terpstra (Level 3),
   Indra Widjaja (Bell Labs/Alcatel-Lucent). Many thanks!</t>

</section> 

<references title='Informative References'>

  &rfc3261;

  &rfc4412;

  &i-d.rosenberg-sipping-overload-reqs;

</references>

</back>

</rfc>


 

