<?xml version='1.0' ?>
<!DOCTYPE rfc SYSTEM 'rfc2629.dtd' [      
  <!ENTITY RFC2119 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'>
  <!ENTITY RFC2326 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2326.xml'>
  <!ENTITY RFC3261 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3261.xml'>  
]>
<?xml-stylesheet type="text/xsl" href="rfc2629.xslt"?> 
<rfc ipr='full3978' category='info'>
<?rfc toc='yes'?>
<?rfc compact='yes'?>
<?rfc sortrefs='yes'?>

<front>
	<title abbrev='media playback protocol requirements'>
		Media Playback Control Protocol Requirements
	</title>
	
  <author initials='S.' surname='Whitehead' fullname='Steven Whitehead'>
    <organization>Verizon Laboratories Inc.</organization>
    <address>
      <postal>
	<street>40 Sylvan Road</street>
	<city>Waltham</city> 
	<region>MA</region>
	<code>02451</code>
	<country>USA</country>
      </postal> 
      <email>steven.d.whitehead@verizon.com</email>
    </address>
  </author>
  
  <author initials='M.J.' surname='Montpetit' fullname='Marie-Jose Montpetit'>
    <organization>Motorola</organization>
    <address>
      <postal>
	<street>900 Chelmsford Street</street>
	<city>Lowell</city> 
	<code>01851</code>
	<region>MA</region>
	<country>USA</country>
      </postal> 
      <email>mmontpetit@motorola.com</email>
    </address>
  </author>
  
  <author initials='X.' surname='Marjou' fullname='Xavier Marjou'>
    <organization>France Telecom</organization>
    <address>
      <postal>
	<street>Rue Pierre Marzin</street>
	<city>Lannion</city> 
	<code>22307</code>
	<region>Brittany</region>
	<country>France</country>
      </postal> 
      <email>xavier.marjou@orange-ftgroup.com</email>
    </address>
  </author>
  
  <author initials="S." surname="Ganesan" fullname="Sam Ganesan">
			<organization>Motorola</organization>
			<address>
				<postal>
					<street>80 Central Street</street>
					<street/>
					<city>Boxborough</city>
					<region>MA</region>
					<code>01719</code>
					<country>US</country>
				</postal>
				<email>sam.ganesan@motorola.com</email>
			</address>
		</author>
	
	<author initials="J." surname="Lindquist" fullname="Jan Lindquist">
			<organization>Ericsson</organization>
			<address>
				<postal>
					<street>Tellusborgsvaegen 83-87</street>
					<street/>
					<city>Hagersten</city>
					<region>Hagersten</region>
					<code>12637</code>
					<country>Sweden</country>
				</postal>
				<email>jan.lindquist@ericsson.com</email>
			</address>
		</author>
	
	<date month='May' year='2008' />
	
	<area>Real-time Applications and Infrastructure Area</area>

  <keyword>SIP</keyword>
  <keyword>RTSP</keyword>
  <keyword>SDP</keyword>
  <keyword>VCR</keyword>
  <keyword>Trick play</keyword>
  <keyword>Trick modes</keyword>
  <keyword>Extension</keyword>
    
  <abstract>
  
    <t>The media playback control functionality controls the delivery of
       streaming media by the means of commands like pause, fast forward,
       fast rewind, slow forward, slow rewind.  This document presents some
       of the requirements for a media control protocol that does not
       contain any session setup semantics in it.
    </t>
  </abstract>
  
</front>

<middle>

  <section title="Introduction">

  <t>This document defines the requirements for a media control protocol
   distinct from the session control protocol for Content on demand
   media streams.</t>

   <t>Historically media stream
   delivery has been controlled by RTSP as both the session set up
   protocol as well as the media control protocol.  RTSP <xref target="RFC2326"/> and its
   successor <xref target="RFC2326bis"/> define semantics for both session setup as well as
   media control, with commands like pause, Rewind etc.</t>

   <t>Similarly SIP has been the protocol of choice for end to end session
   establishment and rendezvous <xref target="RFC3261"/>.  These functionalities in the
   context of conversational services has been the main raison d'etre
   and forte of SIP.</t>

   <t>There exist environments, circumstances and use cases where it is
   desirable to use SIP for session establishment and rendezvous, while
   retaining RTSP's capabilities for media and play back control.  Such
   circumstances are increasing in number given that user agents with
   wide ranging capabilities are become much more common in deployments.
   Good Integration with existing RTSP deployments is also desirable
   under these conditions.</t>

   <t><xref target="use-cases"/> describes some use cases that motivate this work.</t>

   <t><xref target="requirements"/> lays out the requirements for a media control protocol,
   ideally some form of lightweight RTSP without its session
   establishment semantics, that can be used with a session
   establishment and rendezvous protocol.</t>

  </section>


 
	<section title="Terminology">
      
	<t> In this document, the key words "MUST", "MUST NOT", "REQUIRED",
   "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY",
   and "OPTIONAL" are to be interpreted as described in RFC 2119 <xref target="RFC2119" />.</t>
  
	</section>


	<section anchor='use-cases' title="Use Case Scenarios">

   <t>The scope of scenarios for this document includes applications with
   the following characteristics: content-on-demand, streaming media,
   unicast-media streams, live or recorded content, ubiquitous access
   (any-device, any-access).</t>

   <t>While of interest, non-streaming media applications, such as
   downloaded media services, are outside the scope of this document.</t>

<section title="Characteristics">

   <t>For the purposes of this document, the term 'controlled streaming
   media application' represents a class of applications with the
   following characteristics:
  
  <list style="symbols"> 
		<t> Multiple servers that can be a source of content but showing up as
      a single muxed stream at the client.</t> 
		<t> One or more clients can receive the content.</t>
		<t> The media stream(s) needs to be delivered isochronously, in the
      most common case: the client intends to begin rendering the media
      before delivery is complete.
    </t>
		<t> Less common but equally valid, the server does not have resources
      to buffer content until the client is ready to receive it, e.g., a
      live feed.
    </t>
		<t> A session exists between source (e.g. server or peer) and
      destination (e.g. client or peer).
    </t>
		<t> The session is established, managed, and terminated through the
      use of a signaling protocol, in which control messages are
      exchanged (either directly or indirectly) between the source and
      the destination referred to as 'session signaling'.
   </t>
		<t> The application supports media stream control.  The client(s), or
      a proxy element acting on behalf of the client(s), has the ability
      to manipulate the media stream (or other aspect of the
      application) via signaling.  This is referred to as media control
      signaling (or application signaling ).</t>
		</list>
   </t>
   
   </section>



<section title="Use Case Descriptions">

   <t>As IP-based broadband data services have continued to develop and
   expand, opportunities for streaming media applications have also
   proliferated and expanded beyond the traditional framework.  This
   section describes several streaming media application use case
   scenarios.  These scenarios illustrate the variety of conditions and
   environments in which streaming media applications need to operate.</t>

   <t>Use cases are used with the purpose of clarifying the 'streaming
   media application' and to explore the application space.  The
   objectives are to:
   <list style="symbols"> 
   <t>Clarify the frame / scope the discussion.</t>
   <t>Illustrate some of usage scenarios.</t>
   <t>Identify some of the key attributes that characterize these use
      cases.</t>
   </list>
  </t>
  
</section>
 


<section title="Server Control of Streaming Session">

   <t>During a streaming session, the operator may want to redirect or move
   pending sessions in order to for e.g., upgrade the server.
   Therefore, the server will initiate a session redirect.  Another use
   case of Server control operation is when the server decides to
   initiate a session to the user, based on user preconfigured-settings
   (e.g. reminders).  Further, sessions that are indefinitely paused by
   the client need to be terminated and server resources reclaimed at
   the operators discretion.  This would also be true for sessions where
   the client may have become unresponsive.</t>
   
</section>

<section title="Remote Access to Private/Firewalled Video Content">

   <t>In this use case, the user, while not on his home network, wants to
   access content that is stored on his personal or home network, e.g.,
   a pre-recorded show on a PVR device, or a monitoring camera at his or
   her home location that is capable of providing a live feed as well as
   record it locally as a PVR asset.</t>

   <t>In the case of a live monitoring camera in a home network, the user
   wishes to transition from watching a live stream of the feed to being
   able to move backwards and forwards using media control commands on
   the stored content of the monitoring device.  This translates
   logically to transitioning from watching live TV programming to Time
   shifted TV or PVR type of viewing.  Being able to do it in the
   context of the same session is desirable.</t>

   <t>In the case of a home network based PVR, it is more than likely that
   the home gateway is not set up to be an inbound device for various
   session setup requests to enable the external client to traverse the
   protocol unaware IP NAT device commonly found at the edge of home
   networks.</t>

   <t>In both these cases, the video server is behind a firewall.  In the
   first case, the transition from one mode (live feed) to another
   (COD), would entail multiple messages for staying within a session.
   Also, the client being exterior to the firewall, needs to establish a
   TCP connection from "out" to "in".  RTSP as in stands currently does
   not deal with these adequately.  A rendezvous capable protocol like
   SIP could provide this in addition to client identity and location.</t>

</section>

<section title="VOD services that requires resource or QOS-guarantees">

   <t>Consider a Video on Demand (stored video) service provided as a
   unicast session to an end user device from a server.  The user
   requests a VOD movie.  If the operator uses a network proxy to
   request and guarentee QoS for the delivery of the movie from the
   server to the client, currently RTSP does not provide the means of
   guarenteeing that all subsequent messages that form part of the RTSP
   session will go through the network proxy that manages the QoS.  In
   this particular use case, an end to end session setup and management
   protocol would be helpful.</t>

</section>

<section title="Intelligent selection of media encoding">

   <t>A user orders content to be delivered to its current device. The
   content could exist in different format (e.g. standard definition or
   high definition) or encoding (e.g. MPEG2, MPEG4, ...). In addition
   the device can be located behing different types of access networks,
   which implies  bandwidth constraints. The selection of media encoding 
   can be adjusted to accomodate these multiple characteristics.</t>

</section>


<section title="Voice/video mailbox">

<t>The control of the server may be extended to a voice/video mailbox 
system. In addition to controlling the playing of the messages and 
fast forward or rewind similar to content on demand with a couple 
of additional commands to delete or save messages it is possible 
o have access to a mailbox system. THe mailbox may be located in 
the home or in the network accessed from the home. If the mailbox 
is at home then it is possible to remotely access to the messages.</t>

</section>


<section title="Motion Detection">

<t>Motion-detecting or pattern-matching cameras may need to call
a human when motion/pattern is detected and may also have a buffer,
which allow the human to place pause/rewind commands.</t>

</section>


<section title="Video Subscriptions">
 
<t>A user subscribes on a webpage to get all new local high-school 
football games, or Voipsa blue-box podcasts.  He wants his DVR/TV 
to pop up with the option to play them immediately, or record them 
to DVR, or neither.  The high-school football game is not available 
from cable TV, so the DVR doesn't have a schedule to know when 
to get it by itself.  Of course this could be done with an 
off-line indication, such as email, but it would be nice if 
his DVR didn't need to support email.</t>

</section>

</section>		
	 


	
 

	<section anchor='requirements' title="Requirements">

		<t>This section outlines the key requirements that need to be 
		satisfied in order to have a media control protocol acting as a control stream within a multimedia session.</t>
		
		<t><list style="format REQ-%d">

		<t>The media control protocol must support commands such as play,
          pause, rewind, forward, fast rewind, fast forward, slow
          rewind, and slow forward.<vspace blankLines="1"/></t>
				
		<t>It must be possible to negotiate the media control protocol of a media stream.<vspace blankLines="1"/></t>
		
		<t>If the media control protocol does not apply to all media
          streams of a given session, it must be possible to indicate
          the specific media streams that are under the scope of the
          trick-play control protocol.<vspace blankLines="1"/></t>
				
		<t>The media control protocol must allow for asynchronous media
          event notifications (e.g.: end-of-stream)".<vspace blankLines="1"/></t>
		
		<t>The protocol SHOULD work over TCP.<vspace blankLines="1"/></t>
	     
	  <t>The media stream, or media control server, to be controlled by the client may be located in the network or in the home network.<vspace blankLines="1"/></t>
	  
    <t>The media control protocol shall consider additional commands not available in RTSP to control the media in the server. Examples of such commands are deletion or saving of voice messages.<vspace blankLines="1"/></t>
		
		</list></t>
		
	</section>
 

	
 <section title="Considerations for session control protocol">
    
    
<t>This section raises a number of areas that need consideration while developing new standards for the use cases:</t>

<t>1. RTSP protocol assumes a media server is located in the network and not in the home. The session protocol shall be possible to establish a relationship that allows for control of media resoruces in both the home and network.</t>

<t>2. The session protocol shall be able to handle NAT and not affect the call flows being defined.</t>

<t>3. If assuming SIP for session protocol there is a well accepted architecture defined called IMS, IP Multimedia Solution, which is accepted by a number of standard organizations like 3GPP, ETSI TISPAN, ATIS, etc. There is a number of services have been defined using the architecture like telephony, push to talk, presence, messaging, chat and IPTV.</t>

<t>4. In conjunction with IMS there is a resource and admition control architecture called RACS defined in ETSI TISPAN which helps ensure QoS for services defined over IMS and addresses reuse of network resources. What is of special interest is bandwidth reservation is addressed for unicast and multicast media streams as well as handling of multicast addresses used for IPTV.</t>

<t>5. IMS also provides additional authentication mechanisms which allow alternatives for HTTP Digest like the Authentication and Key Agreement mechanism (AKA).</t>

<t>6. The session protocol shall allow server initiated control of streaming sessions such as server-initiated session terminations. RTSP TEARDOWNs are from client to server only.</t>

</section>

 <section title="Call Flows of Use Cases">

<section title="Content on Demand Session Initiation">

<t>For content on demand the session initiation established two media streams, one for media control channel and one for media delivery channel between the entities Media Control Client and Server. All required information for establishing the two channels are conveyed in the session initiation.</t>

<t>The proxy acts as back-to-back user agent for the session control. The proxy may open pin-holes for the media control channel and protocol. The proxy is used to protect the core network which the Media Control Server is located.</t>

<t>
<figure>
<artwork><![CDATA[
Media Control Client            Proxy             Media Control Server
      |                           |                           |
1.    |--- session initiation --->|                           |
2.    |                           |--- session initiation --->|
3.    |                           |<-- confirm initiation ----|
4.    |<-- confirm initiation ----|                           |
5.    |<-- Media control channel established ---------------->|
      |                                                       |
6.    |=== request play media ===============================>|
7.    |<== confirm play media ================================|
      |                                                       |
8.    |<-- Media delivery channel established --------------->|
  ]]></artwork>
    </figure>
</t>

<t>The Media Control Client or Server do not need to be strickly located in a home for the Client and the network for the Server. The roles may reversed. The Client may be located in the network as a remote user attempting to access the content in the home; the Server is then located in the home.</t>

<t>Other use cases compared to content on demand may be supported that follow theses sequences. For example voice/video mailbox may be supported.</t>

</section>
    
    <section title="Content on Demand Session Termination">

<t>When content on demand is terminated either the media control client or server may initiate a session termination. Pin-holes that may previously have been established are closed.</t>

<t>
<figure>
<artwork><![CDATA[
Media Control Client            Proxy             Media Control Server
      |                           |                           |
1.    |--- session termination -->|                           |
2.    |                           |--- session termination -->|
3.    |                           |<-- confirm termination ---|
4.    |<-- confirm termination ---|                           |
  ]]></artwork>
    </figure>
</t>

<t>Alternatively the server may terminate the session.</t>

<t>
<figure>
<artwork><![CDATA[
Media Control Client            Proxy             Media Control Server
      |                           |                           |
1.    |                           |<--- session termination --|
2.    |<--- session termination --|                           |
3.    |--- confirm termination--->|                           |
4.    |                           |--- confirm termination -->|
  ]]></artwork>
    </figure>
</t>


</section>

    
    
    

<section title="Activation of trick play of linear TV Session Modification">

<t>The first 5 steps setup linear TV and a multicast media delivery channel. If proxy is aware of the bandwidth limitations for the client it may reserve the required bandwidth for the session. Note that parallel sessions may be established for the same client to convey multiple linear TV channels at the same time.</t>

<t>When the user pauses live TV the same session is used to modify the media requirements from a multicast media channel to a media control and unicast media channels. If session modification is successful then the live TV is paused. If unsuccessful then the linear TV is maintained without affecting the use viewing experience. By performing the action within the same session network resources are not released when transitioning between multicast and unicast otherwise there is a risk that resources are not anymore available if trying to reestablish the previous session.</t>


<t>
<figure>
<artwork><![CDATA[
Media Control Client            Proxy             Media Control Server
      |                           |                           |
1.    |--- session initiation --->|                           |
2.    |                           |--- session initiation --->|
3.    |                           |<-- confirm initiation ----|
4.    |<-- confirm initiation ----|                           |
5.    |<-- Multicast media delivery channel established ----->|
      |                                                       |
6.  User pauses live TV
7.    |--- session modification ->|                           |
8.    |                           |--- session modification ->|
9.    |                           |<-- confirm modification --|
10.   |<-- confirm modification --|                           |
11.   |<-- Media control channel established ---------------->|
12.   |=== request pause media ==============================>|
13.   |<== confirm pause media ===============================|
14.   |<-- Unicast media delivery channel established ------->|
  ]]></artwork>
    </figure>
</t>

</section>

    
    
    
<section title="Deactivation of trick play of linear TV Session Modification">

<t>When the user switches TV channels or catches up with live TV then the session modification is performed and media requirements are changed from a media control and unicast media channels to a multicat media channel. Note that if user chooses to end viewing session termination is performed directly with modification.</t>

<t>
<figure>
<artwork><![CDATA[

Media Control Client           Proxy              Media Control Server
      |                           |                            |
1.    |<-- Unicast media delivery channel ongoing ------------>|
      |                                                        |
2.    |--- session modification ->|                            |
3.    |                           |--- session modification -->|
4.    |                           |<-- confirm modification ---|
5.    |<-- confirm modification --|                            |
      |                                                        |
6.    |<-- Multicast media delivery channel established ------>|
  ]]></artwork>
    </figure>
</t>

</section>

    
    
  </section>  
    
	<section title="Security Considerations">
  
		<t>T.B.D.</t>

	</section>


	<section title="IANA Considerations">

		<t>This document has no actions for IANA.</t>

	</section> 


	<section title="Acknowledgements">

		<t>Thanks to Christer Holmberg from Ericsson, Priya Rajagopal from Motorola , 
		Mikhael Said from France Telecom, Dan Wing from Cisco, and Hadriel Kaplan from Acme Packet.</t>

	</section> 


  
</middle>

<back>

<references title="Normative references">

	&RFC2119;

</references>

<references title="Informative references">

  &RFC2326;

	<reference anchor='RFC2326bis'>
		<front>
			<title>Real Time Streaming Protocol 2.0 (RTSP)</title>            
			<author initials='H.' surname='Schulzrinne' fullname='H. Schulzrinne'><organization/></author>
			<author initials='A.' surname='Rao' fullname='A. Rao'><organization/></author>
			<author initials='R.' surname='Lanphier' fullname='R. Lanphier'><organization/></author>
			<author initials='M.' surname='Westerlund' fullname='M. Westerlund'><organization/></author>
			<author initials='M.' surname='Stiemerling' fullname='M. Stiemerling'><organization/></author>
			<date month='November' year='2007' />
		</front>
		<seriesInfo name='Internet-Draft' value='draft-ietf-mmusic-rfc2326bis-18' />
	</reference>
	
	&RFC3261;
  
  <!--&RFC3264;-->
      
  <!--&RFC4463;-->

  <!--&RFC4582-->
  
</references>

</back>

</rfc>


 

