<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc toc="yes" ?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc iprnotified="no" ?>
<?rfc strict="yes" ?>
<rfc ipr="full3978" docName="draft-ietf-avt-rfc3047-bis-08.txt" category="std" obsoletes="3047 (If approved)">
	<front>
		<title abbrev="G7221">RTP Payload Format for ITU-T Recommendation G.722.1</title>
		<author initials="P" surname="Luthi" fullname="Patrick Luthi">
			<organization>Tandberg</organization>
			<address>
				<postal>
					<street>Philip Pedersens vei 22</street>
					
					<city>1366 Lysaker</city>
					<country>Norway</country>
				</postal>
				<email>patrick.luthi@tandberg.no</email>
			</address>
		</author>
		<author initials="R" surname="Even" fullname="Roni Even">
			<organization>REL</organization>
			<address>
				<postal>
					<street>14 David Hamelech</street>
					<city>Tel Aviv</city>
					<country>Israel</country>
					<code>64953</code>
				</postal>
				<email>ron.even.tlv@gmail.com</email>
			</address>
		</author>
		<date month="August" year="2008"/>
		<workgroup>AVT</workgroup>
		<abstract>
			<t>International Telecommunication Union (ITU-T) Recommendation G.722.1 is a wide-band audio codec. This document describes the payload format for including G.722.1 generated bit streams within an RTP packet.  The document also describes the syntax and semantics of the SDP parameters needed to support G.722.1 audio codec.</t>
		</abstract>
	</front>
	<middle>
		<section title="Introduction">
			<t>ITU-T G.722.1 <xref target="ITU.G7221"/> is a low complexity coder, it compresses 50 Hz - 14 kHz audio
   signals into one of the following bit rates, 24 kbit/s, 32 kbit/s or 48 kbit/s.</t>
			<t>The coder may be used for speech, music and other types of audio.</t>
			<t> Some of the applications for which this coder is suitable are:</t>
			<t>
				<list style="symbols">
					<t>Real-time communications such as videoconferencing and telephony.</t>
					<t>Streaming audio</t>
					<t>Archival and messaging</t>
				</list>
			</t>
			<t>ITU-T G.722.1 <xref target="ITU.G7221"/> uses 20 ms frames and a sampling rate clock of 16 kHz or 32kHz. The encoding and decoding algorithm can change the bit rate at any 20ms frame boundary, but no bit rate change notification is provided in-band with the bit stream.</t>
			<t>For any given bit rate the number of bits in a frame is a constant. Within this fixed frame, G.722.1 uses variable length coding (e.g.  Huffman coding) to represent most of the encoded parameters.  All variable length codes are transmitted in order from the left most (most significant - MSB) bit to the right most (least significant - LSB) bit, see [ITU.G7221] for more details.    </t>
			<t>The ITU-T standardized bit rates for G.722.1 are 24 kbit/s or 32kbit/s at 16 Khz sample rate, and 24 kbit/s, 32 kbit/s or 48 kbit/s at 32 Khz sample rate.  However, the coding algorithm itself has the capability to run at any user specified bit rate (not just 24, 32 and 48 kbit/s) while maintaining an audio bandwidth of 50 Hz to 14 kHz. This rate change is accomplished by a linear scaling of the codec operation, resulting in frames with size in bits equal to 1/50 of the corresponding bit rate.
   </t>
		</section>
		<section title="Terminology">
			<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY",
            and "OPTIONAL" in this document are to be interpreted as described in RFC2119 <xref target="RFC2119"/> and indicate requirement levels for compliant RTP implementations.</t>
		</section>
		<section title="RTP usage for G.722.1">
			<section title="RTP G.722.1 Header Fields">
				<t>The RTP header is defined in the RTP specification <xref target="RFC3550"/>. This section defines how fields in the RTP header are used.</t>
				<t>
					<list hangIndent="1">
						<t>Payload Type (PT): The assignment of an RTP payload type for this packet format is outside the scope of this document; it is specified by the RTP profile under which this payload format is used, or signaled dynamically out-of-band (e.g., using SDP).</t>
						<t>Marker (M) bit: The M bit is set to zero. </t>
						<t>Extension (X) bit: Defined by the RTP profile used.</t>
						<t>Timestamp: A 32-bit word that corresponds to the sampling instant for the first frame in the RTP packet. The sampling frequency can be 16 Khz or 32 Khz. The RTP timestamp clock frequency of 32 Khz SHOULD be used unless only an RTP stream sampled at 16 Khz is going to be sent.  </t>
					</list>
				</t>
			</section>
			<section title="RTP payload format for G.722.1">
				<t> The RTP payload for G.722.1 has the format shown in Figure 1. No additional header fields specific to this payload format are required.</t>
				<figure>
					<artwork><![CDATA[
					   
   0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                      RTP Header                               |
   +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
   |                                                               |
   +                 one or more frames of G.722.1                 |
   |                             ....                              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   
                  Figure 1: RTP payload for G.722.1					]]></artwork>
				</figure>
				<t>Because bit rate is not signaled in-band, a separate out-of-band method is REQUIRED to indicate the bit rate (see section 5 for an example of signaling bit rate information using SDP). For the payload format specified here, the bit rate MUST remain constant for a particular payload type value. An application MAY switch bit rates and clock rates from packet to packet by defining different payload type value and switching between them.</t>
				<t>The use of Huffman coding means that it is not possible to identify the various encoded parameters/fields contained within the bit stream without first completely decoding the entire frame. For the purposes of packetizing the bit stream in RTP, it is only necessary to consider the sequence of bits as output by the G.722.1 encoder, and present the same sequence to the decoder.  The payload format described here maintains this sequence.</t>
				<t>When operating at 24 kbit/s, 480 bits (60 octets) are produced per frame. When operating at 32 kbit/s, 640 bits (80 octets) are produced per frame. When operating at 48 kbit/s, 960 bits (120 octets) are produced per frame.  Thus, all three bit rates allow for octet alignment without the need for padding bits.</t>
				<t>Figure 2 illustrates how the G.722.1 bit stream MUST be mapped into an octet aligned RTP payload.</t>
				<figure>
					<artwork><![CDATA[
					   
      first bit                                          last bit
      transmitted                                     transmitted
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                         |
      + sequence of bits (480, 640 or 960) generated by the     |
      |            G.722.1 encoder for transmission             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


      |           |           |                     |           |
      |           |           |     ...             |           |
      |           |           |                     |           |


      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... +-+-+-+-+-+-+-+-+-+-+-+-+
      |MSB...  LSB|MSB...  LSB|                     |MSB...  LSB|
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... +-+-+-+-+-+-+-+-+-+-+-+-+
        RTP         RTP                               RTP
        octet 1     octet 2                           octet
                                                   60, 80, 120

        Figure 2:  The G.722.1 encoder bit stream is split into
                   a sequence of octets (60, 80 or 120 depending on
                   the bit rate), and each octet is in turn
                   mapped into an RTP octet.					]]></artwork>
				</figure>
				<t>When operating at non-standard rates the payload format MUST follow the guidelines illustrated in Figure 2.  It is RECOMMENDED that values in the range 16000 to 48000 be used. Non standard rates MUST have a value that is a  multiple of 400 (this maintains octet alignment and does not then require (undefined) padding bits for each frame if not octet aligned).  For example, a bit rate of 16.4 kbit/s will result in a frame of size 328 bits or 41 octets which are mapped into RTP per Figure 2.
	</t>
			</section>
			<section title="Multiple G.722.1 frames in a RTP packet">
				<t> A sender may include more than one  consecutive G.722.1 frame in a single RTP packet.</t>
				<t>Senders have the following additional restrictions:</t>
				<t>
					<list style="symbols">
						<t>Sender SHOULD NOT include more G.722.1 frames in a single RTP packet than will fit in the MTU of the RTP transport protocol.</t>
						<t>All frames contained in a single RTP packet MUST be of the same length and sampled at the same clock rate. They MUST have the same bit rate (octets per frame).</t>
						<t>Frames MUST NOT be split between RTP packets.</t>
					</list>
				</t>
				<t>It is RECOMMENDED that the number of frames contained within an RTP packet be consistent with the application.  For example, in a telephony application where delay is important, then the fewer frames per packet the lower the delay, whereas for a delay insensitive streaming or messaging application, many frames per packet would be acceptable.
	</t>
			</section>
			<section title="Computing the number of G.722.1 frames">
				<t>Information describing the number of frames contained in an RTP packet is not transmitted as part of the RTP payload.  The only way to determine the number of G.722.1 frames is to count the total number of octets within the RTP packet, and divide the octet count by the number of expected octets per frame. This expected octet per frame count is derived from the bit rate, and is therefore a function of the payload type.</t>
			</section>
		</section>
		<section title="IANA Considerations">
			<t>This document updates the G7221 media type described in RFC3047.</t>
			<section title="Media Type Registration">
				<t>This section describes the media types and names associated with this payload format. The section registers the media types, as per RFC4288 <xref target="RFC4288"/>
				</t>
				<section title="Registration of media type audio/G7221">
					<t>Media type name: audio </t>
					<t> Media subtype name: G7221 </t>
					<t> Required parameters: </t>
					<t>
						<list hangIndent="1">
							<t>bitrate: the data rate for the audio bit stream.  This parameter is mandatory because the bit rate is not signaled within the G.722.1 bit stream.  At the standard G.722.1 bit rates, the value MUST be either 24000 or 32000 at 16 Khz sample rate, and 24000, 32000 or 48000 at 32 Khz sample rate.  If using the non-standard bit rates, then it is RECOMMENDED that values in the range 16000 to 48000 be used. Non standard rates MUST have a value that is a multiple of 400 (this maintains octet alignment and does not then require (undefined) padding bits for each frame if not octet aligned).</t>
						</list>
					</t>
					<t>Optional parameters:  </t>
					<t>
						<list hangIndent="1">
							<t>rate: RTP timestamp clock rate, which is equal to the sampling rate. If the parameter is not specified a clock rate of 16 Khz is assumed. 
</t>
							<t>ptime: see RFC 4566. SHOULD be a multiple of 20 msec.</t>
							<t>maxptime: see RFC 4566. SHOULD be a multiple of 20 msec.</t>
						</list>
					</t>
					<t> Encoding considerations: </t>
					<t>
						<list hangIndent="1">
							<t>This media type is framed and binary, see section 4.8 in <xref target="RFC4288"/>.
							</t>
						</list>
					</t>
					<t>Security considerations: See Section 6</t>
					<t>Interoperability considerations: </t>
					<t>
						<list hangIndent="1">
							<t>Terminals SHOULD offer a media type at 16 Khz sample rate in order to interoperate with terminals that do not support the new 32 Khz sample rate. </t>
						</list>
					</t>
					<t> Published specification: RFC yyy [see RFCeditor notes].  </t>
					<t> Applications which use this media type: </t>
					<t>
						<list hangIndent="1">
							<t>  Audio and Video streaming and conferencing applications. </t>
						</list>
					</t>
					<t>Additional information: none </t>
					<t>Person and email address to contact for further information :</t>
					<t>
						<list hangIndent="1">
							<t> Roni Even: ron.even.tlv@gmail.com</t>
						</list>
					</t>
					<t>Intended usage: COMMON </t>
					<t>Restrictions on usage: </t>
					<t>
						<list hangIndent="1">
							<t>This media type depends on RTP framing, and hence is only defined for transfer via RTP <xref target="RFC3550"/>. Transport within other framing protocols is not defined at this time.</t>
						</list>
					</t>
					<t>Author: Roni Even </t>
					<t>Change controller:</t>
					<t>
						<list hangIndent="1">
							<t>IETF Audio/Video Transport working group delegated from the IESG.</t>
						</list>
					</t>
				</section>
			</section>
		</section>
		<section title="SDP Parameters ">
			<t>The media types audio/G7221 are mapped to fields in the Session Description Protocol (SDP) <xref target="RFC4566"/> as follows: </t>
			<t>
				<list style="symbols">
					<t>The media name in the "m=" line of SDP MUST be audio. </t>
					<t>The encoding name in the "a=rtpmap" line of SDP MUST be G7221 (the media subtype). </t>
					<t>The parameters "rate" goes in "a=rtpmap" as clock rate parameter</t>
					<t>Only one bitrate SHALL be defined (using the "bitrate=" parameter in the a=fmtp line) for each payload type. 
					</t>
				</list>
			</t>
			<section title="Usage with the SDP Offer Answer Model ">
				<t>
				 When offering G.722.1 over RTP using SDP in an Offer/Answer  model <xref target="RFC3264"/>  the following considerations are necessary.</t>
				<t>The combination of the clock rate in the rtpmap and the bitrate parameter defines the configuration of each payload type. Each configuration intended to be used, MUST be declared.</t>
				<t>There are two sampling clock rates defined for G.722.1 in this document. RFC3047 <xref target="RFC3047"/> supported only the 16 Khz clock rate. Therefore a system that wants to use G.722.1 SHOULD offer a payload type with clock rate of 16000 for backward interoperability.</t>
				<t>An example of an offer that includes a 16000 and 32000 clock rate is:</t>
				<t>
					<list hangIndent="10">
						<t>m=audio 49000 RTP/AVP 121 122<vspace blankLines="0"/> 
						a=rtpmap:121 G7221/16000<vspace blankLines="0"/> 
						a=fmtp:121 bitrate=24000<vspace blankLines="0"/> 
						a=rtpmap:122 G7221/32000<vspace blankLines="0"/> 
						a=fmtp:122 bitrate=48000</t>
					</list>
				</t>
			</section>
		</section>
		<section title="Security Considerations">
			<t>RTP packets using the payload format defined in this specification are subject to the security considerations discussed in the RTP specification [RFC3550], and any appropriate RTP profile.  This implies that confidentiality of the media streams is achieved by encryption.</t>
			<t>A potential denial-of-service threat exists for data encoding using compression techniques that have non-uniform receiver-end computational load.  The attacker can inject pathological datagrams into the stream which are complex to decode and cause the receiver to be overloaded.  However, this encoding does not exhibit any significant non-uniformity.</t>
		</section>
		<section title="Changes from RFC 3047">
			<t>The new draft obsoletes RFC3047 adding the support for the Superwideband (14 Khz) audio support defined in annex C of the new revision of ITU-T G.722.1.</t>
			<t>Other changes</t>
			<t>Update the text to be in line with the current rules for RFC and with media type registration conforming to RFC 4288.</t>
		</section>
		<section title="Acknowledgements">
			<t>The authors would like to thank Tom Taylor for his contribution to this work.</t>
		</section>
		<section title="RFC editor note">
			<t>Please replace RFC yyy giving the number assigned to this RFC.</t>
		</section>
	</middle>
	<back>
		<references title="Normative References">
			<reference anchor="ITU.G7221">
				<front>
					<title>Low-complexity coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss</title>
					<author>
						<organization>International Telecommunications Union</organization>
					</author>
					<date month="" year="2005"/>
				</front>
				<seriesInfo name="ITU-T" value="Recommendation G.722.1"/>
			</reference>
			<reference anchor="RFC2119">
				<front>
					<title abbrev="RFC Key Words">Key words for use in RFCs to Indicate Requirement Levels</title>
					<author initials="S." surname="Bradner" fullname="Scott Bradner">
						<organization>Harvard University</organization>
						<address>
							<postal>
								<street>1350 Mass. Ave.</street>
								<street>Cambridge</street>
								<street>MA 02138</street>
							</postal>
							<phone>- +1 617 495 3864</phone>
							<email>sob@harvard.edu</email>
						</address>
					</author>
					<date month="March" year="1997"/>
					<area>General</area>
					<keyword>keyword</keyword>
					<abstract>
						<t>
   In many standards track documents several words are used to signify
   the requirements in the specification.  These words are often
   capitalized.  This document defines these words as they should be
   interpreted in IETF documents.  Authors who follow these guidelines
   should incorporate this phrase near the beginning of their document:

<list>
								<t>
      The key words &quot;MUST&quot;, &quot;MUST NOT&quot;, &quot;REQUIRED&quot;, &quot;SHALL&quot;, &quot;SHALL
      NOT&quot;, &quot;SHOULD&quot;, &quot;SHOULD NOT&quot;, &quot;RECOMMENDED&quot;,  &quot;MAY&quot;, and
      &quot;OPTIONAL&quot; in this document are to be interpreted as described in
      RFC 2119.
</t>
							</list>
						</t>
						<t>
   Note that the force of these words is modified by the requirement
   level of the document in which they are used.
</t>
					</abstract>
				</front>
				<seriesInfo name="BCP" value="14"/>
				<seriesInfo name="RFC" value="2119"/>
				<format type="HTML" octets="14486" target="http://xml.resource.org/public/rfc/html/rfc2119.html"/>
				<format type="XML" octets="5661" target="http://xml.resource.org/public/rfc/xml/rfc2119.xml"/>
			</reference>
			<reference anchor="RFC3550">
				<front>
					<title>RTP: A Transport Protocol for Real-Time Applications</title>
					<author initials="H." surname="Schulzrinne" fullname="H. Schulzrinne">
						<organization/>
					</author>
					<author initials="S." surname="Casner" fullname="S. Casner">
						<organization/>
					</author>
					<author initials="R." surname="Frederick" fullname="R. Frederick">
						<organization/>
					</author>
					<author initials="V." surname="Jacobson" fullname="V. Jacobson">
						<organization/>
					</author>
					<date month="July" year="2003"/>
				</front>
				<seriesInfo name="STD" value="64"/>
				<seriesInfo name="RFC" value="3550"/>
			</reference>
			<reference anchor="RFC3264">
				<front>
					<title>An Offer/Answer Model with Session Description Protocol (SDP)</title>
					<author initials="J." surname="Rosenberg" fullname="J. Rosenberg">
						<organization/>
					</author>
					<author initials="H." surname="Schulzrinne" fullname="H. Schulzrinne">
						<organization/>
					</author>
					<date month="June" year="2002"/>
				</front>
				<seriesInfo name="RFC" value="3264"/>
				<format type="TXT" octets="60854" target="ftp://ftp.isi.edu/in-notes/rfc3264.txt"/>
			</reference>
			<reference anchor="RFC4566">
				<front>
					<title>SDP: Session Description Protocol</title>
					<author initials="M." surname="Handley" fullname="M. Handley">
						<organization/>
					</author>
					<author initials="V." surname="Jacobson" fullname="V. Jacobson">
						<organization/>
					</author>
					<author initials="C." surname="Perkins" fullname="C. Perkins">
						<organization/>
					</author>
					<date year="2006" month="July"/>
					<abstract>
						<t>&lt;p>This memo defines the Session Description Protocol (SDP). SDP is intended for describing multimedia sessions for the purposes of session announcement, session invitation, and other forms of multimedia session initiation. [STANDARDS TRACK]&lt;/p></t>
					</abstract>
				</front>
				<seriesInfo name="RFC" value="4566"/>
				<format type="TXT" octets="108820" target="ftp://ftp.isi.edu/in-notes/rfc4566.txt"/>
			</reference>
		</references>
		<references title="Informative References">
			<reference anchor="RFC4288">
				<front>
					<title>Media Type Specifications and Registration Procedures</title>
					<author initials="N." surname="Freed" fullname="N. Freed">
						<organization/>
					</author>
					<author initials="J." surname="Klensin" fullname="J. Klensin">
						<organization/>
					</author>
					<date year="2005" month="December"/>
				</front>
				<seriesInfo name="BCP" value="13"/>
				<seriesInfo name="RFC" value="4288"/>
				<format type="TXT" octets="52667" target="ftp://ftp.isi.edu/in-notes/rfc4288.txt"/>
			</reference>
			<reference anchor="RFC3047">
				<front>
					<title>RTP Payload Format for ITU-T Recommendation G.722.1</title>
					<author initials="P." surname="Luthi" fullname="P. Luthi">
						<organization/>
					</author>
					<date month="January" year="2001"/>
				</front>
				<seriesInfo name="RFC" value="3047"/>
			</reference>
		</references>
	</back>
</rfc>
