XCON Working Group S. Srinivasan Internet-Draft Microsoft Corporation Intended status: Standards Track November 13, 2006 Expires: May 17, 2007 Media usages and SDP in the XCON data model draft-srinivasan-xcon-usecases-mediausage-00 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on May 17, 2007. Copyright Notice Copyright (C) The Internet Society (2006). Abstract The scope of this document is to describe the SDP attributes required for various media usages of the XCON defined data model. The document also describes how changes to controls in media sessions may be performed using the conference control protocol and how related notifications of conference state take place. Srinivasan Expires May 17, 2007 [Page 1] Internet-Draft mediausage November 2006 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Media stream definitions . . . . . . . . . . . . . . . . . . . 3 3.1. Available media definition . . . . . . . . . . . . . . . . 3 3.2. Per-user or per-endpoint media definitions . . . . . . . . 4 4. SDP negotiation and conferencing media usage . . . . . . . . . 4 4.1. Criteria for including media label attribute in SDP . . . 4 4.1.1. Mapping of media label (in SDP) to media id . . . . . 5 5. Media scenarios . . . . . . . . . . . . . . . . . . . . . . . 5 5.1. An example mixer model . . . . . . . . . . . . . . . . . . 5 5.1.1. Conference notification example . . . . . . . . . . . 6 5.2. Common audio/video scenarios . . . . . . . . . . . . . . . 11 5.2.1. Muting an audio stream . . . . . . . . . . . . . . . . 11 5.2.2. Pausing a video stream . . . . . . . . . . . . . . . . 14 5.3. Changing media streams . . . . . . . . . . . . . . . . . . 17 5.4. Changing media sources . . . . . . . . . . . . . . . . . . 17 5.5. Sidebar scenarios . . . . . . . . . . . . . . . . . . . . 17 5.5.1. Basic sidebar (exclusive participation) . . . . . . . 17 6. Security Considerations . . . . . . . . . . . . . . . . . . . 17 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 17 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 17 8.1. Normative References . . . . . . . . . . . . . . . . . . . 18 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 18 Intellectual Property and Copyright Statements . . . . . . . . . . 19 Srinivasan Expires May 17, 2007 [Page 2] Internet-Draft mediausage November 2006 1. Introduction The document clarifies the usages of various aspects to negotiating media, the associated controls for media and what they mean. Current documents do not specifically cover details of how the various aspects of media related to conferencing integrate. For example, [3] describes a mechanism to label media streams to identify them, but leaves the offer/answer model up to specific implementations. And [7] and [2] describe mechanisms to control and notify state of conferences, but does not specifically discuss how the conferencing server ties specific media identifiers into signaling. 2. Terminology In this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as described in BCP 14, RFC 2119 [1] and indicate requirement levels for compliant implementations. 3. Media stream definitions The following section offers a description of the media elements in the schema in relation to XCON. The XCON framework [1] describes a framework for establishing and participating in a centralized conference. 3.1. Available media definition The available media section of the schema in the [2] contains a list of media stream outputs offered by the conferencing server each identified by a label. A conferencing client joining the conference via the focus, typically subscribes to the conference event package [2]. The conference event package provides conferencing clients details of the conference including updates to conference state as described in the conference framework [1]. The global streams in the conference as defined in the conferencing event package are identified via a label (in available media) as defined in [2]. The available media element is optional, except in the case where more than one stream of the same type is offered by the conferencing system. A media label MUST be assigned if more than one stream of the same media type (like audio/video etc.) is available in the conference. This is so that when a conferencing client receives more than one stream of the same type it should be able to relate the incoming stream to a specific available media to Srinivasan Expires May 17, 2007 [Page 3] Internet-Draft mediausage November 2006 render appropriately. Note that the label also serves one other purpose, it can be used to relate the media 'id' (as described in the subsequent sections) to the stream (m line) in the SDP. The label is also unique within the conference-info context as defined in [2]. A label is typically created when a conference is scheduled, either via conference blueprints [1] or through some other means. A new label MAY however appear in available media element (in the [7]) after the conference is active and clients may decide to render new streams as required (based on local policy). When a conference is activated and a conferencing client receives a notification with the conference state, it would receive a notification with the list of media labels available in the conference. The conferencing client may use this or may discover available media via signaling (for example, via SIP OPTIONS) to join the conference to receive media, provided it understands the context in which the specific media needs to be rendered. 3.2. Per-user or per-endpoint media definitions Streams sent from a specific user's endpoint device is usually negotiated via some form of a signaling session. The conferencing event package schema contains media XML elements within the users/ user/endpoint elements. The media XML element in [2] refers to a media stream of which there may be more than one. Each media stream being sent from the conferencing client to the conference server is identified, within the conferencing event package, via an 'id' (refer [2], for more information). 4. SDP negotiation and conferencing media usage This section explains the semantics of the media label and its usage within the XCON framework. The media label in the conferencing data model or the conferencing event package maps to the media label defined within SDP in [3]. The media label is used to identify and associate streams in the SDP offer/answer model to the specific streams within conferencing. This section will explain how this is done. 4.1. Criteria for including media label attribute in SDP [3] suggests that the label may appear either in the offer or the answer and is used to identify the local stream either in the offer or the answer. This section describes how conferencing servers may integrate label into the offer/answer model and associate it with the data model. All conferencing clients and servers MUST follow the offer/answer model as described in [6]. The following sections only describe the usage of the media label in the context of conferencing Srinivasan Expires May 17, 2007 [Page 4] Internet-Draft mediausage November 2006 within the offer/answer model. The conferencing server SHOULD include a SDP 'label' attribute (as defined in [3]) for each stream in SDP sent from the conferencing server to the conferencing client (for example, in either the offer or the answer). If there are two or more streams of the same media type (as defined in [2], Section 5.3.4 with type being the values registered for "media" of SDP [5]), the conferencing server MUST include the label for each stream in the SDP sent from the server. Furthermore, if present, the value for the 'label' attribute MUST match the label under available-media XML element in the data model [7]. Also, the 'label' MUST uniquely identify the stream within specific SDP offer or answer from the conferencing server. 4.1.1. Mapping of media label (in SDP) to media id As the 'id' XML attribute defined in users/user/endpoint/media XML element in the data model [7] is not carried in SDP (or any signaling for that matter), the label attribute also serves the purpose of mapping the media 'id' defined in the data model to the media label defined in the data model. What this means is that a conferencing client will not be able to negotiate different m-lines with the same label within the same conference via separate signaling sessions. [[ Note: Fixing this will require a new SDP attribute for conveying the media id in SDP ]] 5. Media scenarios 5.1. An example mixer model [to-mixer streams] [from-mixer streams] |----mixer----| userid=23 , id=34 ----| |--- label='10' userid=23 , id=35 ----| |--- label='11' userid=23 , id=36 ----| |--- label='12' userid=24 , id=24 ----| | userid=24 , id=35 ----| | |-------------| The mixer shown above takes in some set of input streams and mixes them in some form or manner to produce output streams. This document will not cover how the streams are grouped and/or mixed but will only show how the media inputs and outputs tie into the conferencing data Srinivasan Expires May 17, 2007 [Page 5] Internet-Draft mediausage November 2006 model and signaling with an example. For further information refer [2]. The 'label' parameter above identifies the media output from the mixer and the streams to the mixer, from a specific user and endpoint, are identified by an 'id' in the conferencing data model. The label is unique throughout the conferencing data model. The id is unique within the endpoint media element in the data model and is generated by the conferencing server. Furthermore, each user is identified by a user identifier, refer [4]. Consider that the label '10' is the stream containing the audio stream mix from all audio input streams offered to every participant. And '11' consists of a video mix that contains one of the layouts as decribed in the scenarios section. And that '12' is an alternate mix of the video streams that is voice activated. And id's 34,35 and 36 for userid 23 are the user's main audio, main video stream and secondary video streams respectively. And id's 24 and 35 for userid 24 are the user's main audio and main video streams respectively. Let us consider that the mixer mixes the incoming video streams from the participants (going to the mixer) into both label '11' and label '12' streams. And this is how the mixer is wired. Also consider that the mixer accepts a single input stream at most from the client (in any sendrecv media stream), while rejecting the rest. This is a specific mixer model described here, other mixer models may interpret the input streams differently and this is something that the mixer describes. For more information on mixer templates and other representations within the mixer, refer to [template]. The next section will cover how this specific mix will appear in the offer/ answer model in SDP. Note that the floor control aspects of the streams above are omitted here for brevity as floor control is clearly defined as being optional. 5.1.1. Conference notification example The notification example given below corresponds to the mixer model defined above. The available-media element lists the media labels as defined. Note that the media labels '11' and '12' are defined with a status element of sendrecv. The mixer model defined in a [template] extension should indicate what the mixer model for this is through a standards track or informational RFC. Using the offer/answer model described earlier, users Bob (userid=23) Srinivasan Expires May 17, 2007 [Page 6] Internet-Draft mediausage November 2006 and Carl (userid=24) have joined the Focus and negotiated media streams as shown in the notification below. It is useful to note that Bob has chosen to recv all video streams, while Carl has decided to opt on the secondary voice-activated video stream. It is quite possible for a conferencing system to expose Bob's input stream directly (without mixing) to the participants of the conference if it deems necessary as Bob has a role of presenter. It may do so, for example, by creating a new label on-the-fly to expose this to the conferencing client. The notification below is what a presenter (Bob) may receive. - main audio audio sendrecv Mute audio outputs of this stream to all participants true main video video sendrecv Pause video outputs of this stream to all participants true secondary video video sendrecv Srinivasan Expires May 17, 2007 [Page 7] Internet-Draft mediausage November 2006 Pause video outputs of this stream to all participants true Bob Hoskins presenter Bob's Laptop connected dialed-out main audio audio 432424 sendrecv Mute-Audio True Mute-Audio True Srinivasan Expires May 17, 2007 [Page 8] Internet-Draft mediausage November 2006 main video video 324255 sendrecv Pause video True Pause video True secondary video video 1324255 recvonly Pause video True full info hsjh8980vhsb78 Srinivasan Expires May 17, 2007 [Page 9] Internet-Draft mediausage November 2006 vav738dvbs 8954jgjg8432 Carl participant Carl's video phone connected dialed-in main audio audio 242443 sendrecv Mute-Audio True Mute-Audio True secondary video video Srinivasan Expires May 17, 2007 [Page 10] Internet-Draft mediausage November 2006 632425 sendrecv Pause video True Pause video True full info aachsjh8980vhsb78 ffvav738dvbs a8954jgjg8432 5.2. Common audio/video scenarios 5.2.1. Muting an audio stream 5.2.1.1. Mute all participants Muting all participants (in other words, activating the control or setting the value to 'true') in the conference typically means that for the entire duration where mute is applicable, all current and future participants of the conference are muted and will not receive any audio. Typically this control is available to presenter or moderator roles in a conferencing system. Since no audio is flowing to all participants, activating this control, in turn, may cause the Srinivasan Expires May 17, 2007 [Page 11] Internet-Draft mediausage November 2006 conferencing focus to re-negotiate SDP with the various participants to stop media flowing as and when necessary. This is entirely up to local policy. Note that doing so may cause changes in conference state (with per-endpoint media elements and controls, their respective id's and their default states changing). In the example mixer, the control appears under available-media element as shown below. Mute audio outputs of this stream to all participants true 5.2.1.2. Muting to-mixer stream from a specific participant A mixer stream being sent from a participant to the mixer may be mixed in any form or manner. For example, this may appear in multiple media outputs from the mixer (though not the case in this specific example). Thus, activating this control would most certainly cause this input not appearing in any of the outputs from the mixer. Similar to the previous scenario, activating this control may end up re-negotiating SDP. In the example mixer, the control appears under media element for each user and endpoint. Bob's controls is shown below. Srinivasan Expires May 17, 2007 [Page 12] Internet-Draft mediausage November 2006 Mute-Audio True SDP from the conferencing server may look like (some elements omitted) v=0 c=IN IP4 131.164.74.2 t=0 0 m=audio 30000 RTP/AVP 0 a=label:10 Note that even though the above SDP does not contain any information about the media id, the label provides a mapping of the specific m-line to the media section in the data model. 5.2.1.3. Muting from-mixer stream to a specific participant This is a control on a specific mixer stream that is sent from a mixer to the participant negotiated via SDP. This is mostly optional and many conferencing systems may instead opt to not implement such a control. A client may instead, stop sending the media to the output device instead of activating this control to mute the stream. Doing so will have the mixer still sending media packets towards the participant thus taking bandwidth on the network and CPU on the mixer. Activating this control would stop media being send back from the mixer to the participant. Similar to the previous scenarios, activating this control may end up re-negotiating SDP. In the example mixer, the control appears under media element for each user and endpoint. Bob's controls is shown below. Srinivasan Expires May 17, 2007 [Page 13] Internet-Draft mediausage November 2006 Mute-Audio True SDP from the conferencing server may look like (some elements omitted) v=0 c=IN IP4 131.164.74.2 t=0 0 m=audio 30000 RTP/AVP 0 a=label:10 As before, note that even though the above SDP does not contain any information about the media id, the label provides a mapping of the specific m-line to the media section in the data model. 5.2.2. Pausing a video stream 5.2.2.1. Pausing video to all participants Pausing the video being sent to all participants (in other words, activating the control or setting the value to 'true') in the conference typically means that for the entire duration where pause is applicable, all current and future participants of the conference would not receive video. Typically this control is available to presenter or moderator roles in a conferencing system. Since no media is flowing to all participants, activating this control, in turn, may cause the conferencing focus to re-negotiate SDP with the various participants to stop media flowing as and when necessary. This is entirely up to local policy. Note that doing so may cause changes in conference state (with per-endpoint media elements and controls, their respective id's and their default states changing). In the example mixer, the control appears under available-media element as shown below. Srinivasan Expires May 17, 2007 [Page 14] Internet-Draft mediausage November 2006 Pause video outputs of this stream to all participants true 5.2.2.2. Pausing to-mixer stream from a specific participant A mixer stream being sent from a participant to the mixer may be mixed in any form or manner. For example, this may appear in multiple media outputs from the mixer (as is the case in the example). Thus, activating this control would most certainly cause this input not appearing in any of the outputs from the mixer. Similar to the previous scenario, activating this control may end up re-negotiating SDP. In the example mixer, the control appears under media element for each user and endpoint. Bob's controls is shown below. Activating this control would end up not showing Bob in any of hte output streams. Pause video True SDP from the conferencing server may look like (some elements omitted) v=0 c=IN IP4 131.164.74.2 t=0 0 m=video 30002 RTP/AVP 31 a=label:11 Srinivasan Expires May 17, 2007 [Page 15] Internet-Draft mediausage November 2006 As before, note that even though the above SDP does not contain any information about the media id, the label provides a mapping of the specific m-line to the media section in the data model. 5.2.2.3. Pausing video from-mixer stream to a specific participant This is a control on a specific mixer stream that is sent from a mixer to the participant negotiated via SDP. This is mostly optional and many conferencing systems may instead opt to not implement such a control. A client may instead, stop sending the media to the display device instead of activating this control to pause the stream. Doing so will have the mixer still sending media packets towards the participant thus taking bandwidth on the network and CPU on the mixer. Activating this control would stop media being send back from the mixer to the participant. Similar to the previous scenarios, activating this control may end up re-negotiating SDP. In the example mixer, the control appears under media element for each user and endpoint. Bob's controls is shown below. Pause video True SDP from the conferencing server may look like (some elements omitted) v=0 c=IN IP4 131.164.74.2 t=0 0 m=video 30002 RTP/AVP 31 a=label:11 As before, note that even though the above SDP does not contain any information about the media id, the label provides a mapping of the specific m-line to the media section in the data model. Srinivasan Expires May 17, 2007 [Page 16] Internet-Draft mediausage November 2006 5.3. Changing media streams TBD 5.4. Changing media sources TBD 5.5. Sidebar scenarios 5.5.1. Basic sidebar (exclusive participation) TBD 6. Security Considerations TBD 7. Acknowledgements Thanks to Tim Moore and Gonzalo Camarillo for useful comments. 8. References Srinivasan Expires May 17, 2007 [Page 17] Internet-Draft mediausage November 2006 8.1. Normative References [1] Barnes, M., "A Framework and Data Model for Centralized Conferencing", draft-ietf-xcon-framework-05 (work in progress), September 2006. [2] Rosenberg, J., Schulzrinne, H., and O. Levin, "A Session Initiation Protocol (SIP) Event Package for Conference State", RFC 4575, August 2006. [3] Levin, O., Camarillo, G., "The Session Description Protocol (SDP) Label Attribute", RFC 4574, August 2006. [4] Boulton, C., Barnes, M., "A User Identifier for Centralized Conferencing (XCON)", draft-boulton-xcon-userid-00.txt (work-in-progress), October 2006. [5] Handley, M., Jacobson, V. and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006. [6] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002. [7] Novo, O., Camarillo, G., Morgan, D., "A Common Conference Information Data Model for Centralized Conferencing (XCON)", draft-ietf-xcon-common-data-model-03 (work in progress), October 2006. [8] G. Camarillo, J. Holler, and H. Schulzrinne, "Grouping of Media Lines in the Session Description Protocol (SDP)", RFC 3388, December 2002. Author's Address Srivatsa Srinivasan Microsoft Corporation One Microsoft Way Redmond, WA 98052, USA Email: srivats@microsoft.com Srinivasan Expires May 17, 2007 [Page 18] Internet-Draft mediausage November 2006 Full Copyright Statement Copyright (C) The Internet Society (2006). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Srinivasan Expires May 17, 2007 [Page 19]