Network Working Group Internet Draft Maria Napierala Document: draft-mnapierala-mvpn-part-reqt-01.txt AT&T Expires: August 24 2008 February 24 2008 Requirement for Multicast MPLS/BGP VPN Partitioning Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document describes a requirement for Multicast IP VPN solutions to allow the same multicast stream to flow simultaneously on multiple inter-PE paths without duplicates being sent to receivers. It is intended that existing and new solutions to MPLS/BGP Multicast IP VPN's will support this requirement. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [i]. Napierala Expires - August 2008 [Page 1] Requirement for Multicast MPLS/BGP VPN Partitioning Table of Contents 1. Introduction................................................2 2. Terminology.................................................3 3. General Requirement.........................................3 4. Multicast VPN Partitioning..................................4 4.1 Support of Anycast Sourcing in MVPN......................5 4.2 Supporting Redundant P-Tunnels...........................6 5. Preserving Customer Multicast Traffic Patterns..............7 5.1 Support of Source-Specific Host Reports in PIM-SM........8 6. Support of PIM-Bidir in MVPN................................8 6.1 Preventing Packet Loops..................................9 6.2 Support of Source-Only Branches..........................9 7. IANA Considerations........................................10 8. Security Considerations....................................10 9. References.................................................10 10. Acknowledgments............................................11 11. Author's Addresses.........................................11 12. Intellectual Property Statement............................11 13. Copyright Notice...........................................11 1. Introduction Multicast VPN (cf.[ii]) extends MPLS/BGP VPN services (cf.[iii]) by enabling customers to run native IP multicast within their IP VPN's. Multicast VPN's enabled customers to use applications that were expensive or difficult, if not impossible, to operate in wide-area network before (e.g., voice/video conferencing, stock quotes, large file distribution). From VPN customer perspective there is no change in the multicast operational model. Multicast distribution trees are built in service provider network to carry VPN multicast traffic. Those trees are essentially point-to-multipoint (P2MP) and multipoint-to-multipoint (MP2MP) tunnels that encapsulate IP VPN multicast packets for transport across provider's network. Throughout this document whenever we refer to a VPN we mean MPLS/BGP IP VPN and whenever we refer to an MVPN we mean MPLS/BGP Multicast IP VPN. The generic requirements for IP multicast services within IP VPN's are specified in [iv]. This document defines additional multicast VPN requirements beyond those in [iv]. More precisely, it specifies the need of VPN customers to have multiple parallel paths to a Rendezvous Point or a multicast source across a service provider network. The document formulates the generic aspects of this Napierala Expires - August 2008 [Page 2] Requirement for Multicast MPLS/BGP VPN Partitioning requirement and states its specific issues that should be addressed by MVPN solution. It is expected that solutions that specify procedures and protocol extensions for multicast in IP VPN's should satisfy this requirement. 2. Terminology In this document when we use the "C-" prefix when we refer to the VPN customer multicast addresses and multicast trees. We will prefix VPN customer multicast trees, sources, groups, Rendezvous Points (RP), and PIM routes with "C-", as in: C-tree, C-S, C-G, C-RP, (C-*, C-G), (C-S, C-G). When we use the "P-" prefix when we refer to provider's multicast addresses and multicast trees/tunnels. We assume familiarity with PIM protocol [v][vi][vii]. 3. General Requirement The MVPN solution MUST allow the same multicast traffic to flow simultaneously on multiple trees across provider's network without duplicate packets sent to customer receivers. The solution has to allow for different downstream PE's to choose different upstream PE's to customer RP or customer source. As a result, customer multicast stream would be able to flow along multiple inter-PE trees and simultaneously utilize multiple paths in redundant topology. The lack of support of parallel paths for multicast traffic would prevent different multicast VRF's of the same VPN to have different routing policies and choose different paths to reach C-RP or the C- source. As a result it would break any kind of "anycast" sourcing of a multicast stream in IP VPN, including Anycast RP operation by not allowing multiple RP's to send traffic in parallel to their closest receivers. The solution MUST NOT result in creating duplicates to customer receivers, except during routing transients. The amount of duplicates during routing convergence should be minimized and should be compatible to that of standard PIM operation. When preventing duplicates to receivers, the solution MUST NOT waste provider's network resources by discarding, at network egress, already transmitted duplicate traffic. Only a single copy of any C-multicast stream should reach any egress mVRF in a converged network. This includes PIM-SM C-streams that are either flowing on C-shared tree or C-shortest-path tree. An egress PE should receive a PIM-SM C- stream either from the C-RP or directly from the C-source but never from both. Napierala Expires - August 2008 [Page 3] Requirement for Multicast MPLS/BGP VPN Partitioning There are two exceptions to the requirement of not wasting provider's network bandwidth and discarding duplicates at network egress. One exception is when providing fast convergence for certain multicast VPN traffic on failures in access or in provider's network. This requirement is described in section 4.2. The other exception is in case of aggregation of different C-streams on to the same multicast distribution tree (i.e., P-tunnel) in provider's network. These two exceptions should be implemented by MVPN solution as optional behavior, based on provider's discretion. In addition, these two exceptions apply only to traffic from anycast sources/RP's but not to PIM-SM C-streams flowing on C-shared tree vs. C-shortest-path tree. In other words, a PIM-SM C-stream should never be delivered to an egress PE from both, the C-RP and directly from the C-source. These requirements SHOULD be supported for all tunnel technologies and SHOULD work with all protocols used for multicast signaling among PE's. However, it is desirable that the solution is "generic" i.e., it is independent of multicast tunnel technology used by the service provider. Independence from tunneling technology will allow different transport methods to be used for different multicast applications without overloading the transport technology itself. This, in turn, will simplify provider's network maintenance operations. Moreover, the solution MUST NOT impose any restrictions on customer's multicast routing or requirements on multicast service offering. For example, it cannot require a customer to outsource its RP functionality to the service provider or a service provider to participate in customer's RP protocol by running MSDP with the customer. These requirements SHOULD be supported for the following PIM modes in customer domain: PIM-SM [v], PIM-SSM [vii], and PIM-Bidir [vi]. The MVPN procedures SHOULD support all Rendezvous Point solutions currently used by customers. 4. Multicast VPN Partitioning Service providers might allow VPN sites to have specific routing intelligence, giving the customers more granular control of their routing in the VPN. Site specific routing intelligence may include, for example, route preference or denial of routes. A VPN customer may partition its sites into groups that share the same routing policy. How these routing policies are defined and how the customers control their routing may differ among service providers and it is outside the scope of this document. Napierala Expires - August 2008 [Page 4] Requirement for Multicast MPLS/BGP VPN Partitioning An example where customers need more granular control of their routing is in the selection of the Internet Gateway. Customers might access the Internet from their branch sites utilizing a default route or a summary route originating from their Internet Gateway site. A VPN might have multiple Internet Gateways. A customer may want to control which sites can access each of the gateways based on delay, application, security policy, etc. More generally, enterprises and firms often want to segment their VPN's by data center or hub locations in order to meet specific performance, security, functionality, or application access requirements. Such partitioning of VPN's has been possible for unicast transmission. It is a customer expectation that multicast traffic in a VPN can be subject to similar partitioning by multicast source or Rendezvous Point location. Lack of such solution can deteriorate application's performance, introduce latencies and variable delays that impact users and applications. Typically, large enterprises have multiple data centers and would like multicast applications to be simultaneously sourced from each data center to serve different sets of branch locations. For example, real-time market data distribution requires timely and simultaneous delivery to users. The differences in propagation delay might introduce unacceptable timing differences in the availability of data to users, and hence, unfairness in business competition. Locating sources close to receivers and partitioning of receivers by the source location is necessary in such business critical real-time data feeds. 4.1 Support of Anycast Sourcing in MVPN IP VPN customers might inject the same route(s) at different VPN sites. These could be default or summary routes, or specific routes. Those routes could be the routes to C-RPs or C-sources and they are examples of "anycast" sourcing of multicast traffic in a VPN. The "anycast" sourcing also includes dually connected C-sources or C- RP(A)'s. Specifically, in Anycast RP, two or more RP's are configured with the same IP address on loopback interfaces. IP routing automatically selects the topologically closest RP for each source and receiver. Anycast RP provides RP redundancy, fast RP failover, and load sharing of registering sources. Anycast RP's are often used in large enterprise networks. Napierala Expires - August 2008 [Page 5] Requirement for Multicast MPLS/BGP VPN Partitioning Typically, large enterprises have multiple data centers where Anycast RP's and sources of multicast traffic are located. Such customers might require multicast applications to be simultaneously sourced from each data center and delivered via corresponding Anycast RP's to different sets of branch locations. The expected anycast addressing behavior is that different PE's could choose different upstream PE's as the next-hops to the customer RP or source. As we described in the beginning of section 4 such partitioning of VPN by Rendezvous Point and source location might be needed to assure required application performance. Hence, the support for anycast sourcing in MVPN is REQUIRED. The MVPN solution SHOULD support Anycast C-RP in the following two ways: based on provider's network IGP/routing cost or based on VPN customer routing. IGP cost-based next-hop selection provides PIM-like support of Anycast C-RP's, i.e., C-receivers join the closest Anycast C-RP across provider's network. The other option is to leave it up to the VPN routing policy to partition receivers by Anycast RP location. This allows multicast VPN customer to define its own Anycast C-RP selection, based on other criterion than the closest distance. 4.2 Supporting Redundant P-Tunnels As was stated in section 3, while preventing duplicates to receivers in converged network, the MPVN solution should not waste provider's network resources by discarding, at network egress, already transmitted duplicate traffic. The exception to the duplicate free operation is when providing fast convergence for certain multicast VPN traffic on failures in access or in provider's network. This is required for applications that are not tolerant to packet loss. To minimize multicast traffic disruption during failures, the solution should provide a capability to pre-build redundant P-tunnels when an mVRF has multiple paths to C-RP or C-source. In addition to "primary" tunnel, a "secondary" or redundant P-tunnel could be triggered for C- Group or C-Source traffic. The primary P-tunnel should be based on mVRF's best route to C-RP or C-source. The secondary P-tunnel should be based on the next best route (or another equal cost path) to C-RP or C-source. The redundant P-tunnel could function in a "warm- standby" or a "hot-standby" mode. In a "warm-standby" mode the redundant P-tunnel should be triggered but the traffic should not be forwarded to it from the ingress PE, the root of the tunnel. In a "hot-standby" mode the traffic should be carried on both primary and standby tunnels and allow duplicates to be received at egress PE's. The egress PE's should accept the traffic only from either the primary tunnel or from the secondary tunnel. If the best route to C- RP/C-Source exists that the traffic should be accepted from the primary tunnel. If the best route to C-RP/C-route is withdrawn and Napierala Expires - August 2008 [Page 6] Requirement for Multicast MPLS/BGP VPN Partitioning the secondary route exists then the traffic should be accepted from the redundant tunnel. Such a capability should be implemented as a per-tunnel configuration option to service provider. 5. Preserving Customer Multicast Traffic Patterns PIM-SM has the capability for last-hop routers to switch to the shortest-path tree if the traffic rate is above a configured SPT- threshold. By switching to SPT, the optimal path is used to deliver the multicast traffic. Depending on the location of the source in relation to the RP, switching to the SPT can significantly reduce network latency. However, in networks with large numbers of senders, SPT's can increase amount of state that must be kept in the routers. A VPN customer might set SPT-threshold to a value higher than zero in order to switch to SPT's only for sources that cross certain traffic rate. This is done in order to alleviate RP from carrying too much traffic while at the same time controlling the number of (S,G) states created in the network. When a source traffic rate falls below the specified SPT-threshold on the last hop CE, the source tree is switched back to the shared tree. In fact, last-hop CE's might never switch traffic to SPT's for certain multicast groups if SPT-threshold of "infinity" is specified for those groups. The MVPN solution should not affect this PIM-SM behavior. In native PIM-SM mode the same multicast traffic does not necessarily flow over a single tree but it can simultaneously flow on both shared and shortest path trees, without duplicates being sent to receivers. The MVPN solution SHOULD allow egress PE's (i.e., PE's with receivers) receive a specific PIM-SM C-stream either via the C-RP or directly from the C-source. As it was stated in section 3, a PIM-SM C-stream should never be delivered to an egress PE from both the C-RP and directly from the C-source. Moreover, the MVPN solution should not base multicast routing decisions on the provider's backbone internal infrastructure, like IP addressing of PE routers. Rather the MVPN solution MUST preserve the multicast routing policies as defined in customer's VPN. More precisely, a multicast VRF with receivers of (C-*, C-G) or (C-S, C-G) should receive multicast traffic from its best next-hop to C-RP or C- S, respectively. The only two exceptions to this requirement are: (1) the existence of multiple equal cost paths to customer source or customer RP, which forces the provider's network to impose a tie- breaker, and (2) the existence of a configuration knob that provides an optional multicast behavior, based on provider's discretion (for Napierala Expires - August 2008 [Page 7] Requirement for Multicast MPLS/BGP VPN Partitioning example, Anycast C-RP selection based on provider's network routing cost). In summary, the MVPN procedures should not alter customer's multicast traffic patterns as defined by customer's PIM infrastructure as well as by MVPN routing policies. More precisely, MVPN solution MUST: - preserve Anycast C-RP infrastructure, - conform to customer's SPT-thresholds by + not triggering unexpected (C-S, C-G) states in customer's network + supporting C-tree "switchback" from shortest path tree to shared tree + supporting dually connected C-receiver sites where shared tree and shortest path tree diverge - preserve shared trees in customer network if CE's do not switch traffic to SPT's, by not triggering unexpected (C-S, C-G) states in customer's network - preserve multicast routing policies in customer's VPN. 5.1 Support of Source-Specific Host Reports in PIM-SM PIM-SM [v] permits "a receiver to join a group and specify that it only wants to receive traffic for a group if that traffic comes from a particular source. If a receiver does this, and no other receiver on the LAN requires all the traffic for the group, then the DR may omit performing a (*,G) join to set up the shared tree, and instead issue a source-specific (S,G) join only." Such a behavior of PIM-SM means that any PE can receive Join (C-S, C- G) for a sparse mode group even if no PE has ever received Join (C-*, C-G) in an MVPN. It also means that (as in PIM-SSM) source trees might be triggered even for sources that are not active. The MVPN solution SHOULD support source-specific host requests but it SHOULD prevent useless S-PMSI creation for C-sources which are not active. 6. Support of PIM-Bidir in MVPN Many enterprises use multicast applications that scale or even operate correctly only with PIM-Bidir [vi]. For example, financial firms use a business critical "always on" VoIP conferencing (so called "hoot-n-holler") to share market updates and trading orders. PIM-Bidir is already deployed in many of these networks and its support in MVPN context is REQUIRED. This is a change from MVPN generic requirements document [iv] where PIM-Bidir support on PE-CE interfaces is only recommended. Napierala Expires - August 2008 [Page 8] Requirement for Multicast MPLS/BGP VPN Partitioning 6.1 Preventing Packet Loops In PIM-Bidir, the packet forwarding rules have been improved over PIM-SM, allowing traffic to be passed up the shared tree toward the RP Address (RPA). To avoid multicast packet looping, PIM-Bidir uses a mechanism called the designated forwarder (DF) election, which establishes a loop-free tree rooted at the RPA. Use of this method ensures that only one copy of every packet will be sent to an RPA, even if there are parallel equal cost paths to the RPA. To avoid loops the DF election process enforces consistent view of the DF on all routers on network segment, and during periods of ambiguity or routing convergence the traffic forwarding is suspended. The standard DF election procedure used in plain IP environments would not yield the desired results in MVPN context. This is because the DF election in MVPN would have to be based on the provider's internal IP addressing of PE routers instead of on routing policy in customer's VPN. In MVPN context a Designated Forwarder for Bidir C-RPA is a PE attached to C-RPA. Different mVRF's in a given MVPN might have different best next-hop PE's to C-RPA due to different routing policies or they might have temporarily different next-hop PE's to C- RPA due to routing transients. The MVPN solution for C-Bidir MUST prevent multicast packet looping during routing convergence. The MVPN solution for C-Bidir SHOULD NOT rely on all mVRF's in a given MVPN to either have common routing view to C-RPA or to reach a common routing view to C-RPA in time to prevent packet looping. Rather, a VPN has to be treated as a collection of sets of multicast VRF's, each having the same but distinct from other sets reachability information towards C-RPA. Hence, resolving C-Bidir packet loops in MVPN inevitably results in the ability to partition a VPN into disjoined sets of VRF's, each having a distinct view of converged network. As an option, the MVPN implementation of C-Bidir SHOULD allow to ignore specific multicast routing policy in mVRF, and instead make all PE's in a given MVPN choose the same next-hop PE to C-RPA. Among all candidate next-hop PE's, the single chosen upstream PE to C-RPA could be the PE with the highest IP address. This approach to C-Bidir might be desirable to customers that do not want a permanent splitting of their MVPN's into disjoined C-Bidir trees. 6.2 Support of Source-Only Branches Napierala Expires - August 2008 [Page 9] Requirement for Multicast MPLS/BGP VPN Partitioning PIM-Bidir supports source-only branches i.e., branches that do not lead to any receivers but that are used to forward packets traveling upstream from sources towards the RPA. In plain IP PIM-Bidir it is up to the implementation whether to maintain group state for source-only branches [vi]. The MVPN solutions MUST assure that PE's on source- only branches of C-Bidir tree are able to send and receive inter-PE MVPN traffic. In other words, PE's on source-only branches have to be able to participate in P-tunnels triggered for C-Bidir trees. 7. IANA Considerations None. 8. Security Considerations To be supplied. 9. References [i] Bradner, S., "The Internet Standards Process -- Revision 3", BCP 9, RFC 2026, October 1996. [ii] E. Rosen, R. Aggarwal, "Multicast in MPLS/BGP IP VPNs", draft-ietf-l3vpn-2547bis-mcast. Work in progress. [iii] E. Rosen, E., Rekhter, Y., "BGP/MPLS IP Virtual Private Networks (VPNs)", RFC 4364, February 2006. [iv] T. Morin, Ed., "Requirements for Multicast in Layer 3 Provider-Provisioned Virtual Private Networks (PPVPNs)", RFC 4834, April 2007. [v] B. Fenner et al., "Protocol Independent Multicast - Sparse Mode (PIM-SM): Protocol Specification (Revised)", RFC 4601, August 2006. [vi] M. Handley, I. Kouvelas, T. Speakman, L. Vicisano, "Bi- directional Protocol Independent Multicast (Bidir-PIM)", RFC 5015, October 2007. [vii] H. Holbrook, B. Cain, "Source-Specific Multicast for IP", RFC 4607, August 2006. Napierala Expires - August 2008 [Page 10] Requirement for Multicast MPLS/BGP VPN Partitioning 10.Acknowledgments The author would like to thank Eric Rosen, Yakov Rekhter, and Ron Bonica for their comments. 11.Author's Addresses Maria Napierala AT&T Labs 200 Laurel Avenue, Middletown, NJ 07748 Email: mnapierala@att.com 12. Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietfipr@ietf.org. 13. Copyright Notice Copyright (C) The IETF Trust (2008). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Napierala Expires - August 2008 [Page 11] Requirement for Multicast MPLS/BGP VPN Partitioning This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Napierala Expires - August 2008 [Page 12]