Network Working Group Internet Draft M. Napierala Document: draft-mnapierala-mvpn-rev-00.txt AT&T Expires: May 2007 November 2006 Multicast MPLS/BGP VPNs Revisited Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document describes changes to inter-site signaling procedures in Multicast MPLS/BGP IP VPNs, defined in [ii], that result in the simplification of inter-PE multicast traffic patterns. It specifies a mechanism to bypass shared tree to shortest path tree VPN traffic switching between PE’s and to achieve congruency of multicast routing with VPN’s unicast routing policy. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [i]. Table of Contents Napierala Expires - May 2007 [Page 1] Multicast MPLS/BGP VPNs Revisited November 2006 1. Introduction................................................2 2. MPLS/BGP MVPN Control Plane Principles......................3 3. Congruent Multicast and Unicast Routing in Multicast MPLS/BGP VPN 3 3.1 Bypassing Shared RP Trees in MVPN........................4 4. Source Discovery and Simplification of MVPN Inter-Site Routing 5 4.1 VPN Source Discovery - Option 1..........................5 4.2 VPN Source Discovery - Option 2..........................7 4.3 Comparison with other Source Discovery Techniques........8 4.4 Applicability to draft-rosen-vpn-mcast...................8 5. PE-to-PE Signaling in Multicast MPLS/BGP VPN................9 5.1 Multicast Routing Exchange in MPLS/BGP VPN..............10 5.2 Multicast Tunnel and Source Announcements in Multicast MPLS/BGP VPN..................................................13 6. Multicast MPLS/BGP VPN Data Plane..........................13 7. Separating MVPN Data Plane from Control Plane..............15 8. IANA Considerations........................................15 9. Security Considerations....................................15 10. References.................................................15 11. Acknowledgments............................................16 12. Author's Addresses.........................................16 13. Full Copyright Statement...................................16 14. Intellectual Property......................................17 1. Introduction Multicast VPN (cf.[ii]) extends MPLS/BGP VPN services (cf.[iii]) by enabling customers to run native IP multicast within their VPN’s. From the customer perspective there is no change in their multicast operational model and extending multicast routing between customer’s edge (CE) routers and provider’s edge (PE) routers is straightforward. Multicast distribution trees are built is a service provider network to carry VPN customer multicast traffic. Those trees are essentially point-to-multipoint (P2MP) or multipoint-to- multipoint (MP2MP) tunnels that encapsulate IP VPN multicast packets for transport across SP network. Throughout this document whenever we refer to a VPN we mean MPLS/BGP IP VPN. The existing MVPN solutions and new proposals do not take into account that different VPN sites might have different unicast routing policies. As a consequence, hosts in a VPN might receive duplicated or unwanted multicast traffic. In this document we propose a solution to this problem. We also discuss a direct consequence of this solution which is the simplification of inter-PE multicast traffic patterns and routing in MVPN. Napierala Expires – May 2007 [Page 2] Multicast MPLS/BGP VPNs Revisited November 2006 Several options exist for solving MVPN signaling as well as several tunnel options exist for carrying multicast data traffic across a service provider’s (SP) network. Sections 5 and 6 of this document evaluate, respectively, different MVPN control protocols and different MVPN tunneling technologies. We discuss the tradeoffs between those options and attempt to conclude which options are the most promising to service providers. 2. MPLS/BGP MVPN Control Plane Principles There are several main principles that multicast MPLS/BGP IP VPN control plane should meet. The multicast routing should be CONGRUENT with unicast routing policy. This requires that the tunnel information should not be signaled before active sources are discovered. Specifically, signaling of tunnel information upon the source sending traffic natively on (S,G) is not sufficient since it might lead to a non-negligible amount of duplicate or unwanted traffic being sent to receivers. The MVPN implementation in SP’s infrastructure should be SIMPLIFIED from the normal Any-Source Multicast (ASM) procedures. The amount of customer multicast control messages carried across SP network should be minimized. Inter-site multicast traffic should not be sent over shared trees (RPT's). The simplification or reduction of inter-site MVPN routing information should not require multicast protocol changes in the customer domain. There should be no shifts of multicast VPN traffic in the Service Provider network to assure STABILITY of traffic patterns. Specifically, there should be no switching of traffic between RPT and shortest path tree (SPT) within SP network or between different tunnels in SP network, e.g., from Default MDT to Data MDT as in [iv]. There should be an OUT-OF-BAND MECHANISM that would allow constrained distribution of tunnel information, constrained distribution of information about active sources, and constrained distribution of customer multicast control messages for, at least, bypassing inter- site shared trees. Sending all multicast VPN control information out- of-band will allow for separating multicast control plane from its data plane. 3. Congruent Multicast and Unicast Routing in Multicast MPLS/BGP VPN Current deployments [iv] as well as new proposals [ii] for MPLS/BGP Multicast VPN are not aware of the underlying unicast routing policy in multicast enabled VPN. Yet, it is a natural expectation that multicast routing in a layer 3 VPN is congruent with unicast routing Napierala Expires – May 2007 [Page 3] Multicast MPLS/BGP VPNs Revisited November 2006 policy. A consequence of the lack of routing congruency is that VPN sites might receive duplicated or unwanted multicast traffic. Multicast traffic from source S to group G should not be sent to MVRF that has not “joined” the tunnel announced by a next hop towards S. We propose a solution to this problem whose main idea is that a tunnel for multicast VPN group G is announced only if a source S of G has been discovered. Only the VRF with a receiver for a group G behind it (i.e., attached to its site) has the information about the current best next hop to a source S of G. The best next hop to the source S might be different in different VRF’s of the same VPN as those VRF’s might have different unicast routing policies. To preserve the routing policies for multicast, the tunnel announcement has to happen after source S of G has been discovered, i.e., when a PE being the next hop to that source learns about the source existence. An active source should be discovered in MVPN no later or negligibly later than it would in plain ASM multicast, so that the amount of traffic that is lost, duplicated, or misrouted is negligible. Different methods for source discovery exist. PE can itself discover the sources attached to it or it can learn it from other PE's in MVPN. Source discovery methods differ in how quickly they discover active sources, whether source discovery information has to be exchanged between PE's in MVPN, and whether the discovery method requires exchanging of control information with VPN customer’s Rendezvous Points. The source discovery methods are defined in section 4 of this document. Resolving MPLS/BGP VPN unicast/multicast routing congruency is applicable to existing implementation [iv] as well as to new MVPN proposal [ii]. For example, we show how to modify the Data MDT procedure in [iv] such that Data MDT tunnel is announced upon source discovery and not upon the source sending traffic natively on (S,G), which might lead to a non-negligible amount of duplicate or unwanted traffic being send to receivers. 3.1 Bypassing Shared RP Trees in MVPN In native IP Any-Source Multicast (ASM) mode the only purpose of shared trees is for multicast hosts to discover the sources of multicast traffic they are interested in. Once the source is discovered in ASM, last-hop routers usually switch from shared tree to shortest path tree. However, it is not necessary to perform the RPT-to-SPT traffic switching between PE’s in MVPN. In MVPN context, the ASM source discovery process can be “intercepted” by PE’s in order to extract VPN source information. The interception of the ASM Napierala Expires – May 2007 [Page 4] Multicast MPLS/BGP VPNs Revisited November 2006 discovery process should result in PE-to-PE traffic being sent only on SPT’s regardless of whether it was or it wasn’t switched to SPT’s in customer domain. This avoids unnecessary shifts of traffic in SP network and, depending on the interception technique, can lead to simplification of PE-to-PE multicast routing. The knowledge about active sources can be used to eliminate inter-PE shared tree (RPT) to shortest path tree (SPT) switchover in MVPN and even to simplify PE-to-PE inter-site multicast routing. Whether the switching of inter-site multicast traffic from RPT to SPT can be completely eliminated depends on the source discovery technique. In order to eliminate inter-site RPT-to-SPT switchover, PIM control procedures in MVPN context need to be modified. Those modifications are straightforward. The elimination of inter-site RPT-to-SPT switchover can also simplify the inter-PE multicast routing by performing only the necessary routing exchanges. The goal should be that the modification of PE-to-PE multicast routing does not require changes to PIM protocol in the customer domain. 4. Source Discovery and Simplification of MVPN Inter-Site Routing Different options exist on how to discover active multicast sources in MVPN. In general, an active source can be discovered when a PE with a customer RP (C-RP) behind it receives the first (*, C-G) packet from a source C-S or when a PE with a customer source (C-S) behind it receives the first (C-S, C-G) join, either from directly connected CE or from another PE in MVPN. Below we explore the two techniques that can be used to discover an active source of multicast traffic. 4.1 VPN Source Discovery - Option 1 One option (let’s call it Option 1) is to completely avoid inter-PE RPT-to-SPT switching. There are two scenarios of this option: when customer source (C-S, C-G) and its C-RP are reachable across the SP network or if they are reachable outside the SP network. We use two VPN sources to illustrate these two scenarios: a source C-S1 of C-G located at the same site as its C-RP, and a source C-S2 of C-G located at a site attached to a different PE than the C-RP. We assume that in case of source C-S1, the reachability between C-S1 and C-RP is outside of the SP network. In case of source C-S2, the reachability between C-S1 and C-RP is across the SP network. A PE that leads to a VPN C-RP, upon receiving (*, C-G) PIM Join from another PE, will follow the normal ASM procedures: it will add an interface that leads to the SP’s core network to (*, C-G) outgoing interface list (OIL) and it will send (*, C-G) Join towards the C-RP. Napierala Expires – May 2007 [Page 5] Multicast MPLS/BGP VPNs Revisited November 2006 In meantime, a VPN source (C-S1) of C-G located at a site behind this PE or a source (C-S2) located at a site attached to a different PE, has sent to C-RP a PIM Register message with encapsulated multicast data it in. The C-RP extracts the multicast data packet from the Register message and sends it over the shared (*, C-G) tree. However, the PE with C-RP behind it should not forward any (*, C-G) traffic on the interface leading to SP’s core network. This is a change from the normal ASM procedure but it does not affect PIM in customer domain. When the PE (that leads to VPN C-RP) receives the 1st multicast packet from the source C-Si (i = 1 or 2), described above, over (*,C- G) tree it sends (C-Si,C-G,rpt) Prune towards C-RP and announces the active source C-Si and its location (being behind this PE or another PE) to all PE’s in the MVPN. This requires the PE with C-RP at its site (i.e., the incoming interface for (*,C-G) state is PE-CE interface) to “snoop” for a packet received on (*,C-G) and extract from it the source address. This is no different from the last hop router extracting source address from the first packet it receives on shared tree in plain ASM procedure. However, instead of sending (C-S, C-G) Join (as in plain ASM), this PE sends an announcement message about the active source C-S to all other PE’s in this MVPN. It also sends (C-Si,C-G,rpt) Prune towards C-RP to stop C-Si traffic. Upon receiving a source C-S of C-G announcement message, a PE who is the next hop to source C-S will send a message containing the information about a tunnel to be used for (C-S, C-G) traffic. If the next hop to the source C-S is the same PE as the next hop to the C-RP then it is sufficient for this PE to just send tunnel announcement message. The PE’s attached to receivers of C-G (also called egress PE’s), upon receiving the source announcement for (C-Si, C-G), will convert (*, C-G) PIM Join/Prune messages received from locally attached CE’s to (C-Si, C-G) PIM Join/Prune for all active sources C-Si. If the tunnel announcement for (C-Si, C-G) was also received, egress PE’s will also “join” the correct tunnel. Each such egress PE will send (C-Si, C-G) join to its best next-hop to the source C-Si. It will join the tunnel for (C-Si, C-G) announced by this best next hop. As we stated above, the PE with C-RP behind it will not forward any (*, C-G) traffic on the interface leading to SP’s core network. This interface serves solely the purpose of keeping (*, C-G) OIL non-empty. If there is more than one best next-hop to C-S, the PE will choose as the next hop the PE with the highest IP address. There could be a scenario where a dually homed VPN site with receiver(s) chooses a different next-hop PE depending on whether a shared (*,C-G) tree or source (C-S,C-G) tree is joined. In means that shared and source trees diverge at this site. The PE which is on the Napierala Expires – May 2007 [Page 6] Multicast MPLS/BGP VPNs Revisited November 2006 shared tree will receive (C-S,C-G, rtp) Prune message from its directly connected CE. In this case, to avoid duplicate traffic from C-S, this PE will send (C-S,C-G) Prune towards interface leading to the backbone network. However, it does not need to propagate (C-S,C- G, rtp) Prune (if C-RP is at a remote PE) since C-S has be already pruned from the shared tree. Option (1) blocks all (*, C-G) data traffic between PE’s in MVPN. Hence, it completely eliminates inter-site RPT to SPT multicast traffic switching. The initial packets send by the sources will be dropped. Regardless of whether or not the traffic in customer’s network switched to SPT’s, PE-to-PE MVPN traffic is sent only on SPT’s. 4.2 VPN Source Discovery - Option 2 With another technique an active source is discovered when a PE with a customer source behind receives the first (S,G) join, either from directly connected CE or from another PE in MVPN. We will call it Option (2). When a PE with a VPN source C-S located in a site behind it receives the 1st (C-S, C-G) join from another PE or from a locally attached CE, this PE will announce the tunnel to be used for (C-S,C-G) traffic to all other PE’s in the MVPN. In this option the source discovery and tunnel announcement is one and the same technique. In other words, there is no need for a separate source discovery method as in Option (1). An egress PE with receiver(s) of C-G at its directly attached site(s) will “join” the tunnel announced for (C-S, C-G) only if the PE that sent this announcement is the best next hop to the source C-S on this egress PE. Each PE that have a C-RP behind it (i.e., the incoming interface for (*,C-G) state is PE-CE interface) will send (C-S, C- G,rpt) Prune message for all active sources of VPN group C-G. This is to shut off active source traffic of the shared tree. Note that if PIM is used to exchange MVPN PE-to-PE multicast routing, traffic duplicates are resolved with Assert mechanism. But in order to assure that the source traffic is forwarded directly from the source and not from the source via RP, the source traffic has to be prune off the shared tree. If there is more than one best next-hop to C-S, the PE will choose as the next hop the PE with the highest IP address. This option cannot block (*, C-G) control or initial (*,C-G) traffic between PE’s in MVPN. Also, it does not eliminate RPT to SPT Napierala Expires – May 2007 [Page 7] Multicast MPLS/BGP VPNs Revisited November 2006 switching in MVPN context. Few initial packets received by the receivers could be duplicates. The option (2) preserves the congruency of multicast routing with unicast routing policy because it allows the PE’s with receivers to join the correct tunnel for (C-S, C-G) traffic. It also avoids switching of multicast traffic between MVPN tunnels (except the few initial packets). This option relies on at least one customer last- hop router to switch C-S traffic from RPT to SPT. As long as (at least) one last-hop router switched to SPT for C-S in customer’s network, all PE-to-PE VPN traffic from C-S is switched to SPT. 4.3 Comparison with other Source Discovery Techniques Neither source discovery option described in sections 5.2 and 5.3, respectively, introduces any requirements or restrictions on the MVPN service offering. Both techniques work with any multicast topology and with any RP protocol in customer domain. The preferred option is option (1) because it eliminates switching of inter-site multicast traffic between RPT and SPT, eliminates inter-site (S,G,rpt) Prunes and, hence, simplifies MVPN routing. It also does not rely on the traffic in customer’s network to be switched to an SPT. Lastly, Option (1) discovers sources no later than in plain ASM multicast. There are other source discovery methods possible. However, they either require changes in multicast procedures in customer network or they introduce specific requirements in how Multicast VPN service is offered. For example, PE could participate in PIM source discovery techniques in the customer domain. This requires one or more (for redundancy) PE’s to act as customer RP’s or MSDP peers to customer RP. This technique completely eliminates inter-site PIM messages associated with shared trees and with RPT-to-SPT switching. However, it lays specific requirements on MVPN service offering and on the multicast design in the customer network. It either requires that MVPN customer is willing to outsource the RP functionality to the service provider (SP) or that customer RP establishes the MSDP peering with PE. Outsourcing the RP to service provider might not be desirable to neither, the customer and the SP. Also, it might not be feasible for the customer to run MSDP and for SP to support MSDP in MVPN context with every customer. Also, the RP assignment gets complicated if customer is using dynamic RP protocol (Auto-RP or BSR). Because of these reasons this is not a valid source discovery option. 4.4 Applicability to draft-rosen-vpn-mcast Napierala Expires – May 2007 [Page 8] Multicast MPLS/BGP VPNs Revisited November 2006 If MVPN implementation is according to [iv], enhancing its source discovery will allow Default MDT to be used solely for inter-site multicast control traffic. No or only negligible amount of multicast data traffic will to be sent over Default MDT. No “switching” of multicast traffic from Default to Data MDT’s would be required. Data MDT’s would be announced upon source discovery and not upon the source sending traffic natively on (S,G) as in [iv]. Information about the active sources could be distributed among PE’s in Data MDT- like UDP message format, except that this message does not have to be periodically resent. However, for reliable delivery, a method to assure the delivery would have to be implemented. This source discovery message could be distributed over Default MDT. The following information would be carried in Active Source Announcement messages: - customer source and group address (there could be many [source, group] tuples for the same PE address) - IP address of a PE which is BGP next hop to the source announce in this message. To assure congruency of multicast routing with unicast routing policy, Data MDT procedure in [iv] has to be modified as follows: - when a source is discovered (through one of the techniques described in sections 4.1 and 4.2), PE router connected to the source C-S for group C-G sends MDT Join TLV; - MDT Join TLV is forwarded (and send periodically as described in [iv] to all PE routers in the MVPN. Only those PE routers, that received MDT Join TLV, will join the Data MDT group for whom the two conditions are met: -> they have interested receivers for group C-G or for (C-S,C-G) and -> whose next hop to the source C-S is the PE announcing Data MDT Join. - PE routers which are not connected to receivers of C-G will cache Data MDT message in order to reduce the delay when a receiver comes up in the future or in case of a primary route failure towards a source. No delay would be needed in the PE router connected to VPN source to start encapsulating traffic using the Data MDT group. The traffic received from the source (C-S,C-G) would be immediately sent over Data MDT once it is built and, hence, as long as there is at least one receiver for group C-G reachable across SP network. Any pre- configured conditions (like bandwidth) are no longer required to start announcing Data MDT. 5. PE-to-PE Signaling in Multicast MPLS/BGP VPN The key functionality provided by multicast PE-PE signaling in MPLS/BGP VPN architecture consists of the following: Napierala Expires – May 2007 [Page 9] Multicast MPLS/BGP VPNs Revisited November 2006 - exchange of VPN multicast routing information - signaling of active sources in a VPN - signaling of tunnel information to be used for multicast VPN traffic across the SP network. In this section we discuss MVPN PE-to-PE signaling protocol options and the differences between them. 5.1 Multicast Routing Exchange in MPLS/BGP VPN The following protocols could be used for exchanging MPLS/BGP VPN multicast routing among PE’s: PIM [iv], Multicast LDP [v], and BGP [vi]. In this section we discuss each of these protocols. 5.1.1 PIM Using PIM for inter-PE VPN multicast route distribution allows preserving the standard PIM procedure in the customer domain. PIM procedures are well understood and well deployed. Even though in ASM the data and control plane are not separated, in PIM-based MVPN it is possible to separate customer multicast control from multicast data, according to the procedures defined in this document. VPN multicast routing among PE’s can be exchanged out-of-band using a constrained multicast distribution tree like Default-MDT in [iv]. According to the procedures described in this document, multicast VPN data traffic is to be sent always selectively to the interested egress PE’s and forwarded directly from VPN sources and not via C- RP’s. Hence, there is never a duplicate multicast VPN data traffic received at the egress PE’s. As a result, PIM Assert mechanism is no longer needed in MVPN. However, PIM Join suppression and Prune override provide very important optimizations in multicast VPN routing. With PIM-on-LAN procedures there is no need to track the interested receivers behind different downstream PE’s since there is only one multicast tunnel interface per MVPN whose inclusion in C- tree is handled by PIM override. The issue with existing MVPN implementation based on [iv] is not that it uses PIM but that it does not have a good technique for source discovery, leading to unnecessary traffic shifts is SP network, and that it does not account for different unicast routing policies within a multicast VPN. PIM LAN procedures have to scale on PE routers to support large number of MVPN’s spanning large number of PE’s. 5.1.2 BGP Napierala Expires – May 2007 [Page 10] Multicast MPLS/BGP VPNs Revisited November 2006 Using BGP for exchanging VPN multicast routes is a non-trivial task and it is in the very early stages of standardization. The existing proposal to carry multicast routes in BGP in [vi] does not address the RP discovery protocols (BSR, Auto-RP) and RP-to-group mapping. When BGP is used as multicast routing protocol PIM state machine changes would be required to interact with BGP. In general, BGP cannot fully re-create PIM, especially PIM LAN procedures for resolving duplicate traffic because they depend on the actual traffic flows. However, as we pointed out in this document, PE-to-PE signaling in MVPN can be simplified to the point that no duplicates are ever sent to egress PE’s, and, hence, a full translation of PIM in BGP is not required. However, there are several fundamental issues with carrying multicast routes in BGP. We will explore them below. The multicast routes (PIM Join/Prunes) in MVPN are by nature many-to- one while BGP w/Route Reflectors scales with exactly the opposite communication, one-to-many or few-to-many. In multicast VPN, only the next hop PE towards the C-source/C-RP should receive multicast (C- S,C-G)/(*,C-G) route. This is different from unicast routing where the same route is to be received by all remote PE’s in a VPN. For only the upstream multicast hops towards C-sources/C-RP’s to receive the multicast routes requires extensive multicast route filtering. This, ultimately, negates the Route Reflector scaling advantage. Moreover, PIM Join/Prunes are independent from downstream routers that initiated them. They are aggregated to a single route when sent “upstream” for a given tree. In contrast, the identical unicast routes may have different next hop routers to reach them. BGP keeps track of these different next hops as alternate paths for the same unicast route, at least within a route reflector cluster. If multicast routes are carried in BGP their next hops have to be also tracked within a route reflector cluster, not as alternate paths but as being on specific C-trees. This is necessary in order to keep track of interested receivers behind downstream PE’s. While in unicast IP VPN’s there are very few (usually two) alternate paths for a single route, in a multicast VPN there might be as many downstream PE’s on a specific C-tree as there are PE’s in an MVPN. Hence, with BGP as multicast routing protocol, PE’s on a C-tree follow PIM point- to-point procedure rather than PIM LAN procedure. Point-to-point multicast routing procedures are not efficient in MVPN (even with the use of route reflectors) because they do not scale with MVPN’s many- to-one topology. For example, if there are 100 PE’s requesting the same C-tree then there will be 100 paths associated with a given multicast state. The multicast BGP path multiplier has a different meaning and is therefore of a different magnitude then it is in unicast routing. This will have a very significant impact on scale and performance of route reflectors carrying multicast routes. Napierala Expires – May 2007 [Page 11] Multicast MPLS/BGP VPNs Revisited November 2006 With unicast IP VPN’s, the only purpose of route reflectors is to scale BGP sessions on PE’s. With multicast VPN’s, route reflectors also provide another function which is the aggregation of multicast routes. With this functionality route reflectors protect PE’s from multicast route churn and volume by shifting this burden on themselves. In multicast a control plane churn is a result of a data plane churn. Multicast updates (Joins/Prunes) are the result of applications joining or leaving a group (of course, not every application’s join or leave results in an update at the CE-PE). As a consequence, multicast BGP control plane acquires the dynamics of the data plane. Yet, it is important for route reflector complex, being a centralized point of failure, to be stable and independent of the data plane. A route reflector cluster failure causes all multicast routes carried by this cluster to be withdrawn and, hence, affects forwarding in many MVPN’s. Translating multicast routing into BGP might have a negative impact on BGP because of the “soft-state” nature of PIM protocol. In PIM the reliability of routing exchanges is assured through their retransmissions. The periodic refresh of PIM Join/Prunes is not simple to eliminate or reduce and it would require PIM changes that will also affect customer’s PIM domain. In addition, multicast route change frequency is different from unicast routing. It is hard to characterize exactly what rate of PIM Join/Prune messages MPLS/BGP VPN may generate. Unicast updates are the result of routing node configuration changes or failures. A unicast route exists regardless of whether there is a currently active application using this route. Multicast updates (PIM Joins/Prunes) are the result of applications joining or leaving a multicast group. Multicast routing is essentially creating of state in routers for currently active multicast applications. 5.1.3 MLDP In multicast LDP (MLDP) (cf. [v]), as in PIM, trees are built receiver driven which allows it to scale well with dynamic multicast group membership and large receiver populations. In contrast to PIM, MLDP control messaging uses TCP and, hence, it is reliable and provides flow control. MLDP can support P2MP trees and MP2MP trees. The first are comparable to PIM SSM trees, the second to PIM Bidir trees. Supporting trees analogous to PIM Sparse Mode poses a difficulty in MLDP. PIM-SM is a combination between two types of trees, the (*,G) tree and the (S,G) tree. To avoid duplicates, packets received via the (*,G) tree on a router that has (S,G) tree must be denied. This is complex in the MPLS network because packets are forwarded using labels and there is no multicast state in the core. To solve this problem an out-of-band-signaling can be used to track (*,G) Join/Prune and (S,G,rpt) Prune messages. As was pointed Napierala Expires – May 2007 [Page 12] Multicast MPLS/BGP VPNs Revisited November 2006 out in this document, one of the principles of MVPN implementation should be simplification of the inter-PE ASM procedure. Since MLDP already requires out-of-band tracking of shared trees to support PIM- SM, the inter-PE shared (*,G) trees should be bypassed and only SPT (S,G) trees should be built across SP network. This would result in no inter-site RPT-to-SPT traffic switching. Also, to avoid carrying the periodic refresh of PIM Join/Prunes in MLDP, all PIM messages could be carried out-of-band while MLDP would react only to new routing events (new PIM Joins or Prunes). 5.2 Multicast Tunnel and Source Announcements in Multicast MPLS/BGP VPN A tunnel to be used for sending multicast VPN data traffic across service provider network should not be instantiated either before source of this traffic is discovered or after the source sends traffic natively on (S,G). The former leads to duplicate traffic sent to receivers, and the latter to a non-negligible amount of traffic being duplicated and shifted between tunnels. Once the active sources are discovered, depending on the source discovery method, PE’s participating in an MVPN might have to announce those sources to each other using some signaling method. For each active source, the identity of the tunnel to be used for this source has to be also announced. The tunnels could be receiver or sender-instantiated. Only the receiver-initiated tunnel creation may preserve unicast routing policy without a need for a tunnel announcement technique. The signaling of tunnel information, in general, is not required with multicast LSP’s but it is necessary if tunnel aggregation is required. The tunnel could be the existing Data-MDT construct of [iv] (but Data-MDT operation has to be modified as described in section 4.4) or other tunnel techniques could be supported. Tunneling technologies are addressed in section 6. The following two signaling options for Source Announcement could be considered for implementation: - UDP - discussed in section 4.4; - BGP – Source Active Auto-discovery route described in [vi]. Similarly, the following two signaling options for Tunnel Announcement could be considered for implementation: - UDP - modified Data MDT announcement described in section 4.4; - BGP - new BGP attribute called Provider Multicast Service Interface Tunnel attribute defined in [vi]. 6. Multicast MPLS/BGP VPN Data Plane There are two general categories of the multicast MPLS/BGP VPN tunneling technology: MPLS-based and GRE-based. The MPLS-based tunneling technology consists of the following options: - Point to Multipoint (P2MP) LSP’s with RSVP-TE Napierala Expires – May 2007 [Page 13] Multicast MPLS/BGP VPNs Revisited November 2006 - Multicast LDP - Ingress replication. The GRE-based tunneling technology consists of the following options: - PIM-SM - PIM-SSM - PIM-Bidir. In this section we discuss and compare these technologies. RSVP-TE P2MP LSP’s (cf.[vii]) are suitable for one-to-many applications with relatively static receivers. Currently, this is the only available option when traffic engineered LSP’s are required for multicast. P2MP RSVP-TE LSP’s are not a good fit for layer 3 multicast VPN’s because of the dynamic, receiver-based nature of enterprise multicast applications. Multicast LDP can support highly dynamic receiver population. Hence, it is suitable to support multicast layer 3 VPN’s in SP network. There can be P2MP as well as MP2MP LSP’s. MLDP still lacks the support for traffic engineering which is necessary for multicast applications that require strict resiliency (i.e., sub-second rerouting around network failures). Ingress replication consists of multiple point-to-point LSP’s, one to each of the egress PE’s. This requires replicating incoming packets to all the P2P LSP’s at the ingress PE to accommodate multipoint communication. It places the replication burden on the ingress PE. Ingress replication is listed here for completeness and it is not recommended because of its very poor scaling characteristics. GRE-based tunneling technologies are receiver driven in nature, which allows them to scale well with dynamic multicast group membership and large receiver populations. GRE-based tunneling is similar to MLDP except that in contrast to MLDP it natively supports PIM Sparse Mode procedures. We have argued in this document that sending inter-PE multicast VPN traffic on a shared tree is not necessary and that, in fact, it can be completely avoided. Any signaling that is still required in order not break PIM-SM in VPN customer domain can be sent across SP network out-of-band. The out-of-band signaling could be over Default-MDT’s which already provide constrained multicast VPN route distribution. Default-MDT could be used for signaling regardless of what tunneling technology is used for multicast data traffic. PIM Bidir is a good choice for such Default-MDT’s within a single provider network since it has good scaling properties and is a good fit for carrying MVPN control traffic. It creates only one state per MVPN (with no aggregation of MDT’s). It eliminates the maintenance of source state and, hence, it can scale to an arbitrary number of PE’s per MVPN. It still relies on Rendezvous Point (although only logical RP) but not on MSDP. In contrast to RP in PIM Napierala Expires – May 2007 [Page 14] Multicast MPLS/BGP VPNs Revisited November 2006 SM, RP in Bidir does not build source trees and does not need to handle the source registration process. If the GRE-based tunneling is used to carry VPN multicast data traffic in SP network, then PIM SSM mode is the best fit. There is no need to discover “sources” in SP core network since they are known, namely, they are the PE routers supporting MVPN service. To optimize the number of MVPN driven multicast states in SP network, tunnel aggregation techniques should be implemented [ii]. Similarly to MLDP, IP multicast still needs the support of fast recovery around network failures (known as IP multicast fast reroute) as defined in [viii] and [ix]. 7. Separating MVPN Data Plane from Control Plane Based on the procedures defined in this document, MVPN control plane should be separated from its data plane and inter-PE multicast VPN routing information should be send out-of-band. Separating multicast VPN control plane from its data plane and equipping its data plane with the ability to provide fast first level of resiliency against failures has many advantages. It relieves the control plane from a complex requirement of rapid restoration from occasional faults while at the same time providing optimal data paths through the network. This allows the control plane to have longer recovery times (seconds to tens of seconds) and hence improves its scalability and decreases its complexity. The data plane availability and survivability becomes independent of control plane failures. 8. IANA Considerations To be supplied. 9. Security Considerations To be supplied. 10.References [i] Bradner, S., "The Internet Standards Process -- Revision 3", BCP 9, RFC 2026, October 1996. [ii] E. Rosen, R. Aggarwal, "Multicast in MPLS/BGP IP VPNs", draft-ietf-l3vpn-2547bis-mcast. Work in progress. Napierala Expires – May 2007 [Page 15] Multicast MPLS/BGP VPNs Revisited November 2006 [iii] E. Rosen, E., Rekhter, Y., "BGP/MPLS IP Virtual Private Networks (VPNs)", RFC 4364, February 2006. [iv] E. Rosen, Y. Cai, I. Wijnands, "Multicast in MPLS/BGP IP VPNs", draft-rosen-vpn-mcast. [v] I. Minei, I. Wijnands, et. al., "Label Distribution Protocol Extensions for Point-to-Multipoint and Multipoint-to-Multipoint Label Switched Paths", draft-ietf-mpls-ldp-p2mp. Work in progress. [vi] R. Aggarwal, et. Al., “BGP Encodings for Multicast in MPLS/BGP IP VPN”, draft-raggarwa-l3vpn-2547bis-mcast-bgp. Work in progress. [vii] R. Aggarwal, "Extensions to RSVP-TE for Point to Multipoint TE LSPs", draft-ietf-mpls-rsvp-te-p2mp. Work in progress. [viii] S. Bryant et. al., “IP Fast Reroute Using Not-via Addresses”, draft-bryant-shand-ipfrr-notvia-addresses. Work in progress. [ix] S. Bryant, C. Filsfils, S. Previdi, and M. Shand. “IP Fast Reroute using Tunnels”, draft-bryant-ipfrr-tunnels. Work in progress. 11.Acknowledgments To be supplied. 12.Author's Addresses Maria Napierala AT&T 200 Laurel Avenue, Middletown, NJ 07748 Email: mnapierala@att.com 13. Full Copyright Statement Copyright (C) The Internet Society (2006). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Napierala Expires – May 2007 [Page 16] Multicast MPLS/BGP VPNs Revisited November 2006 This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 14.Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79.’ Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietfipr@ietf.org. Napierala Expires – May 2007 [Page 17]