Network Working Group                                                
   Internet Draft                                       Maria Napierala 
   Document: draft-mnapierala-mvpn-rev-04.txt                      AT&T 
   Expires: August 24 2008                             February 24 2008 
    
    
                     Segmented Multicast MPLS/BGP VPNs  
    
    
Status of this Memo 
    
   By submitting this Internet-Draft, each author represents that any 
   applicable patent or other IPR claims of which he or she is aware 
   have been or will be disclosed, and any of which he or she becomes 
   aware will be disclosed, in accordance with Section 6 of BCP 79. 
    
    
   Internet-Drafts are working documents of the Internet Engineering 
   Task Force (IETF), its areas, and its working groups.  Note that  
   other groups may also distribute working documents as Internet-
   Drafts. 
    
   Internet-Drafts are draft documents valid for a maximum of six months 
   and may be updated, replaced, or obsoleted by other documents at any 
   time.  It is inappropriate to use Internet-Drafts as reference 
   material or to cite them other than as "work in progress." 
    
   The list of current Internet-Drafts can be accessed at 
        http://www.ietf.org/ietf/1id-abstracts.txt 
   The list of Internet-Draft Shadow Directories can be accessed at 
        http://www.ietf.org/shadow.html. 
 
    
Abstract 
    
   This document describes inter-site signaling procedures in MPLS/BGP 
   IP VPNs that allow the same multicast stream to flow simultaneously 
   on multiple inter-PE paths without duplicates being sent to 
   receivers. Those procedures are independent of multicast tunnel 
   technology used in service provider network as well as of the 
   protocol used to exchange multicast signaling among PE's. The 
   document specifies necessary information elements and their exchange 
   process for the desired MVPN operation. 
    
    
Conventions used in this document 
    
   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this 
   document are to be interpreted as described in RFC-2119 [i]. 
 
 
Napierala               Expires - August 2008                [Page 1] 
                  Segmented Multicast MPLS/BGP VPNs 

    
    
    
    
Table of Contents 
   1.   Introduction................................................2 
   2.   Terminology.................................................3 
   3.   Overview of the Solution....................................4 
      3.1  Overview of Inter-PE Procedures for PIM-SM...............5 
   4.   PE-to-PE Signaling Information Elements.....................6 
   5.   Inter-Site Signaling Procedures for PIM-SM..................6 
      5.1  C-Sources Discovered by PIM Control Messages.............7 
      5.2  C-Sources Not Discovered by PIM Control Messages........12 
      5.3  Group-Only S-PMSI Auto-discovery Route..................15 
      5.4  Handling Initial Packets Sent on C-Shared Tree..........16 
      5.5  Using P2MP LSP's as P-Tunnels for C-Shared Trees........17 
   6.   Supporting C-Shared Trees..................................17 
   7.   Support of Anycast C-RP....................................18 
   8.   Inter-Site Signaling Procedures for PIM-SSM................18 
      8.1  C-Receiver Pruning......................................19 
      8.2  P-tunnel Withdrawal for C-S.............................19 
      8.3  Using P2MP LSP's as P-Tunnels for PIM-SSM C-Trees.......19 
   9.   Inter-Site Signaling Procedures for PIM-Bidir..............20 
      9.1  Preventing C-Bidir Packet Loops in MVPN.................20 
      9.2  Active Group P-tunnel Announcement in C-Bidir...........21 
      9.3  Bidir C-Group Becomes Inactive..........................23 
      9.4  P-tunnels for C-Bidir Traffic...........................23 
      9.5  DF-PE Redundancy with Fast Convergence..................24 
      9.6  Using MP2MP LSP's as P-Tunnels for C-Bidir..............24 
   10.  C-Multicast Traffic Aggregation............................25 
   11.  Supporting Source-Specific Host Reports in PIM-SM..........26 
   12.  IANA Considerations........................................27 
   13.  Security Considerations....................................27 
   14.  APPENDIX: Preserving C-Multicast Traffic Patterns in MVPN..27 
   15.  References.................................................32 
   16.  Acknowledgments............................................32 
   17.  Author's Addresses.........................................33 
   18.  Intellectual Property Statement............................33 
   19.  Copyright Notice...........................................33 
    
    
   1. Introduction 
    
   Multicast VPN (cf.[ii]) extends MPLS/BGP VPN services (cf.[iii]) by 
   enabling customers to run native IP multicast within their IP VPN's. 
   From VPN customer perspective there is no change in the multicast 
   operational model. Multicast distribution trees are built in service 
   provider network to carry VPN multicast traffic. Those trees are 
   essentially point-to-multipoint (P2MP) or multipoint-to-multipoint 
   (MP2MP) tunnels that encapsulate IP VPN multicast packets for 
 
 
Napierala               Expires - August 2008                [Page 2] 
                  Segmented Multicast MPLS/BGP VPNs 

   transport across provider's network. Throughout this document 
   whenever we refer to a VPN we mean MPLS/BGP IP VPN and whenever we 
   refer to an MVPN we mean MPLS/BGP Multicast IP VPN. 
    
   This document defines procedures for exchanging multicast VPN routing 
   that allow for the same multicast stream to traverse multiple inter-
   PE paths without duplicate packets being sent to egress mVRF's. As a 
   consequence, inter-PE C-multicast traffic can flow on multiple 
   tunnels and simultaneously utilize multiple paths in a redundant 
   topology. Different downstream PE's or even different multicast 
   VRF's are allowed to choose different upstream PE's to a customer RP 
   or a customer source. Only a single copy of any C-multicast stream 
   is delivered to any egress mVRF in a converged network. This 
   includes PIM-SM C-streams that are either flowing on C-shared tree 
   or C-shortest-path tree. According the procedures defined in this 
   document, an egress PE receives a PIM-SM C-stream either from the C-
   RP or directly from the C-source but never from both.  
 
   The lack of support of parallel paths for multicast traffic would 
   prevent different multicast VRF's of the same VPN to have different 
   routing policies and choose different paths to reach C-RP or the C-
   source. As a result it would break any kind of "anycast" sourcing of 
   a multicast stream in IP VPN, including Anycast RP [iv][v] operation 
   by not allowing multiple RP's to send traffic in parallel to their 
   closest receivers.  
    
   The proposed duplicate-free operation of Multicast VPN's is 
   independent of multicast tunnel technology used by the service 
   provider as well as of the protocol used to exchange multicast 
   signaling among PE's. The proposed inter-PE multicast signaling does 
   not impose any restrictions on customer's multicast routing or 
   requirements on multicast service offering, e.g., it does not 
   require customer to outsource its RP functionality to the service 
   provider or service provider to participate in customer's RP 
   protocol by running MSDP with the customer.  
    
   The procedures defined in this document include the support of PIM-
   SM [vii], PIM-SSM [vi], and PIM-Bidir [x] based C-tress. 
    
    
   2. Terminology 
    
   In this document when we use the "C-" prefix when we refer to the 
   MVPN customer multicast addresses and multicast trees. We will prefix 
   MVPN customer multicast trees, sources, groups, Rendezvous Points, 
   and PIM routes with "C-", as in: C-tree, C-S, C-G, C-RP, (C-*, C-G), 
   (C-S, C-G). When we use the "P-" prefix when we refer to provider's 
   multicast addresses and multicast trees/tunnels. We assume 
   familiarity with PIM protocol [vii][vi][x] and the terminology used 
   in [ii].  
 
 
Napierala               Expires - August 2008                [Page 3] 
                  Segmented Multicast MPLS/BGP VPNs 

    
    
   3. Overview of the Solution 
    
   In order to support multiple inter-PE trees carrying the same C-
   multicast traffic without duplicate packets at the egress mVRF's, we 
   segment a multicast VPN into sets of multicast VRF's such that each 
   set has the same best route to C-S or C-RP. Each set is served by a 
   different P-tunnel to deliver C-S or C-RP traffic. Each such P-tunnel 
   is rooted at a unique PE that, for a given set of mVRF's, is the best 
   next-hop to C-RP, C-Source, or C-RP Address. This allows for the same 
   C-group or the same C-source traffic to enter provider's network at 
   multiple PE's without creating duplicates to C-receivers.  
    
   In case of PIM-SM the proposed signaling procedure supports Anycast 
   C-RP's by partitioning the MVPN by C-RP location, i.e. by the 
   upstream PE attached to C-RP. In case of PIM-SM and PIM-SSM the 
   proposed procedure supports partitioning the MVPN by C-source 
   location, i.e. by upstream PE attached to C-S. This allows C-
   multicast traffic to be simultaneously sent from each C-source 
   location to a different set of C-receiver locations. In case of PIM-
   Bidir the proposed signaling procedure supports partitioning the MVPN 
   by C-RPA location, i.e. by upstream PE attached to C-RPA. In PIM-
   Bidir the partitioning of MVPN by C-RPA location avoids multicast 
   packet loops during routing convergence. 
    
   In order to trigger a P-tunnel rooted at a PE attached to C-RP/A or 
   C-source to carry their traffic, the active C-groups and C-sources 
   have to be discovered in provider's network. This is straightforward 
   when C-trees are built with PIM-SSM and PIM-Bidir. In PIM-SSM an 
   active C-S is discovered when a PE attached C-S receives customer 
   initiated (C-S, C-G) Join. In PIM-Bidir a group C-G is discovered 
   when a PE attached to C-RPA receives customer initiated (C-*, C-G) 
   Join. When a PE attached to C-S (in PIM-SSM) or C-RPA (in PIM-Bidir) 
   receives, respectively, (C-S, C-G) Join or (C-*, C-G) Join, it 
   announces a P-tunnel for C-S or C-G traffic rooted at this PE. Those 
   procedures are defined in detail in sections 8 and 9, respectively. 
    
   Supporting multiple inter-PE paths for the same C-multicast flow is 
   more complex in PIM-SM. In sparse mode, assigning the (C-S, C-G) 
   streams to an S-PMSI presupposes that there is a way of discovering 
   the C-sources. Plain PIM-SM does this by examining the data plane to 
   see who is sourcing the (*, G) traffic.  This document proposes 
   instead to discover the C-sources by using the control plane but 
   without requiring a customer to outsource its RP functionality to 
   the service provider or without relying on running MSDP with the 
   customer. In MVPN context, the state created along the SPT from C-RP 
   to C-S can be used by PE's to discover customer sources. An active C-
   source can be discovered by a PE attached to C-RP when it receives 
   (C-S, C-G) Join initiated by C-RP and destined to C-S. Receiving (C-
 
 
Napierala               Expires - August 2008                [Page 4] 
                  Segmented Multicast MPLS/BGP VPNs 

   S, C-G) Join on a PE attached to C-RP triggers a Source Active 
   advertisement, which, when received by a PE attached to C-S, causes 
   that PE to announce a P-tunnel for (C-S, C-G) traffic. An egress 
   mVRF (i.e., mVRF with receivers of C-G) will join only this P-tunnel 
   for (C-S, C-G) that was announced by its best next hop to C-S. 
    
    
 3.1 Overview of Inter-PE Procedures for PIM-SM  
    
   In native PIM-SM mode the same multicast traffic does not necessarily 
   flow over a single tree but it can simultaneously flow on both shared 
   and shortest path trees, without duplicates being sent to receivers. 
   According to the inter-PE signaling procedures defined in this 
   document a PIM-SM C-stream is never delivered to an egress PE from 
   both the C-RP and directly from the C-source. In order to support 
   this duplicate-free operation of PIM-SM in MVPN, the specified 
   procedures assure that if a (C-S, C-G) stream is carried in an S-
   PMSI, and for the same C-G, the (C-*,C-G) stream is carried in an S-
   PMSI, then the (C-S, C-G) traffic must not be carried in the (C-*, 
   C-G)'s S-PMSI.  
    
   In order to assign (C-S, C-G) stream to an S-PMSI the C-S has to be 
   discovered in MVPN. In order to discover PIM-SM C-sources based on 
   PIM control messages, we decompose PIM-SM C-multicast into two types 
   of topologies: (1) when the C-SPT from C-Source to C-RP is across 
   provider's network and (2) when the C-SPT from C-source to C-RP is 
   outside of provider's network. In topology (1) the C-sources can be 
   discovered based on (C-S, C-G) Joins received by PE on the interface 
   towards the C-RP's. The traffic from such C-sources is carried only 
   on (C-S, C-G) S-PMSI i.e., it is carried on C-SPT's across 
   provider's network. In topology (2) the C-sources cannot be 
   discovered from control messages and the traffic from such sources is 
   carried only on (C-*, C-G) S-PMSI, i.e., it stays on C-shared across 
   provider's network. In other words, the (C-*, C-G) S-PMSI is only 
   used for those data packets whose source has not been learned from 
   PIM control messages. 
   This decomposition of PIM-SM routing is explained in detail in 
   Appendix A.  
 
   Moreover, the PIM-SM C-multicast signaling defined in this document 
   allow for multiple entry points for C-G or C-S traffic into 
   provider's network without duplicate packets being sent to egress 
   mVRF's. This is because according to the specified procedures, 
   multicast traffic from a customer source or from a customer RP is 
   never sent to a downstream multicast VRF over a tunnel that is not 
   rooted at this mVRF's best next-hop PE towards the source or the RP. 
    
   We observe further that it is not necessary to perform customer 
   initiated RPT-to-SPT switchover across provider's network. The 
   procedures defined in this document discover customer sources by 
 
 
Napierala               Expires - August 2008                [Page 5] 
                  Segmented Multicast MPLS/BGP VPNs 

   observing the (C-S, C-G) Join messages from the C-RP. Such C-source 
   discovery mechanism does not depend on receiving SPT Joins from sites 
   attached to receivers and thus avoids customer-initiated inter-PE 
   RPT-to-SPT switchover. According to the procedure defined in this 
   document, inter-PE C-multicast traffic is being sent either only on 
   SPT's or on shared trees, regardless of whether it was or wasn't 
   switched to SPT's in customer domain. This avoids significant shifts 
   of traffic in provider's network and leads to simplification of PE-
   to-PE multicast routing. The following PIM messages are eliminated 
   between PE's: (C-S, C-G, rpt) Prunes and customer initiated (C-S, C-
   G) Joins associated with C-RPT to C-SPT switchover. The latter 
   elimination has only one exception associated with dually homed 
   receiver sites where C-RPT and C-SPT diverge (defined in section 
   5.1.1). 
    
    
   4. PE-to-PE Signaling Information Elements 
    
   The following information elements are required in support of the 
   multicast signaling procedures defined in this document: 
   - active C-source announcements 
   - P-tunnel announcements and withdrawals for (C-*, C-G) traffic 
   - P-tunnel announcements and withdrawals for (C-S, C-G) traffic. 
   When BGP is used as an auto-discovery mechanism in MVPN, a new BGP 
   NLRI (MCAST-VPN) is already defined in [viii] to handle different 
   route types in MVPN. For active C-source announcements, Source Active 
   auto-discovery route defined in [viii] can be used. The P-tunnel 
   announcements and withdrawals for (C-S, C-G) traffic can use S-PMSI 
   auto-discovery route also defined in [viii]. The S-PMSI auto-
   discovery route for P-tunnel announcements and withdrawals for (C-*, 
   C-G) traffic is defined is section 5.3 of this document. 
   Optionally, there can be an additional route type defined for active 
   C-group announcements. This route type and its purpose are defined in 
   section 9.2.2 of the document. 
    
    
   5. Inter-Site Signaling Procedures for PIM-SM  
    
   An MVPN source C-S and its C-RP could communicate either across 
   provider's network or outside of provider's network. In either 
   topology, a PE attached to C-RP, upon receiving (C-*, C-G) PIM Join 
   from another PE or from a locally attached site, will send (C-*, C-G) 
   Join towards the C-RP. This PE will also announce a P-tunnel for the 
   group C-G to all PE's in a given MVPN and it will add the P-tunnel 
   interface to (C-*, C-G) outgoing interface list (olist).  
   There could more than one PE to which the same C-RP is attached. This 
   could be because the C-RP is multi-homed or because it is Anycast-RP. 
   Each PE that is attached to the C-RP and receives (C-*, C-G) Join 
   will announce a distinct P-tunnel for C-G. This allows for the same 
   C-G traffic to enter provider's network at multiple ingress points. 
 
 
Napierala               Expires - August 2008                [Page 6] 
                  Segmented Multicast MPLS/BGP VPNs 

   Different PE's attached to receivers of C-G may receive C-G traffic 
   on different P-tunnels without duplicate packets sent to receivers. 
    
   An egress PE, or more precisely an mVRF of a given MVPN attached to 
   receiver(s) of C-G will "join" or participate in only that C-G tunnel 
   which was announced by mVRF's best next-hop PE to C-RP. If there is 
   more than one best next-hop PE to C-RP in the mVRF, the egress PE 
   will choose as the next-hop the PE with the highest IP address or it 
   may utilize multicast multipath load splitting algorithm when there 
   are multiple C-RP's behind the same PE's. All PE's in the given MVPN 
   will store the C-G's P-tunnel information until they receive the P-
   tunnel withdrawal message for C-G. The conditions for C-G P-tunnel 
   withdrawal are defined in section 5.1.3. 
    
   In meantime, a VPN source C-S might have sent a PIM Register message 
   to C-RP with encapsulated multicast data it in. The C-RP extracts the 
   multicast data packet from the Register message and sends it to MVPN 
   over the P-tunnel for group C-G. If the P-tunnel is not built yet, 
   which is very unlikely because the P-tunnel creation was triggered 
   upon receiving the first (C-*, C-G) Join, the initial data packet(s) 
   to be sent across provider's network will be dropped. We describe the 
   probability of dropping the initial C-multicast traffic in section 
   5.4. 
    
   From this point on, depending on whether the SPT from C-S to C-RP is 
   built across provider's network or outside of provider's network, the 
   inter-PE procedures differ. They are defined in sections 5.1 and 5.2, 
   respectively. 
    
    
 5.1 C-Sources Discovered by PIM Control Messages 
    
   A PE with attached C-RP site, as PE2 in Figure 1 in Appendix A, upon 
   receiving (C-S, C-G) PIM Join from CE attached to C-RP (CE2 in Figure 
   1), will create (C-S, C-G) state and will add the CE-PE interface to 
   its olist. The olist of the (C-S, C-G) entry is also populated with 
   a copy of the olist from the (C-*, C-G) entry except the P-tunnel 
   used for C-G traffic. This is to avoid duplicate traffic, i.e. the 
   same C-S traffic being sent on both shortest-path tree as well as 
   shared-tree across provider's network. The PE attached to C-RP will 
   propagate (C-S, C-G) Join toward C-S. 
    
   (NOTE that if there is a receiver C-R of C-G at the C-RP site, it 
   might happen that the 1st (C-S, C-G) Join that arrives at the PE 
   attached to this site is from the C-R rather than from C-RP. This 
   does not change the outcome and is transparent to the proposed 
   procedure.) 
    
   When a site with C-S and a site with C-RP are attached to the same PE 
   (as C-S2 and C-RP in Figure 1), this PE, upon receiving the first C-S 
 
 
Napierala               Expires - August 2008                [Page 7] 
                  Segmented Multicast MPLS/BGP VPNs 

   packet on (C-S, C-G) state, will start sending (C-S, C-G, rpt) Prunes 
   towards the C-RP. This is to stop receiving C-S traffic over the C-
   shared tree, i.e., to stop receiving packets de-capsulated from 
   Register messages. The traffic arriving on C-RPT tree will eventually 
   stop flowing when the Register Stop message from C-RP is received by 
   the C-S. This will result in no more (C-S, C-G, rpt) Prunes being 
   sent to the C-RP. To optimize further the traffic flow, the PE 
   attached to C-RP should use so-called "turnaround rules" to prevent 
   multicast traffic from unnecessarily reaching the C-RP if there are 
   no interested receivers behind it. 
    
   In case a site with C-S and a site with C-RP are attached to the same 
   PE, this PE will not announce a new P-tunnel for (C-S, C-G) traffic 
   and it will send the C-S traffic over already announced P-tunnel for 
   C-G. 
    
   In case the C-S is not attached to the same PE as C-RP (as C-S1 in 
   Figure 1), the PE attached to C-RP will announce the active source C-
   S of C-G to all PE's in a given MVPN. Upon receiving active source C-
   S announcement message, a PE that is the next-hop to source C-S (as 
   PE1 in Figure 1) will send a P-tunnel announcement for (C-S, C-G) 
   traffic to all PE's in the MVPN. The PE's will store the C-S P-tunnel 
   information until they receive the P-tunnel withdrawal message for 
   (C-S, C-G). A PE that does not have any interested receivers for C-G 
   when it receives (C-S, C-G) P-tunnel announcement message, it will 
   store this information so it can join this P-tunnel for late 
   receivers. The conditions for (C-S, C-G) P-tunnel withdrawal are 
   defined in sections 5.1.3 and 5.1.4. If C-S is dually connected to 
   two different PE's, both of those PE's will announce their distinct 
   P-tunnels for C-S traffic.  
    
   The PE's attached to receivers of C-G, upon receiving the P-tunnel 
   announcement for (C-S, C-G) traffic, will initiate (C-S, C-G) Joins 
   based on (C-*, C-G) PIM Joins received from locally attached CE's. 
   Each such egress PE will send (C-S, C-G) Join to the best next-hop PE 
   towards C-S in an mVRF of the specified MVPN. The egress PE will also 
   connect to the P-tunnel announced by the best next-hop PE to C-S in 
   the mVRF. Egress PE's will continue participating in the C-shared 
   tree to receive traffic from all other C-sources sending to C-G.  
    
   If there is more than one best next-hop to C-S in the mVRF (i.e., 
   there are multiple equal cost paths), the egress PE will choose as 
   the next-hop the PE with the highest IP address. PE might utilize 
   multicast multipath load splitting algorithm if there are multiple C-
   sources behind the same PE's. All PE's have to use the same load 
   splitting algorithm in order to choose the same upstream PE for the 
   same C-S.   
    
   The P-tunnel announced for (C-S, C-G) traffic is also joined by the  

 
 
Napierala               Expires - August 2008                [Page 8] 
                  Segmented Multicast MPLS/BGP VPNs 

   PE attached to C-RP that has (C-S, C-G) state with the interface 
   towards C-RP in its olist (as PE2 in Figure 1). This is in order for 
   C-RP to receive C-S traffic natively on the C-SPT. When the first C-S 
   packet arrives over C-S P-tunnel at the PE attached to C-RP (PE2 in 
   Figure 1), this PE will start sending (C-S, C-G, rpt) Prunes towards 
   the C-RP. This is in order to stop receiving C-S traffic over the C-
   shared tree, i.e., to stop receiving packets de-capsulated from 
   Register messages. The traffic arriving on C-RPT tree will eventually 
   stop flowing when the Register Stop message, sent by C-RP, is 
   received by the C-S and no more (C-S, C-G, rpt) Prunes will be sent 
   to the C-RP. To optimize further the traffic flow, the PE attached 
   to C-RP should use so-called "turnaround rules" to prevent multicast 
   traffic from unnecessarily reaching the C-RP if there are no 
   interested receivers behind it. 
       
   Upon receiving packets directly from a source C-S, customer last-hop 
   routers might switch to SPT and send (C-S, C-G) Joins towards the C-
   S. When the SPT between C-RP and C-S is built across provider's 
   network, regardless whether C-RP and C-S are attached to the same PE 
   or different PE's, egress PE's do not need to propagate the (C-S, C-
   G) Join towards C-S. More precisely, when C-RP and C-S are attached 
   to different PE's, egress PE does not need to propagate (C-S, C-G) 
   Join received from locally attached CE because in this scenario 
   egress PE's have already switched to SPT when P-tunnel for C-S was 
   announced. When C-RP and C-S are attached to the same ingress PE, 
   egress PE does not need to propagate (C-S, C-G) Join received from 
   locally attached CE because in this scenario the ingress PE has 
   already joined the source C-S and pruned C-S traffic from the C-
   shared tree. 
    
    
     5.1.1 Dually Connected C-Receivers 
    
   In this section we describe a scenario where a dually homed VPN site 
   with receiver(s) chooses a different next-hop PE depending on whether 
   a shared (C-*, C-G) tree or source (C-S, C-G) tree is joined. This 
   means that shared and source trees diverge at this site. 
     
                              C-S         C-RP    
                               |           |     
                              CE1          CE2   
                              / \           |    
                             /   \          |  
                           PE1   PE2       PE3 
                           |     |         |       
                           Provider's  Network 
                                |         | 
                               PE4       PE5 
                              ^  \      /   ^ 
                    (C-*,C-G) |   \    /    | (C-S,C-G) 
 
 
Napierala               Expires - August 2008                [Page 9] 
                  Segmented Multicast MPLS/BGP VPNs 

                       Join   |     CE3     |    Join 
                                     | 
                                     | 
                                    C-R 
    
                   Figure 3: Dually connected C-Receiver 
    
   Figure 3 depicts an example of such scenario. Customer receiver C-R 
   is dually connected to provider's network via PE4 and PE5. Let's 
   assume that C-RPT and C-SPT diverge at CE3 and that PE4 is on C-RPT 
   and PE5 is on C-SPT for (C-S, C-G). Let's also assume that PE1 is the 
   best next-hop PE to C-S on PE4 and that PE2 is the best next-hop PE 
   to C-S on PE5.   
    
   When a dually connected VPN receiver site switches from shared to 
   shortest path tree, the egress PE on C-SPT (PE5 in Figure 3) will 
   receive (C-S, C-G) Join from this site, while it never received (C-*, 
   C-G) Join from it before. The egress PE will create (C-S, C-G) state, 
   if it does not exist yet, and will add the interface on which it 
   received (C-S, C-G) Join to its olist. If there is already (C-*, C-G) 
   state in the same multicast VRF, the olist of (C-*, C-G) entry is 
   copied into the olist of new (C-S, C-G) entry. This is a standard PIM 
   procedure to allow C-S traffic to flow to (C-*, C-G) receivers. If C-
   S and C-RP are not attached to the same PE and if the egress PE 
   received a P-tunnel announcement for (C-S, C-G) traffic from the best 
   next-hop PE to C-S in the specified mVRF (the latter condition 
   guarantees that C-RP and C-S communicate across provider's network), 
   the egress PE will propagate (C-S, C-G) Join towards C-S. This is to 
   cover the case when C-S is dually connected and the egress PE on C-
   RPT (as PE4 in Figure 3) chooses a different upstream PE to C-S than 
   the egress PE on C-SPT (as PE5 in Figure 3).  
    
   The egress PE on C-SPT will join the P-tunnel for either C-G or C-S 
   of C-G if it was not joined yet. A PE always joins the most specific 
   P-tunnel that was announced for (C-S, C-G) traffic, i.e., it will 
   only join a P-tunnel that was announced for the C-G if there was no 
   P-tunnel announcement for the C-S of the C-G.  
    
   Once a multicast packet is received on the C-SPT at a dually 
   connected site, the PE which is on the C-RPT will receive (C-S, C-G, 
   rtp) Prune message from that site to prune off C-S traffic off C-
   shared tree. The PE on the C-RPT (as PE4 in Figure 3) does not need 
   to propagate (C-S, C-G, rtp) Prune message to C-RP, regardless 
   whether C-RP and C-S are attached to the same or different PE's. This 
   is because C-S has been already pruned off the C-shared tree. The PE 
   on the C-RPT might also stop joining the P-tunnel for (C-S, C-G) if 
   there are no other receivers for (C-S, C-G) attached to it (i.e., if 
   C-S traffic was pruned off on all (C-*, C-G) outgoing interfaces).  
    
    
 
 
Napierala               Expires - August 2008               [Page 10] 
                  Segmented Multicast MPLS/BGP VPNs 

     5.1.2 C-Shared Tree Switchback 
    
   If a site attached to egress PE switches back from C-SPT to C-RPT 
   because C-S traffic rate fell below the SPT-threshold, the PE on C-
   RPT will receive (C-*, C-G) Join to rejoin the shared tree. Since 
   this (C-*, C-G) Join is sent without a (C-S, C-G, rpt) Prune it will 
   cause the (C-S, C-G) Prune state along C-RPT to be deleted, which in 
   turn will permit (C-S, C-G) traffic to begin flowing down the C-RPT 
   again. If the egress PE stopped participating in the P-tunnel for C-
   S it has to rejoin this tunnel to receive the C-S traffic. 
    
   When a customer site switches back from C-SPT to C-RPT, the PE on the 
   C-SPT attached to this site will receive (C-S, C-G) Prune message. In 
   general, the egress PE does not need to propagates the (C-S, C-G) 
   Prune message to a PE attached to C-S, even if C-S and C-RP are not 
   attached to the same PE. This is because in this scenario, inter-PE 
   C-trees are always SPT's. However, there is one exception, namely 
   when SPT and RPT diverge at a dually connected site, as described in 
   section 5.1.1. In this scenario, given that C-S and C-RP are 
   attached to different PE's, when the egress PE receives (C-S, C-G) 
   Prune message it will remove the interface on which it received the 
   Prune from the olist for (C-S, C-G).  If the olist for (C-S, C-G) is 
   empty, the egress PE on C-SPT will send (C-S, C-G) Prune message up 
   the C-SPT. It will also stop joining the P-tunnel for (C-S, C-G) 
   traffic. This is to cover the case when C-S is dually connected and 
   the egress PE on C-SPT (as PE5 in Figure 3) chooses a different 
   upstream PE to C-S than the egress PE on C-RPT (as PE4 in Figure 3).  
    
    
     5.1.3 C-Receiver Pruning and P-tunnel Withdrawal  
    
   An egress PE will send (C-*, C-G) Prune message towards C-RP when the 
   olist for (C-*, C-G) in an mVRF of a given MVPN becomes empty. The C-
   RP could be locally attached to this PE or it can be attached to a 
   different PE. In the latter case, the mVRF with empty olist for (C-*, 
   C-G) will stop joining C-G P-tunnel announced by its best next-hop to 
   C-RP. The egress PE will keep the C-G P-tunnel information in case it 
   receives a new (C-*, C-G) Join from a locally attached site. This PE 
   will also send (C-S, C-G) Prunes for all C-sources for which it 
   triggered SPT's in the specified mVRF. The mVRF will also stop 
   participating in P-tunnels announced for those C-sources but the P-
   tunnel information will be kept on the egress PE until it receives C-
   S tunnel withdrawals.  
    
   The state (C-*, C-G) is removed from a PE, or more specifically from 
   an mVRF, attached to C-RP when its olist for (C-*, C-G) becomes 
   empty. This means that the P-tunnel for C-tree rooted at this PE is 
   not longer needed. Upon (C-*, C-G) state removal the PE attached to 
   C-RP will send the P-tunnel withdrawal message for C-G. It will also 

 
 
Napierala               Expires - August 2008               [Page 11] 
                  Segmented Multicast MPLS/BGP VPNs 

   stop joining the P-tunnels for (C-S, C-G) that it previously joined 
   and it will remove their P-tunnel information.  
    
   Upon receiving C-G tunnel withdrawal message, all PE's in given MVPN 
   will remove the C-G tunnel information. Every egress PE that 
   previously joined this C-G tunnel in any of its mVRF's will also 
   remove information about any P-tunnel for C-S of C-G associated with 
   those mVRF's. 
    
    
     5.1.4 C-Source Becomes Inactive 
    
   The state (C-S, C-G) expires or is removed on a PE attached to C-S 
   when C-S stops sending traffic or/and the state (C-S, C-G) was pruned 
   by the PE because there were no receivers for this traffic (the 
   latter condition was described in section 5.1.3).  
    
   When (C-S, C-G) state expires on PE attached to C-S because C-S 
   becomes inactive, this PE will send P-tunnel withdrawal message for 
   (C-S, C-G) to all PE's in a given MVPN. Upon receiving C-S P-tunnel 
   withdrawal message, PE�s attached to receivers of C-G (including the 
   PE attached to C-RP), will stop joining this P-tunnel and will remove 
   this P-tunnel information. After C-S stops sending traffic, the (C-S, 
   C-G) state will also expire on PE's attached to receivers of (C-S, C-
   G).  
   Upon receiving C-S P-tunnel withdrawal message, PE attached to C-RP 
   of C-G will, if applicable, stop sending periodic (C-S, C-G, rtp) 
   Prune messages towards the C-RP's. 
    
    
 5.2 C-Sources Not Discovered by PIM Control Messages 
    
   Even if from provider's network perspective C-S and C-RP are 
   reachable via different PE's or via different interfaces on the same 
   PE, the SPT between the C-S and the C-RP could be engineered by a 
   customer to be outside of provider's network. See Figure 1a in 
   Appendix A. When the SPT from C-S to C-RP is built outside of 
   provider's network, the C-S cannot be discovered via control 
   messages. In this scenario, the C-S traffic will be carried over C-
   shared tree between PE's. Moreover, the inter-PE signaling is 
   simplified by not switching to C-SPT's at the egress PE's at all. 
   Hence, C-trees will be the shared trees from egress PE's to C-RP's, 
   regardless whether customer last-hop routers switched to SPT's.  
    
   (NOTE that if there is a receiver C-R of C-G at the C-RP site and the 
   source-tree from C-R to C-S is across provider's network while the 
   source-tree from C-RP to C-S is engineered to be outside provider's 
   network, then the PE attached to this site will receive (C-S, C-G) 
   Join. In this scenario the C-S will be discovered and announced in 
   MVPN, following the procedure defined in section 5.1.) 
 
 
Napierala               Expires - August 2008               [Page 12] 
                  Segmented Multicast MPLS/BGP VPNs 

    
   Traffic from all C-sources that can't be discovered in MVPN is kept 
   on the same P-tunnel, regardless whether it is flowing on shared tree 
   or source tree in the customer network. This is the P-tunnel that was 
   announced for the group C-G by the PE attached to C-RP. In fact, this 
   procedure allows for further aggregation of traffic without 
   generating duplicates. Namely, the traffic for all C-G's for which 
   the C-RP is the active RP could be aggregated onto the same P-tunnel. 
   Such aggregation may cause loss of bandwidth optimality by delivering 
   traffic to PE's that don't need but it will not generate duplicate 
   traffic to C-receivers. 
    
   Upon receiving packets directly from source C-S, customer last-hop 
   routers might switch to SPT's and sent (C-S, C-G) Joins. However, the 
   egress PE that received (C-S, C-G) Join from a locally attached CE 
   will not propagate it to C-S and the egress PE will not switch to C-
   SPT's. This includes the topologies where PE attached to C-S is 
   either the same or different from the PE attached to C-RP. In 
   addition, when C-RP and C-S are attached to the same PE, there is no 
   switching to C-SPT's regardless whether C-RP and C-S are behind the 
   same or different CE's. 
    
    
     5.2.1 Dually Connected C-Receivers 
    
   There is one scenario that needs to be separately addressed, namely a 
   dually homed VPN receiver site with shared and source trees 
   diverging. 
    
                                C-S   C-RP 
                                  \    / 
                                   \  / 
                                    R-1  
                                     |           
                                    CE1            
                                   /   \            
                                  /     \             
                                 PE1    PE2        
                                  |      | 
                                  |      |           
                             Provider's Network 
                                |         | 
                                |         | 
                               PE3       PE4 
                              ^  \       /  ^ 
                   (C-*,C-G)  |   \     /   | (C-S,C-G) 
                     Join     |     CE2     |   Join 
                                     | 
                                    C-R 
    
 
 
Napierala               Expires - August 2008               [Page 13] 
                  Segmented Multicast MPLS/BGP VPNs 

                   Figure 4: Dually connected C-Receiver 
    
   Figure 4 depicts an example of such scenario. Customer receiver C-R 
   is dually connected to provider's network via PE3 and PE4. Let's 
   assume that C-RPT and C-SPT diverge at CE2 and that PE3 is on C-RPT 
   and PE4 is on C-SPT for (C-S, C-G). Let's also assume that PE1 is the 
   best next-hop PE to C-RP on PE3 and that PE2 is the best next-hop PE 
   to C-RP on PE4. 
    
   When such dually connected site switches from shared to shortest path 
   tree, the egress PE on C-SPT (PE4 in Figure 4) will receive from this 
   site (C-S, C-G) Join message. The egress PE on C-SPT will create (C-
   S, C-G) state in the relevant mVRF, if it does not exist yet, and it 
   will add the site's interface to the (C-S, C-G) olist. If there is 
   already (C-*, C-G) state in the same multicast VRF, the olist of (C-
   *, C-G) entry is copied into the olist of new (C-S, C-G) entry. This 
   is a standard PIM procedure to allow C-S traffic to flow to (C-*, C-
   G) receivers. The egress PE on C-SPT will not propagate (C-S, C-G) 
   Join towards C-S because there is no C-RPT to C-SPT switching across 
   provider's network. The egress PE on C-SPT will convert (C-S, C-G) 
   Joins to (C-*, C-G) Joins and will sent them to its upstream PE 
   towards the C-RP. This is necessary because the best next-hop to C-RP 
   on the egress PE on C-SPT (as PE4 in Figure 4) might be different 
   than the best next-hop to C-RP on the egress PE on C-RPT (as PE3 in 
   Figure 3). The egress PE will join the P-tunnel announced for C-G by 
   the best next-hop PE to C-RP in the relevant mVRF, if it did not join 
   it yet. 
    
   Once multicast traffic is received on the C-SPT at dually connected 
   site, the PE which is on the C-RPT tree will start receiving (C-S, C-
   G, rtp) Prune messages to prune C-S traffic off C-shared tree. The 
   egress PE will not propagate the (C-S, C-G, rtp) Prune towards C-RP 
   because the C-RPT will not be switched to C-SPT across provider's 
   network. 
    
    
     5.2.2 C-Shared Tree Switchback 
    
   If a site attached to an egress PE switches back from C-SPT to C-RPT 
   because C-S traffic rate fell below the SPT-threshold, the PE on C-
   RPT will receive (C-*, C-G) Join from a customer site to rejoin the 
   shared tree. Since (C-*, C-G) Join will be sent without a (C-S, C-G, 
   rpt) Prune this will cause the (C-S, C-G) Prune state along C-RPT to 
   be deleted, which will permit (C-S, C-G) traffic to begin flowing 
   down the C-RPT again.  
   In case a receiver site is dually connected and it receives the C-S 
   traffic on C-RPT, it will send (C-S, C-G) Prune message to the PE on 
   C-SPT. The PE on C-SPT will prune the interface on which it received 
   (C-S, C-G) Prune message off the C-SPT. If its olist for (C-S, C-G) 
   is empty and there is no (C-*, C-G) state or olist for (C-*, C-G) 
 
 
Napierala               Expires - August 2008               [Page 14] 
                  Segmented Multicast MPLS/BGP VPNs 

   becomes empty, the egress PE on C-SPT will stop sending (C-*, C-G) 
   Joins towards C-RP and it will also stop joining the P-tunnel for C-G 
   traffic. This is to stop unneeded traffic to be sent to the egress 
   PE. 
    
    
     5.2.3 C-Receiver Pruning and P-tunnel Withdrawal 
    
   An egress PE will send (C-*, C-G) Prune message towards C-RP when the 
   olist for (C-*, C-G) becomes empty in an mVRF. The C-RP could be 
   locally attached to this PE or it can be attached to a different PE. 
   The mVRF on egress PE with empty (C-*, C-G) olist will stop 
   participating in the P-tunnel for C-G that it previously joined. 
    
   The state (C-*, C-G) is removed on PE attached to C-RP when its olist 
   for (C-*, C-G) becomes empty. This means that C-G tunnel rooted at 
   this PE is not longer needed. Upon (C-*, C-G) state removal the PE 
   attached to C-RP will send the P-tunnel withdrawal message for C-G to 
   all PE's in a given MVPN. Upon receiving C-G tunnel withdrawal 
   message, all PE's in the MVPN will remove the C-G tunnel information.  
    
    
 5.3 Group-Only S-PMSI Auto-discovery Route 
    
   When BGP is used for an auto-discovery mechanism in MVPN, a new BGP 
   NLRI (MCAST-VPN) is already defined in [viii] to handle different 
   route types in MVPN. According to procedures defined in sections 5.1 
   and 5.2, MCAST-VPN NLRI definition has to be extended to include a 
   new Route Type called Group-Only S-PMSI auto-discovery route. The 
   Group-Only S-PMSI auto-discovery route is an announcement of an 
   active VPN C-group and the P-tunnel to be used for its traffic. The 
   P-tunnel information is carried in a BGP attribute called PMSI P-
   tunnel attribute already defined in [viii]. 
    
   Group-Only S-PMSI auto-discovery route type will be assigned Route 
   Type value of 6 of the MCAST-VPN NLRI and will consist of the 
   following: 
    
    
                   +-----------------------------------+ 
                   |      RD   (8 octets)              | 
                   +-----------------------------------+ 
                   |  Multicast Group Length (1 octet) | 
                   +-----------------------------------+ 
                   |  Multicast Group   (Variable)     | 
                   +-----------------------------------+ 
                   |   Originating Router's IP Addr    | 
                   +-----------------------------------+ 
    
   The RD is encoded as described in [iii]. 
 
 
Napierala               Expires - August 2008               [Page 15] 
                  Segmented Multicast MPLS/BGP VPNs 

    
   The Multicast Group field contains the C-G address or C-Generic LSP 
   Identifier Value. If the Multicast Group field contains an IPv4 
   address or a C-Generic LSP Identifier Value, then the value of the 
   Multicast Group Length field is 32. If the Multicast Group field 
   contains an IPv6 address, then the value of the Multicast Group 
   Length field is 128. 
    
   The Originating Router's IP Address field MUST be set to the IP 
   address that the PE places in the Global Administrator field of the 
   VRF Route Import extended community of the VPN-IP routes advertised 
   by the PE.  
    
    
 5.4 Handling Initial Packets Sent on C-Shared Tree 
    
   According to the procedures described in sections 5.1 and 5.2, the 
   initial C-G multicast packets send over C-shared tree could be 
   dropped by PE attached to C-RP until a P-tunnel for C-G traffic is 
   build. Since the C-G tunnel is announced when the first (C-*, C-G) 
   PIM Join is received by the PE attached to C-RP of C-G, this P-tunnel 
   should be built in time to carry the initial C-S packets. In PIM-SM 
   there are two scenarios to consider: (A) source registers first 
   before there are any interested receivers, or (B) receivers join the 
   group first, waiting for traffic on the shared tree. We will analyze 
   these two scenarios based on inter-PE PIM-SM procedures defined in 
   this document.  
    
   In scenario (A), whether in plain PIM or in MVPN context, the initial 
   source packets are discarded because there are no receivers on shared 
   tree. According to PIM-SM procedure when there are no receivers on 
   the shared tree, the C-RP sends (C-S, C-G) "Register-Stop" message 
   to the 1st-hop router to stop sending Register messages. The 
   Register process will restart in 3 minutes (at the earliest, 
   depending whether C-S is still active). If in meantime the C-
   receivers join the group C-G there is plenty of time for C-G P-
   tunnel to be announced and created. 
    
   In scenario (B), there exists a short window of time during which the 
   initial C-source packets could be dropped, namely when the first 
   active C-S registers with C-RP immediately after the first C-receiver 
   joined the C-G, not giving enough time for C-G P-tunnel to be built. 
   This is the only scenario under which there could be packet discards 
   in MVPN while there are not similar drops in plain PIM-SM multicast. 
   However, even in plain PIM-SM there could be packet drops especially 
   with bursty sources since only a bounded amount of traffic can be 
   encapsulated in PIM Register or MSDP SA messages.  
    
    

 
 
Napierala               Expires - August 2008               [Page 16] 
                  Segmented Multicast MPLS/BGP VPNs 

 5.5 Using P2MP LSP's as P-Tunnels for C-Shared Trees 
    
   If P-tunnels are built with receiver-driven P2MP MPLS LSP's [ix], the 
   P-tunnel for C-G can be algorithmically and uniquely chosen by the 
   egress PE's. An egress PE selects the "root" PE of the P-tunnel, 
   which is its best next-hop PE towards C-RP, and builds the P-tunnel 
   towards this root PE. Different PE's may choose different upstream 
   (i.e., root) PE's to reach C-RP in the same MVPN. This might happen 
   if C-RP is dually connected or if Anycast C-RP is used. When the 
   address of the root PE is used in the tunnel identification 
   algorithm, a distinct P2MP LSP per root can be built. Hence, multiple 
   P-tunnels can be simultaneously used to carry the same C-G traffic 
   without creating duplicates at the C-receivers. The P2MP LSP is 
   triggered by the egress PE when (C-*, C-G) Join is received from a 
   locally attached receiver.  
    
   This technique allows for further aggregation of traffic without 
   generating duplicates. Instead of one P2MP LSP per root PE per C-G, 
   one P2MP LSP per root could be used for all C-groups for which the C-
   RP is the active RP. In this case, C-group address has to be ignored 
   in the P2MP LSP identifier; instead the C-RP address should be used. 
   Such aggregation may cause loss of bandwidth optimality but it will 
   not generate duplicate traffic to C-receivers. 
    
   In most typical MVPN network topology, a data center or a hub 
   location is where one-to-many multicast applications are being 
   sourced. Typically, customer's Rendezvous Points are also located at 
   the data centers/hubs. In this topology there is no advantage to 
   switch from shared to source trees since multicast VPN traffic is 
   already on the shortest path in provider's network. Moreover, it is 
   beneficial to MVPN customer to stay on shared trees because no 
   unnecessary multicast states are created. If is known that a C-tree 
   never switches to SPT then P2MP LSP with inbound signaling is 
   sufficient in supporting such C-trees. 
    
    
   6. Supporting C-Shared Trees 
    
   The last hop customer routers might never switch traffic to SPT's for 
   certain multicast C-groups if SPT-threshold of "infinity" is 
   specified for those groups. The procedures defined in section 5 of 
   this document preserve C-shared trees, regardless of whether a path 
   between C-RP and C-S is outside or across provider's network. 
    
   The procedures defined in section 5.2 of this document preserve C-
   shared trees in case a path between C-RP and C-S is outside of 
   provider's network. This is in order to preserve the multicast states 
   and traffic patterns in MVPN customer network. According to 
   procedures in section 5.1, inter-PE traffic is automatically switched 
   to source trees for those C-sources whose path to C-RP is across 
 
 
Napierala               Expires - August 2008               [Page 17] 
                  Segmented Multicast MPLS/BGP VPNs 

   provider's network. However, in this scenario it is transparent to 
   the VPN customer whether multicast traffic is sent on shared or 
   source trees across provider's network. In other words, from customer 
   network perspective multicast traffic is still on shared trees. 
    
    
   7. Support of Anycast C-RP 
    
   The expected Anycast C-RP behavior is that different egress PE's 
   could choose different upstream PE's as the next-hops to the C-RP. 
   Support of multiple upstream PE's for Anycast C-RP is required. 
    
   There are two ways to support Anycast C-RP: based on provider's 
   network IGP cost or based on VPN customer routing. If there are 
   multiple next-hops to static C-RP installed in mVRF, the closest PE, 
   based on provider's network IGP cost, should be chosen as best next-
   hop to C-RP and only as a tie breaker the PE with the highest IP 
   address. IGP cost-based next-hop selection provides PIM-like support 
   of Anycast C-RP's, i.e., C-receivers join the closest Anycast C-RP 
   across provider's network. 
   Another option is to always use the highest IP address as a tie 
   breaker for RPF neighbor selection and leave it to MVPN routing 
   policy to reach different Anycast-RP's. This allows MVPN customer to 
   define its own Anycast C-RP selection, based on other criterion than 
   the closest distance. 
    
   Both Anycast C-RP options described above should be supported by the 
   MVPN implementation. 
    
    
   8. Inter-Site Signaling Procedures for PIM-SSM 
    
   With PIM-SSM an active C-source is discovered when a PE attached to 
   C-source receives the first (C-S, C-G) Join, either from directly 
   connected CE or from another PE in MVPN. 
    
   When a PE attached to C-S receives the first (C-S, C-G) Join from 
   another PE, this PE will announce the P-tunnel to be used for (C-S, 
   C-G) traffic to all other PE's in the MVPN. In PIM-SSM the source 
   discovery and P-tunnel announcement is one and the same message. The 
   PE's will store the C-S P-tunnel information until they receive the 
   P-tunnel withdrawal message for (C-S, C-G). A PE that does not have 
   any interested receivers for (C-S, C-G) when it receives the P-tunnel 
   announcement message, it will store this information so it can join 
   this P-tunnel for late (C-S, C-G) receivers. The conditions for (C-S, 
   C-G) P-tunnel withdrawal are defined in section 8.2. Each PE attached 
   to C-S, when it receives (C-S, C-G) Join, will announce its distinct 
   P-tunnel for (C-S, C-G) traffic.  
    

 
 
Napierala               Expires - August 2008               [Page 18] 
                  Segmented Multicast MPLS/BGP VPNs 

   An egress PE, or more precisely an egress mVRF with receiver(s) of 
   (C-S, C-G) will "join" the P-tunnel announced for (C-S, C-G) only if 
   the PE that sent this announcement is the best next-hop to C-S in 
   this mVRF. If there is more than one best next-hop to C-S in the 
   mVRF, the PE will choose as the next hop the PE with the highest IP 
   address or PE may utilize multicast multipath load splitting 
   algorithm.  
    
   PIM-SSM allows the source to continuously send traffic even if there 
   are no receivers for this traffic. (The drawback of this behavior is 
   waste of sender resources and the first-hop router/link bandwidth). 
   If the C-S is already active when the (C-S, C-G) Join reaches the C-
   router attached to C-S, the C-S traffic starts immediately flowing on 
   the C-source tree towards the PE. If the P-tunnel for (C-S, C-G) has 
   not yet been built up to the PE attached to C-S, few initial packets 
   arriving from C-S will be dropped. It is rather unlikely that there 
   are PIM-SSM applications where sender can be active without receivers 
   and yet any initial packet drop cannot be tolerated.  
    
    
 8.1 C-Receiver Pruning 
    
   An egress PE will send (C-S, C-G) Prune message towards C-S when its 
   olist for (C-S, C-G) in an mVRF becomes empty. The egress PE will 
   also remove the (C-S, C-G) state from the mVRF. Upon (C-S, C-G) state 
   removal the mVRF will stop joining the P-tunnel announced for (C-S, 
   C-G) traffic. 
    
    
 8.2 P-tunnel Withdrawal for C-S 
    
   The state (C-S, C-G) will be removed by PE attached to C-S after the 
   olist for (C-S, C-G) becomes empty. Upon (C-S, C-G) state removal, PE 
   attached to C-S will send P-tunnel withdrawal message for (C-S, C-G). 
   The egress PE's in a given MVPN, upon receiving (C-S, C-G) P-tunnel 
   withdrawal message, will remove the P-tunnel information.  
    
    
 8.3 Using P2MP LSP's as P-Tunnels for PIM-SSM C-Trees 
    
   If P-tunnels are built with receiver-driven P2MP MPLS LSP's [ix], the 
   P-tunnel for (C-S, C-G) can be algorithmically and uniquely chosen by 
   the egress PE's. Egress PE selects the "root" PE of the P-tunnel, 
   which is the best next-hop PE towards C-S in mVRF, and builds the 
   P2MP LSP towards this root PE. Different PE's and different mVRF's 
   may choose different upstream PE's to reach C-S in the same MVPN. If 
   the address of the root PE is used in the LSP identification 
   algorithm, a distinct P2MP LSP per root is built. Hence, there could 
   be multiple entry points for C-S traffic into provider's network 
   without duplicates at the C-receivers. The LSP is triggered by the 
 
 
Napierala               Expires - August 2008               [Page 19] 
                  Segmented Multicast MPLS/BGP VPNs 

   egress PE when (C-S, C-G) Join is received from a locally attached 
   receiver. The advantage of using P2MP LSP's for PIM-SSM C-trees is 
   that no out-of-band signaling is required. However, without out-of-
   band signaling the aggregation of P2MP LSP's is not possible because 
   it could result in duplicate traffic being sent to customer. 
    
    
   9. Inter-Site Signaling Procedures for PIM-Bidir 
    
   Some multicast applications use many-to-many model where each 
   participant is the receiver as well as the sender. Using PIM-SM for 
   such applications results in increased memory and protocol overhead.  
   Bi-directional PIM [x] eliminates both Register message encapsulation 
   and source-specific states by allowing packets to be natively 
   forwarded from a source to the Rendezvous Point using shared tree 
   state only. This ensures that only (*,G) entries will appear in 
   multicast forwarding tables and that the path taken by packets 
   flowing from the source and/or receiver to the Rendezvous Point 
   Address (RPA) and vice versa will be the same. Membership to a Bidir 
   group is signaled via explicit (*, G) join messages. Traffic from 
   sources is unconditionally sent up the shared tree toward the RPA and 
   passed down the tree toward the receivers on each branch of the tree. 
   This is in contrast with PIM-SM where traffic flows are 
   unidirectional. 
    
   The olist of a (*, G) entry for Bidir group G includes all the 
   interfaces on which (*, G) Joins were received. If a router is 
   located on a sender-only branch, a Bidir implementation might also 
   create (*, G) state but the olist will not include any interfaces. 
   Traffic in a Bidir group is always forwarded to the RPA of that 
   group. If no receivers are along the way to the RPA, the traffic will 
   be dropped off only at the RPA. Traffic will be forwarded to the RPA 
   even if there are no receivers at all.  
    
    
 9.1  Preventing C-Bidir Packet Loops in MVPN 
    
   IP Bi-directional PIM chooses a single Designated Forwarder (DF) for 
   upstream packets (away from the source) on every network segment and 
   point-to-point link. The DF procedure selects one router as the DF 
   for every RPA of bidirectional groups. DF is responsible for 
   forwarding multicast packets upstream to RPA as well as sending 
   (*,G) Join/Prune messages towards RPA. To avoid packet loops DF 
   election procedure eliminates parallel downstream paths from any RPA. 
   It enforces consistent view of the DF on all routers on network 
   segment, and during periods of ambiguity or routing convergence the 
   traffic forwarding is suspended. To avoid loops, customized routing 
   in downstream routers does not affect the choice of DF. In Bidir the 
   path from a source/receiver to DF is always the best metric unicast 
   path.  
 
 
Napierala               Expires - August 2008               [Page 20] 
                  Segmented Multicast MPLS/BGP VPNs 

    
   In MVPN context a Designated Forwarder for Bidir C-RPA is a PE 
   attached to C-RPA. Different mVRF's in a given MVPN might have 
   different next-hop PE's to C-RPA due to different routing policies or 
   they might have temporarily different next-hop PE's to C-RPA due to 
   routing transients. The MVPN solution for C-Bidir cannot rely on all 
   mVRF's in a given MVPN to either have common routing view to C-RPA or 
   to reach a common routing view to C-RPA in time to prevent packet 
   looping. Rather, a VPN has to be treated as a collection of sets of 
   multicast VRF's, each having the same but distinct from other sets 
   reachability towards C-RPA. Resolving C-Bidir packet loops in MVPN 
   inevitably results in the ability to partition an MVPN into disjoined 
   sets of mVRF's, served by disjoined P-tunnels. Each such set would 
   have a distinct view of converged network, i.e., it would have the 
   same upstream PE as the best next-hop towards the C-RPA. If there is 
   more than one best next-hop PE to C-RPA in a set, the tie breaker 
   will be the upstream PE with the highest IP address.  
    
   As an option, the MVPN implementation of C-Bidir should allow to 
   ignore specific multicast routing policy in mVRF, and instead make 
   all PE's in a given MVPN choose the same next-hop PE to C-RPA. Among 
   all candidate next-hop PE's, the single chosen upstream PE to C-RPA 
   could be the PE with the highest IP address. This approach to C-Bidir 
   might be desirable to customers that do not want a permanent 
   splitting of their MVPN's into disjoined C-Bidir trees. 
    
   Note that the unicast routing policy in a VPN cannot influence VPN 
   multicast routing from a multi-homed site. This is the nature of 
   Bidir that the path from a source/receiver site towards the C-RPA is 
   always the best metric unicast path and that choice is made locally 
   at the VPN site.  
    
    
 9.2  Active Group P-tunnel Announcement in C-Bidir 
    
   The (C-*, C-G) state is first created on a PE attached to C-RPA 
   (i.e., on a DF-PE) by a (C-*, C-G) Join from a locally connected or 
   remote C-receiver. Once (C-*, C-G) state is created a DF-PE announces 
   a P-tunnel for active group C-G to all PE's in a given MVPN. If BGP 
   is used as P-tunnel announcement delivery mechanism, the P-tunnel for 
   the active C-Bidir group is announced via the Group-Only S-PMSI auto-
   discovery route, defined in section 5.3. A PE that does not have (C-
   *, C-G) state when it receives a C-G P-tunnel announcement message 
   will store this information so it can join the P-tunnel for late 
   group members. 
    
   This procedure allows for further aggregation of C-Bidir traffic 
   without causing traffic loops. Instead of generating one P-tunnel per 
   C-G, one P-tunnel per DF-PE could be used for all C-groups for which 
   the C-RPA is the active RP. Such aggregation may cause loss of 
 
 
Napierala               Expires - August 2008               [Page 21] 
                  Segmented Multicast MPLS/BGP VPNs 

   bandwidth optimality by delivering the traffic to PE's that don't 
   need it but it will not generate loops in MVPN. 
    
   If C-S traffic starts unconditionally flowing from a VPN site towards 
   a PE before a single (C-*, C-G) Join was received from any VPN site, 
   this traffic will be dropped at the PE. This is because no inter-PE 
   P-tunnel has been built yet for C-G traffic. Since there are no 
   receivers yet for this traffic dropping it optimizes the inter-PE 
   behavior of C-Bidir. No C-G traffic is unnecessarily sent across MVPN 
   until there is a least a single receiver for C-G. This approach has 
   also positive security implications to service providers because it 
   prevents a coordinated attack of unconditional traffic from C-Bidir 
   sources with no receivers for this traffic. 
    
    
      9.2.1 Supporting Source-Only C-Branches 
    
   PIM-Bidir supports source-only branches i.e., branches that do not 
   lead to any receivers, but that are used to forward packets traveling 
   upstream from sources towards the RPA. In plain IP PIM-Bidir it is up 
   to the implementation whether to maintain group state for source-only 
   branches [x]. However, the procedures defined in this document 
   require that in MVPN context PE's on C-source-only branches maintain 
   (C-*, C-G) state. The existence of this state indicates that a PE is 
   on C-Bidir tree and has to join a P-tunnel used for its traffic. If 
   (C-*, C-G) state was not maintained for source-only sites, a PE would 
   not know whether or not it is on C-G's Bidir tree. The consequence of 
   this would be that in order to deliver source-only site traffic 
   across provider's network, all PE's in a given MVPN would have to 
   join the P-tunnel announced for C-G.  
    
    
     9.2.2 Active C-Group Announcement in C-Bidir 
    
   Announcing a P-tunnel for C-Bidir traffic only when at least one 
   receiver already exists for this traffic might introduce a potential 
   delay in receiving traffic from C-Bidir sources by the upcoming 
   receivers. Namely, when one or more C-Bidir sources start 
   unconditionally sending traffic to a C-G group with no active 
   membership and the receivers subsequently join the C-G, the inter-PE 
   P-tunnel has first to be announced and built before the source 
   traffic can be delivered to the receivers. This can be easily 
   remedied by announcing an active C-Bidir group upon receiving 
   unconditional source traffic with no active membership.   
    
   A PE upon receiving unconditional source traffic for C-G with empty 
   membership (i.e., the PE's olist list for (C-*, C-G) is empty), will 
   announce the active group C-G to its DF-PE. If olist for (C-*, C-G) 
   is non-empty on this PE or this PE has already received a P-tunnel 

 
 
Napierala               Expires - August 2008               [Page 22] 
                  Segmented Multicast MPLS/BGP VPNs 

   announcement for C-G, the PE will not announce that C-G is active 
   because this fact is already known in the MVPN.  
    
   When BGP is used as the delivery mechanism, a new route type has to 
   be defined for active C-group announcements. A new route type, a 
   Group Active auto-discovery route, is defined as follows: 
    
                +-----------------------------------+ 
                |      RD (8 octets)                | 
                +-----------------------------------+ 
                | Multicast Group Length (1 octet)  | 
                +-----------------------------------+ 
                | Multicast Group   (Variable)      | 
                +-----------------------------------+ 
    
   The RD is encoded as described in [iii]. The Multicast Group field 
   contains the C-G address or C-Generic LSP Identifier Value. If the 
   Multicast Group field contains an IPv4 address or a C-Generic LSP 
   Identifier Value, then the value of the Multicast Group Length field 
   is 32. If the Multicast Group field contains an IPv6 address, then 
   the value of the Multicast Group Length field is 128. 
   New Group Active auto-discovery route type will be assigned Route 
   Type value of 7 of the MCAST-VPN NLRI defined in [viii]. 
    
   If BGP is used as P-tunnel announcement delivery mechanism, once DF-
   PE receives Group Active auto-discovery route for C-G, it will 
   announce the P-tunnel to be used for C-G via Group-Only S-PMSI auto-
   discovery route, defined in section 5.3. 
    
  The procedure defined in this section should not be a default 
  behavior for handing C-Bidir traffic but it should be implemented as 
  an option to be turned on or off per C-G in provider's network. 
   
   
 9.3  Bidir C-Group Becomes Inactive 
    
   The state (C-*, C-G) is removed on a DF-PE after its olist becomes 
   empty. Upon (C-*, C-G) state removal DF-PE will send the P-tunnel 
   withdrawal message for C-G. This is the P-tunnel the DF-PE announced 
   on active C-G discovery. PE's attached to the participants of C-G, 
   upon receiving C-G P-tunnel withdrawal message, will remove the P-
   tunnel information.  
      
    
 9.4  P-tunnels for C-Bidir Traffic 
    
   The procedure defined in this document requires that C-Bidir traffic 
   is carried over MP2MP P-tunnels across provider's network, which can 
   be built with PIM-Bidir or with MP2MP LSP's. This is because in this 
   procedure only one P-tunnel is announced by and rooted at a DF-PE for 
 
 
Napierala               Expires - August 2008               [Page 23] 
                  Segmented Multicast MPLS/BGP VPNs 

   a C-Bidir group. In fact, using MP2MP P-tunnels in provider's network 
   is the only scalable approach to C-Bidir.  
    
   During routing convergence or when different routing policies for C-
   Bidir are supported, PE's in a given MVPN might choose different 
   upstream PE's as the best next-hops to C-RPA. Each PE attached to C-
   RPA announces a distinct MP2MP P-tunnel. At any given time, a PE in 
   the MVPN joins only one P-tunnel that was announced by its chosen DF-
   PE. Once the MVPN converges, each set of mVRF's with the same 
   multicast routing policy will have a single DF-PE for a C-RPA. When 
   an option for ignoring specific multicast VRF routing policies is 
   turned on, all PE's in the MVPN will choose the same next-hop PE to 
   C-RPA. A PE that joined a P-tunnel announced by a "transient" DF-PE 
   has to join the P-tunnel announced by the converged DF-PE, and stop 
   sending and accepting traffic on the tunnel announced by the 
   transient DF-PE. 
                          
    
 9.5  DF-PE Redundancy with Fast Convergence 
    
   To speed up C-Bidir convergence certain optimizations could be added 
   to C-Bidir support. In case when Bidir C-RPA is redundantly 
   connected, a PIM join could be sent to all PE's connected to C-RPA 
   site, not only to the DF-PE with the best route or with the highest 
   IP address. Each such candidate DF-PE would announce its own P-tunnel 
   for C-G traffic. All those P-tunnels could be joined by the PE's on 
   C-Bidir tree, but each such PE will send and/or receive C-G traffic 
   only over the P-tunnel announced by its current best DF-PE (or one 
   with highest IP address if they are equal cost) for C-G. This 
   procedure introduces a notion of primary and backup P-tunnels. A P-
   tunnel announced by currently active DF-PE is a primary P-tunnel. P-
   tunnels announced by non-active candidate DF-PE's are backup P-
   tunnels.  In case of the current DF-PE failure, upon the failure 
   detection, all mVRF's with participants in C-G and whose primary 
   tunnel was the one announced by failed DF-PE will stop 
   sending/receiving C-G traffic over the primary P-tunnel and will 
   start sending/receiving traffic over the backup P-tunnel. Since this 
   alternate P-tunnel already exists, the data loss is minimized. This 
   is a trade-off between fast-convergence and increased backbone 
   bandwidth usage.  
    
   The procedure defined in this section should be implemented as an 
   option to service provider.  
    
    
 9.6  Using MP2MP LSP's as P-Tunnels for C-Bidir 
    
   C-Bidir signaling procedure defined so far is based on P-tunnel 
   announcements by DF-PE's. Announcing the MP2MP tunnel by a DF-PE 
   allows for P-tunnel aggregation based on congruency of multicast 
 
 
Napierala               Expires - August 2008               [Page 24] 
                  Segmented Multicast MPLS/BGP VPNs 

   flows. If C-Bidir were to be supported without aggregation or with an 
   aggregation not based on on congruency of flows then a different 
   solution for C-Bidir is possible. Instead of announcing MP2MP tunnels 
   by the DF-PE's, such tunnels could be algorithmically derived based 
   on C-group and DF-PE addresses. This is possible when P-tunnels are 
   MP2MP LSP's [ix]. This is the same technique as described in sections 
   5.5 and 8.3 except that MP2MP rather than P2MP LSP's are being used. 
   Egress mVRF selects the "root" PE of the P-tunnel, which is its best 
   next-hop PE towards C-RPA, and builds the MP2MP LSP towards this 
   root/DF-PE. Different PE's or mVRF's may choose different upstream 
   PE's to reach C-RPA in the same MVPN. Since the address of the root 
   PE is also used in the MP2MP LSP identification algorithm, a distinct 
   MP2MP LSP per root is built. At any given time, a PE sends and 
   receives C-G traffic only on one MP2MP LSP that is rooted at the DF-
   PE chosen by this PE/mVRF. Hence, multiple MP2MP LSP's can 
   simultaneously carry the same C-RPA traffic without duplication and 
   looping of packets.  
    
   This technique allows for further aggregation of traffic without 
   causing traffic loops. Instead of generating one MP2MP LSP per C-G, 
   one MP2MP LSP per DF-PE could be used for all C-groups for which the 
   C-RPA is an active RP. In this case, C-group address should not be 
   used when generating the MP2MP LSP identifier, C-RPA address should 
   be used instead. Such aggregation may cause loss of bandwidth 
   optimality but it will not generate loops in MVPN. 
    
    
   10. C-Multicast Traffic Aggregation 
    
   The basic technique for providing scalability is to aggregate a 
   number of customer multicast flows onto a single multicast 
   distribution tree (P-tunnel) through the P routers.  The inter-PE 
   multicast procedures defined in this document support, by definition, 
   the following aggregation of C-multicast flows into a single P-tunnel 
   per root PE: 
   - traffic from all PIM-SM C-sources discovered in an MVPN that attach 
   to the same PE as their C-RP (root PE is each PE attached to the C-
   RP) 
   - traffic from all undiscovered PIM-SM C-sources in an MVPN (root PE 
   is each PE attached to one or more C-RP's) 
   - all PIM-Bidir traffic in an MVPN (root PE is each PE attached to 
   the C-RPA). 
   Such aggregation may cause loss of bandwidth optimality by delivering 
   the traffic to PE's that don't need it but it will not deliver 
   duplicates to egress PE's.  
    
   The aggregation of PIM-SM traffic from C-sources that are discovered 
   by the procedures defined in this document and such that they are 
   attached to a different PE then their C-RP requires "explicit 
   tracking" of receiver mVRF's. Explicit tracking means that the 
 
 
Napierala               Expires - August 2008               [Page 25] 
                  Segmented Multicast MPLS/BGP VPNs 

   transmitting PE has to know which mVRF's need to receive which 
   multicast streams. To assure that no duplicates are sent to 
   receivers, a root PE can only aggregate traffic from those C-sources 
   (attached to it) such that exactly the same mVRF's want to receive 
   this C-source traffic from the root PE. Since the set of receiver 
   mVRF's can dynamically change (e.g., a new mVRF can be added and 
   "break" the congruency of existing aggregation), the aggregation of 
   C-source traffic might need to be dynamically adjusted. 
   However, if the identity of the transmitting PE is known and is 
   supported by the forwarding plane, the egress mVRF can discard those 
   packets that came from the "wrong" PE, i.e., a PE that is not the 
   mVRF's best next-hop to the source of those packets. The ingress PE 
   information is provided by all P2MP tunnel encapsulation techniques 
   defined in [ii] or it can be provided by so called "PE label" in case 
   of MP2MP LSPs [ii]. Knowing the identity of the root PE relaxes the 
   requirement for perfect congruency of receivers for the discovered C-
   sources, however, it requires the support of upstream assigned PE 
   labels. 
    
   To allow the aggregation of C-multicast traffic belonging to 
   different MVPN's requires that the MVPN implementation supports the 
   upstream assigned demultiplexing label, defined in [ii]. The 
   demultiplexing label allows the egress PE's to determine the MVPN to 
   which the packet belongs. With such aggregation, in order to avoid 
   duplicates to receivers, the PE label identifying the transmitting PE 
   has to be also used. 
    
    
11. Supporting Source-Specific Host Reports in PIM-SM 
    
   PIM-SM [vii] permits "a receiver to join a group and specify that it 
   only wants to receive traffic for a group if that traffic comes from 
   a particular source. If a receiver does this, and no other receiver 
   on the LAN requires all the traffic for the group, then the DR may 
   omit performing a (*,G) join to set up the shared tree, and instead 
   issue a source-specific (S,G) join only." 
    
   Such a behavior of end systems in PIM-SM means that any PE can 
   receive Join (C-S, C-G) for a sparse mode group even if no PE has 
   ever received Join (C-*, C-G). It also means that (as in PIM-SSM) 
   source trees might be triggered even for sources that are not active. 
   In the MVPN we want to prevent useless S-PMSI creation for C-sources 
   operating in sparse groups which are not active. The procedures for 
   this case are specified below:  
    
   - If a PE, which is not attached to C-RP, receives a (C-S, C-G) Join 
   without a previous (C-*, C-G) Join on the same interface, and the PE 
   previously received a P-tunnel announcement for (C-S, C-G) traffic 
   or a P-tunnel announcement for C-G traffic, it will treat the (C-S, 
   C-G) Join as if it were a join initiated as a result of C-RPT to C-
 
 
Napierala               Expires - August 2008               [Page 26] 
                  Segmented Multicast MPLS/BGP VPNs 

   SPT switching. This procedure has been already specified in section 
   5.1.1 of this document. 
    
   - If a PE receives a (C-S, C-G) Join without a previous (C-*, C-G) 
   Join on the same interface and the PE has no P-tunnel information 
   for C-S or C-G traffic, it will treat the (C-S, C-G) Join as if it 
   were a (C-*, C-G) Join, provided that the interface in question does 
   not have the C-RP for C-G behind it. The procedure for handling (C-
   *, C-G) Joins is already specified in section 5 this document. 
   This scenario implies that there was no previous (C-*, C-G) Join in 
   the entire MVPN and that the 1st join in sparse group C-G that is 
   received by a PE in this MVPN is a source-specific join.  
    
   - If a PE, which is not attached to C-RP, receives a (C-S, C-G) Join 
   on an interface on which it previously received (C-*, C-G) Join, the 
   PE ignores the (C-S, C-G) Join as already specified by the 
   procedures defined in section 5.1 of this document. 
    
   - If a PE receives a (C-S, C-G) Join on an interface which is the 
   PE's next hop to the C-RP, it will announce source C-S to all PE's in 
   an MVPN. This is according to procedures already defined in this 
   document. If there is a C-receiver behind the same interface as the 
   C-RP, it might be the case that the (C-S, C-G) Join was requested by 
   the C-receiver and not by the C-RP (more precisely, the (C-S, C-G) 
   Join was sent based on receiver's source specific report). If this C-
   receiver has never requested to join (C-*, C-G) then there is no 
   guarantee that the source C-S is active and transmitting packets. 
   Hence, before the PE attached to C-S announces an S-PMSI for C-S, it 
   has to make sure that C-S is active. A PE attached to a site with C-S 
   upon receiving C-S announcement message from the PE attached to the 
   C-RP will not immediately announce the S-PMSI for (C-S, C-G) traffic. 
   It will announce it only when the 1st packet is received on the (C-S, 
   C-G) state, which is indicated by setting the (C-S, C-G) "SPTbit". 
   This assures that S-PMSI for (C-S, C-G) traffic is announced only if 
   C-S is transmitting. 
    
    
   12.IANA Considerations 
    
   To be supplied. 
    
    
   13.Security Considerations 
    
   To be supplied. 
    
    
   14.APPENDIX: Preserving C-Multicast Traffic Patterns in MVPN  
    

 
 
Napierala               Expires - August 2008               [Page 27] 
                  Segmented Multicast MPLS/BGP VPNs 

   This Appendix describes the routing topologies of PIM-SM C-multicast 
   from the provider's network view. It provides detailed analysis of 
   multicast routing scenarios to show that the mechanisms defined in 
   this document work correctly and do not trigger unexpected multicast 
   states in customer's network. 
    
   We decompose PIM-SM C-multicast into two scenarios where: (1) the 
   shortest path tree (SPT) between C-S and C-RP is via service provider 
   network, and (2) the shortest path tree (SPT) between C-S and C-RP is 
   outside of service provider network.  
    
                            C-S1          C-RP   C-S2 
                              |             |     / 
                             CE1           CE2  CE2' 
                              |             |   / 
                              |             |  / 
                             PE1           PE2 
                               \           /      
                             Provider's Network 
                                     | 
                                    PE3 
                                     | 
                                     | 
                                    CE3 
                                     | 
                                    C-R 
    
    Figure 1: Scenario (1) - Path between C-Si and C-RP via provider's 
                                  network 
                                      
   In scenario (1), shown in Figure 1, we assume that C-RP communicates 
   with source C-S, e.g., C-S1 and C-S2, over provider's network. As a 
   consequence, the SPT from C-S to C-RP is built across the network. 
   The (C-S, C-G) state is created within the site with C-S by a PIM 
   Join issued by C-RP towards the C-S. Hence, switching at the egress 
   PE's to SPT will not introduce new multicast states or change 
   multicast traffic patterns within the site with C-S (or any other VPN 
   site). In this scenario, immediate switching to SPT's at the egress 
   PE's is transparent to the customer. As a consequence, in scenario 
   (1), PIM-SM C-trees can be by default automatically triggered as 
   SPT's by all egress PE's with no inter-PE RPT-to-SPT switchover 
   initiated by C-routers. Regardless of whether or not the traffic in 
   customer's network switched to SPT's, inter-PE MVPN traffic is sent 
   only on SPT's. 
   Note that if there is a receiver C-R of C-G at the C-RP site, it 
   might happen that the 1st (C-S, C-G) Join that arrives at the PE 
   attached to this site is from the C-R rather than from C-RP. This 
   does not change the C-multicast traffic flows described above. 
    

 
 
Napierala               Expires - August 2008               [Page 28] 
                  Segmented Multicast MPLS/BGP VPNs 

   Even if from provider's network perspective C-S and C-RP are 
   reachable via different PE's (as C-S1 and C-RP in Figure 1a) or via 
   different interfaces on the same PE (as C-S2 and C-RP in Figure 1a), 
   a better multicast path between the C-S and the C-RP could be 
   engineered by a customer to be outside of provider's network.  
    
      
                            C-S1 ======= C-RP ==== C-S2 
                              |             |      / 
                             CE1           CE2  CE2' 
                              |             |   /  
                              |             |  /  
                             PE1            PE2 
                               \            /      
                             Provider's Network 
                                     | 
                                    PE3 
                                     | 
                                     | 
                                    CE3 
                                     | 
                                    C-R 
    
   Figure 1a: Scenario (1a) - Path between C-S and C-RP engineered to be 
                       outside of provider's network 
    
   Figure 1a depicts this scenario. From provider's network perspective 
   CE1 is reachable via PE1 and C-RP is reachable via PE2. Hence, from 
   provider's perspective the reachability between C-S and C-RP is via 
   provider's network. Yet, the SPT between C-S and C-RP has been 
   engineered by VPN customer to be outside of provider's network (which 
   is depicted by a double line between C-S1 and C-RP in Figure 1a). 
   Similarly, from provider's network perspective CE2 and C-RP are both 
   reachable via PE2. Hence, from provider's perspective the path 
   between C-S2 and C-RP is via provider's PE router. Yet, the best path 
   between C-S and C-RP has been engineered by VPN customer to be 
   outside of provider's network (which is depicted by a double line 
   between C-RP and C-S2 in Figure 1a). Handling such topologies would 
   complicate inter-PE C-multicast routing because it requires full C-
   RPT to C-SPT switching between PE's. Such scenarios are unusual and 
   could be a result of unintentional or incomplete route advertisement 
   by the customer. To avoid full RPT-to-SPT switching, in the scenarios 
   depicted in Figure 1a, the C-S traffic will be kept on inter-PE C-
   shared trees.  
    
   Note that if there is a receiver C-R of C-G at the C-RP site and the 
   source-tree from C-R to C-S is across provider's network while the 
   source-tree from C-RP to C-S is engineered to be outside provider's 
   network, then the PE attached to this site will receive (C-S, C-G) 

 
 
Napierala               Expires - August 2008               [Page 29] 
                  Segmented Multicast MPLS/BGP VPNs 

   Join. In this scenario the C-S will be discovered and announced in 
   MVPN, following the procedure described under scenario (1). 
    
   In scenario (2) customer source and customer RP are located at the 
   same site. In this scenario, the optimal path from C-S to C-RP might 
   not overlap with the optimal path from CE towards C-RP. Figure 2 
   depicts an example of such scenario. In this topology, if PE3 
   unconditionally switches to C-SPT, (C-S, C-G) state is created on CE1 
   which would not be otherwise created. If, in customer network, 
   switching from RPT to SPT is based on a non-zero SPT-threshold then a 
   specific source C-S traffic might never be switched to SPT if C-S 
   rate does not reach the configured threshold. Hence, under scenario 
   (2), to preserve PIM-SM multicast states in customer network, C-RPT 
   to C-SPT switching cannot be initiated by provider's network.  
    
                                C-S   C-RP 
                                  \    / 
                                   \  / 
                                    R-1  
                                     |           
                                    CE1            
                                     |            
                                     |             
                                    PE1            
                                     |  
                                     |               
                             Provider's Network 
                                     | 
                                     | 
                                    PE3 
                                     | 
                                     | 
                                    CE3 
                                     | 
                                    C-R 
    
       Figure 2: Scenario (2) - Path between C-S and C-RP outside of 
                            provider's network 
    
   In scenario (2) there is no advantage to switch inter-PE traffic from 
   C-RPT to C-SPT. Even more, it is beneficial to the customer not to 
   switch to SPT's at all because customer's multicast traffic is 
   already on the shortest path across provider's network. In addition, 
   in scenario (2), if customer initiates switching to SPT for C-S 
   traffic at a remote site (e.g., CE3 in Figure 2), this would not 
   change the C-S traffic pattern within the site with C-S. This is 
   because at this site the path from C-RP to C-S intersects with the 
   path from provider's network towards C-S. Hence, staying on inter-PE 
   shared tree for C-S will not change the C-S traffic pattern even if 
   customer switched to SPT for C-S at a remote site.  
 
 
Napierala               Expires - August 2008               [Page 30] 
                  Segmented Multicast MPLS/BGP VPNs 

   Based on the these observations, the C-G traffic from any source C-S 
   that is located at the same site as C-RP will be kept on inter-PE C-
   shared tree, regardless whether or not the customer network initiated 
   the switching to SPT's. 
    
   There could a scenario, with C-S and C-RP located at the same site, 
   where RPT-to-SPT switchover is initiated by the customer to alleviate 
   C-RP from carrying too much traffic. The example of such scenario is 
   depicted in Figure 2a. In Figure 2a it is assumed that the best path 
   from source C-S to C-RP is directly via CR1 only and not via CE1. 
   When a remote CE3 switches to SPT, C-S traffic does not need to flow 
   through the C-RP. However, this requires (C-S, C-G) state to be 
   created on CE1. In scenario (2a) a path from CE1 to C-RP does not 
   intersect with the SPT from C-RP to C-S. Hence, when staying on the 
   shared tree the C-S traffic cannot be to be "picked off" as it flows 
   along the SPT to the C-RP. In Figure 2a, if the best path from C-RP 
   to C-S were via CE1, the benefit of switching to SPT would be 
   eliminated because the C-S traffic would not flow via C-RP while on 
   the shared tree. Another benefit is that (C-S, C-G) state would not 
   be created on CE1. 
      
                                   CR1---C-S 
                                  /  | 
                               C-RP  |  
                                  \  |     
                                   \ | 
                                    CE1             
                                     |             
                                    PE1            
                                     |                  
                             Provider's Network 
                                     | 
                                    PE3 
                                     | 
                                    CE3 
                                     | 
                                    C-R 
    
      Figure 2a: Scenario (2a) - Path between C-S and C-RP outside of 
                            provider's network 
    
   It is beneficial to a VPN customer to assure that the best path from 
   the C-RP to C-S (when they are located at the same site) intersects 
   with the path from the provider's network towards C-S. Such topology 
   gains all the benefits of staying on the shared-tree because C-S 
   traffic can be "picked off" and send towards provider's network as 
   it flows along the SPT to the C-RP. We assume that staying on C-
   shared trees in topologies exemplified by Figure (2a) has a minimal 
   impact to the customer or that this impact can be easily eliminated 
   by a straightforward routing or topology adjustment in customer 
 
 
Napierala               Expires - August 2008               [Page 31] 
                  Segmented Multicast MPLS/BGP VPNs 

   network. In addition, such adjustment is beneficial to customer 
   because it results in fewer multicast states on customer routers. 
    
    
   15.References 
    
                     
      [i] Bradner, S., "The Internet Standards Process -- Revision 3", 
      BCP 9, RFC 2026, October 1996. 
       
      [ii] E. Rosen, R. Aggarwal, "Multicast in MPLS/BGP IP VPNs", 
      draft-ietf-l3vpn-2547bis-mcast. Work in progress. 
       
      [iii] E. Rosen, E., Rekhter, Y., "BGP/MPLS IP Virtual Private 
      Networks (VPNs)", RFC 4364, February 2006. 
       
      [iv] Kim, D., Meyer, D., Kilmer, H., and D. Farinacci, "Anycast 
      Rendevous Point (RP) mechanism using Protocol Independent 
      Multicast (PIM) and Multicast Source Discovery Protocol (MSDP)", 
      RFC 3446, January 2003. 
       
      [v] Farinacci, D. and Y. Cai, "Anycast-RP Using Protocol 
      Independent Multicast (PIM)", RFC 4610, August 2006. 
       
      [vi] H. Holbrook, B. Cain, "Source-Specific Multicast for IP", 
      RFC 4607, August 2006. 
       
      [vii] B. Fenner et al., "Protocol Independent Multicast - Sparse 
      Mode (PIM-SM): Protocol Specification (Revised)", RFC 4601, 
      August 2006. 
       
      [viii] R.Aggarwal, E.Rosen, et al., "BGP Encoding for Multicast 
      in MPLS/BGP IP VPNs", draft-ietf-l3vpn-2547bis-mcast-bgp. Work in 
      progress. 
       
      [ix] I. Minei, I. Wijnands, et. al., "Label Distribution Protocol 
      Extensions for Point-to-Multipoint and Multipoint-to-Multipoint 
      Label Switched Paths", draft-ietf-mpls-ldp-p2mp. Work in 
      progress. 
       
      [x] M. Handley, I. Kouvelas, T. Speakman, L. Vicisano, "Bi-
      directional Protocol Independent Multicast (Bidir-PIM)", draft-
      ietf-pim-bidir-09. Work in progress. 
       
    
    
    
   16. Acknowledgments 
    

 
 
Napierala               Expires - August 2008               [Page 32] 
                  Segmented Multicast MPLS/BGP VPNs 

   The author thanks Yakov Rekhter, Eric Rosen, Bill Fenner, Toerless 
   Eckert, Ice Wijnands, and Lee Breslau for their comments and 
   insights. 
    
    
   17. Author's Addresses 
    
   Maria Napierala 
   AT&T Labs 
   200 Laurel Avenue, Middletown, NJ 07748 
   Email: mnapierala@att.com 
    
    
18. Intellectual Property Statement 
    
   The IETF takes no position regarding the validity or scope of any 
   Intellectual Property Rights or other rights that might be claimed to 
   pertain to the implementation or use of the technology described in 
   this document or the extent to which any license under such rights 
   might or might not be available; nor does it represent that it has 
   made any independent effort to identify any such rights.  Information 
   on the procedures with respect to rights in RFC documents can be 
   found in BCP 78 and BCP 79. 
    
   Copies of IPR disclosures made to the IETF Secretariat and any 
   assurances of licenses to be made available, or the result of an 
   attempt made to obtain a general license or permission for the use of 
   such proprietary rights by implementers or users of this 
   specification can be obtained from the IETF on-line IPR repository at 
   http://www.ietf.org/ipr. 
    
   The IETF invites any interested party to bring to its attention any 
   copyrights, patents or patent applications, or other proprietary 
   rights that may cover technology that may be required to implement 
   this standard.  Please address the information to the IETF at ietf-
   ipr@ietf.org. 
    
    
19. Copyright Notice 
    
   Copyright (C) The IETF Trust (2008). 
    
   This document is subject to the rights, licenses and restrictions 
   contained in BCP 78, and except as set forth therein, the authors 
   retain all their rights. 
    
   This document and the information contained herein are provided on an 
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 
 
 
Napierala               Expires - August 2008               [Page 33] 
                  Segmented Multicast MPLS/BGP VPNs 

   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 
    














































 
 
Napierala               Expires - August 2008               [Page 34]