of 12
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.

obgp: an Overlay for a Scalable ibgp Control Plane



Publish on:

Views: 4 | Pages: 12

Extension: PDF | Download: 0

obgp: an Overlay for a Scalable ibgp Control Plane Iuniana Oprescu 1,5, Mickaël Meulle 1, Steve Uhlig 2, Cristel Pelsser 3, Olaf Maennel 4, and Philippe Owezarski 5 1 Orange Labs, 38-40, rue du Général
obgp: an Overlay for a Scalable ibgp Control Plane Iuniana Oprescu 1,5, Mickaël Meulle 1, Steve Uhlig 2, Cristel Pelsser 3, Olaf Maennel 4, and Philippe Owezarski 5 1 Orange Labs, 38-40, rue du Général Leclerc Issy-les-Moulineaux Cedex 9, France, {mihaela.oprescu, 2 Deutsche Telekom Laboratories & Technische Universität Berlin Ernst-Reuter-Platz 7, Berlin, Germany 3 Internet Initiative Japan, Jinbo-cho Mitsui Bldg., Kanda Jinbo-cho, Chiyoda-ku, Tokyo , Japan 4 University of Loughborough, Department of Computer Science, Haslegrave Bldg., Loughborough, LE3TU, United Kingdom 5 Université de Toulouse; UPS, INSA, INP, ISAE; LAAS; 7, Avenue du colonel Roche, F Toulouse, France Abstract. The Internet is organized as a collection of networks called Autonomous Systems (ASes). The Border Gateway Protocol (BGP) is the glue that connects these administrative domains. Communication is thus possible between users worldwide and each network is responsible of sharing reachability information to peers through BGP. Protocol extensions are periodically added because the intended use and design of BGP no longer fit the current demands. Scalability concerns make the required ibgp full mesh difficult to achieve in today s large networks and therefore network operators resort to confederations or Route Reflectors (RRs) to achieve full connectivity. These two options come with a set of flaws of their own such as persistent routing oscillations, deflections, forwarding loops etc. In this paper we propose a new architecture for the redistribution of external routes inside an AS. Instead of relying on the usual statically configured set of ibgp sessions, we propose to use an overlay of routing instances that are collectively responsible for (i) the exchange of routes with other ASes, (ii) the storage of internal and external routes, (iii) the storage of the entire routing policy configuration of the AS and (iv) the computation and redistribution of the best routes towards Internet destinations to each router of the AS. Keywords: routing, BGP, architecture, management 1 Introduction The Border Gateway Protocol (BGP) is the glue that enables the computation of end-to-end paths in the Internet. BGP allows networks, called Autonomous Systems (ASes), to exchange their routing information and to implement independently customized routing policies. The Internet has reached a size of more than ASes and roughly blocks of IP addresses. BGP and internal BGP in particular have been widely studied, and many extensions and improvements have been proposed to deal with matters like network convergence[2][3] or route diversity[5][6]. On the other hand, general design and architectural issues in ibgp have not been sufficiently confronted in our opinion. In this paper, we propose a new solution for ibgp routing within an AS using a distributed overlay of routing software. The solution we elaborate is meant as a viable framework for replacing the current ibgp and we introduce a setting for scalable and flexible routing. By gathering the routing information in a platform, we aim to offer easier management of protocols and policies at the network level. 2 BGP Routing in a Nutshell The Border Gateway Protocol[7] is in fact two protocols: internal BGP (ibgp) for handling messages inside an AS and external BGP (ebgp) for exchanging reachability information with other ASes in the Internet. This clear distinction makes it possible for an ISP to deploy a new ibgp in its network without any impact on neighboring ASes. We will further concentrate on describing some aspects of the ibgp mechanism. Within a BGP router, the decision process takes into account the interactions with every neighbor. Roughly speaking, if n is the number of prefixes advertised in the Internet, an ibgp Routing Information Base (RIB) will contain about n m routes in the worst case, where m is the number of neighbors sending their full BGP table as seen in Fig. 1. The best path to a destination is selected, installed in the Forwarding Information Base (FIB) and actually used to perform packet forwarding. The router will also advertise its best path for a given prefix to the adjacent BGP peers. Fig. 1. The selection of the best route among the received routes BGP requires that entries be kept for each reachable network: this constraint leads to large routing tables. There are many routes that cannot be aggregated and even if the optimizations in vendor code reduce the size of memory needed, they still do not fix the problem. The natural growth of the Internet as well as its increasing connectivity and the tweaking of routing entries for traffic engineering purposes have inflated the size of BGP routing tables by a factor of more than 3 within the last decade [1]. The current trend of the routing table indicates continuous growth of the Internet and we expect future evolution to be similar, especially after the migration to the apparently inexhaustible IPv6 space. Inside an AS, a single administrative entity manages all routers and distributes a consistent routing policy configuration. The goal of ibgp is to redistribute routing data inside the AS in accordance with the routing policy configured in each BGP router. ibgp routing originally required a full mesh between the routers within a single AS to guarantee that each router will be able to learn the best external route for forwarding IP packets. The full mesh configuration can quickly turn out to be a scalability problem since the number of sessions grows with the square number of participants. There are two alternatives for avoiding the processing overhead induced by full mesh: confederations and route reflectors (RRs) illustrated in Fig. 2. Confederations are sub-ases meant to divide a large network into more manageable areas. A route reflector is a router that takes the role of a central point where a subset of the other routers peer. These designs are both prone to unpredictable effects such as persistent routing oscillations and forwarding loops affecting network convergence, sub-optimal routing due to network opacity or non-deterministic decisions influenced by the state of the network at the arrival time of the advertisements. We further detail these drawbacks in 2.1. Fig. 2. ASes exemplifying a full mesh of ibgp sessions, confederations, route reflection 2.1 Plagues in Current ibgp In specific architectures using route reflection, routing is victim of a series of aspects induced by the inherent design of ibgp. Below is a brief display of the drawbacks in current ibgp: scalability is quantified by the number of protocol messages exchanged over time, the number of established sessions and especially by the size of the routing table. We estimate that the growth of the routing table can be handled by means of a partitioning model that we expose in this paper: achieve scalability through division of the control plane by placing subsets of prefixes in different locations and then performing the computation of the BGP decision process in a distributed manner. network opacity occurs in architectures where route reflection schemes are used for propagating routes. Incomplete knowledge of the set of routes advertised by neighbor ASes leads to inconsistencies and issues such as routing oscillations and deflections that can cause forwarding loops. Extensive studies [2][3][4][10] give conditions and methods for defining correct ibgp configurations that avoid anomalies and achieve full mesh optimality. poor route diversity is a direct consequence of network opacity. The fundamental design of BGP route redistribution demands that each peer advertise only its best route. Diffusing a single route impacts the available choices and there is a noticeable loss of route diversity when comparing border routers to internal routers [5]. The graph in fig. 3 presents the diversity of neighbor ASes and BGP nexthops for the received prefixes on 5 random routers. The data reveals the fact that there is a large diversity in the received routes but this diversity is severely reduced by the BGP selection mechanism of the best route CDF of prefixes Selected best routes Received RIB-in routes number of BGP next-hops CDF of prefixes Selected best routes Received RIB-in routes number of neighbor ASes Fig. 3. Prefixes and routes on 5 random routers of a large ISP Redundancy in case of failure is highly desirable and a secondary path could also be used for extra features such as load balancing or multipath routing. Protocol extensions introduce new capabilities for adding paths to BGP in [6], but there is no knowledge of the possible impact on current architectures. management and troubleshooting are often complex and challenging: inconsistency of the routing policies, path exploration meeting flap dampening [8] and difficulties in achieving network-wide traffic engineering are some of the issues encountered by network operators. A full view of the external routes and knowledge of the Interior Gateway Protocol (IGP) topology by one entity would make these processes easier. In obgp, the interaction with the entire network is done through the overlay and the concentration of the BGP decision process on the nodes increases control over the network behavior. We should, however, separate theory from practice. Some of the presented issues are commonly avoided with engineering tricks and configuration tweaking. Network operators adapt to inadvertencies by enforcing specific RR placement and building convenient topologies that behave correctly. Ideally, these aspects can be handled in an automated manner and this paper proposes an approach for better control over the network. 2.2 Previous Work New routing paradigms like AIR[9], ibgpv2[10] or even PCE[11] propose different approaches for handling routing within an AS. LISP[12] tackles routing as a general problem and proposes a solution for the global Internet routing. In [13] Jing Fu exposes a centralized control scheme for IGPs with faster routing convergence than link-state routing protocols. His results show it is possible to conceive a routing platform reaching performances comparable to native routing. The need for separating routing from the routers is emphasised also by N. Feamster et al. in [15]. The presented work is a design overview of a Routing Control Platform (RCP) that aims to offer separate selection of routes on behalf of the routers while maintaining backward compatibility. M. Caesar et al. later offer an implementation to the RCP concept. The prototype described in [16] has three modules: the IGP Viewer to collect topology information, the BGP engine that learns the BGP routes, performs the decision algorithm and then communicates the best paths to the routers and finally the Route Control Server that processes messages received from the other two modules and makes it possible to store one single copy of each BGP route, keep track of the routers to which each route has been assigned and maintain an order of preference of the egress point for each router [17]. We extend this work by going a step further in reaching scalability: in our approach, the prefix table is split, making possible parallel computation of routes while in the RCP solution, all the BGP information is concentrated in one point, even if there are multiple replicas of it. Our hybrid solution integrates the division of the routing table within a centralized routing platform. Other projects advocate the idea of downsizing the routing table: ViAggre (Virtual Aggregation) is a configuration-only method for shrinking the size of the routing table in the Internet default-free zone. It proposes a dirty slate technique for distributing routing within an ISP network so that routers maintain only a part of the global routing table. One of the negative impacts of ViAggre[18] is a strech imposed on trafic, diverting it from the native shortest path. Another inconvenient is the difficulty of the configuration. This same approach is advanced in [19] and X. Zhang et al. elaborate similar work in [20], but CRIO seems to bring more benefit to VPN routing. The work of S. Uhlig et al. [21][22] emphasizes the fact that network operators need a smarter way to do route reflection. In [23][24] C. Pelsser et al. aim to build distributed route servers. We go beyond these proposals by providing scalability through the distribution of the control plane in ibgp routing. 3 obgp: a Scalable Overlay for ibgp Routing In today s IP networks, routing is highly distributed: each router in the AS makes its own decisions. We propose to separate the selection of paths (routing plane) from the actual forwarding of traffic (data plane) on distinct equipments. Offloading the control plane from the routers can be seen as a remedy to the explosion of the routing table size. When rethinking the current design, we place all the knowledge of routing data into a separate ibgp routing plane handled by an overlay of routing processes that do not forward traffic. We propose to implement BGP routing engines called obgp. The obgp nodes act as the border routers of the domain and connect to the external peers through multi-hop ebgp sessions. This approach allows the overlay to receive all the routes from the neighboring ASes and aggregate the announced routes to achieve a unified complete view. obgp routing software is intended to be executed by additional servers running on commodity hardware. The logical overlay is composed of routing processes (or nodes) that are jointly responsible of: collecting, splitting and storing the complete set of routes received from ebgp and the internally originated routes, storing the routing policies and configurations of all the routers in the AS, computing BGP best paths for each router, redistributing the computed paths to the client routers. One of the main concerns of an ibgp architecture is its ability to scale: support the growing routing table and handle protocol messages over time. To achieve scalability, we design an obgp solution where the routing information is divided in several sub-planes. In this approach, distinct subsets of overlay nodes each handle only a fraction of the entire set of prefixes in the routing table. 3.1 Overview The next paragraph explains the passage of a route advertisement in the obgp overlay from the arrival in the AS to the installation of the best path in the RIB. Fig. 4 shows the chronological steps of a route announced to the obgp overlay. The obgp acts as a border router and the neighboring ASes connect to an obgp node through multi-hop ebgp sessions. When a route towards a destination (e.g. the prefix /24) is advertised in the Internet, it reaches the first obgp node that determines the corresponding sub-plane in charge of the prefix. The obgp node then forwards the information to the nodes handling the correct sub-plane. After running the BGP decision process and applying the according configuration and IGP topology constraints, the nodes output a best path. The overlay distributes the best path to the client routers connected through sessions and they can immediately install it in their RIBs. Upon reception of the best route, the native routing mechanism takes course and installs the path to the prefix in the FIB for actual packet forwarding. Fig. 4. The steps followed by an advertisement in the overlay network The obgp nodes need to be aware of the actual mapping of the reachable IP space within the overlay. To insure resiliency and avoid a single point of failure, a sub-plane is replicated on several obgp nodes. Coordination between the copies of sub-planes is accomplished through an exchange of meta-data across the obgp. The following paragraphs depict the sub-plane concept. 3.2 Distributed Storage A router learns routes toward a given prefix from its neighbors, and in the general case routers of the same AS do not learn the same exact set of routes or the same quantity. The full visibility of BGP routes received from external ASes can be assimilated to a sum of queries on all border routers of an AS. Aggregating routes received on every border router is equivalent to the global view of the advertised Internet as seen by the domain. obgp manages to keep this external view intact by indexing it directly in the overlay according to a mapping mechanism. The obgp nodes act as the collection of border routers of the AS and establish ebgp multi-hop sessions with neighbor ASes. Storage of prefixes is distributed across the overlay and nodes divide between each other the computational load of the control plane. We define several chunks of the reachable address space that are allocated on distinct nodes. These large IP spaces are called routing sub-planes. The overlay is in charge of keeping a coherent state where no pair of sub-planes has overlapping prefixes and they are stored on different nodes. A structure similar to a distributed hash-table can be used for managing the sub-planes. The obgp nodes guarantee the frontiers of the sub-plane, but another aspect to take into account is the replication of the information on the nodes covering the same sub-plane. Index of Virtual Prefixes: The mapping of the sub-planes on the obgp nodes takes into account the split factor n = 4 and attempts to evenly allocate each chunk of total/n prefixes to a sub-plane. This strategy turns out to be very Fig. 5. The routing table is split between the n = 4 nodes of the overlay coarse grained and thus we introduce smaller containers for the IP space called Virtual Prefixes as in [20]. Table 1 shows an example of a possible configuration of the sub-planes: the reachable IP space is divided in n = 4 sub-planes and each sub-plane covers the equivalent of a /2 prefix (corresponding roughly to 2 30 possible hosts). To better control the load incured by the obgp nodes handling the sub-planes, the network operator may choose to define several virtual prefixes as is the case for sub-plane 1 that contains 2 virtual prefixes.the virtual prefixes may be swapped between the obgp nodes in order to achieve a balanced load on the sub-planes. Data 6 in columns 3 and 4 shows that the density of prefixes advertised in the Internet can be almost uniformly distributed across the previously defined sub-plane space. If the distribution varies in time, we deem necessary to use a dynamic algorithm. Table 1. Sub-planes containing virtual prefixes sub-plane ID virtual prefixes # of prefixes % of total sub-plane / % / % sub-plane / % / % sub-plane / % / % sub-plane / % / % / % We envision as future work to develop an on-line procedure that allocates smaller virtual prefixes to the obgp nodes to obtain a fine grain arrangement. 6 Dataset of November 2010, based on a total of prefixes It is also possible to enforce a rule allowing for popular prefixes to be cached based on a statistical computation of the frequency of occurrence (i.e. cache the popular prefixes that are more stable as opposed to swapping more often the less popular prefixes). 3.3 Selection and Propagation of BGP Routes The main purpose for offloading the control plane into an overlay is to achieve scalability of the routing table, but the separation of the decision process from the actual forwarding of routes has several other benefits such as complete visibility of the routes advertised to the AS. The obgp nodes gather information through ebgp and at the same time they are part of the IGP topology which allows them to be aware of the metrics toward the next-hop. This feature is important because the customized computation of the best BGP route for a given prefix for a particular router will take into account the full view of the BGP routes and the interior cost for reaching the next-hop. The optimal routes are what the client router would choose if it had full view. Complete knowledge of both topologies allows the routing engines to make a correct selection and avoid situations like routing loops. Having a fed
Search Related
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks