The Internet and networks work because of the ability of network operators to master complexity.
Configuring a network requires knowledge of how those control protocols work and how to configure them on a per vendor basis as there is no standardized configuration interface.
Managing Internet Exchange Points (IXPs) is no different.
IXPs operator teams are composed of highly skilled engineers who are able to deploy and support operations, including maintaining layer 2 switching fabric devices – the fundamental service offered by an IXP.
Most configurations are done manually, which is time-consuming, inaccurate, and expensive. Add to this that scaling and managing switching fabric is difficult due to limitations of the equipment being developed – particularly equipment that couples control and data plane – and you begin to see the complex nature of IXPs.
Common issues that IXP operators need to manage
A significant issue that IXP operators continually manage is location discovery traffic (broadcast traffic), which is crucial for the switching fabric to run.
Broadcast traffic needs to be handled in a very specific manner because it can consume critical resources from all the connected operator’s devices. In some circumstances, it can even cause the entire IXP operation to stop. Some solutions and network architecture exist, but they are not perfect.
One solution that many IXPs are using is ethernet bridging in combination with the MAC learning algorithm, which maintains at least a single broadcast domain to enable the IXP members’ routers to connect with each other. By MAC learning algorithm, I mean to say the family of Ethernet IEEE MAC Bridges standards, which include bridging and spanning trees.
IXP ethernet bridging is based on transparent bridges, so called because they appear as transparent to network hosts. When a transparent bridge is turned on, it informs itself of the host location by examining the source MAC address of incoming frames. For example, if a bridge sees a frame arrive on port X from host A, the bridge assumes that host A can be reached through the connection to port X.
Through this process, transparent bridges build a table (the learning process), which it uses for forwarding; if there is no match, the frame is flooded to all ports except the inbound port. Broadcasts and multicasts also are flooded in the same way.
Other identified IXP switching fabric issues that IXP operators need to consider include:
- Broadcast traffic can weaken router CPU or even neutralize the entire IXP
- Loop-Free solutions are not perfect, and they exist to avoid a broadcast storm
- Undesired packets are hard to be filtered out
- Monitoring is too limited or too complicated and expensive to operate (SNMP or SFlow, NetFlow or IPFIX)
Using SDN to improve functionality
Software Defined Networking (SDN) has changed the gameplay somewhat, decoupling the control plane and the data plane and proposing with OpenFlow a programmable abstraction layer of the forwarding plane.
Using the concept of OpenFlow, an architecture called Umbrella was developed. Umbrella is a new SDN-enabled exchange fabric that is reliable, easy to manage, but most importantly, fully programmable.
The Umbrella SDN network design gives a stronger separation of control and data plane functionality, leading to enhanced scalability, reliability, and manageability. A significant benefit of Umbrella is that it removes all broadcast traffic. Integrating Umbrella in today’s IXP architecture is straightforward as we maintain a strict separation between layer 2 and layer 3 functions.
Umbrella works as follows: first, the operator adds the IXP members’ configuration through an API or a web interface (that is, mapping of MAC, IP addresses, and ports), which the controller uses to calculate the internal IXP-paths and installs the corresponding flows in all the IXP switches. Then, for any packet entering the fabric, the ingress OpenFlow switch encodes the path towards the egress in the destination MAC field, thus transforming broadcasts (for example, ARP traffic) into unicasts. Core switches then use the encoded path to forward the packets accordingly. Finally, the egress edge switch replaces the encoded MAC address with the real one.
The main benefits of Umbrella
- No more broadcast and perfect edge filtering
- Can run even if the control plane is down
- Works even without OpenFlow switch in the core
- Fine-grained monitoring with OpenFlow counters in the data plane
- Highly scalable for more PoPs and IXP’s members
- Open to future innovative application-oriented services
Proof of concept: Umbrella at Toulouse IXP
Toulouse SDN IX (TouSIX) is a project I’ve been working on with Toulouse IXP to demonstrate the practicality of the Umbrella approach in a real-world scenario.
It is the first SDN-enabled European IXP that all TouIX members can manage using with the TouSIX-Manager web application. Members can see the amount of traffic they are sharing with each other and per protocol. TouSIX-Manager has been developed with the Django framework and interacts with the OpenFlow switch through Ryu REST API.
You can find all the codes at https://github.com/tousix/tousix-manager.
TouSIX has been running for more than two years and has proved to increase stability, manageability, and monitoring, thus demonstrating the practical applicability and benefits of such a solution. As a result, TouSIX now fully leverages OpenFlow for its day-to-day operations.
The Umbrella architecture is in the process of being fully incorporated in the OpenFlow controller FAUCET, which will provide an ideal solution for small to medium IXPs. FAUCET Umbrella will be soon integrated into the IXP-Manager.
I want to thank Claudes Combes, Hugues Brunel and Laurent Guerby (TouIX/TouSIX) who provide us with the perfect environment for testing this research architecture as well as to all the members of the ENDEAVOUR project.
Marc Bruyere is completing a Post-Doc at the University of Tokyo. He is extending his thesis research on IXP architecture and metropolitan switch fabric to include 4G/5G Internet mobile traffic for operational simplification using Open Source, SDN, and NFV.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.