Network management is hard. Among many reasons, grappling with low-level, unintuitive identifiers, such as IP addresses, is one of them. Consider this example: ‘Today, huge traffic came from 220.127.116.11’. This statement is not that useful to anybody. Instead, what if the IP addresses were automatically translated to higher-level names like ‘YouTube’? Immediately, this statement becomes much easier to understand. The same principle applies to specifying network policies. For example, it would be much easier and more intuitive to write policies like: ‘Bypass the firewall for traffic from YouTube.’
While expressing traffic statistics and network policies using high-level names is appealing, data collection and policy enforcement ultimately take place at the level of individual packets traversing network devices. These network devices need an effective way to associate each IP packet with a high-level name, directly in the data plane. Fortunately, modern switches and network interface cards are increasingly programmable, using higher-level languages like P4. Programmable data planes with Protocol Independent Switch Architecture, or PISA, make it possible to map IP addresses to names as the packets fly by. This offers an unprecedented opportunity to collect statistics and enforce policies by high-level names directly in the data plane.
Ideally, these mappings of IP addresses to names would be installed in the data plane ahead of time. One approach is to make a controller perform DNS lookups for a list of domain names we want to track, find the right IP address for each domain name, and then install the mapping in the data plane.
Unfortunately, a single IP address might be associated with many domain names. To get a sense of how often this happens, we analysed a week of DNS response messages captured on the Princeton University campus network. Around 25% of the 288,834 observed server IP addresses were associated with multiple domain names. Some IP addresses, mostly for web hosting platforms or Content Distribution Networks (CDNs), were associated with hundreds or even thousands of domain names. Thus, this approach could result in wildly incorrect mappings.
Meta4: Enforcing network policies with domain names in the data plane
Instead, we must find the right domain name dynamically. Meta4 enables network operators to express and enforce network policies with domain names in the data plane. Meta4 compiles and runs in a real data plane like Intel’s Tofino programmable switch.
Join a DNS response message with subsequent data packets
Instead of relying on a domain name lookup for an IP address, what if we track the actual DNS messages in the network? A DNS response message includes the domain name queried by the client, the server IP address, and the client IP address (as the destination address of the packet). Let’s say we save this information in memory.
Upon receiving the DNS response packet, the client can send data packets to the server using the server IP address, and vice versa. As shown in Figure 1, it is possible to use the source and destination IP address pair and ‘join’ them with the client IP and server IP address pair in the memory, which was populated by the previous DNS response packet. As a result, this data packet can be tagged with the correct domain name!
Implementing the dynamic mapping logic as a module in the data plane
The above logic has been proposed by a previous work called NetAssay. However, this logic was never intended to directly run in a data plane. A separate controller with a general-purpose CPU had to run this logic and then install the mapping in the data plane.
Meta4 overcomes this limitation. Meta4 implements the logic directly in the data plane, represented as the ‘Domain — IP mapper’ component in Figure 2. Using Meta4, network operators can first specify a list of domain names they want to track, even using wildcards. Now, using the DNS response packets, Meta4 populates its mapper automatically and tags data packets with the right domain name as packets fly by.
The default Meta4 application is to count the number of packets and bytes for each tracked domain name, depicted by the orange box in Figure 2. Consequently, a traffic volume report will be returned to the network operator by default. However, this part is easily swappable. As the Meta4 P4 program fits into the ingress pipeline of a PISA-based data plane, it is easy to attach another custom P4 program that performs interesting actions using the domain name information tagged to each packet. Indeed, we implemented two different applications, DNS tunnelling detection and IoT device fingerprinting, using the Meta4 framework as the basis.
However, there are challenges when trying to make this logic fit and run in a data plane. PISA-based switch hardware imposes several restrictions that make implementing Meta4 difficult.
Dealing with domain name parsing
PISA switch parsers in real hardware are designed to parse fixed-length header fields, which makes dealing with variable-length domain names difficult. Luckily, each octet or label is preceded with length information, which can be parsed by the switch. As shown in Figure 3, we leverage this octet prefix and find the combination of different parser states to cover all the bits in each label.
Currently, Meta4 can parse up to four labels, and up to 15 bytes or characters for each label. This configuration was able to cover most of the DNS response packets on our campus. Of course, it is possible to use a different configuration for each network.
Dealing with limited memory in the data plane
Meta4 uses register arrays to store the domain name information for a given client-server IP pair. Later, this is used to tag a domain name for an incoming data packet with a matching source and destination IP pair. However, PISA switches have limited memory, so it’s important to manage this limited memory well.
First, we use a multi-stage register data structure for storing the client IP, server IP, and domain information. Using two or more stages, each with a different salt when hashing, generally results in less hash collision than using a single large stage hash table. This data structure is shown in Figure 4.
Secondly, kicking out stale entries that have not been recently used in the hash table frees up some space. To keep track of fresh and stale entries, we add a timestamp value when storing entries in the hash table. Whenever a DNS or data packet matches on an entry, Meta4 updates the timestamp value. When a hash collision occurs, Meta4 replaces the entry with the current packet if the timestamp value is too old.
Setting up a good timeout value is important. If the timeout value is too big, we will not kick out stale entries quick enough. If too small, we will prematurely kick out entries that might have some ongoing traffic. Against the real campus trace from Princeton University, around 100 seconds gave us a good balance.
Running Meta4 against real campus traffic
For evaluation, we compiled and loaded Meta4 in Intel’s Tofino programmable switch. We evaluated Meta4 using our Princeton’s P4Campus infrastructure, which allows us to deploy and run our P4 apps against real traffic from our campus network. All packet traces were inspected and sanitized by a network operator to remove all personal data before being accessed by researchers.
Meta4 was able to identify and output traffic statistics on popular domain names visited by our campus network clients. This includes popular gaming platform websites such as Steam and globally popular websites like Facebook, Instagram, and YouTube. Voice and video conference call platforms like Skype/Microsoft Teams and Zoom were on our top list, too.
Using Meta4, we also implemented and tested a DNS tunnelling detection system as well as an IoT device fingerprinting application — both uses domain name information to perform their tasks. We envision many other applications could make use of domain-based monitoring in the data plane. For instance, to improve network performance, a network operator could allow traffic from certain trusted domains to bypass processing by an intrusion detection system (IDS). Similarly, a network operator could create firewall rules based on a domain name to redirect or drop packets associated with certain untrusted domains.
Hyojoon (Joon) Kim is an Associate Research Scholar in the Computer Science department at Princeton University. His current research focuses on building better network systems, applications, and tools.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.