Improving packet capture performance: processing

By on 9 Nov 2020

Category: Tech matters

Tags: , , ,

Blog home

An image of a conveyor belt

In my last post, I examined how common packet capture tools Tcpdump, Zeek and Wireshark can help with reading packets from networking interfaces while avoiding excessive packet drops. Today’s post looks at how packet processing can be improved via these tools.

Turn off DNS resolution

Packets have both a source and destination IP address. Since these aren’t particularly human-friendly, some tools place a DNS query to turn the IP address into a hostname for display. Here’s how tcpdump looks when I’m not looking up hostnames:

tcpdump -i en4 -qtnp -c 1 'host www.whitehouse.gov'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on en4, link-type EN10MB (Ethernet), capture size 262144 bytes
IP 10.0.0.41 > 104.64.214.98: ICMP echo request, id 11503, seq 0, length 64
1 packet captured
82 packets received by filter
0 packets dropped by kernel

Here’s the same ping packet, but with hostnames resolved:

tcpdump -i en4 -qtap -c 1 'host www.whitehouse.gov'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on en4, link-type EN10MB (Ethernet), capture size 262144 bytes
IP ME41 > a104-64-214-98.deploy.static.akamaitechnologies.com: ICMP echo request, id 11503, seq 0, length 64
1 packet captured
51 packets received by filter
0 packets dropped by kernel

What this doesn’t show is the time it takes — performing this lookup can take a significant number of milliseconds for every lookup. Yes, there are ways of making it faster by caching results, but this can still be the biggest unnecessary waste of time in a packet sniffer. Not only did we slow down the capture in a way that’s almost guaranteed to drop packets, but we also got back a result that’s not useful; ‘a104-64-214-98.deploy.static.akamaitechnologies.com’ doesn’t even tell us that we were trying to ping www.whitehouse.gov.

In short, turn off DNS lookups!

tcpdump

On your tcpdump/windump command line, add ‘-n’ to disable dns lookups. It will still display the raw IP addresses and won’t try to turn those into hostnames.

Zeek

Zeek does not look up hostnames; it displays IP addresses by default.

Wireshark

Under View/Name Resolution, disable DNS lookups by unchecking ‘Resolve Network Addresses’:

An image of the Wireshark menu
Figure 1 — The Wireshark menu.

You can also disable this at capture time by going to Capture, Options, then click on the options tab in that window and uncheck ‘Resolve network names’ there:

The Wireshark capture interface options
Figure 2 — The Wireshark capture interface options.

Turn Off Port Resolution

If you’re already disabling DNS lookups, it’s probably also worth disabling port number->port name lookups as well. These aren’t anywhere near as bad as DNS lookups that can take milliseconds to multiple seconds; these are simply looking up port numbers in /etc/services and retrieving the corresponding names. That said, it still takes time and occasionally these can be wrong (port 5432 as a server port is postgresql, but if it’s a randomly chosen client port in an ssh connection the lines for this connection appear to say that postgresql is talking to ssh).

tcpdump

To disable port number lookups as well, replace

-n

with

-nn

Both dns and port number lookups will be disabled.

Zeek

Zeek does not do external port number lookups. Any port names come out of the Dynamic Protocol Detection module.

Wireshark

Under View/Name Resolution, uncheck ‘Resolve Transport Addresses’: (Example in Wireshark section above).

You can also disable this at capture time by going to Capture, Options, then click on the options tab in that window and uncheck ‘Resolve transport names’ there: (Example in Wireshark section above).

Removing unneeded processing

No matter what sniffer you use, there’s a certain amount of CPU time needed to process each packet. The goal here is to minimize that average processing time per packet.

Here are the general categories of processing a sniffer might do. Entries at the top are generally lightweight. As you move further down the list and include more of these, the amount of processing and memory usage goes up dramatically:

  • Breaking up the packet into fields of interest (not optional; may be significant processing time)
  • IP address
  • Protocol and port (including flags)
  • Keeping track of TCP connection and UDP conversation state (including ICMP errors)
  • DNS lookups
  • Reassembling the TCP connections
  • Inspecting the packet payload (actual content being transmitted)
  • Extracting user content out of the payload (such as downloaded files)
  • Decrypting payload content

Take a look at the output provided by your sniffer. What parts do you need? If you’re only interested in IP address and port information, there’s no need to enable extra modules that reassemble TCP conversations; check for TLS certificate errors, and save each downloaded file to disk.

Consult the documentation for your particular tool. What command line or configuration file options can you use to disable processing whose output you don’t need?

tcpdump

In addition to the ‘-n’ and ‘-nn’ option we mentioned earlier to disable dns lookups or dns lookups+port lookups, respectively, there are a few more you can use to minimize the work done. Try these one at a time to see what information is removed with each change.

First, remove any ‘-v’ command-line options. These turn on verbose processing that you may not want. If you have more than one, remove one at a time.

Once the ‘-v’ options are gone, add ‘-q’ to print less about the actual protocol, and ‘-t’ to disable timestamps.

Zeek

Zeek uses a plugin architecture, allowing the user to enable or disable processing tasks by uncommenting or commenting lines, respectively, in local.zeek . Here’s a sample list of some of the heavier processing tasks that you might consider commenting out in that file by making sure each of these you don’t need start with a ‘#’:

#@load frameworks/software/vulnerable
#@load frameworks/software/version-changes
#@load protocols/ftp/software
#@load protocols/conn/known-hosts
#@load protocols/conn/known-services
#@load protocols/ssh/detect-bruteforcing
#@load protocols/http/detect-sqli
#@load frameworks/files/hash-all-files
#@load frameworks/files/detect-MHR

Any other plugins in the ‘/files/’ directory are probably good candidates too. Obviously, it’s just fine if any of the above are not included in your local.zeek.

Wireshark

First, make sure you’ve followed the instructions for Wireshark in the other sections of this blog series; they’re the settings that largely apply to the task of capturing packets.

Wireshark has a number of additional processing tasks that can be performed that cover most of the above list. That said, the heavy processing ones are done after the capture is complete, so there are no additional steps to disable during capture beyond the ones you’ve already done.

Multiprocessing

Multiprocessing refers to using more than one physical processor core to handle the required work. That work includes: 1) The kernel collecting packets from a network interface and publishing them to any running sniffers; and 2) The sniffers processing those packets.

Step 1 is straightforward on Linux; the operating system kernel is already written to work on multiple processors. The quick task of pulling the packet off the network interface and the slightly longer tasks of deciding where to send it (routing) and whether to block it (firewalling) are already independent steps that can run on any available processor. This is why adding more and faster processor cores to a Linux system under heavy packet load is a good idea even when the sniffing tasks are light.

Unfortunately, there’s no single answer to “Can I run my sniffer on multiple processors simultaneously?” The answer depends on how the tool was written. The simplest way to program is to write a tool that only runs on a single processor, and many tools never go past that point. The more complex way to write a multiprocessing program includes starting multiple processes, having one or more listen on network interfaces or pcap files, having them queue up the captured packets, having other processes retrieve them from the queues for processing, and having one or more output programs for displaying or saving results. This is a significant amount of added complexity – I speak from experience.

Side note: there’s a similar technology called multithreading. With this, a program is broken up into small components (like ‘read a packet’, ‘summarize TCP connections’, ‘save packet to disk’, and ‘report statistics’). By having these as multiple threads as opposed to full processes, multithreading allows for different tasks to take place almost at the same time, but multithreading is limited to running inside a single processor core. While a multithreaded sniffer can make full use of that one processor core, it cannot make use of the other cores, so this offers very limited growth potential. For this reason we won’t cover it here.

tcpdump

tcpdump is written to run on a single processor. To spread the work out to multiple tcpdump workers on a multiprocessor system, give each one a unique filter for the traffic they’ll inspect. For example, to watch the traffic coming into and out of an HTTP server, we could start one copy of tcpdump that looks at http traffic:

tcpdump -i eth0 -qtnp -w tcp_80.pcap 'tcp port 80'

And another copy that looks at https traffic:

tcpdump -i eth0 -qtnp -w tcp_443.pcap 'tcp port 443'

Then a third that looks at all other TCP traffic:

tcpdump -i eth0 -qtnp -w tcp_other.pcap 'tcp and (not port 80) and (not port 443)'

And finally a fourth that looks at all non-tcp traffic:

tcpdump -i eth0 -qtnp -w not_tcp.pcap 'not tcp'

While this is a somewhat manual process to set up, it actually is quite efficient at spreading out the load over the available processors if you have enough independent copies of tcpdump.

Zeek

Zeek can use multiple processors. The file that defines Zeek’s use of multiple processors is node.cfg . In this file, add a section (or edit the existing section) for each interface on which you want to sniff:

[worker-WorkerNumber]
type=worker
host=127.0.0.1
interface=InterfaceName
lb_method=pf_ring
lb_procs=CoresPerInterface

You should have one block for each sniffable interface (whose name goes in line 4, such as ‘interface=eth12’. ‘WorkerNumber’ should start at 1 and go up by 1 so each block has a unique label. Finally, ‘CoresPerInterface’ should be a fixed number and the same value for each interface: (NumberOfProcessors – 2) / NumberOfSniffedInterfaces . If this value is less than 1, set it to 1.

Wireshark

Wireshark can only use a single processor.

This blog post is is the second in a series of three entries and was adapted from a post on the Active Countermeasures blog.

Bill Stearns has been working in the network security community for decades, contributing open source software and training.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

Leave a Reply

Your email address will not be published. Required fields are marked *

Top