Cache me outside: A new look at DNS cache probing

By on 22 Jun 2021

Category: Tech matters

Tags: ,

Blog home

Many connections on the Internet rely on the DNS protocol to resolve a domain name into a set of IP addresses. For performance reasons, DNS resolvers typically have a cache of recently resolved domain names that is shared among all the resolver’s users. However, this shared state exposes a side channel by which a user of a resolver can figure out if another user has issued a query for a specific domain name.

This side channel can be exploited by a process called DNS cache snooping, or probing, which involves performing DNS queries with the ‘recursion desired’ (RD) flag set to zero (as specified by RFC 1035) and observing the Time To Live (TTL) on the DNS response, or lack of response. 

Figure 1 — An example scenario of DNS cache probing (snooping).
Figure 1 — An example scenario of DNS cache probing (snooping).

Previously, to perform DNS cache probing, researchers assumed direct access to DNS servers or used open DNS resolvers. Nowadays, however, most DNS resolvers do not respond to queries from outside their network. In this post, we will discuss our approach, which leverages DNS forwarders in ISP networks to perform DNS cache probing.

DNS forwarders and how to locate them

Consumer NAT/gateway devices include a DNS forwarder so they can provide a DHCP lease (which requires specifying the DNS resolver’s IP address) to clients before the gateway itself obtains a DHCP lease. These DNS forwarders are intended to only respond to DNS queries from within the LAN. However, many are improperly firewalled and will also forward external DNS requests to the ISP’s recursive resolver. Figure 2 depicts the process of locating DNS forwarders, and is described below.

Figure 2 — The high-level process of locating DNS forwarder and resolver pairs.
Figure 2 — The high-level process of locating DNS forwarder and resolver pairs.

Step 1: Scanning the Internet’s IPv4 space for DNS resolvers

We began by extracting the results of a DNS scan from the Censys dataset every day. These scans send a recursive DNS query for a scan domain to the entire IPv4 address space. The nameserver of this scan domain will always return the fixed IP address of the scan domain and the source IP from which the nameserver received the DNS query.

Next, we filtered out IPs from the Censys results that did not respond correctly. An IP responded correctly if it answered Censys’ DNS question with exactly two answers, where one answer is the control.

Finally, we excluded shared DNS services, such as Google’s 8.8.8.8, by including only those IPs that are in an Automated System (AS) categorized by CAIDA’s AS Classification dataset as ‘Access/Transit’, and who responded with a resolver address that is also in an Access/Transit AS. We excluded shared DNS services because their users may be globally distributed, making location inference challenging.

Step 2: Determining suitable resolvers for cache probing

In determining suitable resolvers for cache probing, we were only interested in DNS forwarders that forward to resolvers respecting the RD = 0 flag and not recursively resolving any request sent to them. For this purpose, we performed our own scanning to filter the list of IPs from the previous step. We set up our own scan domain with a nameserver configured identically to the Censys scan domain. For this purpose, we included IPs that have the following three conditions.

  1. IPs that respond to a DNS query with RD = 0 to unique subdomains of our scan domain with 0 answers.
  2. IPs that respond to a recursive DNS query to unique subdomains of our scan domain with a resolver address (which does not belong to shared DNS services) and an approximately full TTL value.
  3. IPs that respond to at least one RD = 0 request for google.com with an IP in Google’s AS.

We considered DNS forwarders that met the criteria set out in Step 1 and Step 2 to be ‘well behaved’. Table 1 shows how many forwarders passed each phase of our filtering process on seven consecutive days in October 2020.

Forwarders filtered5 Oct6 Oct7 Oct8 Oct9 Oct10 Oct11 Oct
Filtered Censys scan811,914814,863817,935823,345790,313793,807811,783
RD = 0 check468,882450,421434,773426,936461,981444,785426,350
Forward check311,140295,560282,458277,183307,889293,075276,150
Google check246,710233,441223,014218,417244,032230,042216,049

Table 1 — Number of DNS forwarders passing each stage of our filtering process, 5-11 October 2020.

We repeated the measurements to better understand the behaviour of resolvers that a forwarder may use for DNS resolution. Table 2 presents the breakdown of DNS forwarders and resolvers we have access to across continents. Our dataset allowed us to access at least three DNS backends in 84% of the economies (188) and at least two ASNs in 74% of the economies (188).

 AfricaAsiaEuropeNorth AmericaSouth AmericaOceania/
Australia
All Forwarders66,626531,867392,148263,730120,50514,988
After filtering7,89063,41187,826137,34117,3374,883
Resolvers4192,6097,5455,6712,238475
Resolver Economies424048321214
Resolver ASes1525502,3471,095624137

Table 2 — Number of DNS forwarders and the number of economies and ASes on each continent where we have access to DNS resolvers (aggregated over a week).

Probing DNS forwarders

We developed our DNS cache probing tool, dmap, which sends a DNS query packet for each domain name to each active forwarder, every interval seconds. In the meantime, dmap listens for DNS responses and DNS responses containing error codes, no answers, or no answers for the exact domain name in the question are discarded. The TTL values in the DNS responses allow us to infer the date and time when the domain name was added to the DNS cache, by subtracting the response TTL from the record’s authoritative TTL (measured by a direct query to the domain’s authoritative nameserver). 

During our DNS cache probing, we continually validate the behaviours that respect the RD = 0 flag and don’t forward to shared DNS services. At any given time, dmap tries to have two active forwarders for each resolver. If a forwarder goes offline, or is detected misbehaving, then dmap removes it from the active forwarders list and for each resolver associated with this forwarder, dmap activates an additional forwarder in its list that talks to the same resolver. To ensure that new forwarders to be probed are discovered in a timely fashion, we reprocess the latest Censys scan results and reload these into dmap every day.

Figure 3 —The validation process of dmap using RIPE Atlas nodes and our scan domain.
Figure 3 —The validation process of dmap using RIPE Atlas nodes and our scan domain.

Validation

As shown in Figure 3, we performed a two-part ground truth experiment using 1,000 RIPE Atlas nodes across 106 economies. The nodes sent recursive DNS queries to a single subdomain of our scan domain once per hour with random start times for 72 hours. At the same time, dmap probed for the same subdomain across 16,000 forwarders for a period of 26 hours. During our experiment, only 1,473 unique forwarders returned an answer (that is, they contacted a resolver that had received a query for our scan subdomain). These forwarders used 1,247 unique resolvers in 64 economies.

We cross-checked the timestamps inferred from our DNS cache probing results with ground truth timestamps from our DNS servers’ logs that show when a resolver contacted our nameserver, and timestamps from RIPE Atlas measurement logs. We found that our inferred timestamps were accurate to five seconds for 97% of the resolvers, in both cases.

In summary, we developed a system that uses DNS forwarders to perform DNS cache probing, then applied our technique to localize website filtering appliances sold by Netsweeper and tracked the global proliferation of stalkerware. As shown in the paper, were able to discover devices in ASNs that OONI and Censys had failed to detect, and observed a regionality effect in the usage of stalkerware apps across the world, for example, apps in the Russian language are more prevalent in Russia and Ukraine.

For more technical detail on our study, the full paper and our presentation at the 2021 Passive and Active Measurement Conference (PAM 2021) is available online.

Arian Akhavan Niaki is a PhD candidate at the University of Massachusetts Amherst. His research interests broadly lie in the areas of Internet measurement and computer networking.

This is a joint work with Bill Marczak, Sahand Farhoodi, Andrew McGregor, Phillipa Gill, and Nicholas Weaver.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

Leave a Reply

Your email address will not be published. Required fields are marked *

Top