The Domain Name System (DNS) has been used in the modern Internet to provide mappings between human-readable domain names and digital IP addresses since 1984. It serves almost every request asking for Internet services such as website fetching, email exchanging, video streaming and cloud computing.
Given the important role of the DNS, organizations and network service providers often adopt less restriction on it (for example, allowing DNS traffic by default) compared with other protocols such as HTTP and SSH. Such loose controls over the DNS not only benefit legitimate users but also provide malicious actors with a stealthy way to communicate with and control their distributed malware for cybercrimes.
In this post, I want to cover three typical types of DNS misuse by malware and the current countermeasures that are being used to detect them.
Malware Command-and-Control (C&C) communications over DNS
A networked device could be infected by malware (as a bot) in many ways such as phishing emails and malicious executable programs. Once infected, the device will try to establish connections with its remote master (malicious actor) to update its status, fetch further malicious scripts, or launch attacks as instructed by its master.
However, connecting to the remote master is not a straightforward task, especially for malware-infected devices that are highly distributed in different networks, some of which may have strict security enforcement. Thus, a static IP address or a long-life DNS name of the remote master server that is specified in the malware script may get extracted and added into blocklists.
To bypass such mitigation, malicious actors may assign domain names that have a very short life to the master server and frequently shift to other ones so that static blocklisting becomes less effective. In practice, such short-life DNS names are randomly generated by a certain Domain Generation Algorithm (DGA) defined by a malicious actor. Bot devices infected by the malware perform a sequence of DNS lookups for domain names generated by the same algorithm until it reaches the domain name in use and obtains the current IP address of the master server.
C&C communications between a malware-infected device and its master server may also be embedded in DNS query names (for example, prefixes) and response contents (for example, TXT resource records) to reduce its chance of getting dropped by security middleboxes.
Data exfiltration over DNS
After the establishment of connections with the remote master server, according to the received instructions, a malware-infected device may steal critical files stored on the device (or fetched from other hosts/servers using the credentials of the infected device) and then exfiltrate them to malicious actors sitting outside the network.
To bypass possible security enforcements on-the-path when exfiltrating sensitive files, malware may split the data into small snippets, encoding each into (the prefix of) domain names of DNS queries. For example, an exfiltration query name may look like [data_snippet_1].fakedomain.com. Security middleboxes (firewalls) may take those exfiltration packets as legitimate DNS queries and let them arrive at the destination successfully.
DNS attacks by malware-infected botnets
As instructed by its master, a botnet device may also launch DNS-based reconnaissance scans and denial-of-service (DoS) attacks to congest or paralyse critical (DNS) infrastructures on the Internet. One famous example is the massive-scale DDoS attacks that targeted a large DNS service provider, DYN, which caused many major popular Internet services to become unavailable.
These DNS attacks are quite diverse in terms of their strategy and traffic patterns and include periodic scan, focused scan, low-rate scan, query flood, and reflective flood — see this paper for more details.
Current countermeasures via DNS packet inspection
The security community has proposed various solutions to detect the above-mentioned malware activities exploiting the DNS, most of which focus on identifying the difference between legitimate and malicious domain names in the transmitted DNS packets.
Here are three suspicious domain name patterns that could be easily identified by inspecting DNS packets.
Randomized domain names
The first pattern is related to C&C communications. Recalling that a malware-infected host typically generates many DNS queries for DGA domains until they reach the one in use, it is not surprising that the host receives a high volume of error responses (with the NXDOMAIN error type) indicating the invalidity of the queried DGA domains.
Moreover, those DGA domain names often contain highly randomized top-level domain names (for example, lzeaeac.ru). As a comparison, legitimate domain names (including typos) usually follow certain meaningful patterns that are understandable by humans (for example, google.com).
Long and arbitrary prefixes
The second suspicious pattern is for DNS exfiltration. As discussed, sensitive data is likely to be encoded into the prefixes of query names, thus, exhibit long and arbitrary combinations of low/upper cases, numbers, and letters — you can find a visual comparison in Table 1.
Irrelevant and repetitive domain names
Third, DNS packets in scans and DDoS attacks may carry domain names irrelevant to the services provided by the victim organization, for example, as simple as ‘.com’, since they are likely to be crafted by malware scripts using a predefined domain list.
Furthermore, my colleagues and I (supervised by Professor Vijay Sivaraman and Dr Hassan Habibi Gharakheili) at the University of New South Wales, have empirically observed in two enterprise networks that most malware-infected devices simply use repetitive domain names to launch attacks — see our paper A Survey on DNS Encryption: Current Development, Malware Misuse, and Inference Techniques.
A new challenge introduced by DNS encryption
As we can see, many methods for detecting malware DNS activities require the extraction of plaintext domain names in DNS packets. However, the current popular trend of DNS encryption (via protocols such as DoT, DoH, and DoQ) makes DNS payloads uninterpretable by security middleboxes in-network.
To handle this challenge, the research community is starting to explore the use of various statistical features that can be computed from unencrypted metadata such as temporal patterns of DNS packets from a suspicious host, packet entropy, and distributions of DNS packet sizes.
Minzhao Lyu is currently a postdoctoral research associate at the University of New South Wales, Australia.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.