The Forum of Incident Response and Security Teams (FIRST) hold an annual conference to promote coordination and cooperation among global Computer Security Incident Response Teams (CSIRTs). This year’s conference ran from 26 June to 1 July, in Dublin, Ireland. Andrew Cormack visited #FIRSTCON22, and these are his notes on various topics relating to incident response.
Effective threat hunting
Threat hunting is perhaps the least mechanical of security activities. According to Joe Slowik’ presentation the whole point is to find things that made it past our automated defences. But that doesn’t mean it should rely entirely on human intuition. Our hunting will be much more effective if we think first about which threats it will be most beneficial to find and how we are most likely to find them.
Thoughtful threat hunting requires an understanding of likely adversaries, telemetry and data sources, and the ability to search and query them. Rather than randomly searching for signs of intrusion, threat hunting provides most benefit if it concentrates on the kinds of threats that would cause most harm to the particular organization. Thinking about how those actors are likely to operate, and what their goals might be, should guide us to the services and systems they are most likely to use. Then we can consider what traces they might leave, and what records we might need to find them. If those don’t exist, then we can fill the gaps either by increasing activity logging in specific areas (but not so far that we overload ourselves) or by considering alternative sources that already exist.
For example, a frequent blind spot, mentioned in a number of different talks, is network activity within the organization. Perimeter systems such as firewalls should give good visibility of ingress and egress traffic, but multi-stage threats such as ransomware are more easily detected by their unusual lateral movement between organizational systems. But for organizations that identify email fraud as a significant risk, email headers are more likely to be a relevant source.
Even with a focus on specific threats and data sources, threat hunters are likely to have a ‘needle in the haystack’ challenge — data sources are too big for humans alone to analyse. So, we need tools to explore individual data sources and, in particular, patterns (or their suspicious absence) across sources. Flexible, exploratory tools are likely to be harder to use effectively than single-purpose searches, so threat hunters need more time to plan and develop their skills. Again, focusing on particular threats can guide this learning to where it will most benefit the organization.
Finally, when a threat is discovered, we should ‘codify the success’. Having discovered the signs of a successful intrusion, try to update the rules that it bypassed to make the same technique less likely to succeed in the future. Repeated hunting for the same threat is frustrating for the hunter and a waste of precious resources for the organization.
Incident response in the cloud
My first reaction to Mehmet Surmeli’s presentation on Incident Response in the Cloud was ‘here we go again’. So much seemed awfully familiar from my early days of on-premises incident investigations more than twenty years ago — incomplete logs, tools not designed for security, opaque corners of the target infrastructure, even the dreaded ‘didn’t we tell you that…?’ call from the victim organization.
But the response and lessons learned were different, and more positive. Maybe the next cloud incident can be different…
It turns out that, although they are often turned off by default, cloud platforms do have logging facilities, and it often requires just a couple of clicks to enable them. Bear in mind, however, that logs kept within the cloud container may be lost when the load scales up or down. Instead, it’s better to use the cloud service to build your own (virtual) logging infrastructure, gathering logs from transient virtual machines into a persistent central storage location where you can use cloud facilities to process and explore them. Twenty years ago, we knew we ought to have separate infrastructure for gathering, storing and processing logs — cloud systems might actually make that feasible for most organizations to implement.
Keeping incident response within the cloud also aligns with technical and economic models by avoiding limits or costs on exporting large volumes of data, and instead, using cloud facilities for their intended purpose of analysing large datasets. As with local incident response, things will be much easier if you prepare tools in advance and use separate accounts and access controls to move data to secure places where intruders can’t follow. As with compromised physical machines, Don’t investigate on a system the bad actor can access. Once you’ve established an incident response toolkit on each (major) platform your organization uses, you can quickly bring new activities within its scope and add new tools as you find them useful. Once you have a working incident response infrastructure and toolkit, consider how you might use cloud tools for real-time monitoring. It should be possible to investigate what intruders are doing as they do it.
Some key principles:
- Get logs out of their default locations: Cloud dashboards and tools are not designed for incident response.
- Default logging is not enough: Use the cloud to build the logging infrastructure you need.
- Tag and map your assets: Don’t make the incident response team reverse engineer what your cloud deployment is supposed to look like.
- Establish incident responder accounts, with sufficient privileges to monitor production systems, but no more.
How to phish, and how to stop it
Wout Debaenst’s talk described the preparatory steps an adversary must take before conducting a targeted phishing campaign, and the opportunities each of these presents for defenders to detect and prevent the attack before it happens. The talk was supposed to be accompanied by live demos, but these were sufficiently realistic that the hosting provider blocked them the night before the presentation!
- Create the domain name(s) to be used to create (misplaced) confidence in the phishing emails. I was familiar with the term ‘typosquatting’ for domains that transpose two letters, or use look-alike characters, but ‘combosquatting’ – adding something plausible like ‘shop’ to a genuine domain name — and ‘doppelganger’ — where the malicious domain is created by removing punctuation from the real one — were new and useful descriptions. Tools exist, including the free DNStwister, to check for domains that may be suspiciously close to your own. Another technique is to register something that looks like a genuine platform service, then add the target company as a subdomain. Here certificate transparency reports can be a useful source to check for your company name appearing in unexpected places, but wildcard certificates mean that individual subdomains will not be reported.
- Prepare the infrastructure that will be used to send emails, host webpages and so on. This involves considerable effort, so phishers will often reuse the same infrastructure across multiple attacks. This creates an opportunity for detection — if your company name appears in association with an IP address, whois information or even website images and templates that have previously been reported in relation to phishing attacks, then it’s likely that bad things are being prepared for you. Tools such as Virustotal and Brandefense map these known associations. Intruders can evade these by setting up new infrastructure for each attack or mingle their activities with genuine traffic by using a CDN, but this increases the cost of the attack, hopefully beyond its likely benefit.
- Send phishing emails. This provides an additional source of information whose reputation can be checked by the receiving organization. Email content is often checked, but there are opportunities also to check the originating domain, IP address, mailserver and other information. These defender checks complement those at the previous stage, because freshly created domains may themselves look suspicious.
- Execute malicious code. Although phishing can be conducted using only plaintext, executable components are more common and can often be blocked by disabling unnecessary features, such as macros or filetypes, in endpoint clients and devices.
None of these techniques can prevent phishing by a sufficiently determined attacker, but they increase the cost of a successful attack, both in terms of required preparation and risk of discovery. For many organizations, that should put off sufficient threat actors to significantly reduce the risk.
Ransomware: An emotional experience
Tony Kirtley’s talk explored how the Kubler-Ross model of grieving can help understand the emotional effects of a ransomware attack, both to avoid negative consequences and, where possible, to use natural emotions to support positive responses:
Denial: In a ransomware attack, denial should be short-lived, as the nature of the problem will quickly be clear and undeniable. However, there is a danger that individuals at this stage will take unplanned actions, such as changing passwords or rebuilding systems, that are at best a waste of time (while the bad actor still has access to the system) and at worst may destroy information needed for recovery. A related possibility is misplaced (mis)trust in systems, data, or people whose reliability isn’t yet known.
Anger: Depending how it is directed, anger can be either destructive — if channelled into finding someone to blame — or constructive — if used to bond and inspire those involved in recovery. ‘We are all in this lousy situation together, let’s combine our energy to get out of it’ can be positive, but needs care, because…
Depression: Individuals may naturally believe the situation is their fault, even if there was no way their actions could have changed the course of events. Leaders must provide constant reassurance, otherwise a feeling of hopelessness can easily spread through the organization.
Bargaining: Here the risk is of being too successful in the previous stages, leading individuals to over-commit to the recovery process. Ransomware incidents take a long time to repair — anything from two weeks to four months was suggested — which is too long for anyone to work in ‘emergency’ mode. The impact of burnout is amplified because not only the individual’s effort is lost, so is their detailed knowledge and understanding of the affected system. Here, external support can help by taking on the ‘material’ recovery actions, allowing local staff to focus their knowledge, skills and efforts on the locally unique aspects.
Acceptance: This is essential to plan and perform the recovery process. Leaders need to establish and enforce a tempo that will sustain the required level of work without risking burnout, plan a recovery process, and ensure it is trusted by the whole organization. Earlier emotions may recur, in particular anger and depression, so everyone must ensure the shared, no-blame approach is maintained. Here external support can help emotionally as well as practically; people who are less directly engaged are better placed to manage their own emotions and can spread confidence (‘we’ve done this before, with a successful outcome’) among those who are going through a thoroughly unpleasant experience for the first time.
Tony suggested a sixth stage, not in the original Kubler-Ross model:
Meaning: Sometimes referred to as ‘never let a good crisis go to waste’. Once an organization has successfully recovered from an incident, it should always review what lessons can be learned and implement measures that make a repetition less likely. This still needs care to manage emotions. A successful review will identify improvements to processes, systems, and guidance; one that descends into blaming is unlikely to help the organizational situation and may even make it worse.
Next in this series, I’ll look at topics relating to #FIRSTCON22’s theme ‘strength together’.
Andrew Cormack is Chief Regulatory Advisor at Jisc, and is responsible for keeping an eye out for places where our ideas, services and products might raise regulatory issues. Andrew ran the JANET-CERT and EuroCERT Incident Response Teams.
This post is adapted from posts at Jisc Blog.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.