NETRESEC Network Security Blog - Tag : Threat Hunting

rss Google News

Hunting for C2 Traffic

In this video I look for C2 traffic by doing something I call Rinse-Repeat Threat Hunting, which is a method for removing "normal" traffic in order to look closer at what isn't normal.

The video was recorded in a Windows Sandbox in order to avoid accidentally infecting my Windows PC with malware.

The PCAP files analyzed in the video are:

Thank you for sharing these capture files Brad!

IOC List

  • QBot source: 23.29.125.210
  • QBot md5: 2b55988c0d236edd5ea1a631ccd37b76
  • QBot sha1: 033a22c3bb2b0dd1677973e1ae6280e5466e771c
  • QBot sha256: 2d68755335776e3de28fcd1757b7dcc07688b31c37205ce2324d92c2f419c6f0
  • Qbot proxy protocol server: 23.111.114.52:65400
  • QBot C2: 45.46.53.140:2222
  • QBot C2 JA3: 51c64c77e60f3980eea90869b68c58a8
  • QBot C2 JA3S : 7c02dbae662670040c7af9bd15fb7e2f
  • QBot X.509 domain: thdoot.info
  • QBot X.509 thumbprint: 5a8ee4be30bd5da709385940a1a6e386e66c20b6
  • IcedID BackConnect server: 78.31.67.7:443
  • IcedID BackConnect server: 91.238.50.80:8080

References and Links

Update 2022-10-13

Part two of this analysis has been published: IcedID BackConnect Protocol

Posted by Erik Hjelmvik on Friday, 30 September 2022 12:37:00 (UTC/GMT)

Tags: #Threat Hunting#PCAP#CapLoader#NetworkMiner#NetworkMiner Professional#Video#51c64c77e60f3980eea90869b68c58a8#IcedID#TA578

Share: Facebook   Twitter   Reddit   Hacker News Short URL: https://netresec.com/?b=2296553


Domain Whitelist Benchmark: Alexa vs Umbrella

Alexa vs Umbrella

In November last year Alexa admitted in a tweet that they had stopped releasing their CSV file with the one million most popular domains.

Yes, the top 1m sites file has been retired

Members of the Internet measurement and infosec research communities were outraged, surprised and disappointed since this domain list had become the de-facto tool for evaluating the popularity of a domain. As a result of this Cisco Umbrella (previously OpenDNS) released a free top 1 million list of their own in December the same year. However, by then Alexa had already announced that their “top-1m.csv” file was back up again.

The file is back for now. We'll post an update before it changes again.

The Alexa list was unavailable for just about a week but this was enough for many researchers, developers and security professionals to make the move to alternative lists, such as the one from Umbrella. This move was perhaps fueled by Alexa saying that the “file is back for now”, which hints that they might decide to remove it again later on.

We’ve been leveraging the Alexa list for quite some time in NetworkMiner and CapLoader in order to do DNS whitelisting, for example when doing threat hunting with Rinse-Repeat. But we haven’t made the move from Alexa to Umbrella, at least not yet.


Malware Domains in the Top 1 Million Lists

Threat hunting expert Veronica Valeros recently pointed out that there are a great deal of malicious domains in the Alexa top one million list.

Researchers using Alexa top 1M as legit, you may want to think twice about that. You'd be surprised how many malicious domains end there.

I often recommend analysts to use the Alexa list as a whitelist to remove “normal” web surfing from their PCAP dataset when doing threat hunting or network forensics. And, as previously mentioned, both NetworkMiner and CapLoader make use of the Alexa list in order to simplify domain whitelisting. I therefore decided to evaluate just how many malicious domains there are in the Alexa and Umbrella lists.

hpHosts EMD (by Malwarebytes)

Alexa Umbrella
Whitelisted malicious domains: 1365 1458
Percent of malicious domains whitelisted: 0.89% 0.95%

Malware Domain Blocklist

Alexa Umbrella
Whitelisted malicious domains: 84 63
Percent of malicious domains whitelisted: 0.46% 0.34%

CyberCrime Tracker

Alexa Umbrella
Whitelisted malicious domains: 15 10
Percent of malicious domains whitelisted: 0.19% 0.13%

The results presented above indicate that Alexa and Umbrella both contain roughly the same number of malicious domains. The percentages also reveal that using Alexa or Umbrella as a whitelist, i.e. ignore all traffic to the top one million domains, might result in ignoring up to 1% of the traffic going to malicious domains. I guess this is an acceptable number of false negatives since techniques like Rinse-Repeat Intrusion Detection isn’t intended to replace traditional intrusion detection systems, instead it is meant to be use as a complement in order to hunt down the intrusions that your IDS failed to detect. Working on a reduced dataset containing 99% of the malicious traffic is an acceptable price to pay for having removed all the “normal” traffic going to the one million most popular domains.


Sub Domains

One significat difference between the two lists is that the Umbrella list contains subdomains (such as www.google.com, safebrowsing.google.com and accounts.google.com) while the Alexa list only contains main domains (like “google.com”). In fact, the Umbrella list contains over 1800 subdomains for google.com alone! This means that the Umbrella list in practice contains fewer main domains compared to the one million main domains in the Alexa list. We estimate that roughly half of the domains in the Umbrella list are redundant if you only are interested in main domains. However, having sub domains can be an asset if you need to match the full domain name rather than just the main domain name.


Data Sources used to Compile the Lists

The Alexa Extension for Firefox
Image: The Alexa Extension for Firefox

The two lists are compiled in different ways, which can be important to be aware of depending on what type of traffic you are analyzing. Alexa primarily receives web browsing data from users who have installed one of Alexa’s many browser extensions (such as the Alexa browser toolbar shown above). They also gather additional data from users visiting web sites that include Alexa’s tracker script.

Cisco Umbrella, on the other hand, compile their data from “the actual world-wide usage of domains by Umbrella global network users”. We’re guessing this means building statistics from DNS queries sent through the OpenDNS service that was recently acquired by Cisco.

This means that the Alexa list might be better suited if you are only analyzing HTTP traffic from web browsers, while the Umbrella list probably is the best choice if you are analyzing non-HTTP traffic or HTTP traffic that isn’t generated by browsers (for example HTTP API communication).


Other Quirks

As noted by Greg Ferro, the Umbrella list contains test domains like “www.example.com”. These domains are not present in the Alexa list.

We have also noticed that the Umbrella list contains several domains with non-authorized gTLDs, such as “.home”, “.mail” and “.corp”. The Alexa list, on the other hand, only seem to contain real domain names.


Resources and Raw Data

Both the Alexa and Cisco Umbrella top one million lists are CSV files named “top-1m.csv”. The CSV files can be downloaded from these URL’s:

The analysis results presented in this blog post are based on top-1m.csv files downloaded from Alexa and Umbrella on March 31, 2017. The malware domain lists were also downloaded from the three respective sources on that same day.

We have decided to share the “false negatives” (malware domains that were present in the Alexa and Umbrella lists) for transparency. You can download the lists with all false negatives from here:
https://www.netresec.com/files/alexa-umbrella-malware-domains_170331.zip


Hands-on Practice and Training

If you wanna learn more about how a list of common domains can be used to hunt down intrusions in your network, then please register for one of our network forensic trainings. The next training will be a pre-conference training at 44CON in London.

Posted by Erik Hjelmvik on Monday, 03 April 2017 14:47:00 (UTC/GMT)

Tags: #Alexa#Umbrella#domain#Threat Hunting#DNS#malware

Share: Facebook   Twitter   Reddit   Hacker News Short URL: https://netresec.com/?b=1743fae


Rinse-Repeat Intrusion Detection

I am a long time skeptic when it comes to blacklists and other forms of signature based detection mechanisms. The information security industry has also declared the signature based anti-virus approach dead several times during the past 10 years. Yet, we still rely on anti-virus signatures, IDS rules, IP blacklists, malware domain lists, YARA rules etc. to detect malware infections and other forms of intrusions in our networks. This outdated approach puts a high administrative burden on IT and security operations today, since we need to keep all our signature databases up to date, both when it comes to end point AV signatures as well as IDS rules and other signature based detection methods and threat feeds. Many organizations probably spend more time and money on updating all these blacklists and signature databases than actually investigating the security alerts these detection systems generate. What can I say; the world is truly upside down...

Shower image by Nevit Dilmen Image: Shower by Nevit Dilmen.

I would therefore like to use this blog post to briefly describe an effective blacklist-free approach for detecting malware and intrusions just by analyzing network traffic. My approach relies on a combination of whitelisting and common sense anomaly detection (i.e. not the academic statistical anomaly detection algorithms that never seem to work in reality). I also encourage CERT/CSIRT/SOC/SecOps units to practice Sun Tzu's old ”know yourself”, or rather ”know your systems and networks” approach.

Know your enemy and know yourself and you can fight a hundred battles without disaster.
- Sun Tzu in The Art of War
Art of War in Bamboo by vlasta2
Image: Art of War in Bamboo by vlasta2

My method doesn't rely on any dark magic, it is actually just a simple Rinse-Repeat approach built on the following steps:

  1. Look at network traffic
  2. Define what's normal (whitelist)
  3. Remove that
  4. GOTO 1.

After looping through these steps a few times you'll be left with some odd network traffic, which will have a high ratio of maliciousness. The key here is, of course, to know what traffic to classify as ”normal”. This is where ”know your systems and networks” comes in.


What Traffic is Normal?

I recently realized that Mike Poor seems to be thinking along the same lines, when I read his foreword to Chris Sanders' and Jason Smith's Applied NSM:

The next time you are at your console, review some logs. You might think... "I don't know what to look for". Start with what you know, understand, and don't care about. Discard those. Everything else is of interest.

Applied NSM

Following Mike's advice we might, for example, define“normal” traffic as:

  • HTTP(S) traffic to popular web servers on the Internet on standard ports (TCP 80 and 443).
  • SMB traffic between client networks and file servers.
  • DNS queries from clients to your name server on UDP 53, where the servers successfully answers with an A, AAAA, CNAME, MX, NS or SOA record.
  • ...any other traffic which is normal in your organization.

Whitelisting IP ranges belonging to Google, Facebook, Microsoft and Akamai as ”popular web servers” will reduce the dataset a great deal, but that's far from enough. One approach we use is to perform DNS whitelisting by classifying all servers with a domain name listed in Alexa's Top 1 Million list as ”popular”.

You might argue that such a method just replaces the old blacklist-updating-problem with a new whitelist-updating-problem. Well yes, you are right to some extent, but the good part is that the whitelist changes very little over time compared to a blacklist. So you don't need to update very often. Another great benefit is that the whitelist/rinse-repeat approach also enables detection of 0-day exploits and C2 traffic of unknown malware, since we aren't looking for known badness – just odd traffic.


Threat Hunting with Rinse-Repeat

Mike Poor isn't the only well merited incident handler who seems to have adopted a strategy similar to the Rinse-Repeat method; Richard Bejtlich (former US Air Force CERT and GE CIRT member) reveal some valuable insight in his book “The Practice of Network Security Monitoring”:

I often use Argus with Racluster to quickly search a large collection of session data via the command line, especially for unexpected entries. Rather than searching for specific data, I tell Argus what to omit, and then I review what’s left.

In his book Richard also mentions that he uses a similar methodology when going on “hunting trips” (i.e. actively looking for intrusions without having received an IDS alert):

Sometimes I hunt for traffic by telling Wireshark what to ignore so that I can examine what’s left behind. I start with a simple filter, review the results, add another filter, review the results, and so on until I’m left with a small amount of traffic to analyze.

The Practice of NSM

I personally find Rinse-Repeat Intrusion Detection ideal for threat hunting, especially in situations where you are provided with a big PCAP dataset to answer the classic question “Have we been hacked?”. However, unfortunately the “blacklist mentality” is so conditioned among incident responders that they often choose to crunch these datasets through blacklists and signature databases in order to then review thousands of alerts, which are full of false positives. In most situations such approaches are just a huge waste of time and computing power, and I'm hoping to see a change in the incident responders' mindsets in the future.

I teach this “rinse-repeat” threat hunting method in our Network Forensics Training. In this class students get hands-on experience with a dataset of 3.5 GB / 40.000 flows, which is then reduced to just a fraction through a few iterations in the rinse-repeat loop. The remaining part of the PCAP dataset has a very high ratio of hacking attacks as well as command-and-control traffic from RAT's, backdoors and botnets.


UPDATE 2015-10-07

We have now published a blog post detailing how to use dynamic protocol detection to identify services running on non-standard ports. This is a good example on how to put the Rinse-Repeat methodology into practice.

Posted by Erik Hjelmvik on Monday, 17 August 2015 08:45:00 (UTC/GMT)

Tags: #Rinse-Repeat#PCAP#NSM#PCAP#Intrusion Detection#IDS#network forensics

Share: Facebook   Twitter   Reddit   Hacker News Short URL: https://netresec.com/?b=1582D1D

Mastodon

NETRESEC on Mastodon: @netresec@infosec.exchange

twitter

NETRESEC on Twitter: @netresec