NETRESEC Network Security Blog - Tag : domain


Domain Whitelist Benchmark: Alexa vs Umbrella

Alexa vs Umbrella

In November last year Alexa admitted in a tweet that they had stopped releasing their CSV file with the one million most popular domains.

Yes, the top 1m sites file has been retired

Members of the Internet measurement and infosec research communities were outraged, surprised and disappointed since this domain list had become the de-facto tool for evaluating the popularity of a domain. As a result of this Cisco Umbrella (previously OpenDNS) released a free top 1 million list of their own in December the same year. However, by then Alexa had already announced that their “top-1m.csv” file was back up again.

The file is back for now. We'll post an update before it changes again.

The Alexa list was unavailable for just about a week but this was enough for many researchers, developers and security professionals to make the move to alternative lists, such as the one from Umbrella. This move was perhaps fueled by Alexa saying that the “file is back for now”, which hints that they might decide to remove it again later on.

We’ve been leveraging the Alexa list for quite some time in NetworkMiner and CapLoader in order to do DNS whitelisting, for example when doing threat hunting with Rinse-Repeat. But we haven’t made the move from Alexa to Umbrella, at least not yet.


Malware Domains in the Top 1 Million Lists

Threat hunting expert Veronica Valeros recently pointed out that there are a great deal of malicious domains in the Alexa top one million list.

Researchers using Alexa top 1M as legit, you may want to think twice about that. You'd be surprised how many malicious domains end there.

I often recommend analysts to use the Alexa list as a whitelist to remove “normal” web surfing from their PCAP dataset when doing threat hunting or network forensics. And, as previously mentioned, both NetworkMiner and CapLoader make use of the Alexa list in order to simplify domain whitelisting. I therefore decided to evaluate just how many malicious domains there are in the Alexa and Umbrella lists.

hpHosts EMD (by Malwarebytes)

Alexa Umbrella
Whitelisted malicious domains: 1365 1458
Percent of malicious domains whitelisted: 0.89% 0.95%

Malware Domain Blocklist

Alexa Umbrella
Whitelisted malicious domains: 84 63
Percent of malicious domains whitelisted: 0.46% 0.34%

CyberCrime Tracker

Alexa Umbrella
Whitelisted malicious domains: 15 10
Percent of malicious domains whitelisted: 0.19% 0.13%

The results presented above indicate that Alexa and Umbrella both contain roughly the same number of malicious domains. The percentages also reveal that using Alexa or Umbrella as a whitelist, i.e. ignore all traffic to the top one million domains, might result in ignoring up to 1% of the traffic going to malicious domains. I guess this is an acceptable number of false negatives since techniques like Rinse-Repeat Intrusion Detection isn’t intended to replace traditional intrusion detection systems, instead it is meant to be use as a complement in order to hunt down the intrusions that your IDS failed to detect. Working on a reduced dataset containing 99% of the malicious traffic is an acceptable price to pay for having removed all the “normal” traffic going to the one million most popular domains.


Sub Domains

One significat difference between the two lists is that the Umbrella list contains subdomains (such as www.google.com, safebrowsing.google.com and accounts.google.com) while the Alexa list only contains main domains (like “google.com”). In fact, the Umbrella list contains over 1800 subdomains for google.com alone! This means that the Umbrella list in practice contains fewer main domains compared to the one million main domains in the Alexa list. We estimate that roughly half of the domains in the Umbrella list are redundant if you only are interested in main domains. However, having sub domains can be an asset if you need to match the full domain name rather than just the main domain name.


Data Sources used to Compile the Lists

The Alexa Extension for Firefox
Image: The Alexa Extension for Firefox

The two lists are compiled in different ways, which can be important to be aware of depending on what type of traffic you are analyzing. Alexa primarily receives web browsing data from users who have installed one of Alexa’s many browser extensions (such as the Alexa browser toolbar shown above). They also gather additional data from users visiting web sites that include Alexa’s tracker script.

Cisco Umbrella, on the other hand, compile their data from “the actual world-wide usage of domains by Umbrella global network users”. We’re guessing this means building statistics from DNS queries sent through the OpenDNS service that was recently acquired by Cisco.

This means that the Alexa list might be better suited if you are only analyzing HTTP traffic from web browsers, while the Umbrella list probably is the best choice if you are analyzing non-HTTP traffic or HTTP traffic that isn’t generated by browsers (for example HTTP API communication).


Other Quirks

As noted by Greg Ferro, the Umbrella list contains test domains like “www.example.com”. These domains are not present in the Alexa list.

We have also noticed that the Umbrella list contains several domains with non-authorized gTLDs, such as “.home”, “.mail” and “.corp”. The Alexa list, on the other hand, only seem to contain real domain names.


Resources and Raw Data

Both the Alexa and Cisco Umbrella top one million lists are CSV files named “top-1m.csv”. The CSV files can be downloaded from these URL’s:

The analysis results presented in this blog post are based on top-1m.csv files downloaded from Alexa and Umbrella on March 31, 2017. The malware domain lists were also downloaded from the three respective sources on that same day.

We have decided to share the “false negatives” (malware domains that were present in the Alexa and Umbrella lists) for transparency. You can download the lists with all false negatives from here:
https://www.netresec.com/files/alexa-umbrella-malware-domains_170331.zip


Hands-on Practice and Training

If you wanna learn more about how a list of common domains can be used to hunt down intrusions in your network, then please register for one of our network forensic trainings. The next training will be a pre-conference training at 44CON in London.

Posted by Erik Hjelmvik on Monday, 03 April 2017 14:47:00 (UTC/GMT)

Tags: #Alexa #Umbrella #domain #DNS #malware

More... Share  |  Facebook   Twitter   Reddit   Hacker News Short URL: https://netresec.com/?b=1743fae


DNS whitelisting in NetworkMiner

Stack of Papers by Jenni C

One of the new features in NetworkMiner Professional 1.5 is the ability to check if domain names in DNS requests/responses are “normal” or malicious ones. This lookup is performed offline using a local copy of Alexa's top 1 million domain name list.

We got the idea for this feature via Jarno Niemelä's great presentation titled “Making Life Difficult for Malware”. Despite working for F-Secure Jarno presents several smart ideas for avoiding malware infections without having to install an AV-product.

One of Jarno's slides contains the following suggestions:

Block Traffic To Sites Your Users Don’t Go To

Block subdomain hosting TLDs
  • co.cc, co.tv, ce.ms, rr.nu, cu.cc, cz.cc, vv.cc, cw.cm, cx.cc, etc
Block domains that provide dynamic DNS
  • *dyndns*, *no-ip*, 8866.org, thescx.info, 3322.org, sock8.com
Block file sharing sites, some malware use them
  • fileleave.com, dropbox.com, rapidshare.com, megafiles.com
For strict policy, allow DNS resolving only to Alexa top 1M[1]
  • Tip: Instead of null routing domains set up landing page
  • Either with a link that allows domain or IT ticket

Preventing users from visiting sites outside of the top 1 million websites (according to Alexa) sounds a bit harsh. In fact, we at Netresec just recently made it into the top 1M list (the current rank for netresec.com is 726 922). There are also many good and legit sites that are not yet on this list. Our idea is, however, to give analysts a heads up on queried DNS names that are not on the top 1M list by displaying this information in NetworkMiner's DNS tab.

NetworkMiner Professional 1.5 with DNS tab showing Alexa result for office.windowupdate.com

The screenshot above contains a lookup for the domain “office.windowupdate.com” (note the missing “s” in “windows”). This domain name was previously used by the C2 protocol Lurk (see Command Five's report “Command and Control in the Fifth Domain” for more details). The “Alexa Top 1M” column in NetworkMiner's DNS tab indicates whether or not the domain name is a well known domain. The malicious “office.windowupdate.com” is marked with “No”, while the proper “www.update.microsoft.com” is indeed on the list. It is, however, important to note that only the second-level domain is checked by NetworkMiner; i.e. in this case “windowupdate.com” and “microsoft.com”.

The DNS whitelisting technique can also come in handy when dealing with malware that employs domain generation algorithms (DGAs) (see the Damballa blog for additional info regarding DGAs). It is probably safe to say that these auto-generated domains should never show up in the Alexa Top 1M list.

Posted by Erik Hjelmvik on Wednesday, 02 October 2013 22:30:00 (UTC/GMT)

Tags: #Netresec #DNS #domain #DGA #malware #Alexa

More... Share  |  Facebook   Twitter   Reddit   Hacker News Short URL: https://netresec.com/?b=13A66EB


Extracting DNS queries

There was recently a question on the Wireshark users mailing list about “how to get the query name from a dns request packet with tshark”. This is a problem that many network analysts run into, so I decided to write a blog post instead of just replying to the mailing list.

Note: the pcap file used in this blog post is from the DFRWS 2009 Challenge.

Who queried for a particular domain?

Tshark can easily be used in order to determine who queried for a particular domain, such as google.com, by using the following command:

tshark -r nssal-capture-1.pcap -T fields -e ip.src -e dns.qry.name -R "dns.flags.response eq 0 and dns.qry.name contains google.com"
137.30.123.78 google.com
137.30.123.78 www.google.com
137.30.123.78 id.google.com
137.30.123.78 images.google.com
137.30.123.78 tbn2.google.com
137.30.123.78 tbn0.google.com
137.30.123.78 tbn2.google.com
137.30.123.78 tbn1.google.com
137.30.123.78 tbn3.google.com
137.30.123.78 tbn3.google.com

List all queries

A list of ALL queries can be built with the same command, but without filtering on a particular domain:

tshark -r nssal-capture-1.pcap -T fields -e ip.src -e dns.qry.name -R "dns.flags.response eq 0"
137.30.123.78 fp.ps3.us.playstation.com
137.30.123.78 cmt.us.playstation.com
137.30.123.78 google.com
137.30.123.78 www.google.com
137.30.123.78 www.mardigrasday.com
137.30.123.78 pagead2.googlesyndication.com
137.30.123.78 googleads.g.doubleclick.net
137.30.123.78 www.google-analytics.com
137.30.123.78 mardigrasday.makesparties.com
137.30.123.78 images.scanalert.com
137.30.123.78 a248.e.akamai.net
137.30.123.78 ssl-hints.netflame.cc
...

DNS lists in NetworkMiner

There is a DNS tab in NetworkMiner, which displays a nice list of all DNS queries and responses in a pcap file. Loading the same nssal-capture-1.pcap into NetworkMiner generates the following list:


DNS tab with nssal-capture-1.pcap loaded

NetworkMiner Professional also has the ability to export this data to a CSV file. The command line tool NetworkMinerCLI can also generate such a CSV file without a GUI, which is perfect if you wanna integrate it in a customized script.

Posted by Erik Hjelmvik on Sunday, 17 June 2012 17:45:00 (UTC/GMT)

Tags: #domain #tshark #pcap

More... Share  |  Facebook   Twitter   Reddit   Hacker News Short URL: https://netresec.com/?b=126C5CB

twitter

NETRESEC on Twitter

Follow @netresec on twitter:
» twitter.com/netresec


book

Recommended Books

» The Practice of Network Security Monitoring, Richard Bejtlich (2013)

» Applied Network Security Monitoring, Chris Sanders and Jason Smith (2013)

» Network Forensics, Sherri Davidoff and Jonathan Ham (2012)

» The Tao of Network Security Monitoring, Richard Bejtlich (2004)

» Practical Packet Analysis, Chris Sanders (2017)

» Windows Forensic Analysis, Harlan Carvey (2009)

» TCP/IP Illustrated, Volume 1, Kevin Fall and Richard Stevens (2011)

» Industrial Network Security, Eric D. Knapp and Joel Langill (2014)