Brian: I think it’s very interesting, and probably not very widely known, that the Verizon RISK team provides a very large set of data to analyze on data breaches (the VERIS Community Database). Perhaps one reason it’s not widely known is because people don’t know how to analyze it and look for their own findings. What kind of things could people learn by performing their own analysis?
Jay: The VERIS Community Database (VCDB) is one of our research projects. Beginning in early 2013, we have been keeping an eye on the headlines, and when a breach is publicly reported we record all the information we can find about the incident. As of right now we have over 2,500 incidents recorded in the VERIS format. VERIS is the “Vocabulary for Event Recording and Incident Sharing” and has over 150 different data points that we try to collect. For most of the public incidents, we are able to record between 10 and 30 data points.
We’ve decided to offer all of the data openly and publicly. Anyone can download the data (in JSON or CSV format) and do their own analysis. With so many data points collected, we couldn’t possibly look at all the relationships, nor ask all the questions. What are the attacker’s motives when web servers are involved? What tactics are used against payment card data versus personal information? By doing your own analysis, you can find the answers to your own specific questions, relevant to your specific industry and organizational size.
Brian: Security practitioners are often called upon working on tactical issues, and it can be hard to convince their managers that it’s important to analyze data as well. How do you recommend that security teams incorporate these principles for data analysis, and how does it translate into job responsibilities?
Jay: Everyone in information security is trying so hard to stop the next security incident. Sometimes we might lose sight of the bigger picture. Not only do you want to stop the next incident, but you should also want to also learn from it, which provides important lessons for stopping the one after that. Leadership and practitioners alike need to keep an eye on the long game and focus on making sure that we have better information next year than we do this year, and that’s what data analysis is for.
That also means job responsibilities need to shift as well. Don’t be fooled into thinking that your security expertise will make you an expert with security data. Data has its own language and at the root of that language is statistics. The good news is that with just a little bit of work, basic data analysis skills are relatively easy to pick up and should help you avoid some common pitfalls and mistakes. That is exactly the type of person we wrote our book for: the security practitioner who is motivated to learn basic skills around programming and statistics.
Brian: Do you have any stories of what can be learned from analyzing their own data (either pre-breach or post breach)? Of course, the names can be changed to protect the innocent.
Jay: Even though I just brought up statistics (it’s an important skill to develop), you should be able to learn a lot about your environment by just counting and comparing. Here’s an exercise everyone should try: get a log file from your network egress point. It doesn’t have to be big (no Hadoop necessary). Check open sources like Alien Vault’s IP reputation database and see how many IP addresses are in both files, and you may be surprised at what you find. Simply identify all the countries and destinations your network is reaching out to (with IP geolocation and ASN lookups). Now break down what internal assets are talking to external hosts. Are your printers communicating with foreign countries that you aren’t doing business with? Are your phones reaching out to an old business partner? If your organization hasn’t looked at this before, it’s often a good eye opener.
If you are attending next week's RSA conference in San Francisco, check out Jay’s talk “From Data to Wisdom: Big Lessons in Small Data” on February 25. And be sure to join us next month at Ignite 2014, where we'll be covering several of the topics I discussed with Jay as part of our marquee tracks and sessions.