<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=381191815628942&amp;ev=PageView&amp;noscript=1">

Using machine learning to bridge the gaps in microsegmentation

By John O'Neil — Jan 25, 2018

Microsegmentation has become more widely adopted in the cloud and data center because of its enormous promise. Breaking the network into smaller and smaller fragments (often as small as the individual workloads themselves) comes with significant security and performance benefits. According to some estimates, as much as 76% of the traffic in a network is east-west, so allowing only application communication necessary for business operations can interrupt attack progression.

The problem, though, is that getting to that fully-deployed and realized microsegmented environment is challenging. It requires a fine-grained knowledge of the application communication patterns and application topologies of the network, with the second step being an inventory of what communication should be allowed and what shouldn’t. This leads to a large number of rules/policies, and large rule sets are difficult for humans to understand, manage and modify. Finally, the rules remain address-centric, which means that the translations from application policies to address-based rules is contained in people’s heads.

To gain the advantages of microsegmentation requires a lot of work–and the work doesn’t stop, since networks continue to change.

Is there some way to complete microsegmentation projects faster and more easily? To make them more accurate, and more resilient in the face of changing business and security challenges

Turns out, there is.

A New Approach

We want to make application-centric, rather than address-centric, rules. Starting with collected data about which applications are communicating, we use machine learning to analyze the data and create a nearly optimal set of automatically-generated rules. This moves much of the complexity from humans to computers, who after all are much better at sorting through lots of information.

The machine learning isn’t used here to find malware; rather, it’s used to establish the state of the network–what is talking to what. We’re developing techniques to identify possibly suspicious activity from applications on a network–for now, the learned rules and our UI allow unexpected applications to be easily identified and dealt with.

Once the rules are created, then people can begin to use human insight to protect their network. Because the rules are readable, and parsed explicitly in terms of protecting applications, the rules can be deployed application by application; a human network security expert can use their knowledge and insight to deploy (or edit) the rules in an optimal way. A human is usually better at deciding what applications should be protected first, so we make it easy to find the relevant rules for protecting the most important applications and to use the rules to lock down the application.

Using Machine Learning

Starting with all the network traffic, we want to create a set of rules with the following goals:

  • As few rules as possible.
  • Simplest rules, without superfluous information.
  • More specific rules, rather than general rules.
  • Human-readable rules.
  • Rules with the broadest coverage possible.

These goals are often in conflict with one another. It’s necessary to balance these priorities to get the optimal rules, though. These constraints, and the constraints created by the data, rule out most of the techniques used for machine learning.

We ended up doing stochastic search through a space of candidate rules, maximizing a value based on the above constraints. If this sounds a little hard to understand, it’s not just you. Let’s try an analogy.

Lost in New York

Imagine you’re standing on a street corner in Manhattan with a latitude-longitude GPS. You know where you want to go, because you have the lat-long coordinates of your destination, but you don’t know which direction to start walking to get there. So you measure where you are and you walk a few blocks in one particular direction and look at your GPS again. If you’re closer to your destination than you were, then you stay at your new location and do it again. If you’re further away, you backtrack to where you started and then pick a different distance and direction. Sometimes, even if you’re a bit farther away, you still keep the new location, since maybe it’s a shortcut (or a way to get around Central Park!) As time goes on, you get closer to your destination, until nothing you do makes you any closer. You’ve arrived, more or less.

Of course, Manhattan is two-dimensional, except for the tall buildings. When we’re looking through the space of possible rule sets, there are a lot more “directions” to investigate. That’s why we leave it to the algorithms.

In my next blog post, I’ll explore how Edgewise does this from the application perspective (rather than with addresses as the focal point), and how that shift makes microsegmentation achievable significantly faster than other approaches–while providing greater security.


John O'Neil

Written by John O'Neil

John O’Neil is the Data Scientist at Edgewise Networks. He writes and designs software for data analysis and analytics, search engines, natural language processing and machine learning. He has a PhD in linguistics from Harvard University, and is the author of more than twenty papers in Computer Science, Linguistics, and associated fields, and has given talks at numerous professional and academic conferences.