By John Harrison, Director, Cybersecurity Center of Excellence, Criterion Systems
It is easy to be skeptical about Artificial Intelligence (AI). It has been promised (threatened?) for years, and while it is already showing up in our everyday lives – essentially through companies like Amazon and Facebook that use it to customize user experience and make doing things on their platforms more convenient – it has also been hijacked as a marketing buzz word, and frequently misused. However, as a cybersecurity professional, I believe it will help solve some of our greatest challenges, today and into the future.
Given the confusion surrounding AI, I think it would be prudent to quickly define what it is. AI is a general practice and concept, including capabilities such as natural language processing, image recognition of objects, and pattern recognition through neural network models attempting to mimic cognitive functions of the brain. The term Machine Learning (ML) is frequently used interchangeably with AI, although there are distinct differences. ML algorithms use machines to learn about the given data. A subset of ML includes deep learning, which has shown a lot of promise in the cybersecurity realm. Major differences in ML compared to AI include:
- ML aims to increase accuracy described by confidence intervals whereas AI aims to achieve a successful goal and is less focused on accuracy.
- ML learns from data obtained based on tasks and actions whereas AI uses computer programs to make decisions or apply logic, possibly using ML outputs as inputs to an AI program.
- ML focuses on acquiring knowledge or skills by learning from many observations over time and optimizing its own model to improve accuracy whereas AI’s goal is to mimic a human response and decision-making process.
Which brings me to the question: What are we cybersecurity professionals and organizations looking to get out of AI and ML? That question is predicated on what you are attempting to accomplish.
Augmentation/Automation of Cybersecurity Processes
To date, the most successful use of AI and ML in cybersecurity has been to help detect malware. By supplying machines with samples of good and bad pieces of executable code, they have been able to help identify what are normal and abnormal operations. This is how many of the next generation anti-virus tools work: they are constantly learning and building unique graphs of how applications interact with systems, how users interact with applications, and how applications and users interact with data and other users and computers on the network.
What we need now is for a system to learn to enable augmentation and/or automation of a variety cybersecurity processes to achieve a better outcome, such as saving time and money by using algorithms and models to perform a great deal of the initial tri-gate activities that analysts have to do manually today. Additionally, many low and informational alerts in Security Operations Centers (SOCs) currently go unattended due to a shortage of time and personnel. Using AI and ML to apply initial triage to see if any of the alerts are possibly related to one another represent low-hanging fruit and a great step forward.
Consider this: Many attackers today are attempting to evade our defensive systems and if they can exploit networks and systems by staying under the radar by generating thousands of low-level alerts it’s less likely they will get caught and the organization might not even know they were compromised. As with the example of malware above, the first use that comes to mind for most of us is enhancing our detection and prevention abilities. But have you considered using AI and ML to augment response actions such as containment actions, ticket creation, and user engagement to triage and/or validate a suspicious action? Many of these activities are in the realm of possibilities today for the application of AI and ML, and offer true benefits, such as improving Service Level Agreements, reducing the time spent on each alert, and improving the Meant Time To Recovery (MTTR).
Though AI and ML remains in the early stages of adoption and expansion, there are many business challenges and use cases that the cybersecurity community is eager to deploy in the very near future to address the challenges Security Operation Centers (SOCs) are facing today, including a massive digital transformation and the never-ending alerts coming in for triage.
Implications of a Continuous Exponential Growth in Data
According to the International Data Corporation, data will grow by 61% to 175 zettabytes (ZBs) by 2025, which is equivalent to the data stored on 250 billion DVDs, according to University of California – Berkeley. The majority of this data will reside in cloud and data center environments. Other interesting insights from the same reports include the fact that 90ZBs will be created on the Internet of Things (IoT) devices, nearly 30% of all data will be consumed in real-time, and almost half (49%) of data will be stored in public clouds – all by 2025. Furthermore, the increase in data, especially real-time data, is correlating to the number of devices that are connected to private and public networks, which does not show indications of slowing down anytime soon. Next-generation cyber warriors will be dealing with volumes of data never seen before occurring in real-time making it difficult to spot, assess, and act on future cyber attackers. Attackers will likely attempt to disguise and blend into the noise using AI and ML techniques to mask their malicious intent. The challenge for cybersecurity professionals and organizations is how to harness automation and build next-generation SOCs using AI and ML, which will be crucial for keeping up with the volume and velocity of cyber-related data, created by more users and more machines.
The Alerting Nightmare Many SOCs Face
Millions of daily alerts: This is a normal day as a SOC manager, and it raises several challenges:
- Eliminating false positives to focus effort on prioritizing “real” alerts based on severity and probability.
- Reviewing all alerts may be impossible.
- Many SOCs will avoid some alerts because they are considered low-level or have fired off too many false positives. Remember, however, 10-15 low-level alerts that, when combined and based on the sequence or the nature of those alerts, could equal a high alert translating into a full compromise.
- It is common for many SOCs to fall into the alert fatigue trap and not consider how adversaries operate. Just as military operatives, they always attempt to fly below the radar. Therefore, they focus on exploiting weaknesses to which they feel SOCs are less likely to be given careful attention.
What are some of the solutions to these challenges? Writing correlation/behavioral rules can help, but this has its own limitations and they can be easy to evade if not written correctly. Writing behavioral rules is also complex and requires unique skills – which are in a great shortage within the cyber community right now. A better approach is to save profiles for users, workstations, servers, networking devices, etc., and use ML to generate anomalies and determine behavioral patterns in the form of classifications. This approach is better because it more easily scales and can solve two major problems: The cybersecurity talent shortage and attacks that attempt to evade detection systems by hiding in the noise as we previously explained above.
Using AI and ML together, anomalies can be generated that is then passed through a series of AI models to determine their probability and severity as well as to determine if any specific example crosses a threshold that should trigger an event/alarm. For example, an anomaly is triggered by a user behavioral pattern that has drifted from its normal operation. It is then analyzed by machines to determine if that anomaly occurred before, at what frequency, and if it can be predicted with a reasonable level of probability that the event is actually abnormal. If so, it may be passed to an AI-bot that triggers an alert to the user text or an alternative email address to ask if the action was prompted by them or not. If not, then an alarm is triggered by the AI intrusion system and the incident response process begins. This is just one of many examples of how AI and ML can help with alert fatigue and scaling limited talented resources while also having the ability to respond in seconds versus minutes and hours.
How Organizations Should Start Using AI and ML for Cybersecurity
First, organizations should define what they want to teach a system to help augment and/or automate response actions. In my opinion, I think AI and ML is at a point where it can be of more help in augmenting than it can in automating. (The day where machines will be able to fully think like humans may come, but operationally, that is many years away.) Having worked in several SOCs and managed my own Managed Security Service Provider, I have seen analysts and incident responders performing repeated tasks day in and day out that could be augmented by AI. The AI-bot example provided above is one way this could work. In another example, Thomas Caldwell from Webroot provided a great demo of his AI bot using Amazon’s Alexa device at an RSA Conference session named the “Evolution of AI Bots for Real-time Adaptive Security.”
Here’s one note of caution: Using AI/ML for response actions is complicated. It requires the right skills and data to be analyzed in order to create the feature sets that are viable candidates for AI and ML modeling and are specific for how your organization operates. If a vendor offers you a solution that allows you to run ML algorithms from your data and calls that AI or behavioral detection, turn and run as fast as you can! The great news is that today, a good portion of data created by machines and users are using standardized formatting and language, making it easier for the cyber community to share and build off of others’ AI and ML feature data sets. Keep in mind, however, that just as defenders are using AI and ML, adversaries are as well. To get started I recommend following some best practices that include, but are not limited to:
- Spend the time to create well-defined outcomes and measurable targets for challenges you need to solve insecurity. What tasks/problems do you want a machine to learn and what tolerance do you have for false positives?
- Consider if more simplified tools could achieve a more efficient and effective outcome than an AI and ML model. For example, running models to determine behaviors of where applications are executed can help detect abnormal operations. However, it might be easier to standardize and baseline where all applications are allowed to run from and trigger an alert if one executes from a different file path outside of the baseline. Application whitelisting is an easier solution to this problem as well.
- Brainstorm and pair data scientists with cybersecurity expertise to create features derived from your existing cybersecurity data to solve a relevant problem/outcome. The data that is being generated on the network is not necessarily what gets fed into the model. For instance, financial traders using algorithms to hedge stock portfolios tend to use ratios as an input set of data that is computed based on lower-level data sets such as fluctuations of price, the volume of trades, etc.
- Experiment with various combinations of features, algorithms, and classifiers to find the best fit model for the desired outcome and achieving the measurable target you had established beforehand. If the model doesn’t predict an event at the same rate or better than a human, what value does the model provide? Trial and error is the best approach and interestingly, machines can learn from these trial experiments as well to help automate the selection of features, though that is still a bit early in its maturity.
I’m excited, just as many cybersecurity professionals are, at how far AI and ML have come and where the capability is heading. It’s a great way to augment operations today, allowing resources to be diverted to solving greater challenges and helping mature SOCs around the world. AI and ML will not likely solve all of our problems in the near future, but if it could help solve even one or two major challenges, I think many SOC managers and leaders would find that immensely valuable in the important mission of protecting critical assets from all types of malicious cyber actors.
About the Author
John Harrison is the Director of Criterion Systems’ Cybersecurity Center of Excellence. With more than 15 years of experience in the security industry, he helps design cybersecurity programs to protect government customers. He is a combat service-disabled veteran who served eight years in the US Marine Corps as an intelligence operator and foreign military combat trainer. Following his military career, he spent several years in the Intelligence Community. He has a bachelor’s degree in criminal law, an MBA from Georgetown, and is a certified ethical hacker and incident handler from EC-Council and SANS GIAC, respectively.