AIwarenessStarted in August 2021
Targeted phishing on employees is a persistent and concerning threat. Accurate detection of targeted phishing on employees is therefore a very valuable challenge to solve, since it enables organizations to stop adversaries early in the kill-chain.
According to FireEye, spear phishing was the most used attack for intial access by APT’s in 2010, and ever still in 2020. Over the past 2 years the PCSI partners have developed several machine-learning technologies for detecting targeted phishing on employees. The main challenge in this research was to validate the performance of such detection technologies. The main reason: it is time consuming to manually confirm whether an email is malicious, and in this process you will usually have to involve both the target employee and a security expert.
The employee itself is most knowledgeable about whether an email is contextually ‘normal’, a computer can best detect technological abnormalities, and the security analyst is the right person for drawing a final conclusion on whether an email is malicious or benign.
In this project we propose to develop a system in which the employee (the target), SOC analyst, and detection capabilities all come together in a positive feedback loop. It will support the employee to be resilient for phishing at the right time, it will decrease the workload for SOC analysts, and lastly both the employee’s and SOC analyst’s feedback will help improve the detection algorithm performance. We believe this idea is unique in the sense that it propose to utilize machine learning based detection technologies, employee contextual knowledge, and a SOC analysts expertise, all in synergy.
Expected gains of the AIWARENESS solution at the end of the project:
- Improved employee resilience against phishing, supported by machine learning based detection capabilities that can alert a user at the right time.
- Continuous reduction of workload for SOC analysts when handling phishing alerts.
- Continuous improvement of detection algorithms, utilizing all incoming feedback from employees and SOC analysts.
Why do we want to work on this idea within the PCSI?
Since targeted phishing on employees is a persistent and hard-to-solve challenge we believe that cooperation between the PCSI partners can bring us one step further. Another conclusion that can be drawn from the fact that the most popular method for initial access by APT’s over the past 10 years has been, and still is, spear phishing, is that clearly no market product has fully solved this challenge.
Conclusions at the end of the Explore phase
In the first phase of this project we have explored whether the proposed solution is of added value for the PCSI partners, next to the anti-phishing solutions currently in use. Multiple PCSI organizations are interested in the outcomes of this research, and at least one partner is willing to explore implementation of a prototype.
In order to develop the proposed solution we plan on utilizing three concepts in one combined solution. These concepts are:
- Active learning: a machine learning technique that can help to efficiently label data, and therefore efficiently improve an existing model. We can potentially utilize the employee as a so-called oracle for labelling emails as contextually suspicious or not, and we can utilize the SOC analyst as an oracle for labelling emails as malicious or benign.
- Explainable AI: a technique that enables a machine learning model to explain how it assesses a certain data sample. In our case this would mean that we can explain why a model assesses a certain email as malicious or benign.
- Empowering the user: a research topic focusing on presenting technical information understandably to a non-technical user.
But how will we combine these techniques? Active learning needs an oracle (a human expert) to label data, which in our use-case would be the employee which has most contextual knowledge about her/his email inbox. However, an employee generally lacks the right amount of technical cyber security knowledge, and thus we will feed the employee with information from the technical assessment performed by a computer, for which we will use explainable AI. Generally speaking explainable AI will still result in information that is technical in nature. Therefore we need the ideas from research on presenting technical information understandably in order to empower the employee.
Conclusions at the end of the PoC phase
In the PoC phase of this project we have focussed on technical feasibility. We have taken an existing detection tool, CERTITUDE (which was partly developed in a predecessor of the PCSI. the SRP), and extended it with both active learning and explainable AI functionality. Furthermore, we taken the first exploratory steps towards implementation of a pilot at De Volksbank, one of the involved PCSI core partners. At the end of the PoC phase we have proven that the proposed solution is technically feasible by means of a PoC demo.
Pilot phase activities
In the Pilot phase we will have to bring the developed PoC one step closer to the end-users (employees and probably also the SOC analysts). In the Pilot phase we will execute 3 activities in parallel:
- data science experiments to identify how the interplay between exploration (training) and exploitation (operational use) of our solution should be designed
- human factors experiments to determine when and how we will interact with the employee
- functional design of how the tool can be implemented at De Volksbank
In the extended Pilot phase we will develop a Minimum Viable Product in which we combine the outcomes of the above three activities.
This project is part of the trend