Measuring the Efficacy of Black-Box Attacks against Autonomous Reinforcement-Learning-based Cybersecurity Tools FY21

Domingo Colon | 21-FS-039

Project Overview

Reinforcement Learning-based approaches to automate network security toolchains have been widely studied and deployed within operational environments. Far less understood is the potential attack surface that defenders inherit when deploying these cutting-edge artificial intelligence (AI) solutions. Addressing these unknowns is critical, as a growing number of commercial and government organizations have begun to investigate the use of reinforcement learning based approaches to assist in dealing with the scale and complexity of the current cyber security threat landscape. In this feasibility study, we executed an experimental demonstration that highlighted the potential disruptive capacity of an ensemble of openly published, adversarial machine learning techniques. This study evaluated the impact of those techniques on a novel Reinforcement Learning based network mapping technology. In a series of scripted trails, the robustness of the reinforcement learning-based recommendation engine within the network mapping tool, was evaluated by measuring the impact of the attacks on the quality of the network mapping recommendation results. Each trial subjected the test system to a wide range of adversarial machine learning attack vectors.

Our experimental results highlight that carefully crafted data poisoning and reward manipulation attacks can potentially reduce the efficacy of the ML models that underlie autonomous network security tools. The experimental results indicate that the presence of the adversarial machine learning attacks resulted in the reduction of system efficacy by as much as 50%. Furthermore, it was demonstrated that well-executed adversarial examples allowed the attacker to direct the target selection of the automated network mapping engine. This resulted in focusing network mapping activity away from the legitimate network space and towards low value pre-placed honeypot systems. We believe this study will help fill a gap in the existing literature, by providing insights into the robustness of reinforcement learning based solutions, when confronted with the operational realities of cyber space. Furthermore, the results will highlight to the security operations community, important deployment considerations that must be addressed at the onset of any operations planning process.

Mission Impact

The results presented in this feasibility study have enhanced a fundamental understanding of the robustness of many classes of important machine learning models and their susceptibility to coercion by adversaries. The experience gained from these experiments will allow Lawrence Livermore National Laboratory (LLNL) researchers to better understand the threats directed against AI/ML based resources, and to help secure a growing base of mission essential LLNL and DOE software applications that depend on ML technologies. This study has provided the key insights necessary to begin the process of designing and implementing the next generation of state-of-the-art cyber focused machine learning models and algorithms.

These report findings will also form the basis for potential future research efforts and programs. This can include a more extensive future LDRD ER effort focused on pushing the state of the art in investigating classes of models more robust to the threats posed by key cyber threat actors.

Finally, the results of this study are of immediate interest to LLNL's current government sponsor base. The security of AI/ML resource is a nascent field, with many possibilities for LLNL researchers to act as trusted government partners in helping secure critical DoE and DoD AI/ML assets.