DENAS: Deep Neuroevolution at Scale
Brian Van Essen | 21-ERD-026
Project Overview
The goal of the Deep Neuroevolution at Scale (DENAS) project was to develop new algorithms for exploring the space of viable neural network architectures for Scientific Machine Learning (SciML) applications, with a specific focus on using evolutionary approaches for the neural architecture search (NAS) algorithm. Specifically, it aims to reduce the time required for scientist to find and optimize neural network architectures for applications of SciML where there are unusual data sets, modalities, or learning objectives. The current state of the art neural network architectures for natural language processing (NLP) or vision tasks have been well honed and evolved through manual scientific discovery and discourse over the course of many years. SciML applications typically engage bespoke instruments and analysis tasks that are very different from standard NLP and vision applications. As such, for each of these SciML applications, researchers can either bootstrap their model development and discovery from existing neural network architectures that were optimized for other tasks (e.g. NLP or vision), or start from a clean slate. Neural architecture search and neuroevolutionary algorithms offer the promise of being able to rapidly optimize neural network architectures for each SciML application, if provided sufficient computing power.
One of the key innovations for this project was the close coupling of iterative training methods with the NAS algorithm, enabling frequent evaluation of architectural options and the subsequent expansion and culling of the set of architectures being explored. This tight integration of the NAS algorithm allowed for more efficient utilization of computer training time, as unsuccessful proposed architectures could be quickly evaluated and discarded. While the project was able to develop some interesting results, overall the fundamental challenge for NAS algorithms is to outperform highly-optimized, hand-tuned, architectures in the community. For multiple standard vision-based benchmarks these NAS-based designed networks were unable to exceed the performance of the hand-tuned baseline. However, for many SciML applications where there isn't a large community effort to find the best architecture for a given data set or application, the time to solution from a NAS algorithm can provide a huge benefit.
Mission Impact
As Artificial Intelligence (AI) and associated sub-domains, such as Deep Learning (DL) and Scientific Machine Learning (SciML), become invaluable tools within both the scientific process and the national security mission space there will be a rapid proliferation of new applications and data sets that will demand optimized neural network architectures for these spaces. The traditional approach of manually tuning the hyperparameters of neural network architectures is too slow to enable the rapid adoption of AI within LLNL's mission spaces. HPC-enabled NAS algorithms will be a key approach to rapidly optimizing these neural networks for each application domain and will accelerate the deployment of AI methods across LLNL's programs.
Publications, Presentations, and Patents
Explainable Neural Architecture Search (XNAS) source code. https://github.com/LLNL/XNAS.