BASELINE • November 2020

An ML Newsletter from Novetta

One year ago we launched the Baseline to provide insight into machine learning topics that we found interesting. During that time 19 different authors, including 11 interns, have covered a broad range of topics relevant to the Defense and Intelligence Communities.

With this anniversary, we reflect on important innovations of the past year and anticipate things to come in 2021.

2020 Highlights

Hugging Face Transformers

While we often cite the 2018 releases of ULMFiT and BERT as important moments for natural language processing (NLP), Hugging Face Transformers has accelerated BERT and its many variants by making it easier to find and implement models as they are released. The Hugging Face model hub contains over 1000 user-submitted models that can be applied to over 100 languages for common NLP tasks. Motivated by these developments, we created an open source library, AdaptNLP, to further simplify the fine tuning and deployment of these models.

Model Explainability

Machine learning models have at times been described as black boxes due to the lack of insight into their internal functions. Companies including Google, Uber, and OpenAI have released tools to help users better understand model behavior, discover hidden biases, and identify where models might fail. Google’s What-If Tool, Uber’s Manifold, and PyTorch’s Captum allow users to visually explore model results to help them understand how predictions shift in response to changes in data attributes. OpenAI developed Lucid to allow users to peer into neural networks, getting a better understanding of what is happening at each network layer.

Ethics in Artificial Intelligence and Machine Learning

Like any technology, machine learning practitioners need to understand the implications of the capabilities they are developing. While the promise of machine learning is that it can increase efficiency through automation, the industry needs to ensure that in doing so we are not furthering personal or societal biases. Institutional controls, such as the Department of Defense’s Five Principles of AI Ethics, are important efforts in this direction. At the international level, Novetta SMEs are supporting development of International Organization for Standardization (ISO) technical reports on Bias in AI systems and AI-aided decision making.

What We’re Excited About in 2021

Advances in Audio

While 2018 was the year of NLP, we think 2021 could be the year of machine learning breakthroughs in audio. Beyond open-source automatic speech recognition toolboxes such as wav2letter, we are noticing an increase in end-to-end open source speech recognition toolboxes. SpeechBrain, a single, flexible toolkit designed to perform a variety of audio ML tasks, is projected to be released in 2021. The Common Voice audio dataset by Mozilla continues to grow and includes hours of labeled audio data for a variety of languages. The increase in available data as well as the rapid development of open-source toolkits and frameworks will lower the bar for ML practitioners to develop state-of-the-art models for audio tasks.

Dealing with Small Amounts of Labeled Data

One of the most tedious (and potentially expensive) parts of machine learning projects is acquiring and labeling sufficient training data. Developments in the open source community are likely to alleviate this challenge. This past summer Novetta interns investigated and implemented data augmentation for NLP, active learning, and semi-supervised learning approaches. Each approach significantly decreased the amount of data needed to train performant models. Looking ahead to 2021, we are excited for developments that enable machine learning practitioners to create realistic, synthetically-generated training data suitable for computer vision and structured data use cases.

More Complex Tasking

In an attempt to simplify the task for deep learning models, most training setups associate a single label for each training image. While intending to improve performance, this oversimplification may lead to undesirable predictions when dealing with complex images. For example, to build a model that classifies dogs, all images with dogs will be labeled as “dog.” However, if the images also contained items such as “ball” and “park,” the model would be penalized for suggesting “ball” or “park” despite being contextually accurate. We look forward to development of deep learning models that can tackle complex tasking and understand higher-level concepts. Tasks that once were structured around single ground truth labels will soon be replaced with multiclass labels (“dog”, “park”, “ball”) or more complex labels (“dog is playing with a ball in the park”).

This research was performed under the Novetta Machine Learning Center of Excellence.


Special thanks for your work on Baseline over the past year to Matt Teschke, Brian Sacash, Shauna Revay, Andrew Chang, Brandon Dubbs, Michelle Jou, Jiong Huang, Sophie Sackstein, Mady Fredriksz, Vikas Shankarathota, Xena Grant, Zach Mueller, Amber Chin, Annie Ghrist, Carlos Martinez, Christian Jung, Jack Buttimer, Emma Cooper, and Joseph Terrigno!


Authors:

Michelle Jou
Shauna Revay, PhD
Matt Teschke