BASELINE • September 2021

An ML Newsletter from Novetta

Welcome to the September 2021 BASELINE, Novetta’s Machine Learning Newsletter, where we share thoughts on important advances in machine learning technologies. This month we cover the following topics:

  • Generative Spoken Language Model (GSLM), an NLP model that doesn’t use text
  • Optimizing Transformers with Hugging Face Optimum
  • A new framework to ease the use of machine learning methods for combinatorial optimization problems
  • Twitter’s bounty program for spotting AI bias

Language Models Trained on Speech

Natural language processing (NLP) has become synonymous with text data processing due to the success of text-based language models and their ability to perform text generation and transfer to a variety of other downstream tasks. Recently, researchers at Facebook AI have released a textless NLP model, trained only on raw audio signals without labels. The model first converts recurring sounds from the audio into individual “units”. From there, the model tries to predict the next “unit” based on what it has seen before, and then converts the resulting “unit” back into speech. Previously, if audio data were to be used for training a language model, an ASR model would first need to convert the audio into text, which has been a traditionally spotty process. Textless NLP models have several additional benefits such as being able to potentially capture and encode emotions, intonations, and sarcasm from the audio, leading to richer features. Additionally, NLP tasks can be extended to less common languages where text datasets are more difficult to find. Researchers hope to show that their model can be used effectively as a pretrained model for tasks such as spoken summarization or sentiment analysis. They have released their paper and code into the open source community.

Optimizing Transformers

Transformer models have been extremely successful in their ability to improve accuracy for a variety of tasks including NLP, computer vision, and speech. However, that performance comes with a cost, as they are generally large and putting them into production where they can be run efficiently can be complicated and expensive. Hugging Face has taken another step on their journey to “democratize ML” by releasing Hugging Face Optimum, an optimization toolkit for Transformers. This toolkit helps ML practitioners put Transformer models into production at scale without needing highly specialized optimization knowledge. Optimizing models for production involves both software and hardware tweaks, and Optimum continues to build out optimized solutions for various hardware systems. They released an example working with Intel Xeon CPUs, but hope that optimized models will be released on the Hugging Face Model Hub in the future for a wide variety of hardware and software setups. This has the potential to allow organizations of all sizes and resources to create efficient, production-ready Transformer models.


Solving combinatorial optimization problems is an active field of research which aims to find efficient and generic algorithms to search finite decision spaces. Some examples are finding the shortest path in a graph, or job scheduling with the most efficient use of resources. Recently, researchers proved that machine learning-based reinforcement learning methods can be used to successfully apply constraint programming paradigms in order to find solutions. Unfortunately, general constraint programming frameworks are not well suited to incorporate machine learning methods, leading to inefficiencies. In a recent paper, authors introduced SeaPearl, a flexible framework that enables machine learning algorithms and reinforcement learning routines to be used to explore a tree of partial solutions to constrained optimization problems. It offers a generic framework, and initial experiments demonstrate competitive performance with common manually designed heuristics. By lowering the barrier to incorporating machine learning methods, SeaPearl should quicken the adoption of machine learning-based methods and help users more efficiently solve optimization problems for various use cases.

AI Ethics Feature
Twitter’s Ethical Bug Bounty

Last year, Twitter identified that their photo cropping algorithm, used to show the most important parts of an image, was biased toward White people over Black people. Twitter offered a “bounty challenge” where users were asked to assess the code used for the cropping algorithm to identify potential algorithmic harm. Bounty programs, which are already common for security vulnerability researchers, motivate independent analysts to establish best practices and identify and mitigate vulnerabilities to protect the public. Twitter hopes to cultivate the same culture for AI ethics, with the thought that making their code public will allow them to identify more potential issues than they would be able to find themselves. In August 2021, Twitter announced an independent researcher, Bogdan Kulynch, as the winner of the competition. He identified that the algorithm favors faces that look slim and young and with skin that is lighter-colored or with warmer tones, and that the bias could result in exclusion of minority populations and perpetuation of stereotypical beauty standards. The use of crowdsourcing to help identify bias will hopefully further efforts to find and mitigate bias in many AI algorithms that can impact groups of people.

This research was performed under the Novetta Machine Learning Center of Excellence.

Brenton Avril
Carlos Martinez
Shauna Revay, PhD