An ML Newsletter from Novetta

Welcome to the August 2020 BASELINE, Novetta’s Machine Learning Newsletter, where we share thoughts on important advances in machine learning technologies. This month we cover the following topics:

  • A faster implementation for object detection
  • Better explainability tools for text models
  • Advances in language models and related tasks

PP-YOLO: An Effective and Efficient Object Detection Implementation

PP-YOLO is a recently-released object detector that advances the state-of-the-art in object detector models, achieving faster, more accurate results on the COCO dataset than predecessors such as EfficientDet and YOLOv4. Striking a balance between accuracy and inference speed, PP-YOLO is comprised of a “recipe of existing tricks” that augment the widely-used YOLOv3 model released in March 2018. These “tricks,” inspired by a variety of works in the field and analyzed in an ablation study by the model’s authors, include modifications to model architecture, alternative training strategies, and approaches that enable quicker decisions at inference.

Figure 1. Comparison of the proposed PP-YOLO and other state-of-the-art object detectors. PP-YOLO runs faster than YOLOv4 and improves mAP from 43.5% to 45.2%.

X. Long et al., “PP-YOLO: An Effective and Efficient Implementation of Object Detector,” arXiv:2007.12099 [cs], Aug. 2020, Accessed: Aug. 24, 2020. [Online]. Available:

Improved Interpretability for Text-based Models

Since 2018, natural language processing (NLP) has seen significant advances in model performance on tasks such as sentiment analysis. One weakness of these models is a lack of interpretability, meaning that it can be difficult for users to know why the model made certain predictions. This has implications not only for analysts, who often need to know why a particular classification was made, but also for the data scientists who need to identify the points at which a model is failing. The Language Interpretability Tool (LIT) is an NLP toolkit open-sourced by Google Research that helps to address these issues. It has a browser-based user interface to help explore the complexities of NLP models. The toolkit also provides local explanations through salience maps, aggregate analysis on selections of data, and counterfactual generation while being framework agnostic. These tools help ML practitioners assess model architectures to learn the model’s biases, compare two models side by side, and test the model on new data created in the GUI. The ML practitioner can see which words in the text are the most important in the classification and adjust the model accordingly. LIT is the most promising and easy-to-use interpretability tool we have seen for text models.


Many data scientists will be the first to admit that they do not have the best software development practices. Data scientists commonly use Jupyter notebooks for experimentation, development, and interactive exploration. The leading version control software, Git, does not handle these notebooks well.

Elyra solves this in their 1.0.0 release by offering integrated support for Git repositories, giving data scientists better version control over their notebooks and enabling pipelines to be replicated more easily. Elyra’s new release introduces a code snippets extension that stores blocks of code for later reuse. Elyra’s toolkit also enables data scientists to transform their experimental Jupyter notebooks into a fully functional machine learning pipeline. The complicated process of creating an ML pipeline from scratch is abstracted through a visual editor.

Ever-Bigger Language Models: GPT-3

The internet is buzzing about GPT-3, OpenAI’s latest breakthrough. This state-of-the-art language model is made up of 175 billion parameters – over 10 times larger than Microsoft’s Turing NLG, the largest language model at the time of its release back in February. In addition to strong performance on countless NLP datasets, GPT-3 has been used to create a bot that writes letters on behalf of nature, an AI-written blog that trended on Hacker news, and a program that can automate website development.

While these examples demonstrate the power of GPT-3, the model raises alarm bells in the AI community as to possible adverse uses. While the immediate risks are mitigated – developers must apply to gain access to the model via an API – it is likely only a matter of time before bad actors develop their own comparable capabilities.

Defending BERT Against Adversarial Attacks

BERT is a popular pre-trained NLP model used for tasks including text classification and entity extraction. As BERT has gained in popularity, we have seen an increase in interest in fooling the model – a method known as an “adversarial attack.” A popular method is TextFooler, which attacks BERT by altering words that have the most significance to the prediction. This causes the BERT model to change its prediction, in some cases driving accuracy to almost zero. To help defend against such attacks, students at UC Berkeley developed FireBERT, a set of 3 classifiers hardened against TextFooler. FireBERT uses data augmentation to create synthetic data for each original example, where the synthetic data is a slight variation of the original. FireBERT’s classifier is co-tuned on the training data and synthetic data, which teaches the model that there are multiple ways to express the same thing. While co-tuning slightly decreases accuracy on the original dataset, accuracy was significantly better than the original BERT model on adversarial data.

DeText: A Deep Text Ranking Framework with BERT

DeText, a deep NLP framework capable of performing tasks such as document ranking, text classification, and sequence completion, has been open-sourced by LinkedIn researchers. The DeText framework enables users to select and alter elements of their models, such as swapping BERT in the embedding layer of a ranking model for a CNN or LSTM. DeText tackles the problem of efficiently applying computationally expensive NLP techniques to improve document ranking in search systems in production environments. DeText-BERT, a ranking model built using DeText, pre-computes document embeddings using the BERT language model before deployment to reduce latency. This approach improved upon prior search ranking systems at LinkedIn.

This research was performed under the Novetta Machine Learning Center of Excellence.


Jack Buttimer
Emma Cooper
Joseph Terrigno