Autotext: an End-To-End Autoai Framework for Text

PRELIMINARY PREPRINT VERSION: DO NOT CITE The AAAI Digital Library will contain the published version some time after the conference. AutoText: An End-to-End AutoAI Framework for Text Arunima Chaudhary, Alayt Issak∗, Kiran Kate, Yannis Katsis, Abel Valente, Dakuo Wang, Alexandre Evfimievski, Sairam Gurajada, Ban Kawas, Cristiano Malossi, Lucian Popa, Tejaswini Pedapati, Horst Samulowitz, Martin Wistuba, Yunyao Li IBMArchitecture Research AI farunima.chaudhary, yannis.katsis, dakuo.wang, sairam.gurajada, [email protected], [email protected], [email protected], [email protected], fkakate, evfimi, bkawas, lpopa, tejaswinip, samulowitz, [email protected] Abstract ()*&+%&' !"#$%&' Building models for natural language processing (NLP) tasks /#*&0#"&'A&0&.$#>"'B'=>3&0'4;"#"7 remains a daunting task for many, requiring significant tech- I*$#,#H&+- nical expertise, efforts, and resources. In this demonstration, 1%2&0&3 !"#$#%$&' we present AutoText, an end-to-end AutoAI framework for 4&)$'5%$% ()*&+#,&"$ text, to lower the barrier of entry in building NLP models. Au- !"#$%&'$( )#%$*+, toText combines state-of-the-art AutoAI optimization tech- )"#*-.-*%&-'/ niques and learning algorithms for NLP tasks into a single 5%$% E&;+%0 extensible framework. Through its simple, yet powerful UI, /+&*+>.&-->+- F+.8#$&.$;+& non-AI experts (e.g., domain experts) can quickly generate A&%+.8 performant NLP models with support to both control (e.g., G&%$;+#H&+- via specifying constraints) and understand learned models. 5&.0%+%$#C& 4+%"-?>+,&+-@ /#*&0#"& 6#789:;%0#$<' !"-*&.$' (-$#,%$>+- D>,*>-#$#>" Introduction =>3&0- /#*&0#"&- Recent AI advances led to rising desire of businesses and organizations to apply AI to solve their problems (Mao Figure 1: AutoText Architecture et al. 2019). However, developing performant AI models re- !"#$%&'()*+',)-. /0 mains a tedious iterative process that includes data prepara- tion, feature engineering, algorithm/model architecture se- mance: AutoText is able to learn performant models that, lection, hyperparameter tuning, training, model evaluation, in many cases, are comparable or outperform state-of-the- and comparison to other models. To lower the barrier of en- art models for the corresponding tasks/datasets. A video try in AI, the research community has looked into automat- demonstration of AutoText can be found online2. ing this process with AutoML (Hutter, Kotthoff, and Van- Related Work. Despite increasing popularity of AutoAI schoren 2019) or AutoAI (Wang et al. 2019, 2020). systems, we are not aware of any with the comprehensive We present AutoText, an extensible end-to-end AutoAI support for text offered by AutoText. Google AutoML Nat- framework, to democratize the development of Natural Lan- ural Language3 does not allow users to select model types, or guage Processing (NLP) models. Its design considerations give insights into the model produced nor into its tuning and include: (a) Usability: Non-experts can create NLP models hyper-parameter optimization (Weidele et al. 2020). Ama- (referred to as pipelines) through a GUI, while expert users 4 1 zon SageMaker Autopilot offers AutoML capabilities that can fully control the process via a Notebook UI . (b) Exten- can be applied on text, but does not provide intuitive visu- sibility: AutoText employs an extensible architecture, where alizations of pipelines and their performance; it also lacks both the components of the pipelines and the pipeline dis- more advanced capabilities for neural architecture search covery (known as optimization) algorithms are modeled as (NAS). Other systems, such as Microsoft AzureML-BERT5, exchangeable modules. (c) Comprehensiveness: AutoText focus on high-performance model building with BERT pre- offers a comprehensive approach, combining various learn- training and fine-tuning, but do not offer simpler but faster ing algorithms (incl. classical machine learning and deep models. H206 is perhaps the closest AutoML framework to learning approaches) and optimization approaches (incl. al- ours; however, it supports a limited range of learning algorithm selection, hyperparameter tuning, and neural architecture search) into a single framework. (d) High Perfor- 2https://youtu.be/tujB0BrYlBw 3 ∗ https://cloud.google.com/natural-language/automl Work done at IBM Research; now at the College of Wooster. 4 Copyright c 2021, Association for the Advancement of Artificial https://aws.amazon.com/sagemaker/autopilot/ Intelligence (www.aaai.org). All rights reserved. 5https://azure.microsoft.com/en-us/services/machine-learning 1This demo focuses on AutoText’s GUI-based interaction. 6https://www.h2o.ai/products/h2o-automl gorithms (e.g., RF, GBM, GLM for classical ML and only These include classical ML operators (TfIdfTransformer, MLP for deep learning), a small pipeline search space (only SVM, XGBoost, etc.) and deep learning-based operators ensembles), and limited features for NLP (Word2Vec). Fi- (BiLSTM, CNN, BERT, XLNet, DistilBERT, MLP, etc.). nally, while NAS is becoming an essential part of Au- The library can be easily extended with new operators to toML (He, Zhao, and Chu 2019), few systems focus on NLP; further increase the coverage of supported techniques. moreover, none combine NAS with exploration of alterna- tive pipelines that may include classical ML algorithms. Search Specification. AutoText supports two ways to define the pipeline search space: (1) Declarative Pipeline Compo- The AutoText Framework sition: Leveraging the declarative LALE framework (Bau- dart et al. 2020), AutoText developers can write high-level AutoText comprises a frontend for easy interaction and a specifications of the search space for each pipeline operator, backend for automated model building. We now describe as well as operator composition rules. (2) Neural architec- each of them in detail. ture search (NAS) enables automatic pipeline composition for deep learning models. Frontend Optimizers. AutoText combines into a single framework End users can initiate the building of NLP models by up- multiple optimization techniques presented in the literature, loading a labeled textual dataset and selecting the target col- including combined algorithm selection and hyperparameter umn. AutoText’s backend handles the rest of the process, 7 tuning (CASH) (Thornton et al. 2013) and NAS (Wistuba, including identifying the task type and suitable evaluation Rawat, and Pedapati 2019). This is done by integrating mul- metric, splitting the dataset into train/test, and discovering tiple optimizers, each suited to different scenarios, selecting pipelines. The best performing pipelines are then shown to between them based on heuristics. These include Hyperopt the user to inspect. This includes both a graphical represen- (Bergstra et al. 2015), Hyperband (Li et al. 2017), ADMM tation of each pipeline illustrating its components and a tab- (CASH optimizer supporting black-box constraints, such as ular leaderboard of pipelines sorted by performance. prediction latency or memory footprint) (Liu et al. 2020), Customizations. AutoText allows users to customize TAPAS (Istrate et al. 2019), and NeuRex (Wistuba 2018). through its UI many aspects of the framework, including Pipeline Selection and Model Tuning. Given a labeled the types of learning algorithms explored, the number of dataset and optional user options, AutoText employs the ap- pipelines returned, as well as constraints, such as a time bud- propriate optimizer to iteratively explore different pipeline get for pipeline discovery. More advanced users can also use configurations. During this process, the optimizer considers its Notebook interface to impose additional controls (e.g., the search specification, potential user constraints, and the override the default search space for individual operators). performance of previously explored configurations to deter- mine the next pipeline to explore. This process continues Backend until the user-specified constraints are satisfied or after a pre- The backend of AutoText includes: (1) Operator library: a determined number of iterations. The resulting pipelines are set of operators that form basic components of typical NLP then shown on the frontend for the user to inspect. pipelines, (2) Search specification, defining the pipeline search space, (3) Optimizers: a set of methods to perform Demonstration joint algorithm search and hyperparameter optimization in We will demonstrate AutoText through a variety of datasets, the defined search space, and (4) Pipeline selection and including standard benchmarks, such as MR (Pang and Lee model tuning: the main component that picks the appropriate 2005) and MPQA (Wiebe, Wilson, and Cardie 2005), and optimizer and coordinates the pipeline discovery process. additional data, such as the US Consumer Financial Com- Operator Library. AutoText employs an extensible opera- plaints dataset8. During the demo the audience will be able tor library, with support for three main operator classes: to gain the following insights: (a) experience how Auto- • Data Preprocessors – operators that clean and/or trans- Text allows the quick generation of AI models, (b) under- form data to prepare for later processing. These include stand common components of NLP models and find out tokenization, lemmatization, part-of-speech (POS) tag- which model architectures work best for particular tasks, by ging, normalization, simplification, spell correction, etc. inspecting the discovered pipelines, and (c) explore trade- offs made by different estimators (such as the training

Autotext: an End-To-End Autoai Framework for Text

The Evolution of Ibm Research Looking Back at 50 Years of Scientific Achievements and Innovations

IBM Research AI Residency Program

Treatment and Differential Diagnosis Insights for the Physician's

The Impetus to Creativity in Technology

IBM Cloud Pak for Data

Introduction to the Cell Multiprocessor

APPLIED MACHINE LEARNING for CYBERSECURITY in SPAM FILTERING and MALWARE DETECTION by Mark Sokolov December, 2020

IBM Infosphere

IBM Highlights, 1996-1999

Q3 2020 2 Leadership Spotlight Leadership Scientific Spotlight 1

IBM Highlights, 1985-1989 (PDF, 145KB)

Innovator Exercise Steps to Perform This Part of the Exercise We Will Be