The 2nd workshop on
Efficient Natural Language and Speech Processing (ENLSP)

Friday Dec. 2nd 2022, New Orleans
In-person (Ballroom C) and Virtual



The second version of the Efficient Natural Language and Speech Processing (ENLSP-II) workshop focuses on fundamental and challenging problems to make natural language and speech processing (especially pre-trained models) more efficient in terms of Data, Model, Training, and Inference. The workshop program offers an interactive platform for gathering different experts and talents from academia and industry through invited talks, panel discussion, paper submissions, reviews, interactive posters, oral presentations and a mentorship program. This will be a unique opportunity to address the efficiency issues of current models, build connections, exchange ideas and brainstorm solutions, and foster future collaborations. The topics of this workshop can be of interest for people working on general machine learning, deep learning, optimization, theory and NLP & Speech applications.


Overview

Pre-training a general model using self-supervised learning on huge amount of data and then fine-tuning that model on a specific task has become a generic paradigm in solving many natural language and speech processing tasks. Since then, we have had different types of pre-trained models (e.g. encoder-only such as BERT, decoder-only such as GPT, encoder-decoder such as T5) in very diverse range of scales (from millions to more than 500 billion parameters) for different tasks.

There has been a common practice in the literature to increase the number of parameters of these pre-trained models to improve their performance or their zero/few-shot abilities. Despite the great success of these pre-trained models, it is evident that most of them are largely over-parameterized and their efficiency is under question. Training or deploying these models on devices or even cloud services with limited memory and computational power can be very expensive and challenging. For example, Megatron-Turing with 530B parameters has shown state-of-the-art results in many NLP tasks, but at the cost of using 560 DGX A100 nodes (more than 4000 NVIDIA A100) for training and using more than 300B tokens data. Moreover, delivering such huge models as a service to different clients will require different copies of the model for different tasks. Even fine-tuning the entire large model over a small labeled dataset can lead to overfitting. Therefore, it is of vital importance to invest on future of pre-trained models by enhancing their efficiency in terms of data, modeling, training and inference from different perspectives highlighted in this workshop.


Call for Papers

We would like to share some fundamental challenges on improving efficiency of pre- trained models and encourage the NeurIPS community to submit their solutions, ideas, and ongoing work concerning data, model, training, and inference efficiency for NLP and speech processing. The scope of this workshop includes, but not limited to, the following topics:

Efficient Pre-Training Pre-training is a very expensive process. Even a small modification to the configuration of the models requires the user to redo pre-training.

  • Accelerating the pre-training process
  • Continual/Life-long pre-training and adapting pre-trained models to a new domain
  • Efficient initialization and hyper-parameter tuning (HPT)
  • Better pre-training self-supervised objectives
  • Multi-domain pre-training
  • Data vs. Scale of pre-trained models
  • Pre-training Multimodal (e.g., text–speech) models
  • New efficient architectures for pre-trained models

Efficient Fine-tuning Fine-tuning large pre-trained models on downstream tasks can be challenging because pre-trained models are very over-parameterized.

  • Parameter-efficient tuning solutions to tune only a portion of the entire network (e.g. adapters)
  • Efficient prompt-based fine-tuning
  • Accelerating the fine-tuning process (e.g. optimizer, and layer-skipping)
  • Efficient federated learning for NLP: reduce the communication costs, tackling heterogeneous data, heterogeneous models.

Data Efficiency Pre-trained models rely on a huge amount of unlabeled data which makes the training very sample inefficient.

  • Sample efficient training, training with less data, few-shot and zero-shot learning
  • Sample efficient data-augmentation, identifying which training samples should be augmented
  • Data compression, data distillation
  • Data selection, how to improve the quality of pre-training data

Inference Efficiency How can we reduce the inference time or memory footprint of a trained model for a particular task?

  • Neural model compression techniques such as quantization, pruning, layer decomposition and knowledge distillation (KD) for NLP and Speech
  • Impact of different compression techniques on the inductive biases learned by the original models
  • Combined compression techniques for more efficient NLP and speech models
  • Improving efficiency of KD by removing the teacher
  • Extreme model compression (high compression ratio) for very large pre-trained language models

Special Track) Efficient Graph Learning for NLP

  • Automatically transforming natural language into graph-structured data
  • Representation learning on multi-relational or heterogeneous graphs
  • Learning the mapping between complex data structures, like Graph2Seq, Graph2Tree, Graph2Graph
  • Graph learning with pre-trained language models

Other Efficient Applications Pre-trained models are used in many tasks in NLP that efficiency can be their concern.

  • Efficient Dense Retrieval
  • Large language model as a service
  • Training models on device
  • Incorporating external knowledge into pre-trained models
  • Unifying different pre-training models


Submission Instructions

You are invited to submit your papers in our CMT submission portal. All the submitted papers have to be anonymous for double-blind review. We expect each paper will be reviewed by at least three reviewers. The content of the paper (excluding the references and supplementary materials) should not be longer than 4 pages, strictly following the NeurIPS template style (which can be found here).

Authors can submit up to 100 MB of supplementary materials separately. Authors are highly encouraged to submit their codes for reproducibility purposes. According to the guideline of the NeurIPS workshops, already published papers are not encouraged for submission, but you are allowed to submit your ArXiv papers or the ones which are under submission. Moreover, a work that is presented at the main NeurIPS conference should not appear in a workshop. Please make sure to indicate the complete list of conflict of interests for all the authors of your paper. To encourage higher quality submissions, our sponsors are offering the Best Paper and the Best Poster Award to qualified outstanding original oral and poster presentations (upon nomination of the reviewers). Also, we will give one outstading paper certification for our special track of efficient graph learning for NLP.Bear in mind that our workshop is not archival, but the accepted papers will be hosted on the workshop website.


Important Dates:

  • Submission Deadline: September 25, 2022 AOE
  • Acceptance Notification: October 20, 2022 AOE
  • Camera-Ready Submission: November 1, 2022 AOE
  • Workshop Date: Friday December 2, 2022 (in-person and virtual)

Confirmed Speakers

Tara Sainath
Dr.
Tara Sainath

Google
Graham Neubig
Prof.
Graham Neubig

Carnegie Mellon University
Jimmy Lin
Prof.
Jimmy Lin

University of Waterloo
Song Han
Prof.
Song Han

MIT
Danqi Chen
Prof.
Danqi Chen

Princeton University
You Yang
Prof.
You Yang

University of Singapore
Lu Hou
Dr.
Lu Hou

Huawei Noah's Ark Lab
Bang Liu
Prof.
Bang Liu

University of Montreal / MILA
Siva Reddy
Prof.
Siva Reddy

McGill & MILA
Tim Dettmers

Tim Dettmers

University of Washington
Kenneth Heafield
Prof.
Kenneth Heafield

University of Edinburg
Anna Huang
Prof.
Anna Huang

MILA / Google

Industrial Panelists

Mohammad Norouzi

Mohammad Norouzi

Google Brain
Vikrant Singh Tomar

Vikrant Singh Tomar

Fluent.AI
Rahul Gupta

Rahul Gupta

Amazon Alexa
Boxing Chen

Boxing Chen

Marjan Ghazvininejad

Marjan Ghazvininejad

Meta
Yu Cheng

Yu Cheng

Microsoft
Jiahao Sun

Jiahao Sun

RBC

Schedule (New Orleans Time Zone)

Time Title Presenter
07:30AM - 07:50AM Breakfast
07:50AM - 08:00AM Opening Speech
08:00AM - 08:30AM (KeyNote Talk) Fine-grained Interactive Vision Language Pre-training
Lu Hou
08:30AM - 09:05AM (KeyNote Talk) Efficiency Tradeoffs in the Design of Neural Search Systems
Jimmy Lin
09:05AM - 09:35AM (KeyNote Talk) Last Advances in End-to-End Speech Recognition
Tara Sainath
09:35AM - 09:45AM Collective Knowledge Graph Completion with Mutual Knowledge Distillation
  • Weihang Zhang
  • Ovidiu Serban
  • Jiahao Sun
  • Yike Guo
09:45AM - 09:56AM Attribute Controlled Dialogue Prompting
  • Runcheng Liu
  • Ahmad Rashid
  • Ivan Kobyzev
  • Mehdi Rezaghoizadeh
  • Pascal Poupart
09:56AM - 10:05AM Fast DistilBERT on CPUs
  • Haihao Shen
  • Ofir Zafrir
  • Bo Dong
  • Hengyu Meng
  • Xinyu Ye
  • Zhe Wang
  • Yi Ding
  • Hanwen Chang
  • Guy Boudoukh
  • Moshe Wasserblat
10:00AM - 10:30AM Morning Break and Poster Session 1
10:30AM - 11:05AM (KeyNote Talk) SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Song Han
11:05AM - 11:35AM (KeyNote Talk) Building Language Models Based on Retrieval
Danqi Chen
11:35AM - 12:05PM (KeyNote Talk) Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training
You Yang
12:05PM - 12:15PM Efficient Few-Shot Learning Without Prompts
  • Oren Pereg
  • Daniel Korat
  • Moshe Wasserblat
  • Lewis Tunstall
  • Unso Eun Seo Jo
  • Luke Bates
  • Nils Reimers
12:15PM - 12:25PM PCFG-based Natural Language Interface Improves Generalization for Controlled Text Generation
  • Jingyu Zhang
  • Jim Glass
  • Tianxing He
12:25PM - 12:35PM PromptDA: Label-guided Data Augmentation for Prompt-based Few Shot Learners
  • Canyu Chen
  • Kai Shu
12:30PM - 01:30PM Lunch Break and Virtual Poster Session
01:30PM - 02:00PM (KeyNote Talk) Efficient Identify Event Causality with Knowledge and Analogy
Bang Liu
02:00PM - 02:50PM Interactive Industrial Panel
  • Boxing Chen
  • Jiahao Sun
  • Vikrant Singh Tomar
  • Marjan Ghazvininejad
  • Yu Cheng
  • Mohammad Norouzi
  • Rahul Gupta
02:50PM - 02:59PM Improving the Robustness of DistilHuBERT to Unseen Noisy Conditions via Data Augmentation, Curriculum Learning, and Multi-Task Enhancement
  • Heitor R Guimarães
  • Arthur S Pimentel
  • Anderson ARA R. Avila
  • Mehdi Rezagholizadeh
  • Tiago H Falk
02:59PM - 03:05PM Gradient Knowledge Distillation for Pre-trained Language Models
  • Lean Wang
  • Lei Li
  • Xu Sun
03:00PM - 03:30PM Break and Poster Session II
03:30PM - 04:05PM (KeyNote Talk) Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval
Graham Neubig
04:05PM - 04:35PM (KeyNote Talk) Do we still need inductive biases after Transformer language models
Siva Reddy
04:35PM - 05:05PM (KeyNote Talk) 8-bit Methods for Efficient Deep Learning
Tim Dettmers
05:05PM - 05:35PM (KeyNote Talk) Efficient Controllable Generative Models for Music and Performance Synthesis
Anna Huang
05:35PM - 05:45PM Best Paper and Poster Award & Closing

Organizers

Mehdi Rezagholizadeh
Mehdi Rezagholizadeh
Huawei Noah's Ark Lab
Peyman Passban
Peyman Passban
BenchSci
Yue Dong
Yue Dong
University of California
Lili Mou
Lili Mou
University of Alberta
Pascal Poupart
Pascal Poupart
University of Waterloo
Ali Ghodsi
Ali Ghodsi
University of Waterloo
Qun Liu
Qun Liu
Huawei Noah's Ark Lab

Volunteers

Khalil Bibi
Khalil Bibi
Huawei Noah's Ark Lab
Soheila Samiee
Soheila Samiee
BASF



Technical Committee

  • Kevin Duh (Johns Hopkins University)
  • Boxing Chen
  • Vahid Partovi Nia (Huawei Noah’s Ark Lab)
  • Bang Liu (University of Montreal (UdM))
  • Hamidreza Mahyar (McMaster University)
  • Wenhu Chen (University of Waterloo)
  • Mehdi Rezagholizadeh (Huawei Noah’s Ark Lab)
  • Yingxue Zhang (Huawei Noah's Ark Lab)
  • Yue Dong (University of California)
  • Lili Mou (University of Alberta)
  • Peyman Passban (BenchSci)
  • Ivan Kobyzev (Huawei Noah’s Ark Lab)
  • Aref Jafari (University of Waterloo)
  • Ahmad Rashid (Huawei Noah’s Ark Lab)
  • Vasileios Lioutas (University of British Colombia (UBC))
  • Anderson R. Avila (Huawei Noah’s Ark Lab)
  • Malik H. Altakrori (McGill University & MILA)
  • Ali Vahdat (Thomson Reuters)
  • Prasanna Parthasarathi (McGill University & MILA)
  • Shohreh Shaghaghian (Thomson Reuters)
  • Ehsan Kamalloo (University of Alberta)
  • Ali Saheb Pasand (University of Waterloo)
  • Abbas Ghaddar (Huawei Noah’s Ark Lab)
  • Marzieh Tahaei (Huawei Noah’s Ark Lab)
  • Soheila Samiee (BASF)
  • Habib Hajimolahoseini (Huawei Noah’s Ark Lab)
  • Mohammad Salameh (Huawei Noah’s Ark Lab)
  • Mohammed Senoussaoui (INRS)
  • Flávio Ávila (Amazon)
  • Peng Lu (Huawei Noah’s Ark Lab)
  • Joao Monteiro (Service Now)
  • Xiaoguang Li (Huawei Noah’s Ark Lab)
  • David Alfonso Hermelo (Huawei Noah’s Ark Lab)
  • Khalil Bibi (Huawei Noah’s Ark Lab)
  • Can Liu (Amazon Alexa AI)
  • Amina Shabbeer (Amazon)
  • M. Skylar Versage (Amazon)
  • Tanya Roosta (Amazon)
  • Prashanth Rao (Royal Bank of Canada)
  • Ankur Agarwal (Huawei Noah's Ark Lab)
  • Sunyam Bagga (Huawei Noah’s Ark Lab)
  • Ovidiu Serban (Imperial College London)
  • Tony Tong (Royal Bank of Canada)
  • Jiahao Sun (Royal Bank of Canada)
  • Ryan Ong (Imperial College London)
  • Weihang Zhang (Imperial College London)
  • Manying Zhang (Institut National des Langues et Civilisations Orientales)
  • Lianlong Wu (Oxford University)
  • Mojtaba Valipour (University of Waterloo)
  • Chandra Bhagavatula (Allen Institute for AI)
  • Mahdi Biparva (Huawei Noah's Ark Lab)
  • Jinming Zhao (Monash University)
  • Khalil Slimi (ServiceNow)
  • Mohammadreza Tayaranian (Huawei Noah’s Ark Lab)
  • Alireza Ghaffari (Huawei Noah’s Ark Lab)
  • Weiyi Lu (Amazon)



Platinium Sponsor

Gold Sponsor