Accepted papers

Parameter-Efficient Low-Resource Dialogue State Tracking by Prompt Tuning
BERT on a Data Diet: Finding Important Examples by Gradient-Based Pruning
Pre-Training a Graph Recurrent Network for Language Representation
Collective Knowledge Graph Completion with Mutual Knowledge Distillation
An Efficient Memory-Augmented Transformer for Knowledge-Intensive NLP Tasks
QuaLA-MiniLM: a Quantized Length Adaptive MiniLM
Gradient Knowledge Distillation for Pre-trained Language Models
Towards Data Efficient And Robust Speech Representation Model Distillation
Efficient Few-Shot Learning Without Prompts
On Spectral and Temporal Feature Encoding Behaviour in Stacked Architectures
Few-Shot Aspect Extraction using Prompt Training
Can we get smarter than majority vote? Efficient use of individual rater’s labels for content moderation
Fast DistilBERT on CPUs
BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model From Scratch?
Parameter-Efficient Finetuning of Transformers for Source Code
Graph Masking Pre-training for Graph-to-Text Generation
The Ineffectiveness of TKGE Models in Encoding Real-World Knowledge Graphs
PEST: Combining Parameter-Efficient Fine-Tuning with Self-Training and Co-Training
ContextNER: Contextual Phrase Generation at Scale
Efficient Speech Translation with Pre-trained models
DyREx: Dynamic Query Representation for Extractive Question Answering
Strategies for Applying Low Rank Decomposition to Transformer-Based Models
PCFG-based Natural Language Interface Improves Generalization for Controlled Text Generation
DyLoRA: Parameter Efficient Tuning of Pre-trained Models using Dynamic Search-Free Low Rank Adaptation
Pyramid Dynamic Inference: Encouraging Faster Inference via Early Exit Boosting
An efficient RNN Language Model using activity sparsity and sparse back-propagation through time
Improving the Robustness of DistilHuBERT to Unseen Noisy Conditions via Data Augmentation, Curriculum Learning, and Multi-Task Enhancement
Attribute Controlled Dialogue Prompting
An Exploration of Methods for Zero-shot Transfer in Small Language Models
On the impact of the quality of pseudo-labels on the self-supervised speaker verification task
INT8 Transformers for Inference Acceleration
Parameter and Data Efficient Continual Pre-training for Robustness to Dialectal Variance in Arabic
Depth-Wise Attention (DWAtt): A Layer Fusion Method for Data-Efficient Classification
SymbolicGPT: A Generative Transformer Model for Symbolic Regression
Using Informative Data Subsets for Efficient Training of Large Language Models: An Initial Study
Using Selective Masking as a Bridge between Pre-training and Fine-tuning
Improved Knowledge Distillation by Utilizing Backward Pass Knowledge in Neural Networks
PromptDA: Label-guided Data Augmentation for Prompt-based Few-shot Learners
Topic Segmentation in the Wild: Towards Segmentation of Semi-structured & Unstructured Chats
A Theory of Unsupervised Translation for Understanding Animal Communication