Search Results for author: Sinong Wang

Found 26 papers, 11 papers with code

Phonetic and Lexical Discovery of a Canine Language using HuBERT

no code implementations • 25 Feb 2024 • Xingyuan Li, Sinong Wang, Zeyu Xie, Mengyue Wu, Kenny Q. Zhu

This paper delves into the pioneering exploration of potential communication patterns within dog vocalizations and transcends traditional linguistic analysis barriers, which heavily relies on human priori knowledge on limited datasets to find sound units in dog vocalization.

Paper
Add Code

SPAR: Personalized Content-Based Recommendation via Long Engagement Attention

1 code implementation • 16 Feb 2024 • Chiyu Zhang, Yifei Sun, Jun Chen, Jie Lei, Muhammad Abdul-Mageed, Sinong Wang, Rong Jin, Sem Park, Ning Yao, Bo Long

Leveraging users' long engagement histories is essential for personalized content recommendations.

Language Modelling Large Language Model

Paper
Code

Effective Long-Context Scaling of Foundation Models

2 code implementations • 27 Sep 2023 • Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, Hao Ma

We also examine the impact of various design choices in the pretraining process, including the data mix and the training curriculum of sequence lengths -- our ablation experiments suggest that having abundant long texts in the pretrain dataset is not the key to achieving strong performance, and we empirically verify that long context continual pretraining is more efficient and similarly effective compared to pretraining from scratch with long sequences.

Continual Pretraining Language Modelling

295

Paper
Code

LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language Models

1 code implementation • 30 Aug 2023 • Chi Han, Qifan Wang, Hao Peng, Wenhan Xiong, Yu Chen, Heng Ji, Sinong Wang

As a result, their performance suffers drastically on inputs longer than those encountered during training, substantially limiting their applications in real-world tasks involving long contexts such as encoding scientific articles, code repositories, or long dialogues.

2k 4k +1

Paper
Code

Learning Easily Updated General Purpose Text Representations with Adaptable Task-Specific Prefixes

no code implementations • 22 May 2023 • Kuan-Hao Huang, Liang Tan, Rui Hou, Sinong Wang, Amjad Almahairi, Ruty Rinott

Fine-tuning a large pre-trained language model for each downstream task causes computational burdens in the inference time due to several times of forward passes.

Language Modelling

Paper
Add Code

Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning

1 code implementation • CVPR 2023 • Ajinkya Tejankar, Maziar Sanjabi, Qifan Wang, Sinong Wang, Hamed Firooz, Hamed Pirsiavash, Liang Tan

It was shown that an adversary can poison a small part of the unlabeled data so that when a victim trains an SSL model on it, the final model will have a backdoor that the adversary can exploit.

Data Poisoning Self-Supervised Learning

Paper
Code

Representation Deficiency in Masked Language Modeling

1 code implementation • 4 Feb 2023 • Yu Meng, Jitin Krishnan, Sinong Wang, Qifan Wang, Yuning Mao, Han Fang, Marjan Ghazvininejad, Jiawei Han, Luke Zettlemoyer

In this work, we offer a new perspective on the consequence of such a discrepancy: We demonstrate empirically and theoretically that MLM pretraining allocates some model dimensions exclusively for representing $\texttt{[MASK]}$ tokens, resulting in a representation deficiency for real tokens and limiting the pretrained model's expressiveness when it is adapted to downstream data without $\texttt{[MASK]}$ tokens.

Language Modelling Masked Language Modeling

Paper
Code

Improved Adaptive Algorithm for Scalable Active Learning with Weak Labeler

no code implementations • 4 Nov 2022 • Yifang Chen, Karthik Sankararaman, Alessandro Lazaric, Matteo Pirotta, Dmytro Karamshuk, Qifan Wang, Karishma Mandyam, Sinong Wang, Han Fang

We design a novel algorithmic template, Weak Labeler Active Cover (WL-AC), that is able to robustly leverage the lower quality weak labelers to reduce the query complexity while retaining the desired level of accuracy.

Active Learning

Paper
Add Code

BayesFormer: Transformer with Uncertainty Estimation

no code implementations • 2 Jun 2022 • Karthik Abinav Sankararaman, Sinong Wang, Han Fang

Transformer has become ubiquitous due to its dominant performance in various NLP and image processing tasks.

Active Learning Language Modelling +3

Paper
Add Code

Detection, Disambiguation, Re-ranking: Autoregressive Entity Linking as a Multi-Task Problem

no code implementations • Findings (ACL) 2022 • Khalil Mrini, Shaoliang Nie, Jiatao Gu, Sinong Wang, Maziar Sanjabi, Hamed Firooz

Without the use of a knowledge base or candidate sets, our model sets a new state of the art in two benchmark datasets of entity linking: COMETA in the biomedical domain, and AIDA-CoNLL in the news domain.

Decoder Entity Linking +1

Paper
Add Code

IDPG: An Instance-Dependent Prompt Generation Method

no code implementations • NAACL 2022 • Zhuofeng Wu, Sinong Wang, Jiatao Gu, Rui Hou, Yuxiao Dong, V. G. Vinod Vydiswaran, Hao Ma

Prompt tuning is a new, efficient NLP transfer learning paradigm that adds a task-specific prompt in each input instance during the model training stage.

Language Modelling Natural Language Understanding +2

Paper
Add Code

Reducing Target Group Bias in Hate Speech Detectors

no code implementations • 7 Dec 2021 • Darsh J Shah, Sinong Wang, Han Fang, Hao Ma, Luke Zettlemoyer

The ubiquity of offensive and hateful content on online fora necessitates the need for automatic solutions that detect such content competently across target groups.

text-classification Text Classification

Paper
Add Code

Sparse Distillation: Speeding Up Text Classification by Using Bigger Student Models

1 code implementation • NAACL 2022 • Qinyuan Ye, Madian Khabsa, Mike Lewis, Sinong Wang, Xiang Ren, Aaron Jaech

Distilling state-of-the-art transformer models into lightweight student models is an effective way to reduce computation cost at inference time.

Domain Generalization Privacy Preserving +4

Paper
Code

Luna: Linear Unified Nested Attention

2 code implementations • NeurIPS 2021 • Xuezhe Ma, Xiang Kong, Sinong Wang, Chunting Zhou, Jonathan May, Hao Ma, Luke Zettlemoyer

Specifically, with the first attention function, Luna packs the input sequence into a sequence of fixed length.

Language Modelling Machine Translation +2

104

Paper
Code

Entailment as Few-Shot Learner

3 code implementations • 29 Apr 2021 • Sinong Wang, Han Fang, Madian Khabsa, Hanzi Mao, Hao Ma

Large pre-trained language models (LMs) have demonstrated remarkable ability as few-shot learners.

Ranked #1 on Topic Classification on OS

Contrastive Learning Data Augmentation +8

11,623

Paper
Code

On the Influence of Masking Policies in Intermediate Pre-training

no code implementations • EMNLP 2021 • Qinyuan Ye, Belinda Z. Li, Sinong Wang, Benjamin Bolte, Hao Ma, Wen-tau Yih, Xiang Ren, Madian Khabsa

Current NLP models are predominantly trained through a two-stage "pre-train then fine-tune" pipeline.

Abstractive Text Summarization Language Modelling +4

Paper
Add Code

On Unifying Misinformation Detection

1 code implementation • NAACL 2021 • Nayeon Lee, Belinda Z. Li, Sinong Wang, Pascale Fung, Hao Ma, Wen-tau Yih, Madian Khabsa

In this paper, we introduce UnifiedM2, a general-purpose misinformation model that jointly models multiple domains of misinformation with a single, unified setup.

Few-Shot Learning Misinformation

Paper
Code

Studying Strategically: Learning to Mask for Closed-book QA

no code implementations • 31 Dec 2020 • Qinyuan Ye, Belinda Z. Li, Sinong Wang, Benjamin Bolte, Hao Ma, Wen-tau Yih, Xiang Ren, Madian Khabsa

Thus, our policy packs task-relevant knowledge into the parameters of a language model.

Language Modelling Question Answering +1

Paper
Add Code

CLEAR: Contrastive Learning for Sentence Representation

no code implementations • 31 Dec 2020 • Zhuofeng Wu, Sinong Wang, Jiatao Gu, Madian Khabsa, Fei Sun, Hao Ma

Pre-trained language models have proven their unique powers in capturing implicit language features.

Ranked #5 on Question Answering on Quora Question Pairs

Contrastive Learning Linguistic Acceptability +5

Paper
Add Code

To Pretrain or Not to Pretrain: Examining the Benefits of Pretrainng on Resource Rich Tasks

no code implementations • ACL 2020 • Sinong Wang, Madian Khabsa, Hao Ma

Pretraining NLP models with variants of Masked Language Model (MLM) objectives has recently led to a significant improvements on many tasks.

Language Modelling text-classification +1

Paper
Add Code

To Pretrain or Not to Pretrain: Examining the Benefits of Pretraining on Resource Rich Tasks

no code implementations • 15 Jun 2020 • Sinong Wang, Madian Khabsa, Hao Ma

Pretraining NLP models with variants of Masked Language Model (MLM) objectives has recently led to a significant improvements on many tasks.

Language Modelling text-classification +1

Paper
Add Code

Linformer: Self-Attention with Linear Complexity

15 code implementations • 8 Jun 2020 • Sinong Wang, Belinda Z. Li, Madian Khabsa, Han Fang, Hao Ma

Large transformer models have shown extraordinary success in achieving state-of-the-art results in many natural language processing applications.

Language Modelling

29,554

Paper
Code

Language Models as Fact Checkers?

no code implementations • WS 2020 • Nayeon Lee, Belinda Z. Li, Sinong Wang, Wen-tau Yih, Hao Ma, Madian Khabsa

Recent work has suggested that language models (LMs) store both common-sense and factual knowledge learned from pre-training data.

Common Sense Reasoning Language Modelling +2

Paper
Add Code

Blockwise Self-Attention for Long Document Understanding

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Jiezhong Qiu, Hao Ma, Omer Levy, Scott Wen-tau Yih, Sinong Wang, Jie Tang

We present BlockBERT, a lightweight and efficient BERT model for better modeling long-distance dependencies.

document understanding Language Modelling +1

Paper
Code

UCBoost: A Boosting Approach to Tame Complexity and Optimality for Stochastic Bandits

no code implementations • 16 Apr 2018 • Fang Liu, Sinong Wang, Swapna Buccapatnam, Ness Shroff

We show that UCBoost($D$) enjoys $O(1)$ complexity for each arm per round as well as regret guarantee that is $1/e$-close to that of the kl-UCB algorithm.

Decision Making

Paper
Add Code

A New Alternating Direction Method for Linear Programming

no code implementations • NeurIPS 2017 • Sinong Wang, Ness Shroff

It is well known that, for a linear program (LP) with constraint matrix $\mathbf{A}\in\mathbb{R}^{m\times n}$, the Alternating Direction Method of Multiplier converges globally and linearly at a rate $O((\|\mathbf{A}\|_F^2+mn)\log(1/\epsilon))$.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.