1 code implementation • 17 May 2023 • Wenjun Peng, Jingwei Yi, Fangzhao Wu, Shangxi Wu, Bin Zhu, Lingjuan Lyu, Binxing Jiao, Tong Xu, Guangzhong Sun, Xing Xie
Companies have begun to offer Embedding as a Service (EaaS) based on these LLMs, which can benefit various natural language processing (NLP) tasks for customers.
2 code implementations • 10 Apr 2023 • Nan Yang, Tao Ge, Liang Wang, Binxing Jiao, Daxin Jiang, Linjun Yang, Rangan Majumder, Furu Wei
We propose LLMA, an LLM accelerator to losslessly speed up Large Language Model (LLM) inference with references.
no code implementations • 20 Dec 2022 • Chongyang Tao, Chang Liu, Tao Shen, Can Xu, Xiubo Geng, Binxing Jiao, Daxin Jiang
Different from previous works that only rely on one positive and hard negatives as candidate passages, we create dark examples that all have moderate relevance to the query through mixing-up and masking in discrete space.
1 code implementation • 7 Dec 2022 • Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei
This paper presents E5, a family of state-of-the-art text embeddings that transfer well to a wide range of tasks.
Ranked #11 on Only Connect Walls Dataset Task 1 (Grouping) on OCW (using extra training data)
no code implementations • 21 Nov 2022 • Qiushi Zhu, Long Zhou, Ziqiang Zhang, Shujie Liu, Binxing Jiao, Jie Zhang, LiRong Dai, Daxin Jiang, Jinyu Li, Furu Wei
Although speech is a simple and effective way for humans to communicate with the outside world, a more realistic speech interaction contains multimodal information, e. g., vision, text.
1 code implementation • 17 Oct 2022 • Jingwei Yi, Fangzhao Wu, Chuhan Wu, Xiaolong Huang, Binxing Jiao, Guangzhong Sun, Xing Xie
In this paper, we propose an effective query-aware webpage snippet extraction method named DeepQSE, aiming to select a few sentences which can best summarize the webpage content in the context of input query.
1 code implementation • 31 Aug 2022 • Tao Shen, Xiubo Geng, Chongyang Tao, Can Xu, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang
In large-scale retrieval, the lexicon-weighting paradigm, learning weighted sparse representations in vocabulary space, has shown promising results with high quality and low latency.
1 code implementation • 29 Aug 2022 • Kai Zhang, Chongyang Tao, Tao Shen, Can Xu, Xiubo Geng, Binxing Jiao, Daxin Jiang
The alignment is achieved by weakened knowledge distillations to enlighten the retriever via two aspects -- 1) a lexicon-augmented contrastive objective to challenge the dense encoder and 2) a pair-wise rank-consistent regularization to make dense model's behavior incline to the other.
1 code implementation • 6 Jul 2022 • Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei
It employs a simple bottleneck architecture that learns to compress the passage information into a dense vector through self-supervised pre-training.
no code implementations • 16 Jun 2022 • Yucheng Zhou, Tao Shen, Xiubo Geng, Chongyang Tao, Can Xu, Guodong Long, Binxing Jiao, Daxin Jiang
A ranker plays an indispensable role in the de facto 'retrieval & rerank' pipeline, but its training still lags behind -- learning from moderate negatives or/and serving as an auxiliary module for a retriever.
no code implementations • Findings (ACL) 2022 • Tianyu Chen, Hangbo Bao, Shaohan Huang, Li Dong, Binxing Jiao, Daxin Jiang, Haoyi Zhou, JianXin Li, Furu Wei
As more and more pre-trained language models adopt on-cloud deployment, the privacy issues grow quickly, mainly for the exposure of plain-text user data (e. g., search history, medical record, bank account).
no code implementations • 1 Jun 2022 • Tianyu Chen, Shaohan Huang, Yuan Xie, Binxing Jiao, Daxin Jiang, Haoyi Zhou, JianXin Li, Furu Wei
The sparse Mixture-of-Experts (MoE) model is powerful for large-scale pre-training and has achieved promising results due to its model capacity.
no code implementations • 20 Aug 2021 • Chuhan Wu, Fangzhao Wu, Tao Qi, Binxing Jiao, Daxin Jiang, Yongfeng Huang, Xing Xie
We then sample token pairs based on their probability scores derived from the sketched attention matrix to generate different sparse attention index matrices for different attention heads.
no code implementations • ACL 2021 • Nan Yang, Furu Wei, Binxing Jiao, Daxing Jiang, Linjun Yang
Dense passage retrieval has been shown to be an effective approach for information retrieval tasks such as open domain question answering.