By clamping subsets of activations throughout training, we then identify three underlying subcircuits that interact to drive IH formation, yielding the phase change.

Paper
Code

What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers

kakaobrain/kogpt • • EMNLP 2021

GPT-3 shows remarkable in-context learning ability of large-scale language models (LMs) trained on hundreds of billion scale data.

Paper
Code

MetaICL: Learning to Learn In Context

facebookresearch/metaicl • • NAACL 2022

We introduce MetaICL (Meta-training for In-Context Learning), a new meta-training framework for few-shot learning where a pretrained language model is tuned to do in-context learning on a large set of training tasks.

Paper
Code

Learning To Retrieve Prompts for In-Context Learning

ohadrubin/epr • • NAACL 2022

In-context learning is a recent paradigm in natural language understanding, where a large pre-trained language model (LM) observes a test instance and a few training examples as its input, and directly decodes the output without any update to its parameters.

Paper
Code

Black-Box Tuning for Language-Model-as-a-Service

txsun1997/black-box-tuning • • 10 Jan 2022

In such a scenario, which we call Language-Model-as-a-Service (LMaaS), the gradients of PTMs are usually unavailable.

Paper
Code

Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

alrope123/rethinking-demonstrations • • 25 Feb 2022

Large language models (LMs) are able to in-context learn -- perform a new task via inference alone by conditioning on a few input-label pairs (demonstrations) and making predictions for new inputs.

Paper
Code

UL2: Unifying Language Learning Paradigms

google-research/google-research • • 10 May 2022

Our model also achieve strong results at in-context learning, outperforming 175B GPT-3 on zero-shot SuperGLUE and tripling the performance of T5-XXL on one-shot summarization.

Paper
Code

Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning

r-three/t-few • • 11 May 2022

ICL incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made.

Paper
Code

Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments

dki-lab/pangu • • 19 Dec 2022

Most existing work for grounded language understanding uses LMs to directly generate plans that can be executed in the environment to achieve the desired effects.

Paper
Code

Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations

alrope123/z-icl • • 19 Dec 2022

Although large language models can be prompted for both zero- and few-shot learning, performance drops significantly when no demonstrations are available.

Paper
Code

In-Context Learning

Benchmarks Add a Result

Libraries

Most implemented papers

Content

Benchmarks

Add a Result