Linear Warmup With Cosine Annealing is a learning rate schedule where we increase the learning rate linearly for $n$ updates and then anneal according to a cosine schedule afterwards.
Paper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Language Modelling | 73 | 9.62% |
Large Language Model | 47 | 6.19% |
Question Answering | 35 | 4.61% |
Retrieval | 31 | 4.08% |
In-Context Learning | 27 | 3.56% |
Sentence | 23 | 3.03% |
Text Generation | 23 | 3.03% |
Code Generation | 22 | 2.90% |
Prompt Engineering | 19 | 2.50% |