Search Results for author: Zhiming

Found 2 papers, 2 papers with code

Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster

2 code implementations6 Apr 2023 Nolan Dey, Gurpreet Gosal, Zhiming, Chen, Hemant Khachane, William Marshall, Ribhu Pathria, Marvin Tom, Joel Hestness

We study recent research advances that improve large language models through efficient pre-training and scaling, and open datasets and tools.

Cannot find the paper you are looking for? You can Submit a new open access paper.