JEEBench is a considerably more challenging benchmark dataset for evaluating the problem solving abilities of LLMs. It curates 515 challenging pre-engineering mathematics, physics and chemistry problems from the IIT JEE-Advanced Exam. Long-horizon reasoning on top of deep in-domain knowledge is essential for solving problems in this benchmark.
7 PAPERS • 1 BENCHMARK
📄 Read<br> 💾 Code<br> 🔗 Webpage<br> 💻 Demo<br> 🤗 Huggingface Dataset<br> 💬 Discussions
2 PAPERS • 1 BENCHMARK