Hailey Schoelkopf
Hi! I’m Hailey (she/her). I am currently a Research Scientist at EleutherAI. There, I study a variety of things AI, ML, and LLMs, but some of my research interests in particular include:
- Rigorous, reliable evaluation of LLMs and other generative models: how do we create standards for reproducible evaluation of AI models, evaluate them on complex tasks, and build a science of capability testing?
- The engineering that goes into distributed training and making it fast: I think many of the most important and most interesting questions about our current paradigm are currently engineering questions.
- The science of scaling models up reliably: most recent progress has come from the systematization of transmuting compute into performance. We should understand these processes better and make our existing recipes even more predictable.
I am currently a maintainer of the LM Evaluation Harness. Some notable projects I’ve worked on include pretraining the Pythia suite of language models, and engineering for the continued pretraining of the Llemma base models for mathematics.
news
Aug 29, 2024 | I was a panelist at Princeton Language and Intelligence’s Workshop on Useful and Reliable Agents, discussing our experience maintaining the LM Evaluation Harness and considerations for evaluating LM agents. |
---|---|
Jul 22, 2024 | I gave an ICML 2024 tutorial with Lintang Sutawika on “Challenges in LM Evaluation”! For ICML attendees, the recording can be found on the ICML website and the slides are uploaded here. Thank you to all who attended! |
Jun 22, 2024 | I gave a talk on “Lessons Learned on Effective and Reproducible Evaluations of LLMs” at Cohere For AI’s NLP community group. Thanks for having me! |
Jun 11, 2024 | I gave a talk on “A Deep Dive on LM Evaluation” for Maven and Parlance Labs’ LLM Fine-Tuning Conference. Thanks to all who attended. Slides can be found here. |
Jun 06, 2024 | New preprint released: “Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?” |
latest posts
Aug 11, 2024 | Prefix Linear Attention Can Outspeed Causal Linear Attention |
---|---|
Jul 09, 2024 | Linear Attention Fundamentals |