Announcement_2024 08 29

I was a panelist at Princeton Language and Intelligence’s Workshop on Useful and Reliable Agents, discussing our experience maintaining the LM Evaluation Harness and considerations for evaluating LM agents.