AI Safety Institute approach to evaluations
AI Safety Institute (AISI) approach to evaluations and testing of advanced AI systems to better understand what each new system is capable of.
Documents
Details
The AI Safety Institute (AISI) was launched at the AI Safety Summit in Bletchley. The AI Safety Institute’s mission is to advance AI safety for the public interest. At Bletchley, companies agreed that governments should evaluate and test advanced AI systems, so that we can better understand what each new system is capable of.
Now that AISI has started pre-deployment testing for potentially harmful capabilities on advanced AI systems, we are setting out our approach to evaluations. The publication includes:
- An update on AISI’s progress since the AI Safety Summit, including its three core functions: evaluations, foundational AI research, and facilitating information exchange.
- The AISI’s approach to evaluations including its techniques and an initial research agenda.
- Setting out AISI’s criteria for selecting models for evaluations.
- Showcasing three case studies of demonstrations across misuse, autonomous systems, and societal impacts, which were presented at the AI Safety Summit.
- AISI’s approach beyond evaluations, including foundational AI research.