I am an Assistant Professor at the University of Cambridge and a member of Cambridge’s Computational and Biological Learning lab (CBL). My research group focuses on Deep Learning, AI Alignment, and AI safety. I’m broadly interested in work (including in areas outside of Machine Learning, e.g. AI governance) that could reduce the risk of human extinction (“x-risk”) resulting from out-of-control AI systems. Particular interests include:
- Reward modeling and reward gaming
- Aligning foundation models
- Understanding learning and generalization in deep learning and foundation models, especially via “empirical theory” approaches
- Preventing the development and deployment of socially harmful AI systems
- Elaborating and evaluating speculative concerns about more advanced future AI systems