Me in 10 seconds
I'm a researcher working on AI safety, with a focus on understanding agency, both in AI systems, and to understand how powerful AI will impact human agency. Currently pursuing an MPhil in Applied Mathematics at the University of Cape Town with Jonathan Shock, where I'm studying the application of mechanistic interpretability to deeply understand RL models. I also have strong interest in evaluative frameworks and creative ways that we can learn to measure and understand LLMs.
I am also passionate about teaching and growing the field of AI safety, and co-founded AI Safety Cape Town to contribute to that effort.
Outside of research, I draw inspiration from Buddhist philosophy, Stoicism, and Kantian ethics in thinking about how to make meaningful contributions to the field of AI safety. To recharge I spend time playing beach volleyball, watching anime, reading epic fantasy/sci-fi and hiking.
I welcome feedback on how I'm doing! If you'd like to share, please feel encouraged to do so using this feedback form.
Me in many seconds
link to my about page
Now Now Now
What I'm doing now
Inkhaven: 30 Days of Posts
Why Are Anglophone Countries Unhappier Than Their European Counterparts?
Attachment Theory Is Extremely Cool and Useful
Linkpost: Sanity-Checking 'Incompressible Knowledge Probes'
Posts
Whole Brain Emulation as an Anchor for AI Welfare
Lessons from my first 10 day Vipassana
Why pursue conceptions of agency for AI safety
Vipassana Meditation and Active Inference: A Framework for Understanding Suffering and its Cessation
Projects
Building and training a word embedding system
Creating a small GPT from scratch in Pytorch
Adding vision and navigation to an autonomous farm robot
Developing toy models of agency. A mechanistic interpretability project.
Talks
Papers
Investigating Factored Cognition in Large Language Models For Answering Ethically Nuanced Questions
A Security Analysis of the Linux RNG Protocol in Virtual Machines
HumanAgencyBench: Do Language Models Support Human Agency?