Narrations of Redwood Research blog posts. Redwood Research is a research nonprofit based in Berkeley. We investigate risks posed by the development of... more
Subtitle: A new paper on AI control for agents.. We have released a new paper, Ctrl-Z: Controlling AI... more
Subtitle: Evidence from just model internals (e.g. interpretability) is unlikely to be broadly convincing.. One key hope for... more
Subtitle: A breakdown of how higher capabilities increase risk. It's generally agreed that as AIs get more capable,... more
Subtitle: What are all the research and implementation areas helpful for control?. In this post, I'll list all... more
Subtitle: What methods can we use to ensure control?. We often talk about ensuring control, which in the... more
Buck on the 80,000 Hours podcast My podcast with Rob Wiblin from 80,000 Hours just came out. I’m really... more
Subtitle: How can we prevent AIs from intentionally underperforming on our metrics?. If we naively apply RL to... more
Subtitle: What are the methods and issues when failures occur diffusely over many actions?. In this post, I'll... more
Subtitle: What are the main threats and how should we prioritize them?. We often talk about ensuring control,... more
Subtitle: Developing AI employees that are safer than human ones. My goal as an AI safety researcher is... more