Narrations of Redwood Research blog posts. Redwood Research is a research nonprofit based in Berkeley. We investigate risks posed by the development of... more
Subtitle: Some reasons why relatively weak AIs might still be important when we have very powerful AIs. Sometimes... more
Subtitle: A new analysis of the risk of AIs intentionally performing poorly.. In the future, we will want... more
Subtitle: Clarifying ways in which faking alignment during training is neither necessary nor sufficient for the kind of scheming... more
Subtitle: My views on what's driving AI progress and where it's headed.. AI progress is driven by improved... more
Subtitle: Preventing research sabotage will require techniques very different from the original control paper.. Misaligned AIs might engage... more
Subtitle: A list of easy-to-start directions in AI control targeted at independent researchers without as much context or compute.... more
Subtitle: (There are a few). A casual reader of one of the many AI company safety frameworks might... more
Subtitle: A model of the relationship between higher level goals, explicit reasoning, and learned heuristics in capable agents.. ... more
Subtitle: What if getting strong evidence of scheming isn't the end of your scheming problems, but merely the middle?.... more
Subtitle: A new paper on AI control for agents.. We have released a new paper, Ctrl-Z: Controlling AI... more