pod.link/1806316198
pod.link copied!
Redwood Research Blog
Redwood Research Blog
Redwood Research

Narrations of Redwood Research blog posts. Redwood Research is a research nonprofit based in Berkeley. We investigate risks posed by the development of... more

Listen now on

Apple Podcasts
Spotify
Overcast
Podcast Addict
Pocket Casts
Castbox
Podbean
iHeartRadio
Player FM
Podcast Republic
Castro
RSS

Episodes

“AIs at the current capability level may be important for future safety work” by Ryan Greenblatt

Subtitle: Some reasons why relatively weak AIs might still be important when we have very powerful AIs. Sometimes... more

12 May 2025 · 7 minutes
“Misalignment and Strategic Underperformance: An Analysis of Sandbagging and Exploration Hacking” by Julian Stastny, Buck Shlegeris

Subtitle: A new analysis of the risk of AIs intentionally performing poorly.. In the future, we will want... more

08 May 2025 · 30 minutes
“Training-time schemers vs behavioral schemers” by Alex Mallen

Subtitle: Clarifying ways in which faking alignment during training is neither necessary nor sufficient for the kind of scheming... more

06 May 2025 · 13 minutes
“What’s going on with AI progress and trends? (As of 5/2025)” by Ryan Greenblatt

Subtitle: My views on what's driving AI progress and where it's headed.. AI progress is driven by improved... more

03 May 2025 · 16 minutes
“How can we solve diffuse threats like research sabotage with AI control?” by Vivek Hebbar

Subtitle: Preventing research sabotage will require techniques very different from the original control paper.. Misaligned AIs might engage... more

30 Apr 2025 · 15 minutes
“7+ tractable directions in AI control” by Ryan Greenblatt

Subtitle: A list of easy-to-start directions in AI control targeted at independent researchers without as much context or compute.... more

29 Apr 2025 · 26 minutes
“Clarifying AI R&D threat models” by Josh Clymer

Subtitle: (There are a few). A casual reader of one of the many AI company safety frameworks might... more

25 Apr 2025 · 9 minutes
“How training-gamers might function (and win)” by Vivek Hebbar

Subtitle: A model of the relationship between higher level goals, explicit reasoning, and learned heuristics in capable agents.. ... more

24 Apr 2025 · 32 minutes
“Handling schemers if shutdown is not an option” by Buck Shlegeris

Subtitle: What if getting strong evidence of scheming isn't the end of your scheming problems, but merely the middle?.... more

18 Apr 2025 · 24 minutes
“Ctrl-Z: Controlling AI Agents via Resampling” by Buck Shlegeris

Subtitle: A new paper on AI control for agents.. We have released a new paper, Ctrl-Z: Controlling AI... more

16 Apr 2025 · 4 minutes
Redwood Research Blog
“AIs at the current capability level may be important for future safety work” by Ryan Greenblatt
Redwood Research Blog