Listen to a podcast, please open Podcast Republic app. Available on Google Play Store and Apple App Store.
Listen to resources from the AI Safety Fundamentals: Alignment 201 course!
https://course.aisafetyfundamentals.com/alignment-201
Episode | Date |
---|---|
Empirical Findings Generalize Surprisingly Far
|
May 13, 2023 |
Worst-Case Thinking in AI Alignment
|
May 13, 2023 |
Two-Turn Debate Doesn’t Help Humans Answer Hard Reading Comprehension Questions
|
May 13, 2023 |
Least-To-Most Prompting Enables Complex Reasoning in Large Language Models
|
May 13, 2023 |
Low-Stakes Alignment
|
May 13, 2023 |
ABS: Scanning Neural Networks for Back-Doors by Artificial Brain Stimulation
|
May 13, 2023 |
Imitative Generalisation (AKA ‘Learning the Prior’)
|
May 13, 2023 |
Discovering Latent Knowledge in Language Models Without Supervision
|
May 13, 2023 |
Toy Models of Superposition
|
May 13, 2023 |
An Investigation of Model-Free Planning
|
May 13, 2023 |
Gradient Hacking: Definitions and Examples
|
May 13, 2023 |
Intro to Brain-Like-AGI Safety
|
May 13, 2023 |
Deep double descent
|
May 13, 2023 |
Eliciting Latent Knowledge
|
May 13, 2023 |
Chinchilla’s wild implications
|
May 13, 2023 |