AI Safety Fundamentals: Alignment 201

By BlueDot Impact

Listen to a podcast, please open Podcast Republic app. Available on Google Play Store and Apple App Store.

Image by BlueDot Impact

Category: Technology

Open in Apple Podcasts


Open RSS feed


Open Website


Rate for this podcast

Subscribers: 0
Reviews: 0
Episodes: 15

Description

Listen to resources from the AI Safety Fundamentals: Alignment 201 course!

https://course.aisafetyfundamentals.com/alignment-201


Episode Date
Empirical Findings Generalize Surprisingly Far
May 13, 2023
Worst-Case Thinking in AI Alignment
May 13, 2023
Two-Turn Debate Doesn’t Help Humans Answer Hard Reading Comprehension Questions
May 13, 2023
Least-To-Most Prompting Enables Complex Reasoning in Large Language Models
May 13, 2023
Low-Stakes Alignment
May 13, 2023
ABS: Scanning Neural Networks for Back-Doors by Artificial Brain Stimulation
May 13, 2023
Imitative Generalisation (AKA ‘Learning the Prior’)
May 13, 2023
Discovering Latent Knowledge in Language Models Without Supervision
May 13, 2023
Toy Models of Superposition
May 13, 2023
An Investigation of Model-Free Planning
May 13, 2023
Gradient Hacking: Definitions and Examples
May 13, 2023
Intro to Brain-Like-AGI Safety
May 13, 2023
Deep double descent
May 13, 2023
Eliciting Latent Knowledge
May 13, 2023
Chinchilla’s wild implications
May 13, 2023