Listen to a podcast, please open Podcast Republic app. Available on Google Play Store and Apple App Store.
Episode | Date |
---|---|
“How I talk to those above me” by Maxwell Peterson
|
Mar 30, 2025 |
“The vision of Bill Thurston” by TsviBT
|
Mar 29, 2025 |
“Tormenting Gemini 2.5 with the [[[]]][][[]] Puzzle” by Czynski
|
Mar 29, 2025 |
“Softmax, Emmett Shear’s new AI startup focused on ‘Organic Alignment’” by Chipmonk
|
Mar 28, 2025 |
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit
|
Mar 28, 2025 |
“Gemini 2.5 is the New SoTA” by Zvi
|
Mar 28, 2025 |
“AI #109: Google Fails Marketing Forever” by Zvi
|
Mar 28, 2025 |
“Explaining British Naval Dominance During the Age of Sail” by Arjun Panickssery
|
Mar 28, 2025 |
“Tracing the Thoughts of a Large Language Model” by Adam Jermyn
|
Mar 27, 2025 |
“Third-wave AI safety needs sociopolitical thinking” by Richard_Ngo
|
Mar 27, 2025 |
“Mistral Large 2 (123B) exhibits alignment faking” by Marc Carauleanu, Diogo de Lucena, Gunnar_Zarncke, Cameron Berg, Judd Rosenblatt, Mike Vaiana, AE Studio
|
Mar 27, 2025 |
[Linkpost] “Center on Long-Term Risk: Summer Research Fellowship 2025 - Apply Now” by Tristan Cook
|
Mar 27, 2025 |
“Avoid the Counterargument Collapse” by marknm
|
Mar 27, 2025 |
“Automated Researchers Can Subtly Sandbag” by gasteigerjo, Akbir Khan, Sam Bowman, Vlad Mikulik, Ethan Perez, Fabien Roger
|
Mar 27, 2025 |
“Negative Results for SAEs On Downstream Tasks and Deprioritising SAE Research (GDM Mech Interp Team Progress Update #2)” by Neel Nanda, lewis smith, Senthooran Rajamanoharan, Arthur Conmy, Callum McDougall, Tom Lieberum, János Kramár, Rohin Shah
|
Mar 26, 2025 |
“Eukaryote Skips Town - Why I’m leaving DC” by eukaryote
|
Mar 26, 2025 |
“Conceptual Rounding Errors” by Jan_Kulveit
|
Mar 26, 2025 |
“Goodhart Typology via Structure, Function, and Randomness Distributions” by JustinShovelain, Mateusz Bagiński
|
Mar 26, 2025 |
[Linkpost] “Latest map of all 40 copyright suits v. AI in U.S.” by Remmelt
|
Mar 26, 2025 |
“An overview of areas of control work” by ryan_greenblatt
|
Mar 26, 2025 |
“Subversion Strategy Eval: Can language models statelessly strategize to subvert control protocols?” by Alex Mallen, charlie_griffin, Buck Shlegeris
|
Mar 26, 2025 |
“More on Various AI Action Plans” by Zvi
|
Mar 25, 2025 |
“On (Not) Feeling the AGI” by Zvi
|
Mar 25, 2025 |
“23andMe potentially for sale for $23M” by lemonhope
|
Mar 25, 2025 |
“Notes on countermeasures for exploration hacking (aka sandbagging)” by ryan_greenblatt
|
Mar 25, 2025 |
“Analyzing long agent transcripts (Docent)” by jsteinhardt
|
Mar 25, 2025 |
“An overview of control measures” by ryan_greenblatt
|
Mar 25, 2025 |
“Policy for LLM Writing on LessWrong” by jimrandomh
|
Mar 24, 2025 |
“AI ‘Deep Research’ Tools Reviewed” by sarahconstantin
|
Mar 24, 2025 |
“Recent AI model progress feels mostly like bullshit” by lc
|
Mar 24, 2025 |
“Will Jesus Christ return in an election year?” by Eric Neyman
|
Mar 24, 2025 |
“We need (a lot) more rogue agent honeypots” by Ozyrus
|
Mar 24, 2025 |
“Selective modularity: a research agenda” by cloud, Jacob G-W
|
Mar 24, 2025 |
“Solving willpower seems easier than solving aging” by Yair Halberstadt
|
Mar 23, 2025 |
“A long list of concrete projects and open problems in evals” by Marius Hobbhahn
|
Mar 23, 2025 |
“Reframing AI Safety as a Neverending Institutional Challenge” by scasper
|
Mar 23, 2025 |
“They Took MY Job?” by Zvi
|
Mar 23, 2025 |
“Do models say what they learn?” by Andy Arditi, marvinli, Joe Benton, Miles Turpin
|
Mar 22, 2025 |
“Good Research Takes are Not Sufficient for Good Strategic Takes” by Neel Nanda
|
Mar 22, 2025 |
“SHIFT relies on token-level features to de-bias Bias in Bios probes” by Tim Hua
|
Mar 22, 2025 |
“Silly Time” by jefftk
|
Mar 22, 2025 |
“How I force LLMs to generate correct code” by claudio
|
Mar 21, 2025 |
“Towards a scale-free theory of intelligent agency” by Richard_Ngo
|
Mar 21, 2025 |
“Intention to Treat” by Alicorn
|
Mar 20, 2025 |
“Socially Graceful Degradation” by Screwtape
|
Mar 20, 2025 |
“Apply to MATS 8.0!” by Ryan Kidd, K Richards
|
Mar 20, 2025 |
“Equations Mean Things” by abstractapplic
|
Mar 20, 2025 |
“Prioritizing threats for AI control” by ryan_greenblatt
|
Mar 19, 2025 |
“Elite Coordination via the Consensus of Power” by Richard_Ngo
|
Mar 19, 2025 |
[Linkpost] “METR: Measuring AI Ability to Complete Long Tasks” by Zach Stein-Perlman
|
Mar 19, 2025 |
“Going Nova” by Zvi
|
Mar 19, 2025 |
“Boots theory and Sybil Ramkin” by philh
|
Mar 19, 2025 |
“LessOnline 2025: Early Bird Tickets On Sale” by Ben Pace
|
Mar 18, 2025 |
“I changed my mind about orca intelligence” by Towards_Keeperhood
|
Mar 18, 2025 |
“Go home GPT-4o, you’re drunk: emergent misalignment as lowered inhibitions” by Stuart_Armstrong, rgorman
|
Mar 18, 2025 |
“OpenAI #11: America Action Plan” by Zvi
|
Mar 18, 2025 |
“Feedback loops for exercise (VO2Max)” by Elizabeth
|
Mar 18, 2025 |
“FrontierMath Score of o3-mini Much Lower Than Claimed” by YafahEdelman
|
Mar 18, 2025 |
“Falsified draft: ‘Against Yudkowsky’s evolution analogy for AI x-risk’” by Fiora Sunshine
|
Mar 18, 2025 |
“Three Types of Intelligence Explosion” by rosehadshar, Tom Davidson, wdmacaskill
|
Mar 18, 2025 |
“Sentinel’s Global Risks Weekly Roundup #11/2025. Trump invokes Alien Enemies Act, Chinese invasion barges deployed in exercise.” by NunoSempere
|
Mar 18, 2025 |
“Notable utility-monster-like failure modes on Biologically and Economically aligned AI safety benchmarks for LLMs with simplified observation format” by Roland Pihlakas, Sruthi Kuriakose
|
Mar 17, 2025 |
“Claude Sonnet 3.7 (often) knows when it’s in alignment evaluations” by Nicholas Goldowsky-Dill, Mikita Balesni, Jérémy Scheurer, Marius Hobbhahn
|
Mar 17, 2025 |
“Metacognition Broke My Nail-Biting Habit” by Rafka
|
Mar 17, 2025 |
“How I’ve run major projects” by benkuhn
|
Mar 17, 2025 |
“Any-Benefit Mindset and Any-Reason Reasoning” by silentbob
|
Mar 16, 2025 |
“Announcing EXP: Experimental Summer Workshop on Collective Cognition” by Jan_Kulveit, Anna Gajdova
|
Mar 16, 2025 |
“Why White-Box Redteaming Makes Me Feel Weird” by Zygi Straznickas
|
Mar 16, 2025 |
“I make several million dollars per year and have hundreds of thousands of followers—what is the straightest line path to utilizing these resources to reduce existential-level AI threats?” by shrimpy
|
Mar 16, 2025 |
“AI for AI safety” by Joe Carlsmith
|
Mar 15, 2025 |
“On MAIM and Superintelligence Strategy” by Zvi
|
Mar 15, 2025 |
“Unofficial 2024 LessWrong Survey Results” by Screwtape
|
Mar 15, 2025 |
“AI for Epistemics Hackathon” by Austin Chen
|
Mar 14, 2025 |
“Habermas Machine” by NicholasKees
|
Mar 14, 2025 |
“Interpreting Complexity” by Maxwell Adam
|
Mar 14, 2025 |
“Vacuum Decay: Expert Survey Results” by JessRiedel
|
Mar 14, 2025 |
“AI #107: The Misplaced Hype Machine” by Zvi
|
Mar 14, 2025 |
“Reducing LLM deception at scale with self-other overlap fine-tuning” by Marc Carauleanu, Diogo de Lucena, Gunnar_Zarncke, Judd Rosenblatt, Mike Vaiana, Cameron Berg
|
Mar 13, 2025 |
“Auditing language models for hidden objectives” by Sam Marks, Johannes Treutlein, dmz, Sam Bowman, Hoagy, Carson Denison, Akbir Khan, Euan Ong, Christopher Olah, Fabien Roger, Meg, Drake Thomas, Adam Jermyn, Monte M, evhub
|
Mar 13, 2025 |
“Intelsat as a Model for International AGI Governance” by rosehadshar, wdmacaskill
|
Mar 13, 2025 |
“Don’t over-update on FrontierMath results” by David Matolcsi
|
Mar 13, 2025 |
“Anthropic, and taking ‘technical philosophy’ more seriously” by Raemon
|
Mar 13, 2025 |
“The Most Forbidden Technique” by Zvi
|
Mar 12, 2025 |
“Response to Scott Alexander on Imprisonment” by Zvi
|
Mar 12, 2025 |
“HPMOR Anniversary Parties: Coordination, Resources, and Discussion” by Screwtape
|
Mar 12, 2025 |
“Paths and waystations in AI safety” by Joe Carlsmith
|
Mar 12, 2025 |
“Preparing for the Intelligence Explosion” by fin, wdmacaskill
|
Mar 11, 2025 |
“Elon Musk May Be Transitioning to Bipolar Type I” by Cyborg25
|
Mar 11, 2025 |
“AI Control May Increase Existential Risk” by Jan_Kulveit
|
Mar 11, 2025 |
“Do reasoning models use their scratchpad like we do? Evidence from distilling paraphrases” by Fabien Roger
|
Mar 11, 2025 |
“We Have No Plan for Preventing Loss of Control in Open Models” by Andrew Dickson
|
Mar 11, 2025 |
“Trojan Sky” by Richard_Ngo
|
Mar 11, 2025 |
“The Manus Marketing Madness” by Zvi
|
Mar 11, 2025 |
“OpenAI: Detecting misbehavior in frontier reasoning models” by Daniel Kokotajlo
|
Mar 11, 2025 |
“OpenAI: Detecting misbehavior in frontier reasoning models” by Daniel Kokotajlo
|
Mar 11, 2025 |
“Introducing 11 New AI Safety Organizations - Catalyze’s Winter 24/25 London Incubation Program Cohort” by Alexandra Bos
|
Mar 10, 2025 |
“Everything I Know About Semantics I Learned From Music Notation” by J Bostock
|
Mar 10, 2025 |
“Book Review: Affective Neuroscience” by sarahconstantin
|
Mar 10, 2025 |
“Phoenix Rising” by Metacelsus
|
Mar 09, 2025 |
“The machine has no mouth and it must scream” by zef
|
Mar 08, 2025 |
“Childhood and Education #9: School is Hell” by Zvi
|
Mar 08, 2025 |
“AI #106: Not so Fast” by Zvi
|
Mar 08, 2025 |
“Lots of brief thoughts on Software Engineering” by Yair Halberstadt
|
Mar 07, 2025 |
“Top AI safety newsletters, books, podcasts, etc – new AISafety.com resource” by Bryce Robertson, Søren Elverlin
|
Mar 07, 2025 |
“So how well is Claude playing Pokémon?” by Julian Bradshaw
|
Mar 07, 2025 |
“What the Headlines Miss About the Latest Decision in the Musk vs. OpenAI Lawsuit” by garrison
|
Mar 07, 2025 |
“We should start looking for scheming ‘in the wild’” by Marius Hobbhahn
|
Mar 06, 2025 |
“The Hidden Cost of Our Lies to AI” by Nicholas Andresen
|
Mar 06, 2025 |
“On Writing #1” by Zvi
|
Mar 06, 2025 |
“On the Rationality of Deterring ASI” by Dan H
|
Mar 05, 2025 |
“How Much Are LLMs Actually Boosting Real-World Programmer Productivity?” by Thane Ruthenis
|
Mar 05, 2025 |
“On OpenAI’s Safety and Alignment Philosophy” by Zvi
|
Mar 05, 2025 |
“A Bear Case: My Predictions Regarding AI Progress” by Thane Ruthenis
|
Mar 05, 2025 |
“For scheming, we should first focus on detection and then on prevention” by Marius Hobbhahn
|
Mar 04, 2025 |
“The Semi-Rational Wildfirefighter” by P. João
|
Mar 04, 2025 |
“The Milton Friedman Model of Policy Change” by JohnofCharleston
|
Mar 04, 2025 |
“Could Advanced AI Accelerate the Pace of AI Progress? Interviews with AI Researchers” by Nikola Jurkovic, jleibowich, Tom Davidson
|
Mar 04, 2025 |
“What goals will AIs have? A list of hypotheses” by Daniel Kokotajlo
|
Mar 03, 2025 |
“Methods for strong human germline engineering” by TsviBT
|
Mar 03, 2025 |
“Statistical Challenges with Making Super IQ babies” by Jan Christian Refsgaard
|
Mar 03, 2025 |
“Will LLM agents become the first takeover-capable AGIs?” by Seth Herd
|
Mar 03, 2025 |
“Self-fulfilling misalignment data might be poisoning our AI models” by TurnTrout
|
Mar 02, 2025 |
“Maintaining Alignment during RSI as a Feedback Control Problem” by beren
|
Mar 02, 2025 |
“Open problems in emergent misalignment” by Jan Betley, Daniel Tan
|
Mar 01, 2025 |
“On Emergent Misalignment” by Zvi
|
Feb 28, 2025 |
“OpenAI releases ChatGPT 4.5” by Seth Herd
|
Feb 28, 2025 |
“How to Corner Liars: A Miasma-Clearing Protocol” by ymeskhout
|
Feb 28, 2025 |
“Weirdness Points” by lsusr
|
Feb 28, 2025 |
“Why Can’t We Hypothesize After the Fact?” by David Udell
|
Feb 27, 2025 |
“Fuzzing LLMs sometimes makes them reveal their secrets” by Fabien Roger
|
Feb 26, 2025 |
“[PAPER] Jacobian Sparse Autoencoders: Sparsify Computations, Not Just Activations” by Lucy Farnik
|
Feb 26, 2025 |
“Osaka” by lsusr
|
Feb 26, 2025 |
“Time to Welcome Claude 3.7” by Zvi
|
Feb 26, 2025 |
“You can just wear a suit” by lsusr
|
Feb 26, 2025 |
“what an efficient market feels from inside” by DMMF
|
Feb 26, 2025 |
“Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs” by Jan Betley, Owain_Evans
|
Feb 25, 2025 |
“Grok Grok” by Zvi
|
Feb 25, 2025 |
“Superintelligent Agents Pose Catastrophic Risks:
Can Scientist AI Offer a Safer Path?” by Jesse Richardson, Yoshua Bengio, dwk, mattmacdermott
|
Feb 25, 2025 |
“Dream, Truth, & Good” by abramdemski
|
Feb 25, 2025 |
“Conference Report: Threshold 2030 - Modeling AI Economic Futures” by Deric Cheng, Justin Bullock, Deger Turan, Elliot Mckernon
|
Feb 25, 2025 |
“Training AI to do alignment research we don’t already know how to do” by joshc
|
Feb 24, 2025 |
“Anthropic releases Claude 3.7 Sonnet with extended thinking mode” by LawrenceC
|
Feb 24, 2025 |
“Forecasting Frontier Language Model Agent Capabilities” by Govind Pimpale, Axel Højmark, Jérémy Scheurer, Marius Hobbhahn
|
Feb 24, 2025 |
“Evaluating ‘What 2026 Looks Like’ So Far” by Jonny Spicer
|
Feb 24, 2025 |
“Export Surplusses” by lsusr
|
Feb 24, 2025 |
“Judgements: Merging Prediction & Evidence” by abramdemski
|
Feb 24, 2025 |
“The GDM AGI Safety+Alignment Team is Hiring for Applied Interpretability Research” by Arthur Conmy, Neel Nanda
|
Feb 24, 2025 |
“Power Lies Trembling: a three-book review” by Richard_Ngo
|
Feb 23, 2025 |
“Proselytizing” by lsusr
|
Feb 23, 2025 |
“HPMOR Anniversary Guide” by Screwtape
|
Feb 23, 2025 |
“Alignment can be the ‘clean energy’ of AI” by Cameron Berg, Judd Rosenblatt, AE Studio
|
Feb 22, 2025 |
“ParaScope: Do Language Models Plan the Upcoming Paragraph?” by NickyP
|
Feb 21, 2025 |
“The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better” by Thane Ruthenis
|
Feb 21, 2025 |
“On OpenAI’s Model Spec 2.0” by Zvi
|
Feb 21, 2025 |
“The first RCT for GLP-1 drugs and alcoholism isn’t what we hoped” by dynomight
|
Feb 21, 2025 |
“AI #104: American State Capacity on the Brink” by Zvi
|
Feb 21, 2025 |
“Timaeus in 2024” by Jesse Hoogland, Stan van Wingerden, Alexander Gietelink Oldenziel, Daniel Murfet
|
Feb 21, 2025 |
“Eliezer’s Lost Alignment Articles / The Arbital Sequence” by Ruby
|
Feb 20, 2025 |
“Arbital has been imported to LessWrong” by RobertM, jimrandomh, Ben Pace, Ruby
|
Feb 20, 2025 |
“Go Grok Yourself” by Zvi
|
Feb 20, 2025 |
“How accurate was my ‘Altered Traits’ book review?” by lsusr
|
Feb 19, 2025 |
“SuperBabies podcast with Gene Smith” by Eneasz
|
Feb 19, 2025 |
“How might we safely pass the buck to AI?” by joshc
|
Feb 19, 2025 |
“How to Make Superbabies” by GeneSmith, kman
|
Feb 19, 2025 |
“Dear AGI,” by Nathan Young
|
Feb 18, 2025 |
“Do models know when they are being evaluated?” by Govind Pimpale
|
Feb 18, 2025 |
“AGI Safety & Alignment @ Google DeepMind is hiring” by Rohin Shah
|
Feb 17, 2025 |
“A History of the Future, 2025-2040” by L Rudolf L
|
Feb 17, 2025 |
“Thermodynamic entropy = Kolmogorov complexity” by EbTech
|
Feb 17, 2025 |
“Celtic Knots on Einstein Lattice” by Ben
|
Feb 17, 2025 |
“Gauging Interest for a Learning-Theoretic Agenda Mentorship Programme” by Vanessa Kosoy
|
Feb 16, 2025 |
“It’s been ten years. I propose HPMOR Anniversary Parties.” by Screwtape
|
Feb 16, 2025 |
“Microplastics: Much Less Than You Wanted To Know” by jenn, kaleb, Brent
|
Feb 16, 2025 |
“A computational no-coincidence principle” by Eric Neyman
|
Feb 14, 2025 |
“A short course on AGI safety from the GDM Alignment team” by Vika, Rohin Shah
|
Feb 14, 2025 |
“The Mask Comes Off: A Trio of Tales” by Zvi
|
Feb 14, 2025 |
“Ambiguous out-of-distribution generalization on an algorithmic task” by Wilson Wu, Experience Machine
|
Feb 14, 2025 |
“≤10-year Timelines Remain Unlikely Despite DeepSeek and o3” by Rafael Harth
|
Feb 14, 2025 |
“Self-dialogue: Do behaviorist rewards make scheming AGIs?” by Steven Byrnes
|
Feb 14, 2025 |
“Murder plots are infohazards” by Chris Monteiro
|
Feb 13, 2025 |
“How do we solve the alignment problem?” by Joe Carlsmith
|
Feb 13, 2025 |
“Extended analogy between humans, corporations, and AIs.” by Daniel Kokotajlo
|
Feb 13, 2025 |
“Virtue signaling, and the ‘humans-are-wonderful’ bias, as a trust exercise” by lc
|
Feb 13, 2025 |
“My model of what is going on with LLMs” by Cole Wyeth
|
Feb 13, 2025 |
“Skepticism towards claims about the views of powerful institutions” by tlevin
|
Feb 13, 2025 |
“Why you maybe should lift weights, and How to.” by samusasuke
|
Feb 13, 2025 |
“Not all capabilities will be created equal: focus on strategically superhuman agents” by benwr
|
Feb 13, 2025 |
“The Paris AI Anti-Safety Summit” by Zvi
|
Feb 12, 2025 |
“Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs” by Matrice Jacobine
|
Feb 12, 2025 |
“Proof idea: SLT to AIT” by Lucius Bushnaq
|
Feb 11, 2025 |
“On Deliberative Alignment” by Zvi
|
Feb 11, 2025 |
“The News is Never Neglected” by lsusr
|
Feb 11, 2025 |
“Nonpartisan AI safety” by Yair Halberstadt
|
Feb 11, 2025 |
“Knocking Down My AI Optimist Strawman” by tailcalled
|
Feb 11, 2025 |
“Why Did Elon Musk Just Offer to Buy Control of OpenAI for $100 Billion?” by garrison
|
Feb 11, 2025 |
“Levels of Friction” by Zvi
|
Feb 10, 2025 |
“Reasons-based choice and cluelessness” by JesseClifton
|
Feb 10, 2025 |
“On the Meta and DeepMind Safety Frameworks” by Zvi
|
Feb 09, 2025 |
“Gary Marcus now saying AI can’t do things it can already do” by Benjamin_Todd
|
Feb 09, 2025 |
“Two hemispheres - I do not think it means what you think it means” by Viliam
|
Feb 09, 2025 |
“‘Think it Faster’ worksheet” by Raemon
|
Feb 09, 2025 |
“A Problem to Solve Before Building a Deception Detector” by Eleni Angelou, lewis smith
|
Feb 08, 2025 |
“Wild Animal Suffering Is The Worst Thing In The World” by omnizoid
|
Feb 08, 2025 |
“Research directions Open Phil wants to fund in technical AI safety” by jake_mendel, maxnadeau, Peter Favaloro
|
Feb 08, 2025 |
“Racing Towards Fusion and AI” by Jeffrey Heninger
|
Feb 08, 2025 |
“So You Want To Make Marginal Progress...” by johnswentworth
|
Feb 08, 2025 |
“How AI Takeover Might Happen in 2 Years” by joshc
|
Feb 07, 2025 |
“Open Philanthropy Technical AI Safety RFP - $40M Available Across 21 Research Areas” by jake_mendel, maxnadeau, Peter Favaloro
|
Feb 06, 2025 |
“MATS Applications + Research Directions I’m Currently Excited About” by Neel Nanda
|
Feb 06, 2025 |
“Detecting Strategic Deception Using Linear Probes” by Nicholas Goldowsky-Dill, bilalchughtai, StefanHex, Marius Hobbhahn
|
Feb 06, 2025 |
“Voting Results for the 2023 Review” by Raemon
|
Feb 06, 2025 |
“The Risk of Gradual Disempowerment from AI” by Zvi
|
Feb 06, 2025 |
“C’mon guys, Deliberate Practice is Real” by Raemon
|
Feb 06, 2025 |
“Wired on: ’DOGE personnel with admin access to” by Raemon
|
Feb 05, 2025 |
“Subjective Naturalism in Decision Theory: Savage vs. Jeffrey–Bolker” by Daniel Herrmann, Aydin Mohseni, ben_levinstein
|
Feb 05, 2025 |
“Language Models Use Trigonometry to Do Addition” by Subhash Kantamneni
|
Feb 05, 2025 |
“Reviewing LessWrong: Screwtape’s Basic Answer” by Screwtape
|
Feb 05, 2025 |
“We’re in Deep Research” by Zvi
|
Feb 05, 2025 |
“Meta: Frontier AI Framework” by Zach Stein-Perlman
|
Feb 05, 2025 |
“Anti-Slop Interventions?” by abramdemski
|
Feb 05, 2025 |
“Tear Down the Burren” by jefftk
|
Feb 04, 2025 |
“o3-mini Early Days” by Zvi
|
Feb 04, 2025 |
“OpenAI releases deep research agent” by Seth Herd
|
Feb 03, 2025 |
“Pick two: concise, comprehensive, or clear rules” by Screwtape
|
Feb 03, 2025 |
“2024 was the year of the big battery, and what that means for solar power” by transhumanist_atom_understander
|
Feb 03, 2025 |
“Alderaan” by lsusr
|
Feb 02, 2025 |
“The Simplest Good” by Jesse Hoogland
|
Feb 02, 2025 |
“Gradual Disempowerment, Shell Games and Flinches” by Jan_Kulveit
|
Feb 02, 2025 |
“Falsehoods you might believe about people who are at a rationalist meetup” by Screwtape
|
Feb 02, 2025 |
“Some articles in ‘International Security’ that I enjoyed” by Buck
|
Feb 01, 2025 |
“DeepSeek: Don’t Panic” by Zvi
|
Feb 01, 2025 |
“The Failed Strategy of Artificial Intelligence Doomers” by Ben Pace
|
Jan 31, 2025 |
“Will alignment-faking Claude accept a deal to reveal its misalignment?” by ryan_greenblatt
|
Jan 31, 2025 |
“Catastrophe through Chaos” by Marius Hobbhahn
|
Jan 31, 2025 |
“In response to critiques of Guaranteed Safe AI” by Nora_Ammann
|
Jan 31, 2025 |
“AI #101: The Shallow End” by Zvi
|
Jan 31, 2025 |
“What’s Behind the SynBio Bust?” by sarahconstantin
|
Jan 31, 2025 |
“Steering Gemini with BiDPO” by TurnTrout
|
Jan 31, 2025 |
“Thread for Sense-Making on Recent Murders and How to Sanely Respond” by Ben Pace
|
Jan 31, 2025 |
“You should read Hobbes, Locke, Hume, and Mill via EarlyModernTexts.com” by Arjun Panickssery
|
Jan 31, 2025 |
“Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development” by Jan_Kulveit, Raymond D, Nora_Ammann, Deger Turan, David Scott Krueger (formerly: capybaralet), David Duvenaud
|
Jan 30, 2025 |
“A sketch of an AI control safety case” by Tomek Korbak, joshc, Benjamin Hilton, Buck, Geoffrey Irving
|
Jan 30, 2025 |
“Anthropic CEO calls for RSI” by Andrea_Miotti
|
Jan 30, 2025 |
“Planning for Extreme AI Risks” by joshc
|
Jan 29, 2025 |
“Dario Amodei: On DeepSeek and Export Controls” by Zach Stein-Perlman
|
Jan 29, 2025 |
“Operator” by Zvi
|
Jan 29, 2025 |
“Open Problems in Mechanistic Interpretability” by Lee Sharkey, bilalchughtai
|
Jan 29, 2025 |
“Fake thinking and real thinking” by Joe Carlsmith
|
Jan 29, 2025 |
“DeepSeek Panic at the App Store” by Zvi
|
Jan 29, 2025 |
“The Game Board has been Flipped: Now is a good time to rethink what you’re doing” by Alex Lintz
|
Jan 29, 2025 |