Audio narrations of LessWrong posts.
Twitter thread here. tl;dr When prompted, current models can sandbag ML experiments and research decisions without being detected by... more
Audio note: this article contains 31 uses of latex notation, so the narration may be difficult to follow.... more
Epistemic status: Reasonably confident in the basic mechanism. Have you noticed that you keep encountering the same ideas over... more
I’ve spent the past 7 years living in the DC area. I moved out there from the Pacific Northwest... more
Audio note: this article contains 127 uses of latex notation, so the narration may be difficult to follow.... more
This is a link post. Download the latest PDF with links to court dockets here. --- ... more
In this post, I'll list all the areas of control research (and implementation) that seem promising to me. This... more
We recently released Subversion Strategy Eval: Can language models statelessly strategize to subvert control protocols?, a major update to... more
Last week I covered Anthropic's relatively strong submission, and OpenAI's toxic submission. This week I cover several other submissions,... more
Ben Thompson interviewed Sam Altman recently about building a consumer tech company, and about the history of OpenAI. Mostly it... more