AISafety.com

AI Alignment Forum: Curated Sequences

List of sequences curated by the AI Alignment Forum team, featuring work from Richard Ngo, Paul Christiano, etc.

Category

Technical Alignment

Created by

Various

Agent Foundations for Superintelligence-Robust Alignment

Guide to the cluster of thought which expects solving alignment in a way which scales to superintelligence to require well-specified proposals guided by strong theoretical understanding.

Category

Technical Alignment

Created by

Alignment Ecosystem Development (AED)

DeepMind AGI Safety Short Course

75-minute course covering alignment problems expected as AI capabilities advance. Includes recorded talks, exercises, and an accompanying workbook.

Category

Introductory

Created by

Google DeepMind

AI Safety Atlas

Textbook systematically mapping the landscape of AI safety knowledge. Ideas are laid out in a carefully structured narrative where concepts build naturally on previous ones.

Category

Technical Alignment, Governance

Created by

Centre pour la Sécurité de l'IA (CeSIA)

AI Safety, Ethics and Society (AISES)

Covers a wide range of risks, including loss of control, misalignment, malicious use, etc, while leveraging concepts and frameworks from existing research fields to analyze AI safety.

Category

Introductory

Created by

Center for AI Safety (CAIS)

Intro to Transformative AI

Intensive 5-unit course helping people fast-track their understanding of AI’s impact on society and start taking their first steps in making a contribution to AI safety.

Category

Introductory

Created by

BlueDot Impact

CS 2881: AI Safety

Graduate-level course at Harvard University examining technical challenges in AI safety, including adversarial robustness, jailbreaks, interpretability, and potential catastrophic capabilities.

Category

Technical Alignment

Created by

Boaz Barak

Deep Dive

Designed as a 201 AI policy course, helping you find your footing in the vast AI governance landscape by engaging with various dimensions.

Category

Governance

Created by

European Network for AI Safety (ENAIS) and AI Safety Hungary

Language Models and Intelligent Agentic Systems

Series of 16 lectures explaining how language model systems are built, in order to understand and predict their behaviour.

Category

Technical Alignment

Created by

Meridian and C2D3

Future of AI

Very short, engaging introduction to AI and its potential impacts. Good entry point to share with laypeople curious about AI.

Category

Introductory

Created by

BlueDot Impact

AGI Strategy

The decisions being made today determine whether AI liberates humanity, destabilises it, or worse. This course aims to prepare participants to be part of those decisions.

Category

Strategy

Created by

BlueDot Impact

Introduction to ML Safety

PhD-level survey course on technical ML safety. Covers various technical topics to reduce existential risks from AI, including robustness, monitoring, control, and systemic safety.

Category

Technical Alignment

Created by

Center for AI Safety (CAIS)

Introduction to AI Safety

What is safe AI, and how do we make it? This Stanford University course explores that question, focusing on the technical challenges of creating reliable, ethical, and aligned AI systems.

Category

Introductory, Technical Alignment

Created by

Stanford University

Alignment Research Engineer Accelerator (ARENA)

Facilitates upskilling in machine learning engineering for the purpose of contributing directly to AI alignment in technical roles.

Category

Technical Alignment

Created by

ARENA

Introduction to Cooperative AI

A primer on cooperative AI, a research field focused on improving the cooperative intelligence of advanced AI for the benefit of all.

Category

Introductory

Created by

Cooperative AI Foundation (CAIF)

MATS AI Safety Strategy Curriculum

As part of their program, MATS runs a series of discussion groups focused on topics relevant to prioritizing research in AI safety. This is the curriculum.

Category

Technical Alignment, Governance

Created by

ML Alignment & Theory Scholars (MATS)

Victoria Krakovna: AI Alignment Resources

Intended to help people get up to speed on the main ideas in AI alignment i.e. ensuring advanced AI systems do what we want them to do.

Category

Technical Alignment

Created by

Victoria Krakovna

CHAI: Annotated Bibliography of Recommended Materials

Reading list discussing ideas CHAI believes warrant further attention and research for the purpose of developing human-compatible AI.

Category

Technical Alignment

Created by

Center for Human-Compatible AI (CHAI)

Levelling Up in AI Safety Research Engineering

Level-based guide for independently upskilling in AI safety research engineering that aims to give concrete objectives, goals, and resources to help anyone go from zero to hero.

Category

Technical Alignment

Created by

Stanford AI Alignment (SAIA)

Learning-Theoretic Agenda Reading List

Vanessa Kosoy's math-heavy reading list for people interested in studying the learning-theoretic agenda.

Category

Technical Alignment

Created by

Association for Long-Term Existence and Resilience (ALTER)

Arkose: AI Safety Papers

Papers, blog posts, talks, newsletters, and guides relevant to improving the safety of advanced AI systems in order to reduce large-scale risks.

Category

Technical Alignment, Governance

Created by

Arkose

Reading What We Can

Collection of books and articles for a 20 day reading challenge. Covers AI safety basics, ML engineering, ML upskilling, and AI safety-relevant sci-fi.

Category

Introductory

Created by

Apart Research

Existential Risks Introductory Course (ERIC)

Each section covers a different topic related to existential risks. The book 'The Precipice' by Toby Ord forms the majority of the core reading material.

Category

Introductory

Created by

Cambridge Long View Initiative (CLVI)

Economics of Transformative AI

Course designed for economists who want to develop their understanding of transformative AI and its economic impacts.

Category

Strategy

Created by

BlueDot Impact

Key Phenomena in AI Risk

Focused on introducing people to AI risk factors from misdirected optimization or consequentialist cognition, aiming to remain largely agnostic of solution paradigms.

Category

Technical Alignment

Created by

Principles of Intelligence

Self-study

These curricula and reading lists enable you to dive deeper into AI safety through independent learning.

AI Alignment Forum: Curated Sequences

BlueDot Impact: Technical & Frontier AI Governance

AI Alignment Forum: Curated Sequences

Agent Foundations for Superintelligence-Robust Alignment

DeepMind AGI Safety Short Course

AI Safety Atlas

AI Safety, Ethics and Society (AISES)

Intro to Transformative AI

CS 2881: AI Safety

Deep Dive

Language Models and Intelligent Agentic Systems

Future of AI

AGI Strategy

Introduction to ML Safety

Introduction to AI Safety

Alignment Research Engineer Accelerator (ARENA)

Introduction to Cooperative AI

MATS AI Safety Strategy Curriculum

Victoria Krakovna: AI Alignment Resources

CHAI: Annotated Bibliography of Recommended Materials

Levelling Up in AI Safety Research Engineering

Learning-Theoretic Agenda Reading List

Arkose: AI Safety Papers

Reading What We Can

Existential Risks Introductory Course (ERIC)

Economics of Transformative AI

Key Phenomena in AI Risk