Lots of Links

How to use this page: This lots-of-links page is intended to be used as a reference when you want to look up something specific.

AI safety is a new and fast-growing research field. This means that everything is a bit of a mess. If you are new to the field and feel overwhelmed or don't know where to start, we recommend contacting AI Safety Quest for guidance. Alternatively, just have a look at the highlighted links and ignore the rest.

Page maintenance: This page is sporadically maintained by Linda Linsefors. Please reach out if you want to help keep it up to date, or you have other comments or suggestions.
linda.linsefors@gmail.com

Last update: 2024-12-13

Highlights

News and Community

List of communities

AISafety.com Communities

News

AI Safety Newsletter by Center for AI Safety
Subscribe to updates from AISafety.com Events and Training
AI Safety in China by Concordia AI (安远AI)
Don't Worry About the Vase | Substack or WordPress <- Not just AI
AI Explained – YouTube
Newsletters - Future of Life Institute

Community Blogs

Facebook

AI Safety & Security
Careers in AI Safety
AI Safety Discussion (Open)
AI Safety Discussion – This group is primarily for people who have experience in AI/ML and/or are familiar with AI safety. Beginners are encouraged to join the AI Safety Discussion (Open) group.

Twitter/X (Very incomplete since I (Linda) don't use twitter. Please help.)

Eliezer Yudkowsky ⏹️
AI Notkilleveryoneism Memes ⏸️
AI Safety Core by JJ Balisanyuka-Smith
⏹️ = Stop AI, ⏸️ = Pause AI
(These symbols are used on twitter to show support for these types of policies.)

Other

AI Safety Google Group – used by various people for various announcements
AI Alignment Slack group – discussions, networking, etc.
AI Existential Safety Community from FLI
Apart Research
More Light -- Whistle-blower support for AI insiders
Women in AI Safety

Local groups (click to expand)

Australia & New Zealand

Asia

Africa

Equiano Institute - A grassroots interdisciplinary responsible AI lab for Africa and the Global South
AI Safety Cape Town

Netherlands

UK & Ireland

Europe

South America

North America

Fellowships, Internships, Training Programs

AISafety.com Events and Training – a calendar showing all upcoming programs and other events

Programs

ML Alignment Theory Scholars Program (MATS)
AI Safety Camp (AISC) – online, part-time research program.
ML4Good - Intensive in-person Bootcamps
Principles of Intelligent Behavior in Biological and Social Systems (PIBBSS)
Supervised Program for Alignment Research (SPAR)
Swiss Existential Risk Initiative (CHERI), Research Fellowship
The Existential Risk Alliance (ERA), Cambridge Fellowship
Center for Human-Compatible AI (CHAI), Research Fellowship, Collaborations, Internships
Center for the Governance of AI (GovAI), Research Fellows, Summer and Winter Fellowships
LawAI's Summer Research Fellowship
Center for AI Safety (CAIS), Philosophy Fellowship
Global Challenges Project
AI Safety Intro Fellowship, Cambridge AI Safety Hub
ML for Alignment Bootcamp (MLAB) – not currently running, but you can sign up for news on future iterations or request access to their curriculum
Institute for AI Policy and Strategy (IAPS) Fellowship
Pivotal (previously CHARI)

Self studies

- The Library — AI Alignment Forum
- AI Safety Fundamentals
  - Self Study Curriculum: Alignment
  - Self Study Curriculum: Governance
- Introduction to ML Safety – Self Study Curriculum
- Alignment Research Engineer Accelerator (ARENA) – follow links from each module to find full curriculum
- Levelling Up in AI Safety Research Engineering by Gabriel Mukobi
- Study Guide by John Wentsworth
- Agent Foundations for Superintelligence-Robust Alignment (AFSRA) by plex
Alignment reading list by Lucius Bushnaq
Developmental Interpretability - Resources
MAISI Wi23 Alignment Fundamentals Guide
Reading Guide for the Global Politics of Artificial Intelligence
Description of Refine, including initial reading list by Adam Shimi
Introduction to AI Safety, Ethics, and Society textbook + lectures by Dan Hendrycks
"Key Phenomena in AI Risk" - Reading Curriculum Principal Curator: TJ (particlemania)
- AI Safety Papers by Arkose
- Textbook: AISF Alignment Express [Comment Mode]

See AISafety.com Courses for more

Old self study guides (pre-LLMs)

- CHAI's Bibliography
- 80k's AI safety syllabus
- MIRI’s Research Guide (despite the name it is actually more of a study guide)
- Open AI’s Spinning Up in Deep RL (teaches you AI and not AI safety, but still useful)

Other lists

List of AI Safety Technical Courses, Reading Lists, and Curriculums from Nonlinear
List of some university courses on X-risk | Pablo's miscellany <- Not AI Safety specific
Top US policy master's programs <- Not AI Safety specific
Resource Center | Arkose

Research and Career Advice

Research/General Advice

Hamming, "You and Your Research" (June 6, 1995)
Safety isn’t safety without a social model (or: dispelling the myth of per se technical safety) by Andrew Critch
Inside Views, Impostor Syndrome, and the Great LARP by John Wentworth
You Are Not Measuring What You Think You Are Measuring by John Wentworth
Research productivity tip: "Solve The Whole Problem Day" by Steven Byrnes
Alignment Research Field Guide by MIRI – the best part is 3A. Transmitters and receivers
Focus on the places where you feel shocked everyone's dropping the ball by Nate Soares

Career Advice

FAQ: Advice for AI alignment researchers by Rohin Shah
Beneficial AI Research Career Advice by Adam Gleave
Technical AI Safety Careers by AI Safety Fundamentals

PhD Advice

Should you do a PhD? by Linda
Leveraging Academia and Deliberate Grad School by Andrew Critch
A Survival Guide to a PhD by Andrej Karpathy
Getting into CS graduate school in the USA
How to Write an Email to a Potential Ph.D. Advisor/Professor
List of (mostly academic AI Safety Professionals - Look here for potential PhD superviosrs

Job listings

Posts tagged "hiring" on LessWrong
80k's Job Board WARNING: This job board includes listings of jobs at AI capabilities labs such as OpenAI. Please don't apply for these jobs. Even so called safety roles at these labs should not be assumed to be good places to work for someone who cares about AI safety.
EA Opportunity Board
Many job openings are posted in #opportunities in the AI Alignment Slack

Career Coaching

AI Safety Quest – Navigation Calls
Arkose – Career coaching for machine learning professionals interested in contributing to technical AI safety work.
Free Coaching for Early/Mid-Career Software Engineers with Jeffrey Yun
Successif

Mental Health Support

Mental Health and the Alignment Problem: A Compilation of Resources (updated April 2023)
Shay Gestal – Health Coach
EA Mental Health Navigator
Rethink Wellbeing
The Replacing Guilt series by Nate Soares
Offer: Team Conflict Counseling for AI Safety Orgs by Severin T. Seehrich

Research Maps, Reviews, Information Databases, etc

Ten Levels of AI Alignment Difficulty by Sammy Martin
FLI's AI Safety Research Landscape
Mapping the Conceptual Territory in AI Existential Safety and Alignment by Jack Koch
AI Alignment 2018-19 Review by Rohin Shah
2020 AI Alignment Literature Review and Charity Comparison by Larks
AI Incident Database
The AI Risk Repository
Probability Calculator designed by Will Petillo to calculate p(doom) of AI destroying the world
Misalignment Cases - submit here
Specification gaming examples in AI

Mechanistic Interpretability

(This is a new section and therefore extra incomplete)

Libraries

Inspect - a framework for large language model evaluations created by the UK AI Safety Institute.
NNsight - NNsight (/ɛn.saɪt/) is a package for interpreting and manipulating the internals of models.
TransformerLens - This is a library for doing mechanistic interpretability of GPT-2 Style language models.

Other

Research Agendas

Technical AI safety

MIRI: Agent Foundations for Aligning Machine Intelligence with Human Interests (2017) and Alignment for Advanced Machine Learning Systems research agendas
CLR: Cooperation, Conflict, and Transformative Artificial Intelligence: A Research Agenda (+ includes some questions related to AI governance)
Paul Christiano’s research agenda summary (and FAQ and talk) (2018)
Synthesising a human's preferences into a utility function (example use and talk), Stuart Armstrong, (2019),
The Learning-Theoretic AI Alignment Research Agenda, Vanessa Kosoy, (2018)
Research Priorities for Robust and Beneficial Artificial Intelligence, Stuart Russell, Daniel Dewey, Max Tegmark, (2016)
Concrete problems in AI Safety, Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, Dan Mané, (2016)
AGI Safety Literature Review, Tom Everitt, Gary Lea, Marcus Hutter, (2018)
AI Services as a Research Paradigm, Vojta Kovarik, (2020)
Avoiding Negative Side Effects due to Incomplete Knowledge of AI Systems, Sandhya Saisubramanian, Shlomo Zilberstein, Ece Kamar, (2020)
AI Research Considerations for Human Existential Safety (ARCHES), Andrew Critch, David Krueger, (2020)
How do we become confident in the safety of a machine learning system? By Evan Hubinger
Research Agenda: Using Neuroscience to Achieve Safe and Beneficial AGI by Steve Brynes
Unsolved Problems in ML Safety By Dan Hendrycks, Nicholas Carlini, J. Schulman, J. Steinhardt
Foundational Challenges in Assuring Alignment and Safety of Large Language Models, Usman Anwar, et al.

AI governance

AI Impacts: promising research projects and possible empirical investigations
Governance of AI program at FHI: Alan Dafoe's AI governance research agenda
Center for a New American Security: Artificial Intelligence and Global Security Initiative Research Agenda
FLI: A survey of research questions for robust and beneficial AI (+ some aspects also fall into technical AI safety)
Luke Muehlhauser’s list of research questions to improve our strategic picture of superintelligence (2014)

Books, Papers, Podcasts, and Videos

(Non-exhaustive list of AI safety material)

Books

Considerations on the AI Endgame: Ethics, Risks and Computational Fram by Soenke Ziesche, Roman V. Yampolskiy, 2025
Introduction to AI Safety, Ethics, and Society (textbook) by Dan Hendrycks, 2024
Taming the Machine: Ethically Harness the Power of AI by Nell Watson, 2024
Uncontrollable: The Threat of Artificial Superintelligence and the Race to Save the World by Darren McKee, 2023
The Alignment Problem by Brian Christian, 2020
Human Compatible by Stuart Russell, 2019
Reframing Superintelligence by Eric Drexler, 2019
The AI Does Not Hate You: Superintelligence, Rationality and the Race to Save the World by Tom Chivers, 2019
Artificial Intelligence Safety and Security, By Roman Yampolskiy, 2018
Superintelligence: Paths, Dangers, Strategies by Nick Bostrom, 2014

Some other reading

AI Alignment Forum Sequences
Victoria Krakovna's AI safety resources (contains a list of motivational resources and key papers for some AI Safety sub fields)
Pragmatic AI Safety by ThomasWoodside
X-Risk Analysis for AI Research by Dan Hendrycks, Mantas Mazeika

Podcasts

Future of Life Podcast
80k's Podcast (Effective Altruism podcast with some AI Safety episodes)
AXRP – the AI X-risk Research Podcast
The Inside View
The Nonlinear Library – a repository of text-to-speech content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs
Into AI Safety

YouTube

(Most of these channels are a mix of AI Safety content and other content)

AI Safety Research Groups

(Many of these groups do a combination of AI Safety and other X-risks)

There are many academic and independent researchers who are interested in AI safety and are not covered by this list. We are not going to list specific individuals publicly, so please contact us if you want to find more AI safety researchers.

Technical AI safety

Machine Research Intelligence Institute (MIRI), Berkeley
Center for Human-Compatible Artificial Intelligence (CHAI), University of California, Berkeley
Future of Humanity Institute (FHI), University of Oxford
Center on Long-Term Risk (CLR), London
Ought, San Francisco
Alignment Research Center
Redwood Research, Berkeley
Conjecture
Aligned AI
FAR AI
Apollo Research
Apart Research
Center for AI Safety
Model Evaluation and Threat Research (METR)
Alignment Research Center (ARC)
AE Studio
Orthogonal (orxl)

AI governance

The Center for the Study of Existential Risk (CSER), University of Cambridge
Future of Humanity Institute (FHI), University of Oxford
Global Catastrophic Risk Institute (GCRI), various locations
Median Group, Berkeley
Center for Security and Emerging Technology (CSET), Washington
Centre for the Governance of AI
AI Standards Lab
Rethink Priorities

Forecast and strategy

Future of Humanity Institute (FHI), University of Oxford
Convergence Analysis
Leverhulme Center for the Future of Intelligence (CFI), University of Cambridge
AI Impacts
Metaculus
Epoch

Organisation working on both technical safety and capabilities

(These orgs employ people who are doing valuable technical AI safety research, which is good. But they are also employing people doing AI capabilities research, which is bad since it speeds up AI development and reduces the time we have left to solve AI safety.)

Anthropic, San Francisco
OpenAI
Obelisk
DeepMind, London
Cohere AI, Toronto

Outreach and Advocacy Groups and Initiatives

Pause AI
Global AI Moratorium
Stop AGI
SaferAI
Campaign for AI Safety
Luddite Pro
Existential Risk Observatory – collects and spreads information about existential risks
AI Digest
Center for AI Policy
Future of Life Institute (FLI) – outreach, podcast and conferences
ControlAI
Palisade Research – produces demos of dangerous AI capabilities
AI Safety Initiative Groningen (AISIG)

Funding

Grants

The Center on Long-Term Risk Fund (CLR Fund) – S-risk focused
Long-Term Future Fund (LTFF)
Survival and Flourishing Fund (SFF) – awards and facilitates grants to existing charities
Career development and transition funding | Open Philanthropy
Open Philanthropy Undergraduate Scholarship | Open Philanthropy
Future of Life Institute - Grants
CEA Group Support Funding
AI Safety: Neuro/Security/Cryptography/Multipolar Approaches | Foresight Institute

Fundraising platforms

Housing

CEEALAR / EA Hotel – a group house in Blackpool, UK which provides free food and housing for people working on Effective Altruist projects (including AI safety)
EA Housing by Nonlinear

Other lists

Field building Orgs and Initiatives

AI Safety Support (AISS) – that's us!
Apart Sprints
Alignment Ecosystem Development
AI Plans
AI Safety Ideas
Catalyze Impact
Lightcone Infrastructure – building tech, infrastructure and community
Berkeley Existential Risk Initiative (BERI) – support for academic researchers
Arkose
Horizon Omega (HΩ) & Horizon Events
Effective Thesis

Lots of Links

Highlights

Contents

News and Community

Local groups (click to expand)

Fellowships, Internships, Training Programs

Programs

Self studies

Old self study guides (pre-LLMs)

Other lists

Research and Career Advice

Job listings

Career Coaching

Mental Health Support

Research Maps, Reviews, Information Databases, etc

Mechanistic Interpretability

Research Agendas

Books, Papers, Podcasts, and Videos

AI Safety Research Groups

Outreach and Advocacy Groups and Initiatives

Funding

Field building Orgs and Initiatives

Other General Resource Lists

Suggestions