Lots of Links
How to use this page: This lots-of-links page intended to be used as reference, when you want to look up something specific.
AI Safety is a new and fast-growing research field. This means that everything is a bit of a mess. If you are new to the field and feel overwhelmed or don't know where to start, we recommend contacting AI Safety Quest for guidance.
Alternatively, just have a look at the highlighted links and ignore the rest.
Page maintenance: This page is sporadically maintained by Linda Linsefors. Please reach out if you want to help keeping it up to date, or you have other comments or suggestions.
linda.linsefors@gmail.com
Contents
Fellowships, Internships, Training programs, study guides, etc
AI Safety Training <- A calendar showing all upcoming programs and other events.
List of some university courses on X-risk | Pablo's miscellany
Alignment Research Engineer Accelerator (ARENA)
At least some of the content can be found on their website for self stuides
AI Safety Camp (AISC) <- Online, part-time research program.
Principles of Intelligent Behavior in Biological and Social Systems (PIBBSS)
Swiss Existential Risk Initiative (CHERI), Research Fellowship
Center for Human-Compatible AI (CHAI), Research Fellowship, Collaborations, Internships
Center for the Governance of AI (GovAI), Research Fellows, Summer and Winter Fellowships
Legal Priorities Project, Summer Research Fellowship in Law & AI
ML for Alignment Bootcamp (MLAB) <- Not currently running, but you can sign up for news on future iterations or request access to their curriculum.
Top US policy master's programs <- Not AI Safety specific.
Self studies
Study Guide by John Wentsworth
Reading Guide for the Global Politics of Artificial Intelligence
Introduction to AI Safety, Ethics, and Society textbook + lectures by Dan Hendrycks
"Key Phenomena in AI Risk" - Reading Curriculum Principal Curator: TJ (particlemania)
MIRI’s Research Guide (despite the name it is actually more of a study guide)
Open AI’s Spinning Up in Deep RL (teaches you AI and not AI safety, but still useful)
List of AI Safety Technical Courses, Reading Lists, and Curriculums from Nonlinear
News and Community
List of communities
AI Safety Communities and University groups by Alignment Ecosystem Development
Newsletters
AI Safety Newsletter by Center for AI Safety
AI Safety Opportunities Newsletter by AI Safety Fundamentals
AI Safety in China by Concordia AI (安远AI)
Community Blogs
AI Safety Discussion - "This group is primarily for people who have experience in AI/ML and/or are familiar with AI safety. We encourage beginners to join the AI Safety Discussion (Open) group."
AI Safety Core by JJ Balisanyuka-Smith
lots more... I'm not on twitter, so I need help with this section.
Other
AI Safety Google Group <- Used by various people for various anouments
AI Alignment Slack group <- Discussions, networking, etc.
AI Existential Safety Community from FLI
Some local groups
AI Safety Hub Oxford I think?
see also AI Safety Communities and University groups for more
Career Advice and Job Search
General Career Advice
FAQ: Advice for AI alignment researchers by Rohin Shah
Beneficial AI Research Career Advice by Adam Gleave
Technical AI Safety Careers by AI Safety Fundamentals
Levelling Up in AI Safety Research Engineering by Gabriel Mukobi
Safety isn’t safety without a social model (or: dispelling the myth of per se technical safety) by Andrew Critch
PhD Advice
Should you do a PhD? by Linda
Leveraging Academia and Deliberate Grad School by Andrew Critch
A Survival Guide to a PhD by Andrej Karpathy
There are more non-public resources for finding an AI Safety (or AI Safety friendly) PhD position. Contact Linda for more info.
How to Write an Email to a Potential Ph.D. Advisor/Professor
Jobs, etc
80k's Job Board WARNING: This job board includes listing of jobs at AI capabilities labs such as OpenAI. Please don't apply for these jobs. Even so called safety roles at these labs should not be assumed to be good places to work for someone who cases about AI safety.
Many job openings are posted in #oportunites in the AI Alignment Slack
Career Coaching
Arkose <- Career coaching for machine learning professionals interested in contributing to technical AI safety work.
Free Coaching for Early/Mid-Career Software Engineers with Jeffrey Yun
Coaching, Mentoring, Mental Health Support, etc
Research Maps and Reviews
Research Agendas
Technical AI safety
MIRI: Agent Foundations for Aligning Machine Intelligence with Human Interests (2017) and Alignment for Advanced Machine Learning Systems research agendas
CLR: Cooperation, Conflict, and Transformative Artificial Intelligence: A Research Agenda (+ includes some questions related to AI governance)
Paul Christiano’s research agenda summary (and FAQ and talk) (2018)
Synthesising a human's preferences into a utility function (example use and talk), Stuart Armstrong, (2019),
The Learning-Theoretic AI Alignment Research Agenda, Vanessa Kosoy, (2018)
Research Priorities for Robust and Beneficial Artificial Intelligence, Stuart Russell, Daniel Dewey, Max Tegmark, (2016)
Concrete problems in AI Safety, Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, Dan Mané, (2016)
AGI Safety Literature Review, Tom Everitt, Gary Lea, Marcus Hutter, (2018)
AI Services as a Research Paradigm, Vojta Kovarik, (2020)
Avoiding Negative Side Effects due to Incomplete Knowledge of AI Systems, Sandhya Saisubramanian, Shlomo Zilberstein, Ece Kamar, (2020)
AI Research Considerations for Human Existential Safety (ARCHES), Andrew Critch, David Krueger, (2020)
How do we become confident in the safety of a machine learning system? By Evan Hubinger
Research Agenda: Using Neuroscience to Achieve Safe and Beneficial AGI by Steve Brynes
Unsolved Problems in ML Safety By Dan Hendrycks, Nicholas Carlini, J. Schulman, J. Steinhardt
Foundational Challenges in Assuring Alignment and Safety of Large Language Models, Usman Anwar, et al.
AI governance
AI Impacts: promising research projects and possible empirical investigations
Governance of AI program at FHI: Alan Dafoe's AI governance research agenda
Center for a New American Security: Artificial Intelligence and Global Security Initiative Research Agenda
FLI: A survey of research questions for robust and beneficial AI (+ some aspects also fall into technical AI safety)
Luke Muehlhauser’s list of research questions to improve our strategic picture of superintelligence (2014)
Books, papers, podcasts, videos
(Non exhaustive list of AI Safety material)Books
Introduction to AI Safety, Ethics, and Society (textbook) by Dan Hendrycks, 2024
Taming the Machine: Ethically Harness the Power of AI by Nell Watson, 2024
Uncontrollable: The Threat of Artificial Superintelligence and the Race to Save the World by Darren McKee, 2023
The Alignment Problem by Brian Christian, 2020
Human Compatible by Stuart Russell, 2019
Reframing Superintelligence by Eric Drexler, 2019
The AI Does Not Hate You: Superintelligence, Rationality and the Race to Save the World by Tom Chivers, 2019
Artificial Intelligence Safety and Security, By Roman Yampolskiy, 2018
Superintelligence: Paths, Dangers, Strategies by Nick Bostrom, 2014
Some other reading
Victoria Krakovna's AI safety resources (contains a list of motivational resources and key papers for some AI Safety sub fields)
Pragmatic AI Safety by ThomasWoodside
X-Risk Analysis for AI Research by Dan Hendrycks, Mantas Mazeika
Podcasts
Alignment Newsletter Podcast (Robert Miles reads the Alignment Newsletter)
80k's Podcast (Effective Altruism podcast with some AI Safety episodes)
Quinn’s Technical AI Safety Podcast
The Nonlinear Library <- a repository of text-to-speech content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs
YouTube
(Most of these channels are a mix of AI Safety content and other content)Robert Miles discuss AI on Computerphile and Robert Miles's own YouTube channel
SlateStarCodex Meetups (recorded talks)
Other Videos
Aisafety.video <- A list of very meny videos
CEA Artificial Intelligence playlist - Youtube
AI Safety Research Groups
(Many of these groups do a combination of AI Safety and other X-risks.)There are many academic and independent researchers, who are interested in AI Safety, and who are not covered by this list. We are not going to list specific individuals publicly, so please contact us if you want to find more AI Safety researchers.
Technical AI safety
Center for Human-Compatible Artificial Intelligence (CHAI), University of California, Berkeley
Future of Humanity Institute (FHI), University of Oxford
Center on Long-Term Risk (CLR), London
Ought, San Francisco
Redwood Research, Berkeley
AISafety.com - A Startup for Aligning Narrowly Superhuman Models
Organisation working on both technical safety and capabilities
(These orgs employ people who are doing valuable technical AI safety research, which is good. But they are also employing people doing AI capabilities research, which is bad since it speeds up AI development and reduces the time we have left to solve AI safety.)AI governance
The Center for the Study of Existential Risk (CSER), University of Cambridge
Future of Humanity Institute (FHI), University of Oxford
Global Catastrophic Risk Institute (GCRI), various locations
Median Group, Berkeley
Center for Security and Emerging Technology (CSET), Washington
AI companies which also does some safety work, both technical and governance
Forecast and strategy
Future of Humanity Institute (FHI), University of Oxford
Convergence Analysis, moving around
Leverhulme Center for the Future of Intelligence (CFI), University of Cambridge
Outreach and advocacy
Existential Risk Observatory, collects and spreads information about existential risks
Future of Life Institute (FLI), outreach, podcast and conferences
Funding
Grants
The Center on Long-Term Risk Fund (CLR Fund) <- S-risk focused
Survival and Flourishing Fund (SFF) <- awards and facilitates grants to existing charities.
Survival and Flourishing (SAF) <- not currently active
Career development and transition funding | Open Philanthropy
Open Philanthropy Undergraduate Scholarship | Open Philanthropy
AI Safety: Neuro/Security/Cryptography/Multipolar Approaches | Foresight Institute
Fundraising platforms
Housing
CEEALAR / EA Hotel is group house in Blackpool, UK which provides free food and housing for people working on Effective Altruist projects (including AI Safety), for up to two years
Other lists
Other
AI Safety Support (AISS) <- That's us!
Lightcone Infrastructure building tech, infrastructure and community
Berkeley Existential Risk Initiative (BERI), support for academic researchers