Definition
Also known as: AI alignment, AI risk, responsible AI
The field of research and practice focused on ensuring artificial intelligence systems behave as intended and do not cause unintended harm. Encompasses alignment (ensuring AI goals match human values), robustness (reliability under unexpected conditions), interpretability (understanding AI decision-making), and governance (institutional frameworks for responsible development).
THE LONG VIEW Glossary