In The Space
Research and white papers
The are hundreds of smart people focused on issues related to AI trust and safety. The following is just a sample of some of the great work being done.
Model Development
General
Liquid Foundation Models: Our First Series of Generative AI Models: Could Liquid Foundation Models (LFMs) replace Transformer architecture based models?
Building Socio-culturally Inclusive Stereotype Resources with Community Engagement
All that Agrees Is Not Gold: Evaluating Ground Truth Labels and Dialogue Content for Safety
A Framework to Assess (Dis)agreement Among Diverse Rater Groups
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts
An Insider’s Guide to Designing and Operationalizing a Responsible AI Governance Framework
THE HISTORY AND RISKS OF REINFORCEMENT LEARNING AND HUMAN FEEDBACK