Study Finds Simple Method Can Improve Safety for Language Models

Researchers with the Northwestern Center for Advancing Safety of Machine Intelligence (CASMI) have found a simple and effective way to identify unknown and anomalous data for language models, which improves safety and reliability.

The findings will be presented at the Association for Computational Linguistics (ACL) conference on July 12 in a paper entitled, “Is Fine-tuning Needed? Pre-trained Language Models Are Near Perfect for Out-of-Domain Detection.” Its authors are affiliated with the University of Wisconsin-Madison: Rheeya Uppaal, PhD student in computer science; Junjie Hu, assistant professor of biostatistics & medical informatics and of computer science; and Sharon Li, assistant professor in computer science and principal investigator of the CASMI-funded project, “Understanding and Reducing Safety Risks of Learning with Large Pre-Trained Models.”

View media coverage of our news story at the following link: https://casmi.northwestern.edu/news/articles/2023/study-finds-simple-method-can-improve-safety-for-language-models.html

McCormick News Article