Recognizing Sound in Machine Learning

Professor Bryan Pardo understands the role sound plays in daily life and why artificial intelligence tools that perceive sound are more important than ever before.

Professor Bryan Pardo heads the Interactive Audio Lab at Northwestern University, where he and his colleagues develop new methods using machine learning, signal processing, and human-computer interaction to make tools to better understand and manipulate sound. 

"I get to work with world-class students and researchers at places like Google, Mitsubishi, Sony, Adobe, and Descript on cutting-edge tech that is transformative for how we find, label, separate, make, and modify sounds," Pardo said.

Pardo is a professor in Northwestern Engineering's Computer Science Department, and he routinely teaches students in the Master of Science in Artificial Intelligence (MSAI) program.

Pardo taught an overview course on machine learning this past year, as well as a look at machine perception of music and audio. As machine learning and artificial intelligence become more prevalent in society, tools that help understand and manipulate sound will become more important than ever before. 

"Speech is our primary form of communication, more so than text," Pardo said. "Music is one of our most important art forms. Videos, despite the name, are as much about sound as they are about image." 

Pardo is also the co-director of Northwestern's Center for Human-Computer Interaction + Design, where he helps bring together researchers and practitioners from across the University to study, design, and develop the future of human and computer interaction at home, work, and play.

There are already countless examples of human-computer interactions (HCI) that rely on artificial intelligence and machine learning and impact an average person on a daily basis, from digital voice assistants that interact with you verbally to search engines that recognize images and find similar ones to machine translation of text between different languages. As the tools and knowledge about those tools continue to develop, so too will the number of HCI examples. 

Pardo will explore some of those possibilities in the spring when he teaches Digital Music Instrument Design (aka Digital Luthier), a course that studies HCI through the lens of artistic creation. 

While Pardo is excited about the possibilities that come with machine learning, he also recognizes there will be inherent risks that need to be addressed. 

"Ever increasing deep model sizes with ever increasing needs for data and computational resources are a problem we'll need to deal with," Pardo said. "The use of deep models that cannot explain why they make the choices they make will also become increasingly problematic as they are put in more mission-critical positions, such as self-driving cars, medical diagnosis, and loan approval."

McCormick News Article