See all NewsEngineering News

Answering How to Collect Data for Machine Learning

Researchers showed how robots can purposefully collect data to learn about the surrounding environment


People frequently hear how artificial intelligence (AI) and machine learning (ML) technologies are going to change the world. What they don’t hear as often is how most of what’s learned uses data provided by the general public, and how much data the scientific community actually has.

Recent work from Northwestern Engineering researchers could provide some answers.

In “Mechanical Intelligence for Learning Embodied Sensor-Object Relationships,” published July 15 in Nature Communications, the McCormick School of Engineering’s Todd Murphey and former PhD student Ahalya Prabhakar demonstrated how robots can collect data in environments they may not have seen before, encountering novel objects as they do so. This work shows how robots can respond to that novelty by purposefully collecting data to learn about the environment, rather than passively collecting data as they go about other tasks.

Todd Murphey

To reach this conclusion, the researchers developed an algorithmic method for collecting data for learning — how a robot moves determines what it learns. The experiments showed this method works across robot sensor types, indicating that sensor-specific methods are not necessary, and new types of sensors that do not depend on human-like sensation can be used for perception.

“Right now, the prevailing view in machine learning is that the data we already accumulate will be sufficient for everything we want to learn. And it is true that we want data — lots of it. We'd like it to be unbiased, and we'd like it whenever we encounter something novel,” Murphey said. “Robots are not just going to need to collect data for their own effectiveness, they are also going to collect data for all these data centers that we use for machine learning in general.  

“In the long run, anything data-intensive is going to rely on autonomy to gather more data as the world changes and as learning requirements become more nuanced.”

Murphey is a professor of mechanical engineering at the McCormick School of Engineering and a member of the Center for Robotics and Biosystems. Prabhakar, the first author of the paper and a former member of Murphey’s lab, is now a postdoctoral researcher at the Swiss Federal Institute of Technology Lausanne in Lausanne, Switzerland.

Murphey’s group has a history of working on active learning in robotics, trying to understand what robots do mechanically in order to learn better. Murphey said the idea of embodied intelligence is not new, but his group has made strides toward automating embodied intelligence so robots can learn about the world on their own.

The work, Murphey said, will continue.

“We are currently in the midst of experiments using these ideas,” Murphey said, “and are partnering with industry collaborators to apply these ideas in real-time hardware using high-performance computing.”