World's Largest Materials Database Now Open

McCormick team’s open quantum materials database offers unlimited access to analyses of nearly 300,000 compounds

A network graph—called a “minimum spanning tree”—showing the 7,410 predicted table compounds from the Open Quantum Materials Database. Since this image was completed, the number of compounds predicted has increased.Christopher Wolverton was frustrated. His search for potential materials for new, stronger structural alloys and advanced battery electrodes was becoming far too time-consuming and difficult because most materials databases denied him access.

“I just wanted to download some data, but I couldn’t,” says Wolverton, McCormick professor of materials science and engineering. “So my students and I decided to build our own materials database.”

Two years and several false starts later, Wolverton’s group at McCormick has created the largest materials database in the world. The Open Quantum Materials Database (OQMD) launched in November 2013 and has been growing since. It is entirely open to the public and can be downloaded online.

Taking guesswork out of new materials design

When researchers like Wolverton want to create better batteries, solar cells, and medical devices, they often look for answers in new materials. Materials with optimal properties can improve existing technologies and spark ideas for new ones. But finding materials that have just the right properties can take many years of trial and error.

“Suppose you want to find a material that would make a good solar cell, but you don’t have a design strategy,” Wolverton says. “You would have to explore in the dark.”

The OQMD takes some of the guesswork out of designing new materials. Its purpose is to identify candidate materials for specific applications by screening them for various properties before they are tested in the lab. This dramatically accelerates the search, narrowing down candidates for possible materials to a mere handful that require further experimentation.

“The calculations are faster and easier with less cost than conducting experiments,” Wolverton says. “And it’s all on computers, so users can explore things—like toxic elements and radioactive elements—that they probably wouldn’t want to do in their labs.”

How it works

The OQMD allows users to search for materials by composition, create phase diagrams, determine ground state compositions, and visualize crystal structures. Wolverton says his group has also implemented machine-learning models, trained on the database, that can learn chemistry and predict the possible existence of new compounds that have not yet been synthesized.

“Using sophisticated data mining, we could turn materials science into a big data problem,” he said. “We could use algorithms to make recommendations for materials the same way Netflix recommends movies you might like.”

The team used Northwestern’s high-performance computer cluster, Quest, to construct most of the database. So far, it contains analyses of 285,780 compounds and continues to grow.

Keeping it open keeps adding value

Since the OQMD launched, other institutions have started to make their databases public, but many remain closed. Wolverton says closed databases can only be used the way their creators intended. By keeping the OQMD open, more people can use it, adding their own compounds and growing its potential.

"Using sophisticated data mining, we could turn materials science into a big data problem. We could use algorithms to make recommendations for materials the same way netflix recommends movies you might like."

“People will use the database in ways that we couldn’t possibly imagine right now,” he said. “People will improve it, change it, and use it in different ways. They will search for applications of materials that my group isn’t interested in, and that’s great. They will get value out of it that we never would have.”

Even though Wolverton cannot predict how others might use the database, one thing is certain: it will remain open to the public. “Our philosophy from the very beginning was that our database should be open,” he says. “No one should have to repeat this work ever again.”