ENGINEERING NEWS

Amazon Review Analysis Wins MSiA Hackathon

The Master of Science in Analytics program partnered with Teradata Aster for the fourth annual competition

Members of the winning MSiA teams, Teradata representatives, and Hackathon judges.Members of the winning MSiA teams, Teradata representatives, and Hackathon judges.
Students spent eight hours analyzing large datasets to uncover unique insights.Students spent eight hours analyzing large datasets to uncover unique insights.

There are millions of reviews for products on Amazon.com. What makes one more helpful than another? Master of Science in Analytics (MSiA) students Lauren Yu and Jessica Chan dove deep into the data to find out.

Winners of MSiA’s fourth annual Hackathon on Tuesday, May 2, Yu and Chan analyzed 500,000 Amazon reviews to pinpoint the characteristics that best predicted a review’s “helpfulness.” The team built a series of analytical models to break down a subset of the data consisting of all reviews that had at least five “helpful” notifications. They then developed a “helpfulness ratio” that allowed them to compare the role of variables like word count and readability.

Among the most notable indicators, the team discovered that longer written Amazon reviews with proper punctuation earned higher helpfulness ratings. They also learned that the presence of comparative language in reviews, including phrases that juxtapose the product in relation to similar alternatives, positively impacted how it was received by other Amazon users. The team believes their model could classify helpfulness in future reviews with 85 percent accuracy. 

“Computing helpfulness more easily could allow Amazon to improve its review ranking system by placing more predictively helpful reviews higher,” Chan said.

Yu and Chan were one of 22 teams to present at the Hackathon, a collaboration between Northwestern Engineering’s MSiA program and leading big data analytics company Teradata. The day-long competition challenged students to use the Teradata Aster Discovery Platform — an analytic engine developed to help companies solve complex business problems through data — to leverage their skills to analyze a data set without limitation in hopes of making unique discoveries. In addition to Amazon product reviews, teams could build their analyses around Dallas Police Department crime data, Twitter data, NFL and MLB statistics, and healthcare medical data.

“The teams showed wonderful patience and creativity,” said Roger Fried, senior data scientist within Teradata Aster’s Advanced Strategy Team. “They took all of the wild challenges experienced throughout the day and boiled it down to some very good presentations.”

In addition to Fried, the event was co-led by Diego Klabjan, professor of industrial engineering and management sciences and director of the MSiA program, along with Teradata team members Greg Bethardy, Choudur Lakshminarayan, Adam London, and Mary Gros. The Teradata team met with students the day prior to the Hackathon to train them on the analytics tools and techniques found within the Teradata Aster platform. After a few hours of learning, students could grasp the technology and were ready to unleash their creativity on the data.

After nearly eight hours of multi-genre techniques such as analysis of patterns, predictions, graphs, and texts, the student teams discussed their findings with a panel of judges that included Teradata members, previous Hackathon winners, and returning judge Szabolcs Paldy, vice president of digital marketing at Discover Financial Services. Teams were assessed by their analytic approach, creativity, and final presentation.

As winners of the Hackathon, Yu and Chan will be sponsored by Teradata to attend its PARTNERS Conference this October in Anaheim, California.

Finishing in second place, students Bryce Codell and Ethan Liu conducted a sentiment analysis of 200,000 Amazon product reviews. Recognizing the difficulty in extracting meaning from a subjective five-star review scale, the duo used predictive probability techniques to classify how positive or how negative reviews are with greater granularity. They believe their work could offer greater insight to merchants about what they are doing well and what they can improve upon.

“The MSiA program is unique in the way it incorporates team activities around business use cases,” said Greg Bethardy, data scientist and big data solution architect at Teradata. “It is also the first program we’ve worked with that is housed within an engineering school, which is exciting for us. Its close academic ties to fields like artificial intelligence and deep learning align with many of the trends we see in the world of data science.”

Northwestern’s MSiA program teaches students skills that drive business success in today’s hyper-competitive, data-driven world. Students learn to identify patterns and trends, derive optimized recommendations evaluated through simulations, interpret and gain insight from vast quantities of structured and unstructured data, and communicate findings in practical useful terms that help drive business management.