Sentiment Analysis of the 2016 Presidential Election

Paola Castillo

When RCC staff member Dr. Prasad Maddumage decided to take a brief sojourn from his usual research area in astrophysics, he discovered an interesting way to dabble in machine learning. Something incredibly different than anything he had done in the past …

Twitter.

Using a form of machine learning called sentiment analysis, Prasad examined the emotional undertone of thousands of public tweets to gain insight into how the general public felt about the 2016 presidential election. He used a technique called opinion mining to analyze the tweets and accumulate information about people’s ideas and opinions during the last ten days leading up to the election.

To facilitate the enormous amount of data, he employed state-of-the-art machine learning technologies on the HPC. Prasad first gathered a subset of tweets and categorized them by hand, deciding whether the sentiment was positive, negative or neutral towards each candidate. He then used this data to train a model using SciKit-Learn module in Python. The ultimate goal of Prasad’s work was to find which candidate was preferred by each twitter user.

Although this kind of research has been performed in the past, Prasad hopes this methodology can be used in a less specific way in the future.

According to Prasad, “Given the size of the data, this kind of analysis requires a lot of memory and storage. Without a resource like the HPC, it would be prohibitively expensive to apply machine learning algorithms to anything with more than trivial amounts of data.”

The analysis predicted all but six states correctly.