Predicting the outcomes of Indian Elections using Google Trends

Indian elections are like a festival that celebrates the birth of democracy in the country. India is one of the largest democratic nations in the world. As of 2019, there are nearly 900 million voters in the country. Elections give these people a right to express their political opinion and choose who they deem suitable to represent them. Most media houses and several independent agencies try to forecast the result of elections using various methods. The process of forecasting the election outcomes usually takes place just ahead of the upcoming elections.

The most common methods used to make these predictions are opinion polls and exit polls. India has one of the youngest populations globally, with the average age being just 29. The overwhelming majority of Indian voters are young and well acquainted with the internet. Studies have shown that young people value web pages as one of the most important sources of information. Hence, it is fair to say that the internet plays a vital role in shaping up their political opinion, and to some extent, it helps them decide who they want to vote for. For most people in India and around the world, the gateway to the internet is google. It is like a modern digital encyclopedia that can answer all their queries. The majority of the population trusts the information provided by Google and uses that information to take action.

Google Trends is a great tool that helps us understand people's nature and behavior based on what they search and interact with on the internet. It provides keyword-related data, including search volume index and geographical information about search engine users. It normalizes search data to make comparisons between terms easier. Search results are normalized to the time and location of a query. Each data point is divided by the total searches of the geography and time range it represents to compare relative popularity. Given the heightened activity of people on google, the analysis of data provided by google trends can predict future human behavior. Much research has been done in this domain, especially in developed countries with high internet penetration and a two-party system. A study done by Spyros E. Polykalas and his team on the German elections investigates whether prediction of election results is possible by analyzing the behavior of potential voters before the date of the elections. It proposes a step-by-step algorithm and makes a normalized prediction, which simply means that it predicts the percentage of votes for one party in relation to the other. The American Behavioral Scientist also published a fascinating article by Camilo Prado-Roman and his two partners. The report proposed a free method to anticipate the winner of the presidential election. To demonstrate the predictive capacity of the proposed method, the authors conducted a study for two countries: the United States of America and Canada. The study took into account the past four elections in the United States and the past five in Canada and analyzed which candidate had the most Google searches in the months leading up to the polling day. The study only made a binary prediction about which candidate would win the election.

Polls and data analysis in politics and election campaigns is nothing new. The critical role of data in campaigns has not changed; what has changed is the way of obtaining this information and the greater certainty provided by getting data through different information sources. So far, the studies conducted in this domain have been limited to the West. Nobody has ever taken the liberty to explore the Indian political scenario, which is far more complex because of all the different factors in play. Unlike most countries, India does not have a two-party system. It is full of diversity. Different parties run the state governments and the national government. Ground-level political influence plays a significant role in determining which political party will come into power.

In our study on the Indian Elections, we added more to what the researchers had done earlier and explored the unexplored territory of the Indian political scenario. The algorithm proposed in our study predicts the Indian Lok Sabha and Vidhan Sabha elections using Google Trends. To be more specific, given a simple set of keywords for a political party, it performs statistical analysis on google trends data for those keywords to predict and explain the outcome of the elections. The predictive model was developed after many runs of trial and error. We wanted to test whether a straightforward predictive model based only on keywords for the political party is an indicator of the actual electoral outcome. To develop the model, we only considered the parties that had a significant vote share and ignored the parties with low vote share to leverage the predictive capability of google trends data. In the Lok Sabha Elections, the national elections were converted into a two-party system since only Bharatiya Janata Party (BJP) and Indian National Congress (INC) showed promise in terms of vote share. Similarly, in the Vidhan Sabha Elections, only parties with significant vote share were taken into consideration. The number of parties varied for each state that was analyzed.

The proposed model is one of its kind, and its predictions make comparisons with actual data and require no normalization of the actual electoral outcome. The Indian Elections have never been analyzed through the lens of Google Trends. This kind of prediction has never been made before by any researcher in any country of the world. The algorithm results indicate that we have successfully predicted and explained the outcome of the Indian Elections, on both state and national levels, using a predictive model based solely upon simple keywords. Currently, no study in the world has successfully predicted the vote share of a party. Apart from the binary prediction, the maximum researchers have achieved is to predict the vote share of one party in relation to the other. No one has been able to say anything about the actual numbers. India is the biggest democracy in the world. We have achieved for a complex electoral democracy like India what researchers have not achieved for relatively simpler democracies around the world.

The results of this predictive model are proof that we are well on our way to replacing the traditional methods used to forecast the election results. However, this is not the end but just the beginning. We will explore more ways to play with this data and predict more electoral outcomes in more ways than one, with even better accuracy.