We have all heard of big data, those vast data sets (on anything from weather to shopping trends, traffic flows to video views, and many, many other topics) that float around in hyperspace and can only be managed with computer applications and, if possible, by qualified professionals in the field.

Their analysis can help us predict who will score the winning goal, when the next pandemic will appear or when the next snowfall of the century will occur… or not?

On numerous occasions, we read that big data will change the world, which is true, but sometimes wishes are confused with reality. Despite the enormous versatility of data-driven solutions, there are also beliefs or myths around this discipline that are repeated and have no basis. 

Javier Garcia Algarra, academic director of the Engineering and Science area at the U-tad University Center, identifies (and refutes) the seven most widespread big data myths in recent times.

Lots of data = big data: We may not always collect a considerable volume of information, but it is not helpful for the problem we want to solve because it is not representative. 

For example, suppose we have access to millions of medical records. In that case, these data will not allow us to make correct predictions about the chances of patients developing diabetes or melanoma in the Philippines or Guatemala, even though the system is very accurate. 

Nor can we deduce the musical tastes of those over 50 years of age by studying the playlists of high school students.

We can predict any phenomenon: For this to be true, it is essential that what we are studying is not entirely random. We cannot predict which number will be awarded in the Christmas lottery draw, even if we know the results of the previous 100 years, because it is related to pure chance. 

On the contrary, we can estimate that soccer team A will beat team B seven out of every ten times they meet based on the analysis of recent results. Sport is not entirely random.

Big data can solve ‘all’ problems.  

According to many, big data was going to stop COVID-19 at the beginning of the pandemic, but why was it not able to prevent its spread? In the first weeks, we saw many predictive models that estimated the evolution of the number of infected or deceased, and almost all failed miserably. 

Fortunately, we do not have centuries of experience or data, which is why making predictions is hazardous. On the contrary, the health authorities predict with great success how many flu infections there will be every year and in what week the peak will occur. 

The key is that the flu is recurrent, and we have a representative historical series of data on which to base ourselves.

Elections without surprises with big data: Experience shows us that, no, we can accurately predict the next election’s outcome by studying what happens on social networks. This is a legend based on the successful prediction of the US presidential election results in 2012 when Barak Obama was re-elected. 

However, those predictions failed in the 2016 election, when Donald Trump was the winner against all odds. The reality is that electoral behavior is challenging to model, and social networks are not a faithful representation of society but only of its most boisterous part.

Your future work depends on big data: It seems that, in the future, personnel selection processes or dismissals will be decided by algorithms using big data. However, while it is true that decision-making using data is a potent tool in the hands of companies, the last word will always be held by a human being.

Blaming the algorithm is the 2022 version of that excuse from the 90s, “the lines are very busy” or “the computer is prolonged.”

My data does not matter to anyone, and there is no danger in sharing it with applications or platforms. 

All data we generate when using the Internet remains on the network forever, and we do not know who will use it or how it will be marketed, now or in 20 years. We must educate the youngest in taking care of the information they spread.

Big data and algorithms are magical: This is the most pernicious myth. Behind this technology, there is only mathematics and computation, nothing irrational. It is developed by humans who know what they are doing and do not use spells or perform hidden ceremonies. You do not need any superhuman power to understand it, just study and dedication.

The Data area contains many of the professional profiles most in demand today worldwide and who are leading the digital revolution in the economy.

Also Read: Types of Artificial Intelligence


Please enter your comment!
Please enter your name here