I interviewed Rick van der Lans, one of the leading BI/Big Data gurus in the world, about the future of Big data.
“In a nutshell, these new big data systems can seriously increase the analytical capabilities of an organization. And that can lead to increased profits, lower costs, bigger market share, and so on”
Ari: You have made a long career in utilizing data for business, what’s the next big thing in Business Intelligence?
Rick: The next big thing is always hard to predict. I hope it will somehow be related to simplification. With the current technologies, and here I include the new big data technologies, everything has become so complex, including our architectures. Because of this complexity, it takes so long to make changes or to add new customer features. I think it’s important for our businesses that IT can develop and maintain systems faster, much faster than we can today. This means that the architectures must be simplified and that the technologies must be easier to deploy too. But I am afraid that the next big thing will be a new technology again.
Ari: Big Data is the buzzword and perhaps too overhyped. I think the common assumption in Europe is that it is definitely interesting subject and worth getting familiar with, but only few companies really implement it at full scale?
Rick: I fully agree with that. Of course, there are companies that have already developed systems that we can qualify as big data systems. But the majority hasn’t done anything with this concept yet nor have they used the typical big data technologies, such as Hadoop, MongoDB and Cassandra. Some have started with tests and prototypes, so let’s say, they have touched the water. But we must admit that the US companies have adopted this faster than we have.
Ari: Why do you think that is?
Rick: That’s a tough one to answer. Many analysts are trying to find an explanation for this. Maybe it’s that, on average, European companies are smaller and, therefore, their databases are smaller, so there is less need for big data systems. Maybe it’s the overall economic situation? And there are other possible reasons. But we can’t pinpoint exactly why it’s the way it is.
Ari: Many also think since their amount of data is so small, big data solutions are not needed. Is Big Data only for really big amounts of data, or could big data technologies also be useful for smaller amounts of data?
Rick: Sure, big data technologies will work for small amounts of data as well. Technologically, there is no problem with that. I would definitely recommend companies to explore what all these new technologies could mean for them and for what types of use cases they have a better price/performance ratio than classic SQL databases. In fact, working with this technology may even lead to new ideas on how new forms of analysis can be implemented.
Ari: Many people talk about NoSQL. Some younger programmers rather use only Java and MongoDB or similar NoSQL –tools. How do you see the role of SQL in the future?
Rick: My feeling is that SQL will stay the dominant data access language for a long time. We can already see the enormous interest in SQL-on-Hadoop engines that allow us to access data stored in Hadoop using the familiar SQL language. It also allows us to use all the popular reporting and analytical tools to play with Hadoop data.
Ari: Do you think that in the future new EDW-solutions will be built entirely on Hadoop (without any traditional relational database)?
Rick: That’s not unthinkable, but today that will be a technological challenge. Hadoop HDFS was not designed to support many concurrent interactive users. In addition, Hadoop HDFS is not that fast when we do 10-table joins, which is not uncommon in an EDW environment. Hadoop has been designed to store massive amounts of data and to support occasional complex analytical questions, but not a typical EDW-type workload.
Ari: I think the term ’Big Data’ will eventually vanish, but the technologies such as Hadoop and NoSQL will stay. What do you think the future looks like for big data?
Rick: I think you’re right. Big data is a marketing term and very poorly defined, but it has been very helpful in making companies aware of the business value of creating big data systems.
Ari: Business leaders are always interested in cutting costs and making profit. What would be there in Big Data for them?
Rick: In a nutshell, these new big data systems can seriously increase the analytical capabilities of an organization. And that can lead to increased profits, lower costs, bigger market share, and so on.
Ari: How could a Finnish company get easily started with e.g. Hadoop or other big data solutions?
Rick: I think the first thing to do is to think backwards. First, determine from which forms of analytics an organization could benefit. Would it be useful to analyze social media data? Can the customer care level be increased? Can transport efficiency be optimized? Then, determine whether such new forms of analytics require big data, and finally, evaluate if new technology, such as Hadoop or MongoDB, is needed.
Rick will also run a seminar in Helsinki 3.3 on How to add Big Data to DW/BI solutions.