IDC just released a report that forecasts that the worldwide big data technology and services market will reach about $48.6 billion in 2019. There is a lot going on, which Ben Fu points out in his high-level outline of how Next World Capital looks at the market – especially around big data apps. Another big part of the market? Machine learning. Check out more in our third episode with Ben.
Stefan: Ben, let me switch a little bit topics here. You have phenomenal investments in the big data space, very knowledgeable. It’s very interesting to learn that you went and wrote about the distributed databases at MIT. I didn’t know that about you.
Ben: Oh. You learned something.
Stefan: Where do you see the market today and where do you see it moving in the future?
Ben: Sure. Maybe just on a high level, and we can certainly white board it, but at least the way that we view it is the market opportunity is just tremendous, just level set. We’ve all seen that graph of, let me just start out here, the 30x data growth from 2010 to 2020. I think everyone’s seen this with this 30x growth, but we’re just right here, and that’s what makes it really exciting. We’re in 2015. We’re literally at this point where existing solutions today are hitting their breaking point. You see it today. Hadoop is going to re-platform partially or wholly the enterprise data warehouse. NoSQL will partially or wholly re-platform the SQL world, called post-relational world.
There’s so many other things that are going to happen beyond this when you can see access to data as well as access to be able to manage that data or analyze that data becomes more and more available. We think of it as there’s many different ways to be able to attack this market and maybe I can just draw on here. Maybe just it’s high-level blocks of how we see things. On the very bottom you have the management layer, and you can certainly slice and dice this as much as you want, but this is really the data management side. This is $30 billion dollars of market. This includes Oracle. This includes Microsoft, SQL, MySQL. This also includes Enterprise Data Warehouses. This is $30 billion. Above that, you have analytics. This is where Datameer is, and … Sorry, my handwriting’s not the best.
Stefan: Better than mine.
Ben: I went to engineering school, so I’m better with the keys.
Stefan: Same here.
Ben: And mice. This is a $17 billion dollar market and growing, and then I think with the more interesting part, which we’ll see in the coming years is what we call the big data apps side. If I had to take a swag at this, this is $100 billion plus. The view that we have on this market is you could slice it up, and this is not going to be symmetrical here, but there’s a re-platforming as well as an expansion here.
Stefan: Right. Exactly.
Ben: Right. Because if Hadoop costs, as you know, and we’ve seen it with some customers, if Hadoop is really 120th up to 150th of the cost, why wouldn’t you use it? If DataStax Cassandra is 10 times faster than Oracle, 100 times cheaper, why wouldn’t you do it?
Stefan: Why wouldn’t you? Yeah.
Ben: This is a case where you can take some of the existing market, but you may be able to expand it even further. It just takes time, and you know it as well.
Stefan: That’s what we see. It’s usually not 100% re-platforming. We go in just in use cases and then people realize how much opportunity there is, how easy it is. They realize all benefits and then they actually step-by-step moving use cases and data over. It’s never, “Oh, we replacing Oracle or Teradata today,” but it’s like, “Oh, okay, we upload something. Oh, this is easy. Let’s add more and more and more,” so we seeing the expansion. Yeah.
Ben: Yeah. There’s certainly different areas of texture underneath that are I think really interesting. To your early question, you really got to know the technology or the real use case of why does that even matter, but it will be very valuable in certain enterprise use cases more than others. I think that machine learning is very, very interesting. We’re spending a lot more time. It’s one of those things where a lot of innovation in the 80s then the dark days, so pretty useless from the 80s until 2009 or 10. For instance, in machine learning now we have enough data. Now, we have the right cheap, almost free, computing. And then the new algorithms, which could, not necessarily will, but could actually revolutionize how we think about natural language processing as well as images.
Stefan: Right. How do you feel about things that Google’s doing about deep learning and all the fascinating stuff?
Ben: I think between Google and Facebook those two companies in particular are doing amazing things in machine learning. If you think about it, the evolution of search and the interface of us typing something into a machine and it knowing relevance but nuance relevance, as well as learning things that are less and less human based, it’s phenomenal. I think that those two companies you’ll see a lot of innovation, and at some point, people leaving those companies to be able to do machine learning applications.
That’s what falls up under here where you can change the outcome. You can change how you do health. You can change how you do security. You can change how you do finance because of machine learning. It may not be completely broad based. We can certainly talk about your view on this as well, but we are at a point … I feel like we’re literally at a tipping where you will see new things happen, new units of value being created because now we can manage and handle the data and interpret the data more and more machine based.
Stefan: Is the self-driving car part of that, for example?
Ben: Very possibly. My guess is you could probably do a rules engine behind that. I think the machine learning, specifically of that data science and machine learning is probably helpful in there, but the real value at least in the self-driving cars is, in my view, something that is determined as seen as doable, and Google’s already doing it today.
Stefan: Their big breakthrough was machine learning. That’s when they had the DARPA challenge and then they acquired basically the whole Stanford team. Their big breakthrough was the visual machine learning.
Ben: That’s right.
Stefan: Education, to get their car going. Where do you see those things maybe overlap a little bit and what kind of companies you seeing in this, obviously Datameer here, and we’ll go on the whole market and grow it, but-
Ben: At 17 billion dollars, right?
Stefan: What are kind of some of the interesting companies here and maybe in this space that you seeing and you guys maybe tracking or invested into?
Ben: Sure. Sure. Maybe just some of the companies that on the data management side, and I just think of this as there’s management but there’s also the actual architecture underneath. This includes the Hadoops of the world but also DataStax to Cassandra. DataStax is the enterprise gray management tool for Cassandra, which is again that very, very powerful database.
In the analytic side, we’ve actually made quite a few investments in that space because this is probably the separation between IT and business users, at least the first division, and here, this is actually lines of business buying in. That’s at least the way that we interpret it. On the analytics side, obviously, we invested in Datameer, but we’ve also made investments in GoodData as well, so cloud-based analytics. If you believe that on-premise is going to go to a cloud, another dimension of shift, not only the big data side of things but actually it’s cheaper, better, faster to be able to do it in the cloud, GoodData is also on this side.
I would say that a company that we’ve spent a lot of time with and I think has a lot of unbound opportunity is actually Agile-1, so that’s one of our companies that’s in predictive marketing analytics. There, the customer database think B2C CRM, to be able to not only analyze your customer but be able to funnel down and segment and personalize who they are all the way down to by zip code and persona, and they can do it not only online but offline as well. I think they straddle between absent analytics because they can be actually prescriptive.