Big Data & Brews: Cloudera’s Mike Olson
I’m sure most of you are familiar with Mike Olson, one of Cloudera’s co-founders and currently its chief strategy officer. I’ve known Mike for many years, we’ve been in the industry together pretty much since the beginning. Mike was kind enough to stop by our new office studio to talk about his history in tech and also a bit about what he is seeing in the industry as infrastructure continues to grow.
Check out the first installment of our chat below:
Stefan: Welcome to Big Data & Brews today with Mike Olson from Cloudera. Hello.
Mike: Good to see you Stefan.
Stefan: Could you introduce yourself and the brew you brought with you?
Mike: You bet. I am, as Stefan said, Mike Olson. I’m the Chief Strategy Officer and one of the cofounders of Cloudera. The business is about six years old now and it was summer of 2008 when a small group of us banded together to start the business, focused on delivering a big data platform built on Apache Hadoop. And of course know you and your team from way back in the day.
Very few others were working in big data at the time. You were one of out very first partners. I think we watched the big data market and appetite for Hadoop explode and analytic data exploration, the sort of tooling that Datameer provides, become a really critical part of that ecosystem. I’ve been with the business from the very beginning, going strong and having fun.
Mike: Great! And what are we drinking today.
Mike: Ah. Yeah. So.
Stefan: The classic California brew huh?
Mike: I love Sierra Nevada. I actually quite enjoy craft brews as well, but maybe as much as twenty years ago I discovered Sierra Nevada, back when you could still twist the tops off.
Mike: About two years later, I love the beer and it became my go-to beer. About two years later Consumer Reports published a report that labeled this the best beer.
Stefan: Uh my gosh.
Mike: So once I got there…
Stefan: Shall we just drink out of the glass?
Mike: I feel more comfortable if we just did this. Cheers and thanks for the hospitality man.
Stefan: Nice and refreshing. So you know its brewed in Chico California, have you ever been to this area?
Mike: I’ve not been to Chico. I do have a friend of the family who is a student at Chico State and I understand people have a really good time.
Stefan: I hear about that too. So close to Chico is this little town in Nevada City where I lived for a while. We call it big data capital, the secret big data capital. Ken Krugler lives there, who was one of the very first Nutch users of his old company Krugler. I lived there 2007 maybe. Chris Wensel was doing Cascading as well.
Mike: Yeah. Yeah. Yeah.
Stefan: The guy that did IBM Watson, lives up there as well. It’s a 2000 people town.
Mike: What were you doing in Nevada City? Why were you living there?
Stefan: I was just living there consulting for different kind of companies. I do projects at this time. I worked for EMI Music at this time. The said, “Oh we’re a fancy music company, you can live wherever.” I’m like, OK, see ya. You know Sierra foothills, it’s beautiful up there.
Mike: Exactly. We like to spend a lot of time up at the Lake Tahoe area, so I know that drive very well.
Stefan: Nice, nice. You have an interesting background before you founded Cloudera. Give us a little bit of a color on this. [3:06]
Mike: Ah, you bet. I’m a database guy from way, way back. I’m an engineer by training and by inclination. I did my Bachelors and Masters work at UC Berkeley.
I was there at the time that the Postgres project was sort of the core focus of Mike Stonebraker’s research and I began working for Stonebraker in like 1988 or so on Postgres. I designed and built much of the storage infrastructure, the vitri code and a bunch of the access methods, the Archery spatial access method.
I got my Bachelors in ’91. That year I thought about going off and getting a job. Stonebraker talked me into hanging around and becoming a graduate student, entered the grad program planning to get a PhD. I spent one more year at Berkeley and got my Masters in ’92 and actually stayed beyond that.
But in 1992 Mike hired the entire post-grad research team, except for me, because I was still in the PhD program, away to start a company that became Illustra, which was really the first successful company commercialized Postgres.
I learned in the ensuing year that I actually had no taste for research at all. I liked those people, they were great to work with. Once they had all been extracted from the University, the University wasn’t such a great place to be.
So I dropped out of Berkeley finally in 1993, joined Illustra. That was a lot of fun. It was a great run, we got acquired by Informix in 1996. Immediately before, for those of you who pay attention to database history, Informix restated earnings three times.
Stefan: Uh-huh. Ooo.
Mike: So, the deal when it went down was $440 million paid for Illustra. By the time the lockout ended and engineers could sell the stock that they owned, it was down to $100 million.
Mike: Well, yeah, I mean it feels kind of bad. But remember, before that I had been a graduate student.
Stefan: Yeah, so no big deal.
Mike: I’d been used to making maybe $60,000 a year. For a while on paper I was fabulously wealthy, and then I was merely pretty well off. But it was a great experience.
I stuck around Informix for 18 months, about, joined a biotech startup for a year. That was a one year mulligan. I ran engineering for those guys, but it wasn’t a really good fit for me. Then it was back to databases.
Most significantly what I did besides Illustra was Sleepycat. We built, in fact when I was at UC Berkeley, Margo Seltzer and I built Berkeley DB, the embeddable database. We, at Sleepycat, commercialized that in the late 90s and early 2000’s. Business grew with no investment from outside ever, just on revenues, very successful, very fun company. We got to about 25 people. We got to pretty respectable revenues for $12 million all in. But in 1995-1996 it began to get challenging to continue to drive that growth without a large slug of investment capital. [6:00]
The result was, we decided to shop the business, we surprised ourselves by getting very strong inbound interest from Oracle, wound up striking with what was a great deal for the employees, the customers, and for the product, which continues to be developed at Oracle. I spent 2006-2008 at Oracle.
Stefan: That was the best time of all?
Mike: A couple things. Oracle’s stock had been flat until the Sleepycat acquisition when it began to appreciate. Eventually, if I can do anything for you guys, just let me know. Clearly lots at work there. I really enjoyed the time I spent at Oracle. Met a lot og good people. I got to see how that organization functions worldwide. And, as I said, it was a great home for Sleepycat, for Berkeley Db and for our customers.
But after a couple years I felt like I’d learned what I need to learn there, and was frankly looking to do something smaller. So I left Oracle in January and by June it had become clear to me that Hadoop had some real opportunity in the market. So, met Christophe Bisciglia from Google, Jeff Hammerbacher from Facebook and Amr Awadallah from Yahoo and the four of us banded together to start Cloudera that summer.
Stefan: Before we dive in deep into the Cloudera story, do you still have any connections to Michael Stonebraker? [7:28]
Mike: Mike and I are very close. We are on each other’s Christmas card list. He doesn’t write very many Christmas cards. I was one of the speakers at his 70th birthday party back at MIT a couple months back. Maybe what’s lurking behind your question is: Mike’s on record as not thinking that Hadoop is such a good idea.
Stefan: He recently now at a … what’s his new company?
Mike: Data Tamer
Stefan: Is it like an ETL play or…
Mike: It’s data prep and ingest so if you’re going to have a big data platform you have these huge streams of data coming in. That stuff has to be preened and cleaned and organized. You can think of it as ETL but actually, I’m sure you guys see this in the customers you talk to – you guys build a beautiful tool, a beautiful product for exploring and analyzing data. In order to get it ready for the exploration and analysis, there’s a lot of kind of behind the scenes work, busting it into the right records and formatting it correctly and making sure that all the values are right.
Tamer like, by the way, Trifacta and Paxata, are building products aimed at that.
Stefan: So coming to the Cloudera story, how is it to grow, to see a company grow from four people to, how many people are there now? (8:48)
Mike: We are 675 people right now. And we’re looking around 800 by the end of the year.
Stefan: I’m sure you don’t remember all the names anymore.
Mike: I do not remember every single name in the company.
Stefan: What was the twist between working with all of the people that you know to the point, we’re at the point of like, who’s that? Oh, he works here?
Mike: You’ve got to have the same experience. You walk down the hall. You’re Stefan. Everyone knows who you are. I walk down the hall. I’m Mike. Everyone knows who I am. And they’ll say “Hi Stefan” and like 80% of the time you’ll say “Hi, man”. It’s a little bit uncomfortable.
The growth has been fantastic. It has been wonderful to watch the company get big. Obviously the job of building and running a company at this size is just radically different from what it was.
I stepped out of the CEO job about a year ago now, just a little bit more than a year ago, because of that. When a company is young an entrepreneur’s job, and especially a technical founder, is drive the vision and get people excited about it, and make sure that the product is making progress. (10:01)
As you get bigger stuff like your three-year financial plan and your path to profitability and the predictability in your sales pipeline, all that stuff becomes hugely important. I don’t just wake up in the morning thinking “A staff meeting, a staff meeting, yeah!”
I felt that I wasn’t doing the job that the company needed me to do. I thought we needed to be world-class and frankly, I didn’t think I was going to be able to be world-class in that job. And it stopped being fun in an important way. You wake up in the morning thinking about customer problems and how technology can make customers’ lives better and I get excited about that stuff. All of that other stuff which is absolutely critical requires a guy whose brain is just wired differently.
So we surveyed the market. We hired a great CEO who had formally run ArcSight, had been acquired into HP. His name is Tom Reilly. Tom has been fantastic, good partner to me, we are close friends. I wake up in the morning now with a job that, not only do I feel deeply competent at, but I really just love.
Stefan: Before we get into the next generation technology, from your experience right now, with MapReduce, SQL, HBase, you guys have Floom, you have Scoop, what’s the most critical part of the infrastructure, or most popular part of the infrastructure, what people are asking for, and it may be in which different stage of the adoption cycle? (11:30)
Mike: Yeah. MapReduce is absolutely ubiquitous, no surprise right? It was the first kid to the party. A huge investment in applications that run on MapReduce. The single biggest asset that Hadoop has is the storage layer HDFS, right? It is that very forgiving, very accommodating place to put any kind of data at all. The set of engines that’s becoming available up here is large and increasing quickly. Maybe later on we can talk about Spark and what that means. The pace of innovation at this layer is absolutely frantic.
As the Cloudera Chief Strategy Officer, but also as a long-lived data guy, I think the most interesting thing happening right now is the emergence of machine learning as a real first-class tool for enterprise data analysis. It used to be rocket science that happened in academic research labs and really hard to wrap your heads around. Now there are beginning to be algorithms that surface that stuff and make it easy to train models and then to score basically vectors, basically do predictive analytics. Good user interfaces on that are still a big opportunity. We’re going to see, we expect, huge adoption of ML techniques for analytics and enterprise.
Stefan: Thank you very much for joining for Big Data & Brews and talk to you in the next segment.
Mike: That’s great.