About Us Icon About Us Icon Business Analyst Icon Business Analyst Icon CEO Icon CEO Icon Datameer Icon Datameer Icon Envelope Icon Envelope Icon Facebook Icon Facebook Icon Google Plus Icon Google Plus Icon Instagram Icon Instagram Icon IT Professional Icon IT Professional Icon Learn Icon Learn Icon Linkedin Icon Linkedin Icon Product Icon Product Icon Partners Icon Partners Icon Search Icon Search Icon Social Networks Icon Social Networks Icon Share Icon Share Icon Support Icon Support Icon Testimonial Icon Testimonial Icon Twitter Icon Twitter Icon

Datameer Blog

Big Data & Brews: Jake Flomenberg of Accel Partners Talks Investment for Big Data Startups

By on February 25, 2014

On this week’s episode, Venture capitalist, Jake Flomenberg of  Accel Partners, shares his thoughts with me on how startups can make an impact in the big data market.


Stefan: Jake Flomenberg joins us today for Big Data & Brews from Accel.

Jake: Thanks for having me.

Stefan: Thank you. You actually brought big bottles of beer, what I like. Usually in America, you have this tiny bottles of beer, so you already won a lot brownie points. Can you introduce yourself and your group?

Jake: Yeah. My name’s Jake Flomenberg. I’m an investor with Accel Partners. I’ve been there for about two years. Before that, I had product management responsibilities at both Splunk and Cloudera.

A little bit about the brew, Pliny the Elder is a pretty popular beer in these parts. It’s probably my favorite double IPA. I’m not a huge double IPA person, but they did a really good job balancing all that hops and alcohol with enough flavor to make it work.

Stefan: What does double IPA mean?

Jake: Double IPA. I’m not a beer expert, but I think it means double the hops and traditionally almost double the alcohol by volume content. They usually have pretty strong hoppy flavor. This is probably eight percent or more by volume.

Stefan: Holy moly. Yeah, it’s eight percent. You brought two big bottles of eight percent alcohol.

Jake: Yeah, so I have to wait before I drive after this.

Stefan: Well, we will go into a lot of details then. Looks like it’ll be a fun conversation. Let’s go a little bit more into the details. What did you do at Splunk? You were on product there? When did you join, early on or … ?

Jake: I joined Splunk in 2010 and eventually took on responsibility for the entire top half of the stack. The UI, the user experience, the search language, a lot of the alerting framework, and that last mile of the product that makes it be usable by the sysadmins and IT ops professionals.

Stefan: What did you do before Splunk?

Jake: Out of college, I worked at Lockheed Martin, actually. I was in a rotational program. [00:02:00]

Stefan: Yeah, we should not talk about that, most likely. Was it top secret?

Jake: It wasn’t that top secret, but it was actually a great way to refine my software engineering chops, contribute, and have a huge impact. But at the end of the day, even as a young guy running a pretty meaningful project, you look at that project against Lockheed’s 40 billion dollar revenue that they do annually, and it’s like, “How do I make a difference?”

That’s what motivated me to go to business school and start thinking about what’s going on in the world of cloud and the world of data. It’s been a lot of fun. I spent the last few years in the world of data.

Stefan: Cool. Cheers, first.

Jake: Cheers.

Stefan: Let’s see how much we can drink of that. It’s very hoppy, and it has twice as much alcohol that I’m used to. Good. That will be fun. What did you do at Cloudera?

Jake: I joined Cloudera for a bit in the very early days. I was probably about the 13th employee there. At the time, I was the only business guy.

Stefan: That was fun.

Jake: It was a ton of fun, but a lot of the small problems, anything that wasn’t an ACFS or a MapReduce problem, got to spend some time on. Go figure out how to stand the first website up. Go talk to the engineers and talk to them about how to get data in things have become projects in their own right, like Flume, for instance. How to think about pricing, and all these little things, I got to help move the ball just a little bit forward. It was a tremendous amount of fun.

Stefan: Was it a big step from the technical world into the investor world? What was the biggest culture shock?

Jake: I think the product management experience prepares you reasonably well for a lot of the aspects of being an investor. You can think of investing of, at first, “Why did I go work for this company? Is it interesting? Is it an interesting market? Is it going to do well?” [00:04:00]

Obviously, it’s at a higher level of abstraction. You’re not in there in the weeds, in the day-to-day. That sort of stuff translates really nicely. Some of the more interesting learnings have been in terms of the board work, the recruiting, and even the interactions within a partnership, where you get out of a meeting and you’re like, “What just happened?” How feedback is interpreted is a little bit of a different operating system than the top-down command and control experiences at a normal company.

Stefan: Is it a lot more responsibility now as an investor, dealing with millions of dollars and taking care of people? How is that?

Jake: I try not to think about it too much. At the end of the day, I think it’s a lot of responsibility, but so is shepherding the product roadmap for companies that are now doing hundreds of millions of dollars in revenue. I think jobs are on the line either way you look at it. It’s important to do right by the companies and right by the employees, in either case.

Stefan: Cool. My understanding, you’re mostly involved at Accel with the whole data and big data space in their investment. How do you see the big picture, the big data market at this point?

Jake: Yeah. Do you mind if I draw here?

Stefan: Oh, please.

Jake: Okay. I’ll try to get this out of the way. As an investor, I have to be reasonably simplistic in how I see the world. This is just one more box that I can fit on my hand. I’ll go through each of these boxes in order.

Down here is the infrastructure. This is the storage, the cloud, everything signing in virtualization, and networking. It’s part of the enterprise software and hardware market. It’s slightly orthogonal to big data. A lot of these appliances and systems are going to get built regardless of the trend that’s going on. Some of them are much more customized, and it becomes very important to understand how the systems operate, to take advantage of them further up the stack. [00:06:00]

Stefan: Is the hardware part of that? Or you put hardware-

Jake: No, no. Largely hardware, some bits of software, but at the end of the day, you can now unscalably, in many cases, store ones and zeros.

To paint the layer on top, here we have DMP, data management platforms. This is the relational databases of old to the Hadoops and non-relational stores today. I spent time at Cloudera. Couchbase, another company in the portfolio we’re really excited about, in terms of just production, large key-value stores.

I think we’ve witnessed over the past couple years the maturation of this market. A lot of this technology has gone from, you really need to be a unique, rare individual to know how to stand up a Hadoop cluster, to a technology that has been commercialized by companies, where to you can go to Cloudera, grab CDH, and download something and get going. I think that’s going to enable a lot of more exciting things at the top of the stack.

Off to the side, there’s this pretty important management and security layer. These pieces of infrastructure are just so big and scalable these days that I think they need separate tools and systems just to make sure they’re in working order and secure.

Stefan: I almost thought you’d write Oracle over here. That’s the Hadoop market, and that’s-

Jake: Sometimes I draw it where I draw a line over here, like the old world and the new world. But relational databases in Oracle are, I think, going to be here for a very long time.

The next one is three different flavors of leveraging data. The first one is leveraging data as a product. This is packaging data up in a consumable format. Sometimes- [00:08:00]

Stefan: The Gnips of the world-

Jake: Backed by a search. Whether it’s Gnip or some of these other things for inverted index around social data, to finance data stores in modern LexisNexises, et cetera, et cetera. Those are pretty interesting markets to the extent that you can have proprietary data sets or proprietary lenses into those data sets.

The next one is data tooling. This is probably where you guys fall for the most part. This is everything from your BI on top of this stack, to your ETL, to your search-based analytics, and even the advanced machine learning, which we’re finally starting to see. A lot more scalable implementations on top of some of these platforms. This here is just the data line, so everything below the data line is store.

This is the last bucket, data-driven software. That’s probably one of the buckets I’m most excited about. Obviously, we invest in all of these different places. The exciting thing about data-driven software is we fundamentally believe that every single piece of enterprise software for ripe for disruption, and can be built or rebuilt in more data-driven ways. If you …

Stefan: Is that like a Marketo then, or is that … ? What’s an example of data-driven software for you?

Jake: Yeah, so let me try to draw an analogy between the transition from client server architecture to SaaS 1.0, and what I think the opportunity to go from a lot of first-generation SaaS to data-driven softwares is today. If you look at how SaaS came into existence and who the key beneficiaries are, I think it’s particularly illuminating. A lot of these first-generation SaaS software companies, the beneficiaries were finance, who could now recognize expenditure as opex instead of capex, and IT professionals, who now didn’t have to rack and stack servers and manage this gunk.

Stefan: But they also lost their job, maybe. [00:10:00]

Jake: Maybe.

Stefan: The benefit was they could retire earlier.

Jake: Hopefully not. At the end of the day, the end users … Maybe the software was deployed a little bit more quickly, so they were happier. But the fundamental end user experience wasn’t all that changed. Maybe went from thick client to a thin client, but the user is still stuck in many cases entering data in some manual system so that their manager could run a report.

There was no one spending any time thinking about, “How do we make data entry easier and more automated?” No one spending any time thinking about, “How can we leverage this collective pool of data that you as a SaaS provider have from all your customers to push value back to the customer?” Whether it be through benchmarking or anything that provides value before. At the end of the day, there’s a huge risk of some of these first-generation SaaS solutions becoming data abysses.

A pretty common example is, any software enterprise company probably uses a CRM tool. Sometimes the field-level people get a lot of value out of them. Other times, they’re just entering information so that their manager can run a report. Sometimes it’s late, and they put it in at the end of the quarter so they can get their commission check.

Flip that on its head and say, “I’m going to connect to your email. I’m going to connect to your calendar. I’m going to start pulling some of this information all together. I’m going to map it up to people’s social media profiles. I’m going to do some lightweight NLP on your email, suggest follow-ups, and all these sorts of things.”

Very quickly, you’ve turned a tool that may or may not be useful for these entry-level people into something they can’t live without. Assuming you’ve done that, you now have a tool that people are in every day, the information that they’ve entered is now more accurate, and the managers now have better and quicker access to the information. It creates this virtuous cycle.

The same opportunity that exists in a space like CRM … Accel is proud to be an investor of a company called RelateIQ … the same holds true for every other enterprise software vertical. It wouldn’t have been possible before. Without all this modern architecture here, people wouldn’t be able to just build on top of it and take the next step.

I think the big opportunity to serve middle America and, really, the rest of the world is for startups to come and build data-driven software, because even though we have this democratization of this infrastructure-related software, most companies don’t know how to leverage that to meet the particular needs of their business. [00:12:00]

Stefan: Yeah. That train from a startup perspective already left the station. If you try to build another Hadoop distribution today, you might be a little late.

Jake: Might be a little late. That doesn’t mean there isn’t room for other things, whether it’s in memory storage and [crosstalk 00:12:48].

Stefan: Yeah, sure. It’s always fun. It’s so interesting that many, many times, a lot of those companies are so engineering-driven. It’s great technology, but then like, “Hey, what’s the use case? What problem is it solving?” Yeah, it’s 20 milliseconds faster than something else, but a lot of things we see around here is like, “Oh my God, we’re like 10 milliseconds faster than everybody else.” But nobody in the customer base really cares about carrier speed. What matters more is time to insight, or optimizing processes, being more lean, removing waste in your overall company.

Jake: Absolutely.

Stefan: Interesting. I guess you guys invested all across. If you associate timing, you said there’s still room for improvement, but this is your most … Is that your hot topic right now?

Jake: I’d say storage is a perennial opportunity. There’s always going to be new horizons and more data to be stored. We just invested in David Flynn’s new company, the Fusion-io founder. It’s called Primary Data. I can’t say too much about it, but we’re pretty excited about it. Like I said before, I think we’ve witnessed the maturation of a lot of open source data management technologies. That doesn’t mean there’s not room around the edges. We’ve invested a lot in the data tooling space with a number of seed companies, companies like Sumo Logic that do log analytics, Trifacta, and many more. [00:14:00]

The reason why we’re so excited about data-driven software is really twofold. One, just from a timing perspective, it wasn’t ready. It wasn’t the time until now. And two, just because it’s really unbounded in market opportunity. There can only be so many winners at the data management layer, but every single enterprise software vertical has billions of dollars of potential market opportunity.

It’s really about finding the unique teams of entrepreneurs who understand the domain well enough to go after something, and have enough data expertise or chops to understand how to leverage the rest of the stack beneath them to go do that.

Stefan: I want to use the opportunity, given your sort of experience and obviously your experience as an investor. For everybody that’s now listening to you and like, “Oh, I built the next data-driven software company,” is there a blueprint? Is there a way to go after … ? Do I just find the next vertical and install Hadoop, and that’s my start? What’s the blueprint to get there?

Jake: Step one is really find a team. As I said before, I think the team needs-

Stefan: Even before you have a product idea?

Jake: I think you need to find a space … The team and the market, to me, go hand in hand. Those are probably the two most important things. It’s hard to start unless you know who you’re working with and the direction you’re pointing in.

The two critical team elements to have, again, is one, an understanding how to leverage some of this modern data infrastructure, and two, a deep enough understanding of a domain. It’s really hard for people to go into domains that they really haven’t spent any time in before. The people that are up to really interesting things have spent time in … whether it’s CRM, security, marketing analytics, retail analytics, et cetera. They’ve lived and breathed a little bit, so they understand how some of this data can be brought to bear. [00:16:00]

Stefan: At least 10,000 hours.

Jake: Yeah, at least 10,000 hours.

Stefan: What I hear you saying … and again, that’s I’m sure where people are really interested now they’re listening … you need the technical expertise and then you find the domain expertise. Maybe you’re lucky enough to have domain and technical expertise in the same person, but most likely, you try to build that team.

Jake: Team, yeah. Because at the end of the day, it’s hard to spend your time doing both sides of the house. Then there’s the related issued, relative to that domain expertise, is that domain expertise in a crowded market or in an open market?

Stefan: That’s what you guys looking for usually, I guess? That’s the filter criteria, how you … ?

Jake: What kind of market they’re in?

Stefan: Yeah.

Jake: We’re certainly aware, and it’s important that the entrepreneur be aware, because in a crowded market, typically those markets might be larger, but the bar for differentiation is going to be higher. It would be okay to wind up with a smaller piece of that market, but you have to believe you can get it and you can be differentiated in that market.

In a smaller market, it turns out there might not be that many competitors. You could be in a sub-billion dollar market, you could be the only one or one of a few, and you could just go run the tables. We wish there were more of those. It’s really easy to identify the top 10 enterprise software verticals, but entrepreneurs should consider: go build that company that tackles the 25th that no one’s going after.

Stefan: Yeah. From all the data companies you saw, what would you say is the most prominent mistake that people … You see so many. You see like 20 a day, maybe.

Jake: Not that many. Half of that, maybe.

Stefan: What’s the most common mistake people do?

Jake: One of the most common mistakes, particularly for technically oriented teams, is the way that they think about de-risking the business. Oftentimes the way they think about de-risking the business caters to their expertise rather than their areas of weakness. Oftentimes you’ll hear the story, “Well, we just really want to make sure this is going to work. We’re going to go invest in this technology. We’ll stand up the stack. We’ll build an app on top,” or ML, or whatever it happens to be. [00:18:00]

The advice I always try to give those teams out of the gate is, if you’ve been building software for a number of years, you’re probably going to get credit. People will believe you if you say you can build it. If you have 10 minutes to spend de-risking that business, those 10 minutes would be much better spent going to talking to some potential customers, thinking about the business model, and investing the time in that area where the team or expertise just might not be as deep.

I see that over and over again. You have to strike the balance at some point, obviously. Sometimes you might just be too early to start asking those questions. But that advice holds generally true.

Stefan: Not build a prototype, but rather really validate your market, ensure that you’re going in the right direction. Is that what I’m hearing?

Jake: I’m not saying don’t build a prototype, but just think about when the right time is to invest in some of those other things. It may be sooner than you think.

Stefan: Yeah. Great. Well, we will be back next week for more wonderful …

Jake: Pliny.

Stefan: Pliny, oi, okay … it’s really strong, though … with Jake from Accel. Cheers.

Jake: Thanks, Stefan.

Connect with Datameer

Follow us on Twitter
Connect with us on LinkedIn, Google+ and Facebook

Stefan Groschupf

Stefan Groschupf

Stefan Groschupf is a big data veteran and serial entrepreneur with strong roots in the open source community. He was one of the very few early contributors to Nutch, the open source project that spun out Hadoop, which 10 years later, is considered a 20 billion dollar business. Open source technologies designed and coded by Stefan can be found running in all 20 of the Fortune 20 companies in the world, and innovative open source technologies like Kafka, Storm, Katta and Spark, all rely on technology Stefan designed more than a half decade ago. In 2003, Groschupf was named one of the most innovative Germans under 30 by Stern Magazine. In 2013, Fast Company named Datameer, one of the most innovative companies in the world. Stefan is currently CEO and Chairman of Datameer, the company he co-founded in 2009 after several years of architecting and implementing distributed big data analytic systems for companies like Apple, EMI Music, Hoffmann La Roche, AT&T, the European Union, and others. After two years in the market, Datameer was commercially deployed in more than 30 percent of the Fortune 20. Stefan is a frequent conference speaker, contributor to industry publications and books, holds patents and is advising a set of startups on product, scale and operations. If not working, Stefan is backpacking, sea kayaking, kite boarding or mountain biking. He lives in San Francisco, California.