Post from Brian Smith, Regional Director of Sales
I ran the commercial sales division at Vertica Systems for several years prior to joining Datameer. At the risk of going on (and on), I wanted to share my enthusiasm around my professional experience selling big data solutions, and why I am so excited to be at Datameer.
Circa 2011 – We’re all producers and consumers of data in almost every aspect of our professional and personal lives. Analysts anticipate a compounded 40% per year growth rate in corporate data volume, with the lion’s share of the growth in unstructured data. In 2011, the name of the game in big data is Hadoop, and it’s become the Gold Rush of interest. Why?
The economics are compelling – Hadoop is moving out costly analytic databases and warehouses, driving IT to re-look at ADBMS sales cycles, shifting IT dollars and vendor roadmaps, and generally wreaking havoc in the traditional vendor community. We’ve gone from one or two distributions to nine in the last year! And, literally every vendor in the BI/DBMS space has a Hadoop connector, the latest being the recent Oracle announcement. Everybody is on board this train – All this based upon the premise of unlimited scale and data variety at a fraction of traditional costs. Technical challenges exist, but its clear that there’s a sea change.
Prevailing winds – In light of large corporate BI and data warehouse investments, companies are typically using Hadoop as a staging and storage area to start… This involves parallel loading high volumes of raw enterprise data into Hadoop for subsequent scrubbing, (ETL), in route to passing smaller subsets for analysis to the existing BI stack through connectors.
There are several reasons why this approach will be short-lived:
- Business analysts in this process are by definition separated from the data that they need to do their job – The end product that they receive is a limited subset of available data for analysis, simply because traditional BI cannot handle the volumes of data.
- Time value of data – the development/IT resource required to ETL the data and move/process the subsets introduces unacceptable delays. People have jobs to do and analyses to rely on…
- Cost – at each stage of this process, costs are created and duplicated across developer, vendor license, support and training , only to yield a partial answer based on a fraction of the data. It’s incredibly inefficient and complex.
Conclusion? Where such obvious inefficiency exists in a business segment where literally billions of dollars are spent every year to achieve business efficiency (!), alternative solutions will rapidly emerge…That’s why I am at Datameer. (“Data Ocean” in German)
“Datameer is the first BI/Analytics platform built natively on Hadoop.”
On the surface it sounds interesting, but in practice the solution is game-changing. The Datameer Analytic Solution (DAS) connects business users directly with the entire volume and variety of their raw Hadoop data and makes it available for comprehensive analysis.
DAS does this in a way that any business user will find to be simple, efficient, and completely intuitive. IT will love it as well, as Datameer frees up the IT folk to focus on more value add work rather than writing code to get data into Hadoop.
Practice makes perfect…Datameer gives business users an iterative “prototyping” capability for the data pipeline against a small sample of the data prior to running production analytics against the Hadoop cluster. (IT guy smiling…) It’s simple, practical, and directly in line with how the business needs to access, analyze and consume data – all without stepping on IT’s toes.
Under the covers, DAS generates Java/MapReduce code that runs natively on the Hadoop cluster. All current Hadoop distros are supported – we’re Switzerland when it comes to platform support for Apache, Cloudera, MapR, IBM and the rest, we run all of it in a browser on Windows, Mac and Linux.
DAS is an open book at every stage of the data pipeline, with plug and play support at each phase – integration, analysis and visualization. So you can pick and choose, plug in your own custom analytic functions, use your visualization tool of choice, or simply use DAS end-to-end as an integrated stack. What’s not to like?
I attended an Hadoop user meeting in San Francisco a few weeks back when I first joined Datameer. One of the moderators in the small group discussion was a local IT/Development manager responsible for supporting a 200 node Hadoop cluster. He made a very telling comment:
“The other day, one of my business guys asked me why I couldn’t just ship him a half a petabyte of .csv files?”