Eckerson Optimizing Your Data Pipeline

Eckerson Data Pipeline White Paper

Data-driven organizations need a modern data pipeline to ensure business users of all stripes have the right data at the right time to make decisions and take expedient actions. This report provides an overview of the modern data pipeline and how to implement it.

Ebook Background

About the Eckerson Data Pipeline White Paper

Multiple and complex data pipelines can quickly become chaotic with the pressures of agile, democratization, self-service, and organizational groups of analytics. The increased difficulty of governance and uncertainty of data usage are only the beginning. From enterprise business intelligence to self-service analytics, data pipeline management should ensure that data analysis results are traceable, reproducible, and of production strength. Robust pipeline management works across a variety of platforms, from relational to Hadoop, and recognizes today’s omnidirectional data flows, where any data store may function in both source and target roles.

Waves of Acceleration

Infrastructure. Self Service. Artificial Intelligence.

Data Refinement

Preparation tools to create end-to-end data pipelines.

Evolution of Technology Markets

Managed Services. Data Pipeline in a Box.

Standardize Your Platform

Adapt rapidly to changing opportunities.

THE MODERN DATA PIPELINE

Most data analytics leaders oversee a data infrastructure that was designed and built before the technology acceleration of the past decade. Most recognize they need to modernize their data analytics platforms to keep up with trends in big data, self-service, cloud computing, advanced analytics, and artificial intelligence. The question is how.

Greenfield. Traditional approaches to business intelligence (BI) and data warehousing (DW) do not consider supporting large volumes of multi-structured data, advanced analytics, and other new use cases. Consequently, some approaches advocate replacing relational databases, ETL tools, and SQL queries with more modern approaches based on Hadoop, NoSQL databases, cloud platforms, and a host of open-source projects. These are typically companies with ample technical talent or new, greenfield environments.

Hybrid. Most organizations, however, opt to blend the best of the old and new worlds. They supplement legacy environments with new big data platforms or implement hybrid environments—usually in the cloud—that offer the flexibility, scalability, and elasticity of big data environments using traditional tools, such as relational databases, ETL tools, and SQL processing. This allows organizations to leverage their existing investments in people and technology while gaining the benefits of new techniques and technologies.

GETTING FROM DATA TO ANALYTICS

Creating a modern data pipeline that supplies a business with a steady stream of integrated, consistent data for exploration, analysis, and decision-making takes effort and time. The six key ingredients are as follows:

Assessment. Determine your current state. What is your current architecture? The assessment doesn’t need to be lengthy or highly formal, but you need to know where you are today before plotting a course to the future.

Vision. Then develop a vision for harnessing the company’s data to achieve strategic objectives. This vision consists of a strategy and plan showing how the company will grow revenues, increase customer satisfaction, reduce costs, and lower risks from better managing its data assets.

Roadmap. Next, analyze the differences between your current architecture and your vision of a modern analytics architecture. What technologies do you need to enable the modern analytics ecosystem? What management and cultural shifts are needed? Finalize, prioritize the gaps, and build a roadmap.

Leadership. With a data strategy (an assessment, vision, and roadmap) in place, the company needs an experienced, respected data leader to execute the vision. Many companies appoint chief data officers who report to the CEO to demonstrate the importance of data to the organization.

Organization. A CDO needs to align various groups in the organization that typically work independently or at cross purposes. The CDO needs to synchronize initiatives and teams across the organization, eliminating data silos and data turf wars.

Change Management. The creation of a modern analytics pipeline requires as much emotional intelligence as data intelligence or more. Getting business and technical managers and users to change the way they acquire, use, and act on information is not easy, especially if they have to learn new tools and processes and adhere to new policies for governing data.

Get the Eckerson Data Pipeline White Paper

Sign Up for Our Newsletter

If you liked this ebook, sign up and stay informed on the most popular trends in data management.