New Paradigm for Enterprise Data Management: The Data Mesh

  • Benoite Yver
  • April 2, 2020
Data Mesh

A data mesh is a new architectural paradigm for connecting distributed data sets to enable data analytics at scale. A data mesh solves the issues presented by centralized, monolithic data lakes and data warehouses by treating domain-based data as the end-product and allowing separate business domains to host and serve their datasets in an easily consumable way. 

Why Current Data Architecture Models are Failing

As data proliferates and becomes ever more ubiquitous, data lakes and warehouses are beginning to fail at a few of the key functions they were designed to facilitate: cross-silos analysis and consumption. 

Smaller organizations with minimally diversified data sets may still be able to centralize their data in an enterprise data warehouse, but for larger companies with an infinite and ever-growing number of new and legacy data sources as well as diverse data consumers, piping all the data into a single place becomes an endless project. ETL engineers can’t keep up with the added data sources, and existing data pipes need constant maintenance at each subtle change and updating the data sources. 

Besides, the centralized architecture of lakes creates a bottleneck between the data engineers and the business domain experts, causing domain knowledge to be lost and resulting in disconnected and frustrated source teams that feel locked out of data they should rightly own, use and process. Ultimately, it’s a structure that does not scale and does not deliver on the promise of creating a data-driven organization.

The Data Mesh Solution

Just as microservices have changed how we develop software by allowing applications to be broken down into independently built and maintained services, data meshes provide granular access and control over highly distributed data from various domains. 

Functioning similar to a service mesh, it connects siloed data by creating a self-serve data infrastructure that stitches together data held across multiple locations and organizations. It accomplishes this by using a modern platform approach and treating domain-data as the primary component of its architecture. Doing so ensures that data is highly available, easily discoverable, secure, and interoperable with the applications that need access to it. Data is no longer segregated into source and consumption patterns, and decentralized teams can use whatever data they need and then “feed the mesh” with their output. 

For a data mesh architecture to work, the data product owners ensure their data is discoverable, trustworthy, self-describing, interoperable, secure, and governed by global access control. Data lakes and warehouses can still live in this architecture, but instead of being the central focus and repository, they just become another node in the mesh. 

Data Mesh Use Cases

A data mesh unlocks the possibilities for various consumption scenarios across an organization, including machine learning, analytics, and data-intensive applications. 

With its architecture, you can create virtual data catalogs from a variety of data sources. You can also create virtual data warehouses and lakes for analytics and machine learning training. Perhaps most importantly, you will be able to connect cloud applications to sensitive data that lives in on-premises and/or streaming or real-time data from devices. 

Also—application developers and DevOps teams will be able to query data from various data stores without worrying about accessing this data. 

Summary 

A data mesh empowers your organization to escape the analytical and consumptive confines of monolithic data architectures and connects siloed data to allow machine learning and automated analytics at scale. 

With it your company will truly be data-driven, relinquishing lakes and warehouses’ issues and replacing them with the power of data access, control, and connectivity.

Transform data in Snowflake, try Datameer’s free 14 day trial

 

Related Posts

Top 5 Snowflake tools for Analysts- talend

Top 5 Snowflake Tools for Analysts

  • Ndz Anthony
  • February 26, 2024