Adopting Best Practices for Cloud Data Management
- John Morrell
- November 21, 2019
Enterprises are increasingly adopting cloud-based data management and analytics solutions because of their ease of deployment, greater efficiency and elastic scalability. Yet organizations must adopt best practices for cloud data management in order reap the benefits and accelerate their analytics initiatives.
Transforming Data With Intelligence (TDWI) has developed a Best Practices Report on Cloud Data Management to help guide professionals through the myriad of approaches and techniques for managing data in the cloud for analytics. As part of this study, TDWI interviewed data and analytics professionals across a wide range of industries and company sizes to understand their adoption and views on cloud data management (CDM).
The key findings of the study and the resulting best practices were very revealing and offer a guide to drive adoption of data and analytics in the cloud. Here are a few of the key findings.
Perceived Benefits and Barriers
An overwhelming number of the interviewees believe that CDM presents an opportunity for them – 96%. They see CDM as a way to scale their data, expand their analytics programs and draw business value from new assets in the cloud. 64% of the respondents saw Analytics as a key use case for the cloud, with 47% citing reporting and 43% data science.
TDWI also surveyed users about the benefits they were seeing or expected from CDM. The two highest responses focused around performance and flexibility, with 51% citing scalability as an advantage and 44% seeing elasticity as a major benefit. A number of interviewees also saw the ability to expand their approaches to analytics and increase their return on investment as benefits, with 35% liking the ability to perform more advanced analytics, and 32% being able to better leverage data and analytics assets.
The most common barriers to adoption of CDM are nothing new to the world of data management and analytics – security and governance. The respondents cited data privacy (40%), data governance (38%), data security (36%) and the risk of exposing sensitive data such as PII (32%) as four key barriers. Two other challenges that ranked highly include maintaining a single version of the truth (31%) and difficulty in sharing data across parties (27%).
To break down these barriers, TDWI recommends a holistic approach to data governance with CDM. This means taking an overall governance approach and strategy regardless of where the data resides – on-premises and the cloud – and applying the rules evenly based on the governance requirements for the data at hand.
Hybrid Data Architectures Prevail
Organizations are not doing wholesale movement of data to the cloud. This would be costly, time-consuming and potentially create security issues. Instead, TDWI sees successful CDM adopters using a Hybrid Data Architecture (HDA) as a means to speed delivery and mitigate risk.
Hybrid data is broadly distributed and persisted in numerous data platforms, whether on premises, in the cloud, or both. A well implemented HDA and solution that supports an HDA takes a holistic approach to managing this data, with a keen underlying understanding of where the data resides and how best to access and process it. It should provide a transparent optimization layer and methods where users don’t need to deal with the underlying details.
Hybrid data architectures provide key benefits from the new insights and innovative business processes they can deliver. CDM adopts needs to be careful though, as the diversity of an HDA can leads to complex architectures. A key best practice here is to choose solutions that simplify the architecture and overall maintenance of which a key component is virtualizing access – which we will talk about later.
Knowledge Around Data is a Key Enabler and Unifier
When data is centralized, there is at least a solid inventory of the assets available and, provided analysts have access, they can get to the data they need. In an HDA, data assets are dispersed across the landscape, potentially creating silos and limiting knowledge and understanding of data to the analytics community.
TDWI calls knowledge around data the semantic array, and it includes all relevant information about an asset, not just metadata. TDWI believes “modern users want to query, browse, and search semantic descriptions of data (which leads to accessing the data), a modern semantic facility must support multiple forms of indexing.”
As part of its’ CDM best practices, TDWI believes “drawing a holistic ‘big picture’ is critical for HDA success,” and capturing and sharing knowledge about the various assets is a key aspect of this. This not only remove architecture complexity, but also enables the analytic community to gain faster and greater value from the assets at their disposal in the HDA.
Virtualizing Assets is a Key Best Practice
Virtualization provides an abstraction layer and underlying services to provide optimized access to assets where they reside, without moving them. For an HDA, virtualization can unify the diversity and simplify the access to distributed analytics assets.
TDWI believes virtualization is a core component of an HDA and CDM, should be a key best practice, with compelling benefits, including:
- Virtualized assets can be access on demand from the source, enabling fresh data for time-sensitive business processes.
- Virtual views are often designed to be business-friendly and can simplify access and exploration.
- Virtualization reduces data replication and relocation, reducing network and storage loads.
- A virtual layer supports fast prototyping and testing and can facilitate best placement across an HDA’s heterogeneous platforms.
Finally, note that previously mentioned semantic-driven views combined with virtual access can bring together the silos of the HDA without the risk, cost, and distraction of time-consuming data migration and consolidation projects. And in concert, the knowledge capture of the virtual assets and the virtual access to those assets create the fastest possible path to the delivery of new analytics to the organization.
The TDWI Cloud Data Management Best Practices Report provides a solid blueprint for organizations to deliver on new analytics initiatives in the cloud. Companies can reap the previously mentioned benefits such as scalability, elasticity and faster delivery of new analytics to support the business. Two critical components of these best practices are shared inventory and knowledge of HDA assets to facilitate sharing, reuse and collaboration, and virtualizing access to the assets to eliminate costly and risky data movement processes.
The Datameer Spotlight Virtual Analytics Hub is a SaaS solution allowing analytics teams to find, create, collaborate and publish trusted analytics assets in complex hybrid landscapes. Datameer Spotlight provides unified access across analytics silos, increases use of analytics assets and furthers data knowledge to build trust and rapidly answer new business questions. To learn more visit the Datameer Spotlight website or test drive Datameer Spotlight by registering for a free trial.