The Secrets of Migrating to Cloud Analytics
- John Morrell
- February 12, 2020
With the market for cloud analytics predicted to reach $37.8 billion by 2023 (a CAGR of 22.3%), many organizations find themselves moving analytics workloads to the cloud. Therefore, they can take advantage of the easy adoption of cloud services, the instant scalability and elasticity of cloud resources, and the analytics services’ pay-as-you-go consumption model.
Migrating to Cloud Analytics
To date, migrating to cloud analytics has been wrought with challenges. Although cloud-based analytics services are easy to adopt, engineering the processes supporting the cloud analytics services has presented many challenges, as outlined in this TDWI Best Practices Report: Cloud Data Management. Many large organizations have created 24 to 36-month roadmaps to migrate to cloud analytics filled with complexities.
Let’s look at some of these challenges and explore some approaches that streamline and shorten cloud analytics migration.
What Do I Get from Cloud Analytics?
Cloud analytics offers many crucial advantages over its’ on-premises counterparts, including:
- Instant-on and consume as you go – You can almost instantaneously start-up new analytics services in the cloud and pay for only what you use. It eliminates the need for upfront investments and offering a more economical model.
- Elasticity & Scalability – Analytic workloads will often vary from light to heavyweight, depending on the job at hand. The cloud provides an elastic resource model that can auto-adjust to the workload needs of that moment.
- Managed services – Cloud analytics are run as a service, eliminating expensive and resource-intensive administration. It dramatically reduces the cost of operating your
- Modernization – many cloud analytics services have newer architectures and tooling to simplify your approach and processes. It allows you to modernize your data and analytics infrastructure and brings greater flexibility.
What Are the Challenges?
Yet, as with anything new, cloud analytics and data management do present its challenges, some of which are outlined in this TDWI Best Practices Report: Cloud Data Management. Among the challenges are:
- Security – Of the people surveyed in the TDWI report, security and data privacy were the most often cited challenges. It is not a knock against cloud security, which is strong, but rather a recognition that organizations need to revamp security processes for the cloud.
- Governance – The second biggest challenge cited in the TDWI report was around governance. As with security, this challenge is because it needs to revamp and rethink governance processes as organizations move to the cloud.
- Data movement – Perhaps the most significant challenge organizations face with cloud analytics is moving the data into the cloud for analysis. Existing data movement processes need to be re-engineered, and new ones need to be created for new data. Processes to move data into the cloud can also be resource-intensive.
These three factors lead to one big overall worry with CDOs and analytics leaders – risk. Security, governance, and data movement create a certain degree of risk that needs to be weighed and overcome to reap the benefit of cloud analytics.
Cloud Analytics Workloads
In general, there are two types of analytic workloads: reporting and ad-hoc analytics. Reporting tends to be highly repeatable with well-known data and well-defined metrics. Ad-hoc analytics are new questions raised by management based on new business situations and business processes where conditions can vary greatly. For ad-hoc analytics, you don’t always know what data will help answer your question, or the analytics need to deliver an answer or set of actions, not metrics.
While both are very viable cloud models, ad-hoc analytics fits very well with the cloud compute and resource model. Ad-hoc analytics workloads can vary quite slightly from small to large depending upon the business process in play, the level of exploration needed, and the data required to support the analysis. It plays very nicely with the elasticity, consume-as-you-go, and self-service models of cloud analytics.
A Better Approach to Cloud Analytics
With reporting use cases, since the datasets are well known, the data engineering processes to push the data into a cloud data warehouse can be straightforward. To answer the earlier mentioned concerns, governance establishment and security for this use case are also well-known. This process is even more comfortable if you migrate existing reporting use cases (and methods) to the cloud for cost, flexibility, and modernization reasons.
Ad-hoc analytics is a very different beast. Moving all the data that could answer any potential question would be an arduous and potentially futile task for the data engineering team. It would require either (a) a too lengthy (and potentially error-prone) requirements and delivery process, or (b) a continuous process where analysts request data then wait while the data engineers get to the request sitting in the lengthy queue.
The ability to find, discover, and collaborate around different datasets while leaving the data in place offers the best of both worlds – all the advantages of the cloud while mitigating data movement risks. It facilitates faster ad-hoc analytics and responds more quickly and effectively to the business as it poses new questions.
A SaaS virtual analytics hub offers all the cloud benefits, supporting the instant-on, elasticity, and modernized platform needs. Simultaneously, it mitigates the security, governance, and processing risks of moving data into the cloud. Using a SaaS virtual analytics hub allows analysts to be instantly productive for their ad-hoc analytics needs – they don’t have to wait for data engineering processes that move data to the cloud.
Once ad-hoc questions are answered, with the proper datasets discovered, and the analysis is nailed down and trusted, the data and analytic models are well known. Suppose management decides to turn the ad-hoc analysis into a repeatable reporting one. In that case, data teams could migrate the required models and data into a cloud data warehouse where a consistent, repeatable process can be performed, and a precise governance model can be put in place.
Cloud analytics offer many benefits but also numerous challenges. Organizations would like to reap the benefits of migrating to cloud analytics but do so in a manner that mitigates risk and is a fast, smooth process.
A promising path to cloud analytics is to deploy ad-hoc analytics workloads in the cloud and leaving data in place while making it discoverable, sharable, and reusable by the analytics community. It allows the organization to take advantage of the instant-on, elastic, and modernized platform the cloud offers while eliminating the need for extensive upfront migration processes to move data into the cloud and the risk this brings.
This approach will help new ad-hoc business questions get answered faster and more comfortable, delivering immediate benefit and business impact. Once the problems turn into repeatable reporting use cases, it can be migrated into a cloud data warehouse, and the delivery process will be simpler and faster. The combination of these two will deliver even greater ROI to your analytic programs.
The Datameer Spotlight Virtual Analytics Hub is a SaaS solution allowing analytics teams to find, create, collaborate, and publish trusted analytics assets in complex hybrid landscapes. Datameer Spotlight provides unified access across analytics silos, increases the use of analytics assets, and offers a collaborative environment that furthers data knowledge to build trust and rapidly answer ad-hoc business questions.
Datameer Spotlight facilitates faster, more detailed ad-hoc cloud analytics because it helps teams access data virtually without moving it. It allows teams to discover, share, and reuse analytics assets regardless of where they reside – on-premises, cloud, or hybrid and runs as an elastic SaaS service. Datameer Spotlight is open and interoperable with your cloud data warehouses such as Snowflake and Redshift, as well as your favorite tools such as Tableau or Looker.