Get all the capabilities and benefits of a highly optimized distributed query engine like Starburst Data, with true self-service tools and a rich, collaborative data catalog. Datameer Spotlight lets your analytics teams discover, model, consume and govern data ANY data required on their own, without the need for IT. The underlying distributed query engine eliminates the need for ETL while ensuring high performance and rapid response times. The result is faster, trusted insights, and immediate time to value.
Starburst Data offers Starburst Enterprise, a fully supported, production-tested, and enterprise-grade distribution of Presto. Presto is an open-source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes. Presto was initially developed at Facebook, then open-sourced. It was designed and written from the ground up for interactive analytics.
Starburst Enterprise takes Presto and improves performance and security while making it easy to deploy, connect, and manage the Presto environment. Starburst attempts to provide a single database-like interface to multiple remote data sources, making them appear as one database for consolidated analytics without consolidating or moving data.
The Starburst Data product has three major components:
Starburst Enterprise supports over 30 connectors to remote data sources. It offers the Presto data connectors from the open-source distributed, plus a few additional ones, including DB2, SAP Hana, Snowflake, and Delta Lake (Databricks). For some of the Presto connectors, Starburst Data has re-engineered them for better performance.
The distributed query engine and cost-based optimizer are the heart of the Starburst Data product. Starburst Data claims it provides 100 times better price-performance than the query engine in open source Presto. It also has elastic auto-scaling using Kubernetes and cloud compute clusters to manage resource utilization and supports caching for further response time optimization.
Starburst provides standardized JDBC and ODBC interfaces, a Tableau web connector, a PowerBI DirectQuery driver, and a command-line interface on top of the platform. It also provides a REST API and includes the Apache Superset SQL query tool for users. Starburst supports a robust, standard SQL language interface.
Datameer Spotlight is a virtual data management platform with a distributed query engine and optimizer, self-service tools, and a collaborative data catalog that gives analytics teams easy access to all enterprise data assets—regardless of type or location. Spotlight flips the analytics data model on its head, eliminating the need for costly ETL and data replication for analytics and wasted time waiting for data.
Spotlight lets analysts quickly discover, create, share, and collaborate on data assets, building knowledge and trust along the way. It provides a single place where analytics teams can quickly discover all these analytics assets and understand which best solve their problem to produce actionable results promptly. It provides an environment where teams can share and reuse assets, collaborates to form new assets and increase knowledge using familiar social media-like features and AI-augmented information about asset utilization.
Under the covers, Spotlight provides a scalable, performant virtual data query and access environment that brings together all the data analysts need without the need to ETL or replicate data. Spotlight is a SaaS-managed service that does not require IT administration and uses patent-pending optimization techniques and elastic compute architecture to maintain performance and scale.
Spotlight increases the ROI on your data, BI, and analytics investments. It works with any data source you may have – databases, data warehouses, data lakes, files, and applications – and any BI, analytics, and data science tool used. Best of all, the virtual query engine eliminates the need for ETL, allowing you to lower your data integration costs.
At its core, Starburst Enterprise is a distributed query engine that makes data from multiple data sources appear as if in one database. Spotlight is purpose-built to accelerate analytics with a combination of a highly optimized virtual data management server, a broad suite of connectivity to any data, and a collaborative catalog for easy data discovery.
Spotlight and Starburst Data have a few things in common:
Spotlight provides a similar, high-performance distributed query engine as Starburst Enterprise, making distributed data look like a single database to users and ensuring high-performance queries on large datasets with fast response times. Beyond this, Spotlight offers several key differentiated capabilities that Starburst Enterprise does not, allowing Spotlight to facilitate faster analytics at a lower cost:
Starburst Enterprise looks and acts like a database. As such, it requires IT and data teams with expert skills to configure it, add data assets, and maintain it – similar to a data warehouse. This adds to the IT and data teams’ burden and would be just another project in their backlog.
While implementation projects for Starburst may be faster than a traditional data warehouse, it will have similar delays and issues around upfront requirements gathering, project planning and implementation, user acceptance testing, and requirements mismatches. This adds risk to projects, increases the overall cost of ownership, and extends the solution’s time-to-value.
Spotlight provides all the capabilities and benefits of a distributed query engine under the covers but hides and automates all the underlying data facilities with self-service tools and packaging. It removes all the administrative baggage and burden of a data engine, and therefore the need to use IT and data experts.
Spotlight eliminates the IT dependency of analytics teams to work with data, reduces analytics costs, and delivers rapid time to value. Analytics teams gain truly self-service analytics, while IT teams are freed for more worthwhile projects.
As previously mentioned, Starburst Enterprise offers the Presto data connectors, plus a few additional ones – 30 or so in all. The main objective is to facilitate analytics on your existing analytics data sources – data warehouses, data marts, and data lakes. If you need to analyze data from internal applications, SaaS applications, or cloud services, you’ll have to ETL this data into another database and configure it with Starburst.
Spotlight has over 200 connectors to a wide variety of data sources: databases, data warehouses, cloud data warehouses, analytical data sources, SaaS applications, cloud services, and more. With Spotlight, you can directly access all of this data, combine it with other sources, and use it immediately in your analytics. Spotlight’s objective is to facilitate cloud-based analytics across ANY and ALL of your data, supporting analytics of any form.
Since it acts as a database, Starburst Enterprise requires manual configuration to add data assets (tables and schemas) from remote data sources. Additional “modeling” to create views that combine data requires SQL DDL coding. Both steps need IT and data experts and are not self-service to the analysts.
Spotlight has self-service visual tools that allow analysts to model data specific to their needs without requiring IT or data experts’ help. Spotlight provides a codeless, graphical approach to modeling through its intuitive point-and-click interface. Spotlight introspects and catalogs the objects from your sources, lets you search and discover the right assets for your analysis, and has AI-driven recommendations to guide the modeling process.
Starburst Enterprise does not provide a data catalog, only a database-style metadata store that contains technical metadata accessible via SQL. The IT/data team maintains the metadata store, and there are no facilities for analytics team members to add what they know about the data.
Spotlight contains a rich data catalog and semantic layer with tools that allow analytics teams and business users to add their wealth of knowledge to the catalog. Beyond the physical metadata, users can provide information about the data, including tags, descriptions, and comments. They can also certify assets, provide custom properties, and add business-level metadata. Spotlight supplements this by capturing information on where an asset is referenced, who is using it, and how often it is used.
Starburst Enterprise does not provide any means for analytics users to discover data assets that might be the right ones for their analysis. A virtual data warehouse in Starburst Enterprise is pre-designed and hard-wired for specific analytics tasks, offering no means for ad-hoc analysis. Analysts are told what data is in the system.
In Spotlight, users can browse and search for data assets of any kind and explore them to determine the best fit for their analysis. Nothing is hard-wired. If users find certain assets that work but need to add others and combine them to their needs, they can easily do so in the self-service toolset.
Spotlight allows users to perform faceted-search across any information in the data catalog – names, descriptions, tags, custom properties, and more. Search results can also be filtered by who is using a dataset (owners and collaborators) and other usage information. Spotlight provides a detailed data preview, and users can open a dataset in their favorite BI tool from within Spotlight to explore it visually.
Starburst Enterprise does not offer capabilities for analytics teams to share their knowledge and collaborate on data assets. It merely is a virtual data warehouse maintained by IT and data teams.
Spotlight allows users to work together in shared workspaces to collaborate, add knowledge (tags, properties, etc.), and create additional shared assets. It also supports social media-like features around assets. The owner can add collaborators, and users can request to follow or fully collaborate on an asset. Once added, followers will receive notifications in their activities inbox. Collaborators can fully exchange comments and notifications on assets.
Data governance goes beyond security, allowing organizations to understand what their data assets and analytics are made of, their meaning, who is using them, and how they are being used. In many organizations, data governance is needed to meet regulatory requirements.
Starburst Enterprise provides no data governance features. Any data governance on top of Starburst would require manual tasks or external data governance tools – both coming at a high cost.
Spotlight contains several features to maintain governance, including:
Much of the work to configure, operate, and maintain Starburst Enterprise is done via manual configuration files and scripting. Starburst requires IT teams to run and database tuning teams to optimize continuously.
Spotlight is a SaaS-managed service that requires minimal operational administration, particularly for performance and scale. The minimal configuration and operational tasks needed are performed via a graphical console. Under the covers are a self-tuning distributed query optimizer and execution engine, easy caching service, and managed Spark clusters that are elastic and can auto-scale to your environment’s needs. This ensures you get the highest performance and fastest response time with low operational overhead.
Spotlight delivers all the capabilities and benefits of a highly optimized distributed query engine like Starburst Data, with true self-service tools and packaging and a rich, collaborative data catalog. The entirely self-service environment lets analytics teams determine their own destiny, reduces the need for IT and data team involvement, and eliminates risky upfront data design projects.
Spotlight virtually connects directly to over 200 different data sources, offering much broader direct access to data than Starburst Data. Spotlight also provides self-service visual modeling and a rich, easy-to-use data catalog and semantic layer, facilitating faster discovery, knowledge-sharing and collaboration, and better data governance. And, Spotlight’s automated SaaS services dramatically reduces the administrative overhead that Starburst Data burdens your team.
|Datameer Spotlight||Starburst Data Enterprise|
|Spotlight is designed to be a completely self-service data environment for analytics teams, hiding all of the underlying data engine intricacies and eliminating the need for IT and data expert involvement.||Starburst looks and acts like a database, with no self-service tools. It requires a hefty amount of work from IT teams with heavy data expertise to configure and manage.|
|With connectors to over 200 different sources, Spotlight lets teams work with ANY data without the need for ETL.||Starburst only has connectors to databases and data lakes, requiring ETL from other sources such as on-premises apps, SaaS apps, and cloud services.|
|Spotlight provides a self-service visual data modeling environment where analysts can organize the data they need. It is entirely graphical and requires no coding.||Data experts need to design objects in Starburst by importing from back-end data sources and using SQL DDL.|
|Spotlight contains a user-friendly data catalog and semantic layer with physical metadata, tagging, descriptions, comments, custom properties, business-level metadata, and usage information.||Starburst does not have a data catalog, only a physical data meta-store.|
|Spotlight allows users to quickly discover and explore assets using faceted-search across names, descriptions, tags, custom properties, and any item in the catalog. It also allows users to see who is using assets and how they are used to determine fit for their project.||Starburst has no search and discovery features. Users must know what assets are in the system or write SQL queries against the meta-store.|
|Spotlight allows users to collaborate in shared workspaces, add knowledge (tags, properties, etc.), and create additional shared assets. It also supports social media-like features around assets.||Starburst has no collaboration features.|
|To maintain good governance, Spotlight maintains multiple forms of metadata about assets, full lineage, and usage auditing. It maintains multiple system-level and user-set properties such as status, which can define an asset's state.||Starburst has no data governance features.|
|Spotlight is a SaaS managed service that requires minimal operational administration. All configuration is performed via graphical tools. Under the covers are managed Spark clusters that are elastic and can auto-scale to your environment's needs.||Starburst requires a hefty amount of manual configuration and administration, raising operational costs.|