Datameer Spotlight Versus Dremio

Datameer Spotlight Versus Dremio

 

Go beyond data lakes!  Datameer Spotlight lets you efficiently manage, catalog, and query ANY data from across your enterprise to perform ANY form of analytics.  Your analytics community can quickly discover, model, consume, and govern data for analytics on an automated SaaS-based service that delivers faster, trust assets, and immediate time to value.

Looking for a Dremio alternative? Try Datameer Spotlight free today.

about informatica

What is Dremio?

Datameer Spotlight Versus Dremio - what is dremio

Dremio is a data lake engine that allows you to organize data dispersed in your data lake and perform much faster queries on it.  Data lakes have historically been very disorganized and suffered from poor query performance.  Dremio attempts to alleviate both pain points with a self-service semantic layer that maps the underlying data and a robust in-memory query engine.

Looking for a Dremio alternative? Try Datameer Spotlight free today.

informatica job execution

How Does Dremio Work?

Datameer Spotlight Versus Dremio - how does dremio work

Dremio has three major components:

  • A semantic layer and repository that is a map of queryable data models to the raw data in the data lake and additional user-provided information about the data.
  • A web-based data modeling tool allows users to see existing queryable data models and create new ones based on the raw data in the data lake.
  • An Apache Arrow-based in-memory query engine speeds data lake queries and reduces the compute resources needed for those queries.

Data analysts and data engineers work in the toolset to create the right analytics-ready datasets based on the raw data.  At the core of each virtual dataset is a SQL query that defines the structure and physical datasets (raw data).

Dremio can use data from two major areas: data lakes and databases.  Data lake data sources include files, cloud object stores (AWS S3, Azure ADLS, etc.), and Hadoop data stores (Hive, HBase, etc.).  It can also connect to and query from typical databases like Oracle, Teradata, Microsoft SQL Server, MySQL, Amazon Redshift, etc.

The semantic layer is the repository of datasets that analysts can query for their analytics.  Users can see metadata about the datasets and derived semantic information such as transformations and data lineage.  They can also add user-provided “Wiki-style” descriptions for datasets and spaces (collections of datasets) and tags to datasets.  There are limited search facilities allowing simple search on metadata and tags.

The query engine facilitates SQL-based queries on the data, based on the datasets in the semantic layer.  It is based on the Apache Arrow open source project.  The query engine uses data reflections (materialized views), in-memory caching, and pipelining to accelerate queries and performance.

Looking for a Dremio alternative? Try Datameer Spotlight free today.

What is Datameer Spectrum (ETL++)? icon

What Is Datameer Spotlight?

Datameer Spotlight Versus Dremio - what is spotlight

Datameer Spotlight is a virtual data management platform and data catalog that gives analytics teams easy access to all enterprise data assets—regardless of type or location.  Spotlight flips the analytics data model on its head, eliminating the need for costly ETL and data replication for analytics and wasted time waiting for data.

Spotlight lets analysts quickly discover, create, share, and collaborate on data assets, building knowledge and trust along the way.  It provides a single place where analytics teams can quickly discover all these analytics assets and understand which best solve their problem to produce actionable results promptly. It provides an environment where teams can share and reuse assets, collaborates to form new assets and increase knowledge using familiar social media-like features and AI-augmented information about asset utilization.

Under the covers, Spotlight provides a scalable, performant virtual data query and access environment that brings together all the data analysts need without the need to ETL or replicate data.  Spotlight is a SaaS-managed service that does not require IT administration and uses patent-pending optimization techniques and elastic compute architecture to maintain performance and scale.

Spotlight increases the ROI on your data, BI, and analytics investments.  It works with any data source you may have – databases, data warehouses, data lakes, files, and applications – and any BI, analytics, and data science tool used.  Best of all, the virtual query engine eliminates the need for ETL, allowing you to lower your data integration costs.

Increase the ROI on your data, BI, and analytics investments with Spotlight.

virtualization

Quick Comparison

Datameer Spotlight Versus Dremio - quick comparison

At its core, Dremio is a query acceleration engine for cloud data lakes, with a self-service modeling and semantic layer.  Spotlight is purpose-built to accelerate any analytics (not just data lakes) with a highly optimized virtual data management server, a broad suite of connectivity to any data, and a collaborative catalog for easy data discovery.

Spotlight and Dremio have a few things in common:

  • Both platforms take a no-ETL, virtual access to data approach – Dremio for data lakes and Spotlight for any data.
  • Both have a highly optimized distributed query engine that facilitates faster queries on distributed data.
  • Both have a self-service semantic layer to allow teams to create and find virtual, analytics-ready datasets effectively.

Beyond this, Spotlight offers several key differentiated capabilities versus Dremio that allow it to facilitate faster analytics of any kind:

  • Works with any data for any analytics
  • Easy visual data modeling
  • Faster discovery of data assets
  • Deeper cataloging and knowledge sharing
  • Team collaboration
  • Complete data security
  • Robust data governance
  • More automation and less administration

Speed up your analytics with Spotlight, start free today.

Datameer: icon search

Any Data for Any Analytics

Datameer Spotlight Versus Dremio - works with any data

Dremio has a minimal set of data connectors that are extremely focused on data lakes (files, cloud object stores, and Hadoop) and databases (7 in total including Redshift, Oracle, Teradata, DB2, and others).  The main objective is to facilitate analytics on your data lake and combining it with supporting data from data marts and warehouses.

Spotlight has over 200 connectors to a wide variety of data sources: databases, data warehouses, cloud data warehouses, analytical data sources, SaaS applications, cloud services, and more.  Spotlight’s objective is to facilitate cloud-based analytics across ANY and ALL of your data, supporting analytics of any form.

Facilitate cloud-based analytics across ANY and ALL of your data with Spotlight.

user experience

Easy Visual Modeling

Datameer Spotlight Versus Dremio - easy visual data modeling

At the core of each dataset in the Dremio semantic layer is a SQL query.  Dremio does provide some easy menu-based operations for JOINs and basic transformations and offers visual data lineage.  But essentially, creating most virtual datasets requires SQL coding.

Spotlight has a codeless, visual approach to modeling through its intuitive point-and-click interface.  Spotlight introspects and catalogs the objects from your sources, lets you search and discover the right assets for your analysis, and has AI-driven recommendations to guide the modeling process.

No code necessary for these visual datasets. Try Spotlight free today.

data prep

Faster Discovery of Data Assets

Datameer Spotlight Versus Dremio - faster discover of data assets

Dremio has a catalog view that allows users to see Wiki content, tags, and fields for all the datasets.  Related datasets can be organized into spaces to make them easier to find.  Users can search for datasets based on field and table names or find all datasets with a specific tag.  To fully explore a dataset, it needs to be queried from an external tool.

Spotlight has a rich catalog and allows users to easily search across names, descriptions, tags, custom properties, and any item in the catalog.  Search results can also be filtered by who is using a dataset (owners and collaborators) and other usage information.  Spotlight provides a detailed data preview, and users can open a dataset in their favorite BI tool from within Spotlight to explore it visually.

Keep using your favorite BI tool from within Spotlight. Give it a try today.

Deeper Cataloging and Knowledge-Sharing

Datameer Spotlight Versus Dremio - deeper cataloging and knowledge sharing

Dremio has an easy-to-use but limited data catalog.  For each dataset, the Dremio semantic layer contains three items: the physical metadata (including lineage and transformations), a Wiki-like description, and ad-hoc tags.

Spotlight contains a very rich data catalog and semantic layer.  Beyond the physical metadata, users can provide information about the data, including tags, descriptions, and comments.  They can also certify assets, provide custom properties, and add business-level metadata.  Spotlight supplements this by capturing information on where an asset is referenced, who is using it, and how often it is used.  Users can search for assets across technical metadata and all of the added knowledge.

Make searching for data easier with a rich data catalog. Try Spotlight free today.

Informatica Security & Governance

Team Collaboration

Datameer Spotlight Versus Dremio - team collaboration

With Dremio, users can share, reuse, and chain virtual datasets. Spaces can be used as an area where users can collect related datasets and perform rudimentary collaboration on a project.

Spotlight allows users to work together in shared workspaces to collaborate, add knowledge (tags, properties, etc.), and create additional shared assets. It also supports social media-like features around assets. The owner can add collaborators, and users can request to follow or fully collaborate on an asset. Once added, followers will receive notifications in their activities inbox. Collaborators can fully exchange comments and notifications on assets.

Collaborating with your team has never been so easy. Try Spotlight free today.

Fivetran versus Datameer: security

Complete Data Security

Datameer Spotlight Versus Dremio - complete data secruity

Dremio provides its own user-, group-, and role-based security. It can integrate with LDAP and SSO for enterprise security and access rights and supports Personal Access Tokens (PTOs).  At the data level, users can have Edit rights (ability to modify a dataset), Query rights (ability to query/use a dataset), or no rights (will not see the dataset). Dremio also supports encryption on the wire.

Dremio DOES NOT validate user access rights to data objects with the originating data source, forcing data stewards to re-implement data-level access controls in Dremio and creating potential security holes.  A user with access to a physical dataset can create a virtual dataset containing the physical data, then grant access to the virtual datasets to another user who may not have access to the physical data creating a security loophole.

Spotlight provides a deep set of security capabilities, including:

  • User-, group-, and role-based access controls,
  • Integration with LDAP, Active Directory, and SSO/SAML
  • Separation of controls to see an asset (visibility), edit it, and viewing the underlying data
  • Encryption at-rest and on-the-wire
  • Field/column masking and obfuscation

Spotlight is intentionally designed NOT to replicate already-in-place access control mechanisms in place for the data.  Metadata visibility controls in Spotlight and data assess controls independent of each other.  The data source maintains access control to the data. The Spotlight user’s security credentials are passed down to the source at query execution time, eliminating potential conflicts and loopholes.  Even when it caches datasets, Spotlight always re-authenticates with the originating data sources before permitting access, maintaining consistent security across all data.

Keep your data secure. Try Spotlight free today.

Asset Management

Robust Data Governance

Datameer Spotlight Versus Dremio - data governance

Data governance goes beyond security, allowing organizations to understand what data assets are made up of, their meaning, and how they are being used.  In many organizations, strong data governance is needed for regulatory requirements.

Dremio provides only physical metadata and data lineage for governance.  Spotlight contains several features to maintain governance, including:

  • Multiple forms of metadata about assets, including their source technical metadata, any user-defined and business metadata, and full lineage to track where the assets came from and how new ones were formed.
  • There are several properties on data assets that are set and maintained to facilitate governance.  This ranges from system-level ones such as creation and modification dates to user-set properties such as status, which can define the state of an asset and the trust level.
  • Custom properties and standardized tags can be defined to mark assets for custom governance status.

Looking for a more data governance options? Try Spotlight free today.

icon tools

More Automation and Less Administration

Datameer Spotlight Versus Dremio - more automation

Dremio works like a piece of data infrastructure, and as such, requires a great deal of administration to scale and maintain performance.  To gain performance via caching, data reflections and associated refresh jobs need to be defined and managed.  And their “elastic engines” are a bit of a misnomer as the engines require a great deal of setup and maintenance, use a predefined compute resource, and do not auto-scale to the needs of queries or jobs.

Spotlight is a SaaS-managed service that requires no operational administration, particularly for performance and scale.  Under the covers are managed Spark clusters that are elastic and can auto-scale to your environment’s needs to maintain high performance and fast response time.

Spotlight is a SaaS managed service, try it free today.

conclusion icon

Conclusion

conclusion

Spotlight lets you simplify and scale-out your data management for any form of analytics, not just cloud data lakes.  It virtually connects directly to over 200 different data sources, offering much broader access to data.  Spotlight also provides visual, code-free data modeling and a much richer data catalog and semantic layer, facilitating faster discovery, knowledge-sharing and collaboration, and better data governance.  And the auto-scaling elastic service dramatically reduces the administrative overhead that Dremio burdens your team.

See Spotlight for yourself with our free trial.  Or to learn more, please visit our Datameer Spotlight microsite.

comparison table

Spotlight and Dremio Comparison Table

Datameer Spotlight Versus Dremio - comparison table
Datameer Spotlight Dremio Enterprise
With connectors to over 200 different sources, Spotlight lets teams work with any data for any type of analytics.Dremio only has connectors to data lakes and databases, focusing only on data lake analytics.
Spotlight provides an entirely code-free, visual data modeling environment.Dremio forces you to perform modeling in SQL, with a few point and click elements.
Spotlight has a rich catalog and allows users to easily search across names, descriptions, tags, custom properties, and any item in the catalog.Dremio has a simple catalog view allowing users to browse content for datasets with a simple search based on names and tags.
Spotlight contains a rich data catalog and semantic layer with physical metadata, tagging, descriptions, comments, custom properties, business-level metadata, and usage information.Dremio has an easy to use but limited semantic layer with physical metadata, a Wiki-like description, and ad-hoc tags.
Spotlight allows users to work together in shared workspaces to collaborate, add knowledge (tags, properties, etc.), and create additional shared assets. It also supports social media-like features around assets.With Dremio, users can share, reuse, and chain virtual datasets. Spaces can be used as an area where users can collect related datasets and perform rudimentary collaboration on a project.
Spotlight provides complete, end-to-end enterprise security and pushes down data access controls to maintain data source security integrity.Dremio has user- and role-based security that requires you to replicate security controls and can create potential data security holes.
To maintain good governance, Spotlight maintains multiple forms of metadata about assets, full lineage, and usage auditing. It maintains multiple system-level and user-set properties such as status, which can define an asset's state.Dremio has limited data governance features (physical metadata and lineage).
Spotlight is a SaaS managed service that requires no operational administration, particularly for performance and scale. Under the covers are managed Spark clusters that are elastic and can auto-scale to your environment's needs.Dremio requires a great deal of administration to define and maintain data reflections, associated refresh jobs, and "elastic engines."