Datameer Spotlight and Data Catalogs

spotlight and data catalogs

What is a Data Catalog

A data catalog is a solution that captures, manages, and organizes information about data assets and allow users – technical and business – to find data assets and gain knowledge about the assets. There are two loose categories of these solutions:

  • Catalogs are designed to help data architects and engineering teams capture and manage more technical metadata about their data assets.
  • Catalogs are designed to help analysts and business teams find data assets and build business knowledge around them.

The modern data catalogs focus on the latter – building more business-relevant information about data and putting it in business savvy analysts’ hands to share knowledge about the data. Often, these catalogs also include governance features.

A modern data catalog captures three core sets of information about data assets:

  • Metadata – includes both technical metadata (where it resides, what type of data it contains, what fields and types, etc.) and business metadata, which adds a more business-level descriptive layer on the asset and the fields contains.
  • Behavioral – this includes implicit information (popularity, how the data asset is being used, who is using it, etc.) and human added information (tags, approvals, levels of trust, etc.).
  • Governance – who is allowed to use the data assets, what policies are applied, and who has been using it.

Suppliers of data catalogs include IBM, Reltio, Unifi Software, Alation, Collibra, Informatica, and Waterline Data.

Applications

data catalog applications

Data catalogs can be employed across a variety of business use cases within an organization.

They are typically used in three ways:

  • Technical data management support – in this application, technical metadata is captured to manage the data assets better. Data engineers, DBAs, and data architects will use technical information about the assets to find ways for tuning, security, backups, data flows, migration, and other management strategies.
  • Analytics support – tools that capture more business-level metadata can be applied to support analytics teams and their initiatives. Business and behavioral information about data assets can be used by the analytics teams to spread knowledge about the assets to accelerate new insights and build trust in analytic results.
  • Governance – in this application, the data catalog tool is used to manage how the data assets are governed across both analytics and applications. Data stewards and security managers can see who and what is allowed access to the assets, build governance policies, and apply these across the assets to manage governance and apply even-handed policies across the organization centrally.

The existing data catalog tools on the market tend to have come from a heritage of one of these three applications. They will typically have stronger features to support that application.

Related Datameer Spotlight Functionality

data catalog functionality

Spotlight contains key capabilities to help analytic professionals to find, create, collaborate, publish and use trusted analytics assets. Among the functionality is:

  • A searchable inventory of analytics assets to discover and find the various assets regardless of type and location
  • The ability to examine profiles of analytic assets to determine the shape and format of the asset
  • Tagging and annotation to provide human supplemented information about the analytics assets
  • Sharing and collaboration on derivative analytics assets that are created to facilitate knowledge
  • The capture and sharing of behavioral data such as who created an asset and who is using it

Spotlight is different from a data catalog in three ways:

  • It acts as an optimized data query and virtualization layer to allow users to directly connect to and consume the various assets in their favorite BI tool
  • It allows analysts to combine certain assets to create a new asset that provides a unified view of all the data required for an analytics problem
  • It manages multiple types of analytics assets, not just data assets, including data, documents, files, BI assets, applications, and more

Spotlight is specifically designed for analytics teams to answer new, ad-hoc business questions faster, deeper manner while facilitating greater use and sharing of analytics assets and expanding knowledge around those assets.

The Relationship Between Datameer Spotlight and Data Catalogs

data catalog relationship

Datameer Spotlight is highly complementary to and interoperable with enterprise data catalogs. Customers can continue to reap benefits from their investment in data catalogs and the aforementioned applications they support.

Datameer Spotlight focuses on the use of analytics assets within its’ domain and the analytic results. It is NOT designed to be an all-encompassing catalog and facilitator of any data asset. All the features in Datameer Spotlight are designed to help answer analytic questions faster and build knowledge and trust around the assets being used.

If a customer uses a data catalog, it remains the central single version of the truth for all data assets across the entire enterprise, regardless of use – analytics, applications, etc.

The data catalog’s core objective remains true: making any data asset documented and facilitating better management and use of those assets.

How Datameer Spotlight Works with Data Catalogs

Datameer Spotlight can work cooperatively with enterprise data catalogs to support better management, use, and knowledge around data assets. With connectors between Datameer Spotlight and data catalogs in-place, Datameer Spotlight can exchange information with the data catalogs, including:

  • Datameer Spotlight’s inventory can be populated and synced with information about existing data assets from enterprise data catalogs
  • Supplemental information about data assets captured within Datameer Spotlight can be synced with the data catalog

Datameer Spotlight also pushes down security and governance to the analytics asset source and comply with those rules in place. If the data catalog is used to put governance policies in place for assets, then Datameer Spotlight would transparently adhere to those policies.

Benefits from Integration

data catalog benefits

A cooperative environment between Datameer Spotlight and data catalogs provides customers with many benefits:

  • A more significant return on investment in their data catalog by managing more assets from Datameer Spotlight
  • Even faster time to value from the Datameer Spotlight implementations being able to discover and use data assets from the data catalog immediately
  • Consistent management and governance of new data assets in Datameer Spotlight placed within the overarching policies set in the data catalog