Materialized View vs Table with Dbt in Snowflake

  • How-Tos FAQs
  • March 29, 2021
Get Started Transforming Your Data in Snowflake - feature img

Materialized View vs Table with Dbt in Snowflake

Deciding to create a view or a table in Dbt is a common design decision when transforming data for use in analytics and BI platforms. Each option has almost reversed pros and cons:

 

  Table   View
Uses Snowflake storage   Yes   No
Queries latest data when accessed   No   Yes

 

When creating a table from a query, first the query is run and then Snowflake stores the result set of a query, or persists the result, on physical storage, the “hard drive” of your data warehouse.

  • When a table is queried, the database already has the result set stored and it is quickly retrieved from physical storage to make the data set available for query. If the underlying source data in the query updates, our analytics data table becomes stale and isn’t in sync with the latest data available.

A view does not store the result of a query. Instead, a view stores the query definition within the database and does not run the query upon creation of the view. This can save significant storage space, as a result set from a query is often a lot more data to store in physical storage than a query definition script.

  • When a view is queried, the database engine has to first run the query you defined, then returns that result set as a temporary table only accessible to, and for the duration of, that current query. Because a view runs the underlying query at the time of access, rather than at view creation, it will always have the latest data available in underlying tables, but it takes longer to run.

Datameer: Schedule Updates For Published Materialized Table

In Datameer, you can publish and materialize a table from your recipe to Snowflake. Scheduling a materialization from a Project allows you to keep your target Snowflake table up to date. Scheduling is possible for daily, weekly and customized frequencies and is accessible and configurable in the ‘Schedule’ tab in the Inspector.

Materialized View vs Regular View in Snowflake

A materialized view is a Snowflake feature that attempts to combine the benefits of a table and view. The best of both worlds.

A materialized view is a “view” that stores the result set on physical storage for quick retrieval AND updates when the underlying data table in the query view definition updates.

At first, this sounds great: we can achieve both high performance and low latency for our analytics and BI data.

But, unfortunately, there are significant limitations to materialized views in Snowflake.

Materialized View Limitations in Snowflake

Materialized views have a major limitation in Snowflake that prevents them from being used for most analytical queries. Materialized views can only access one table. Joins are not supported in materialized views. This is a very significant limitation. Because of this limitation, there are very limited use cases for materialized views.

Materialized View Cost Considerations

The only additional benefit a materialized view has over a table is that a materialized view will stay in sync with the underlying table it queries from. In order to achieve this, Snowflake has a cloud service that gets notified when your underlying table is updated.

Then behind the scenes, Snowflake runs your view definition query on the new data and adds the result to the existing materialized view data. Snowflake charges you for this service each time a query is run to update the materialized view.

If your underlying table updates frequently, Snowflake will be running queries to update your materialized view frequently, and that can get costly very quickly. Snowflake recommends utilizing materialized views only on an underlying table that does not update frequently. This further limits the practical usefulness of materialized views in DBT and analytics engineering in general.

Limited Use Cases for Materialized Views

There are a few limited use cases for materialized views. One of those use cases is parsing semi-structured data into structured tables.

Using Tables and Scheduling Updates

After reviewing materialized views a bit more, the best approach is to utilize regular views and tables in DBT. A view is the best option, until performance starts to suffer. A table with a scheduled run to update it on a recurring basis is the next best option.

 

And there you have it the pros and cons of materialized view vs table with Dbt in Snowflake.

Continue reading

Check out: Format Numbers With Comma Separation in Snowflake Result Set

Related Posts

Top 5 Snowflake tools for Analysts- talend

Top 5 Snowflake Tools for Analysts

  • Ndz Anthony
  • February 26, 2024