Fivetran is a cloud-based ELT data integration platform that offers a simple, reliable way to replicate and synchronize data into your cloud data warehouse (CDW). It provides an extensive array of connections focused on SaaS applications and cloud services that make it easy to create a data replication process between a source and CDW destination. It also offers pre-built source-specific transformation models that take data from one source and makes it “ready-to-query” in your CDW.
Fivetran’s emphasis is to make basic data integration simple and reliable, especially around SaaS applications and cloud services, which can have complicated APIs to get data. It offers automation features that remove some of the complexity, including connections that simplify the interface into SaaS and cloud sources and automated adaptation to schema changes in sources and the destination CDWs.
Datameer Spectrum is a fully-featured data integration platform with wizard-driven simplicity and point-and-click transformations for fast, easy data integration by your analyst community without writing one line of code. Once integration dataflows are ready, Spectrum’s enterprise-grade operationalization, security, and governance features enable reliable, automated, and secure data pipelines to ensure a consistent data flow.
Spectrum offers a comprehensive suite for data integration. It supports data integration and pipelines across various approaches and needs, including ETL, ELT, data preparation, and data science. Its point-and-click simplicity makes it easy for analysts and data scientists, and even non-programmers, to create data integration pipelines of any level of sophistication, allowing you to make your data analytics-ready 10 to 20 times faster at a fraction of the cost.
Spectrum has extensive features to support your hybrid-cloud data landscape, is cloud-native on all three major cloud platforms (AWS, Azure, GCP), and carries with it the elasticity and cost economics you would expect in the cloud. Spectrum can bring together any data sources you have regardless of type, format, and location (cloud or on-premises).
Fivetran is a very simple, reliable service that lets you set up “connections” between your data sources – primarily SaaS applications, cloud services, and cloud databases – and your cloud data warehouse. Connections are savvy regarding the source and CDW destination, particularly understanding the source schemas and APIs to abstract this from the user.
Fivetran uses the term connections differently than other data integration vendors. To Fivetran, connections are a complete pipeline between your source and cloud data warehouse used to “replicate” data into the CDW. These connections are the “EL” in ELT (extract, transform, and load). Connections have a source schema and a destination schema model built-in, based on your data in the source.
What is loaded into your CDW is the raw data from your data source. In the process, simple transformations are performed, such as data type re-casting, to ensure the data is fully query-ready in the data warehouse. Connections can be defined to “sync” on regular intervals and only load new or updated data.
Once the data is loaded into the CDW, Fivetran then lets you do the “T” in ELT or transform the data. Data can be transformed using SQL or using an open-source transformation package dbt. Transformations can be scheduled to run or triggered based on when connection syncs take place.
To learn how Datameer Spectrum works, click here.
Fivetran and Datameer Spectrum are both “data integration” platforms but are very different in how they approach and perform data integration. Both try to make data integration fast and straightforward. But they vary significantly in the degree of data integration capabilities. Let examine how Fivetran and Spectrum compare in these six key areas:
Fivetran supports one simple data integration pattern – ELT. And all the “T” is done via pushdown queries defined in SQL or via dbt packages. The “EL” is simply data replication and synchronization to the CDW. It is complicated to combine and prepare multiple data sources into a unified view in Fivetran, a very common data integration pattern.
Spectrum supports a wide range of data integration patterns: ETL, ELT, data preparation, and data science pipelines. Dataflow processes can include combining multiple sources, simple transformations, and a full range of data preparation, including cleansing, data enrichment, pivoting, aggregation, shaping data for machine learning, and more.
Fivetran supports a simple operationalization model – each connection has scheduled synchronizations. The underlying server does a good job managing the synchronizations by understanding what data has changed, recovering from failures, keeping logs, and automating adjustments to schema changes.
Through Datameer’s ten years working with some of the largest enterprises with demanding requirements around big data, Spectrum has evolved a scalable, flexible job execution system for operationalizing jobs. It can manage large volumes of data, run reliably, is extremely easy to operate, and connects to enterprise management tools.
Fivetran offers a few key security capabilities, including encryption, column hashing and blocking, role-based access control, and SSO/SAML. Also, since Fivetran is a service, and your data goes through their service, it needs to meet other security requirements, including being SOC 2 (Type 2) certified and having documented security operations. Fivetran offers no governance capabilities.
Spectrum provides enterprise-class security and governance features, highly evolved by working with some of the largest banks, healthcare insurers, telecom carriers, and retailers. Encryption, data masking and anonymization, Kerberos integration, fine-grained access controls, integration with enterprise security (LDAP/AD, SSO/SAML), and more enable complete data security and privacy. Governance features such as data lineage, auditing, and data retention and archiving policies would allow organizations to maintain and report regulatory compliance such as HIPAA, GDPR, PCI, Sarbanes Oxley, and more.
Fivetran supports an extensive suite of intelligent connections to data sources, especially SaaS applications, cloud services, and cloud databases. What Fivetran lacks is connectivity and bridging to on-premises and hybrid data sources, particularly enterprise data warehouses.
Spectrum supports a wide range of data sources, both cloud-based and enterprise, including enterprise data warehouses and data lakes – Teradata, Netezza, Hadoop, Hive. It also can use secure protocols, enterprise security controls, and encryption to access and bridge on-premises sources to the cloud securely.
Fivetran’s transformation capabilities require writing sophisticated SQL code or using dbt packages. SQL is used for custom transformations and is limited to the 100 or so transformations SQL supports. dbt packages are used to transform data from a single SaaS and cloud service source into an analytics-ready schema.
Spectrum includes a powerful yet easy-to-use data preparation capability that allows analysts and data scientists to shape data to their needs without any coding. An easy-to-use spreadsheet-style interface enables users to apply transformations in a single-click fashion and interactively shape their data. Visual data profiling and data exploration allow users to dig into the data to find interesting patterns, anomalies, or places where data needs to be cleansed.
Spectrum offers a rich suite of over 300 single-click transformation functions and packages. These functions include simple everyday transformations, a wide range of aggregations, analysis algorithms, cleansing and deduplication, and specific tasks for text mining, pivoting, and one-hot encoding, some of which perform many transformations under the covers for robust data shaping.
The “T” in Fivetran’s ELT is performed in your cloud data warehouse, such as Snowflake, and creates hidden CDW costs. Raw data is stored in a normalized schema in the CDW, then transformed into an analytics-ready form, such as a denormalized materialized table or a multi-dimensional aggregated table. This will increase:
Transforming from a denormalized schema to a normalized analytics-ready one requires many-way joins and unions, and aggregations, which are extremely “expensive” compute operations in a CDW. Therefore, many smart customers make their data analytics-ready before hitting the CDW. These extra hidden costs will increase your monthly CDW bill, not your Fivetran bill. Also, there are hidden costs of teams trying to write, debug, and deploy SQL transformations.
Using Spectrum’s ETL model, all transformations, joins, and aggregations to make the data analytics-ready are performed in-transit within Spectrum, using its compute infrastructure, which is included in the price. This provides a transparent pricing model without the hidden back-end CDW costs.
Fivetran has done a splendid job at simplifying the process of getting data out of SaaS applications and cloud services into a data warehouse, eliminating the need to work with the complex APIs of such data sources. However, it’s ELT model, inability to bring together multiple data sources, lack of transformation and preparation, and hidden costs make it unsuitable for “real” data integration.
Datameer Spectrum combines similar ease of connectivity with easy to use yet powerful transformations and data preparation, enterprise-grade security and governance, and no hidden costs, allowing you to create “real” data integration pipelines more quickly and efficiently.
|Spectrum is a fully-featured data integration and pipeline platform to bridge all your data sources and create analytics-ready data in minutes.||Fivetran is a simple, reliable service for replication, synchronizing, and transforming data into a cloud data warehouse.|
|Spectrum supports a wide range of data integration capabilities, including ETL, ELT, data preparation, data engineering, and data pipelines for data science.||Fivetran supports only a simple ELT model for integration for individual data sources and cannot bring together multiple data sources into a common analytics view.|
|Spectrum offers easy to manage, scalable operationalization of data integration jobs to ensure reliable data delivery for analytics.||Fivetran offers a simple operationalization model for data replication and synchronization connections.|
|Spectrum supports a deep suite of security and governance features, evolved through work with large enterprises in highly regulated industries.||Fivetran provides cloud-based security and managed security operations for the running service.|
|Spectrum effortlessly bridges your on-premises and hybrid data sources to the cloud with secure, scalable access and protocols.||Fivetran lacks connectivity to and the ability to integrate on-premises data sources into the cloud.|
|Spectrum includes a rich array of over 300 point-and-click transformation functions, all usable without writing any code, and full data preparation for any analytics use.||Fivetran supports SQL coding for transformations or the use of single-source dbt packages.|
|Spectrum performs all data transformations and integrations in-flight, using its own compute engine, and lands only analytics-ready data, eliminating hidden CDW costs or credit burning.||Fivetran forces you to store duplicate copies of data and perform all transformation and integration in the target CDW ringing up extra CDW costs or credit burns.|