Datameer Spectrum is an even easier-to-use ETL++ tool and platform than Matillion for faster, easier data pipeline orchestration and offers a much richer transformation library and feature set to support many more use cases than Matillion without the hidden cloud data warehouse costs.
Matillion is one of the younger, cloud-based ETL solutions on the market. The platform covers all three aspects of data integration – extract, transformation, and loading. The graphical user interface allows users to orchestrate and run data pipelines without coding. The platform is flexible, offering an extensive array of cloud connectors, as is the cloud-based pricing model.
Matillion consists of three components: the underlying platform, a graphical data orchestration tool, and a management tool. The three combine together to enable the definition and operation of ETL data pipelines.
The graphical orchestration tool allows users to string together “components” into a data flow. Components can be data source connections and extractions, data staging and definition, flow control, transformations, messaging, and loading into destinations. Matillion offers 105 connectors and 75 components, of which 25 are for transformations.
The management tool allows admins to set up users and security, run jobs, configure the system, and perform other administrative tasks. For security, Matillion offers user- and role-based security and access controls, LDAP integration, and Single Sign-On (SSO) integration.
The platform is the execution part of the system. It is essential to understand that Matillion does NOT have a storage and execution engine. All data processed in a data flow is stored in its intermediate form in your cloud data warehouse tables. All data management and transformation operations are pushed down into the cloud data warehouse. The scalability and performance of Matillion are dependent upon that of your cloud data warehouse.
Datameer Spectrum is a fully-featured ETL++ data integration platform with a broad range of capabilities for extracting, exploring, integrating, preparing, delivering, and governing data for scalable, secure data pipelines. Spectrum supports analyst self-service data preparation and data engineering use cases, enabling a single hub for all data preparation across an enterprise. Pipelines can span across various approaches and needs, including ETL, ELT, data preparation, and data science.
The Spectrum user experience offers wizard-driven simplicity and visual transformations for fast, easy data preparation by your analyst community without writing one line of code. Simultaneously, the deep suite of functions and governance features enables data engineering pipelines of any level of sophistication. The result is a cooperative environment where analysts and data engineers can get your data analytics-ready 10 to 20 times faster at a fraction of the cost.
Once integration dataflows are ready, Spectrum’s enterprise-grade operationalization, security, and governance features enable reliable, automated, and secure data pipelines to ensure a consistent data flow. Spectrum has extensive features to support your hybrid-cloud data landscape. It is cloud-native on all three major cloud platforms (AWS, Azure, GCP) and carries the elasticity and cost economics you would expect in the cloud. Spectrum can bring together any data sources you have regardless of type, format, and location (cloud or on-premises).
At a high level, both Datameer Spectrum and Matillion are ETL data integration tools and platforms. But several areas differentiate Spectrum from Matillion, including:
Let’s explore each of these in more detail.
Matillion’s data orchestration tool makes it harder to put together data flows than the company lets on. This is because:
This combination of factors can lead to very complex data flows with many components.
Spectrum’s spreadsheet-style UI is the antithesis of complex data flow-style ones. The UI makes it very easy to put together all the transformation operations needed, and the interactive nature of it allows a user to see the impact of each operation. Data connection and extraction is an automated, wizard-driven process, as is data loading into the destination. All of this combines to make data pipeline creation far faster and easier than Matillion’s data flow UI.
Matillion offers a very limited set of 75 components, of which only 25 are dedicated to data transformation. Matillion can perform simple data manipulation, joins, aggregation, filtering, and conversions. Also, only some functions, such as JSON extraction, work with specific CDWs (Snowflake). Real data engineering requires a far more sophisticated set of functions.
Spectrum supports a deep library of close to 300 functions, each applicable graphically without coding, to tame even the most complex data for more sophisticated data engineering tasks. And Spectrum provides this in the same easy-to-use spreadsheet-style UI, makes it just as easy to define sophisticated data pipelines on complex data.
Matillion’s data preparation is performed with the limited set of 25 transformation components in their library. There are no capabilities for basic areas such as data cleansing and de-duplication. Also, because of the complex data flow style UI, Matillion is definitely not a self-service data preparation tool for analysts.
With Spectrum, data preparation is a central part of the platform and a critical piece of the user experience. The easy, spreadsheet-style user interface allows any analyst to create data pipelines rapidly regardless of technical skills. For more sophisticated data pipelines, Spectrum provides over 300 functions, all of which can be applied graphically. The user experience is also interactive, allowing users to immediately see the impact as they apply functions or make changes to speed the process.
Spectrum facilitates self-service data preparation for both analytics and data science, and robust data engineered pipelines. It provides a shared hub where analysts, data scientists, and data engineers can collaborate and a central repository for regulatory compliance.
Matillion offers connectors that can work with more traditional data sources such as Oracle, Teradata, Netezza, and others typically on-premises. But buyer beware if you try to use Matillion to create pipelines to integrate data from your on-premises sources. Matillion offers no capabilities for secure tunneling and encryption to protect the data that your typical enterprise would expect. It also lacks structured data retention policies.
Spectrum uses secure protocols, enterprise security controls, and encryption can access and bridge on-premises sources to the cloud securely and have robust data retention policies that ensure extra copies of data are removed. Spectrum also lets you seamlessly burst or migrate pipeline workloads into the cloud.
Matillion offers only minimal security capabilities with your basic user and role-based controls, LDAP integration, and Single Sign-On (SSO). It does not provide encryption or obfuscation and has no data governance features.
Spectrum provides enterprise-class security and governance features, highly evolved by working with some of the largest banks, healthcare insurers, telecom carriers, and retailers. Encryption, data masking and anonymization, Kerberos integration, fine-grained access controls, integration with enterprise security (LDAP/AD, SSO/SAML), and more enable complete data security and privacy. Governance features such as data lineage, auditing, and data retention and archiving policies would allow organizations to maintain and report regulatory compliance such as HIPAA, GDPR, PCI, Sarbanes Oxley, and more.
Matillion is only suitable for data integration of cloud and SaaS data sources into a cloud data warehouse. Matillion does not offer sophisticated functions for real data engineering, algorithmic or encoding functions for data science, and robust security to bridge a hybrid cloud.
Spectrum offers a single integrated platform and toolset that offers all the capabilities for many different uses cases, providing a versatile, unified data integration hub:
With Matillion, your data destination is always a cloud data warehouse or data lake (Databricks). In fact, Matillion relies on the CDW for its storage and processing. This also creates an extra hop for BI tools to use the data and requires extra compute and storage costs in the destination, making end-to-end operationalization with BI processes difficult.
Spectrum supports the ability to directly send data to BI platform servers in their native format, including Tableau, PowerBI, Qlik, Looker, and ThoughtSpot. For operationalized BI use cases, this simplifies the end-to-end process, lowers your costs, and speeds up data delivery.
As mentioned, Matillion relies on the cloud data warehouse for all its processing power. Matillion itself is a single cloud compute server instance only routing jobs. And, there is no intelligence as to how to break down jobs optimally for execution. This can severely limit Matillion’s scalability and performance and can force jobs to take a back seat if other database workloads take priority.
Spectrum operates its own elastic Spark-based compute cluster under the covers to give jobs the scale and performance they need automatically. The patented Smart ExecutionTM technology is similar to a database optimizer but for data pipelines, intelligently breaking down jobs into smaller components and executing these in parallelized, optimal way to ensure fast performance.
As mentioned, Matillion relies on your cloud data warehouse, such as Snowflake, for its processing power. This creates hidden CDW costs above and beyond the Matillion costs. Intermediate data is pushed into and stored in the CDW, and then transformations are pushed down and executed in the CDW. This will increase:
Data transformation often requires many-way joins and unions, and aggregations which are extremely “expensive” compute operations in a CDW. Also, each component is a fine-grained operation, causing Matillion to go back to the CDW for execution continually. These extra hidden costs will increase your monthly CDW bill, not your Fivetran bill.
Using Spectrum’s ETL model all transformations, joins, and aggregations to make the data analytics-ready are performed in-transit within the Spectrum Spark-based elastic compute cluster, which is included in the price. Only the results are sent to your CDW. This provides a transparent pricing model without the hidden back-end CDW costs.
Matillion is very much a one-trick pony for performing integration of cloud and SaaS data sources into a cloud data warehouse. And it is not that good for that task. The user experience makes it unsuitable for self-service data pipelines and still makes it difficult and complicated for data-savvy users to create data pipelines successfully. Its restricted set of transformation functions and data preparation capabilities and lack of security and governance severely limits where and how you can use Matillion.
Datameer Spectrum simplifies how you manage your data for analytics and makes it faster and easier to create data pipelines ranging from simple to complex, covering many more use cases. Spectrum offers an even easier user experience than Matillion and blends much greater sophistication and data preparation within that easy user experience. Spectrum offers something for everyone – data engineers, analysts, and data scientists – and provides a shared hub for this entire community to collaborate with no hidden cloud costs.
|Spectrum's interactive, spreadsheet-style UI makes it fast and easy to apply operations on the data and create data pipelines.||Matillion offers a complex data flow UI for data pipeline orchestration, making it difficult and complicated to create data pipelines.|
|Datameer offers a deep library of 300+ functions within the same easy-to-use UI, allowing data engineers to quickly craft data pipelines with more sophistication.||Matillion offers only a small set of components (75) with limited transformation capabilities making it impossible to create more sophisticated data engineering pipelines.|
|With Spectrum, data preparation is a central part of the platform, and the easy, spreadsheet-style user interface allows any analyst to create data pipelines rapidly regardless of technical skills.||Because of its limited set of components and transformation capabilities, and complex UI, Matillion is not suitable for data preparation.|
|Spectrum uses secure protocols, enterprise security controls, and encryption to bridge on-premises sources within your hybrid cloud environment.||Matillion offers no enterprise-grade security capabilities making it risky to use in bridging on-premises data in a hybrid cloud.|
|Spectrum provides enterprise-class security and governance features including encryption, data masking and anonymization, Kerberos integration, fine-grained access controls, integration with enterprise security (LDAP/AD, SSO/SAML) to enable complete data security and privacy.||Matillion offers only minimal security capabilities and lacks enterprise security capabilities like encryption and obfuscation, and has no data governance features.|
|Spectrum offers all the capabilities for many different use cases, providing a versatile, unified data integration hub for data integration, self-service data preparation, data engineering, data science, and hybrid-cloud.||Matillion's sweet-spot is for data integration of cloud and SaaS data sources into cloud data warehouse and does not support data engineering, data science, and hybrid cloud use cases.|
|Spectrum supports the ability to directly send data to all the popular BI platforms in their native format, simplifying end-to-end processes, lowering your costs, speeding data delivery.||Matillion offers no integration with BI tools, forcing an extra hop for BI users' data delivery and making it challenging to create operationalized BI data pipelines.|
|Spectrum has an automated, scalable Spark-based elastic compute cluster and patented Smart ExecutionTM optimizer to maximize scalability and performance.||Matillion relies on a CDW for processing and does not have a robust optimizer, giving it performance and scalability issues.|
|Spectrum supplies its own Spark-based elastic compute clusters included in the price to eliminate hidden CDW costs.||Matillion relies on a CDW for its processing generating extra hidden costs on top of the Matillion costs.|