How to impute a value for missing data in Alteryx

  • How-Tos FAQs
  • March 3, 2021
Get Started Transforming Your Data in Snowflake - feature img

It is common to find null or missing values within the databases during the ETL process. Null or missing values are cells containing no values. These can dirty calculations and output files. Let’s figure out how to impute a specific value to handle missing data in Alteryx.

We can use Alteryx to handle missing data.

There are three possible ways, depending on the data type of the field:

  • If the field has a numeric data type, you can use an Imputation tool  or a  Formula tool  for basic replacements and the ‘ Append Fields   technique for more complex ones;
  • If the field has a non-numeric data type, you can use a Formula tool  for basic replacements and the ‘ Append Fields ’ technique for more complex ones.

The starting field should appear as shown below.

How to impute a specific value to handle missing data in Alteryx 1

Imputation tool

Using an  Imputation tool  is the simplest way to replace null values in a numeric field with descriptive statistics (4). It also helps keep track of replaced null values (5).

1. Connect an  Imputation tool  directly to your workflow

How to impute a specific value to handle missing data in Alteryx 2

 

2. In the first section of the configuration panel, select all the fields where you want to replace nulls

3. In the second section, choose ‘Null’

How to impute a specific value to handle missing data in Alteryx 5

4. In the third section, choose which numeric replacement you want to use between  average median mode user-specified value

How to impute a specific value to handle missing data in Alteryx 5

5. In the fourth section, you can add a field that  indicates which rows have been replaced  and if to  create a new field with replacements  instead of modifying the original one

How to impute a specific value to handle missing data in Alteryx

Formula tool

Using a Formula tool is the simplest way to replace nulls in any field with  a specific value  (2) or  a row calculation  (3).

1. Connect a  Formula tool  directly to your workflow

How to impute a specific value to handle missing data in Alteryx

2. Use an if-statement as shown in example 1 to replace nulls with a specific value

3. Use an if-statement as shown in example 2 to replace nulls with a row calculation

Append fields

Using the  Append Fields technique  is helpful to replace null values with vertical calculation or with data coming from a different point of the workflow.

1. Connect your workflow to the Target (T) anchor of an  Append Fields tool

2. Connect the information you want to use as a replacement to the Source (S) anchor of the Append Fields tool (it has to be a dataset made of one row; otherwise, each row of the T-anchor will be duplicated for each row of the S-anchor)

3. Use a formula tool to make the replacement

4. Use the select tool to keep only valuable fields

How to impute a specific value to handle missing data in Alteryx

 

The ending dataset should appear as shown below.

How to impute a specific value to handle missing data in Alteryx

We have seen three different techniques on how to impute value for missing data in Alteryx.

1. Imputation tool  to replace nulls with average, median, mode, or user-specified value in numeric fields

2. Formula tool  to replace nulls with a specific value or row calculation in any fields

3. Append Fields  technique to replace nulls with vertical calculation or information coming from a different source

 

Continue reading

Check out: How to Parse fields using RegEx in Alteryx

Related Posts

Announcing the Datameer + Snowflake Partnership

  • Press Release
  • January 25, 2022
banner-data-security-alteryx-blog

Can You Trust Your Data In Alteryx?

  • John Morrell
  • March 5, 2018