Using the PMML Plug-in


PMML (Predictive Modeling Markup Language) provides a way for analytic applications to describe and exchange predictive models produced by data mining and machine learning algorithms.

The Zementis PMML plug-in for Datameer allows users to upload trained predictive models created in tools like R, SAS, KNIME or SPSS (to name a few) as new functions in Datameer workbooks. The plug-in is sold and maintained by Zementis separately.

For sophisticated algorithms, the approach is to set up an export job in Datameer to export a sample of that data to be used in an external tool for the modeling step. Such tools often support the ability to export a model into the PMML format which can then be used to import it back into Datameer for the prediction/scoring step.

Learn more about predictive modeling in Datameer.

Workflow

  • Export a sample of the data from Datameer using an export job.
  • Train the model on the data with with third party tool (e.g., R, SAS, Python, SPSS, etc...)
  • Export the model to PMML when supported by the tool.
  • Import the model to Datameer using the PMML plug-in (this model shows up as a function).
  • Apply the model (the function) to run the scoring, predictions, or classifications. 

Download

The plug-in is written and owned by Zementis

Click here to learn more about how to obtain the PMML plug-in for Datameer.

Installing and Configuring the Plug-in

  1. Copy the zementis.license into the <Datameer Home> directory. 
  2. Use the directions under Managing Plug-ins and Extension Points to install the plug-in.
  3. Under Plug-ins, click the cog icon under actions to configure the PMML plug-in.
  4. Upload the PMML model. In this example, the name of the PMML model is "IRISNAIVEBAYESMODEL".
    (This ise the name of the PMML function in the Formula Builder.)

All PMML models uploaded are listed under PMML Functions at the bottom of the plug-in configuration settings.

Using the PMML Plug-in with Datameer

  1. Upload the PMML model created on a third party tool in the plug-in configuration settings.
  2. Upload the data to run the scoring/prediction on within Datameer using an import job, file upload, or data link. 
  3. Open a Datameer workbook with that data.
  4. On a worksheet, use the Formula Builder to select the PMML model as the function.
  5. Apply the PMML model to the data in Datameer.
  6. Run the workbook.

Example using a PMML Model Function

In this example, you created a PMML model that has data on specific dimensions of flowers. There are three types of flowers in this model that give a range of the possible dimensions of each flower.

You upload the model in the configuration section of the PMML plug-in. 

You then upload the data you want to perform the analysis on into Datameer using a file upload and then open that data in a workbook.

In this example, there are four columns that hold dimensions of a flower. The analysis to preform is to classify what type of flower they are based on the dimensions. 

In a new column, the Function Builder is used to select the PMML model function and enter the columns to analyze.

There is an optional field that can accept a Boolean argument. Type true to display more details about the classification.     

The results for this example classifies the type of flower based on each row's flower dimensions. 

The Boolean argument adds the probabilities for each possible class.