Page tree
Skip to end of metadata
Go to start of metadata

Are you more of a command line aficionado than a user interface promoter? Wish you could navigate Datameer artifacts using the command line? With this guide, you will be able to do a variety of things with Datameer artifacts without the Datameer user interface (UI).

 

Table of Contents

 

Sample data download

Download: The Flight Delays app from the Datameer App Market.

 

Find your REST call (GET)

In this example, you will start with the GET command. This command will allow you to retrieve the configurations for a particular Datameer artifact. In order to run a REST API command via command line, you will need your username and password for Datameer, as well as the URL for Datameer.

 

Open up whichever command line application you prefer. Here is the command you will be using:

curl -u <username>:<password> -X GET 'http://<Datameer-serverIP>:<port-number>/rest/import-job/ <job-configuration-id>' 


You must make sure to fill in your actual username and password for your Datameer instance along with the Datameer URL and job configuration ID. Your job configuration ID can be found in your Datameer UI. Go to your Datameer instance and select the Airports import job and view the Information Browser to the right of the artifacts. You will find the ID here:

 

A completed REST call with GET will look something like this:

curl -u admin:admin -X GET 'http://localhost:8080/rest/import-job/23'

 

Understand the return for the GET command

Once you run the command, your return should look like this:

{

  "version": "4.0.2",

  "className": "datameer.dap.common.entity.DataSourceConfigurationImpl",

  "file": {

    "uuid": "df4fe482-d07f-4f53-aa04-b58aac43f594",

    "path": "/Users/admin/Applications/Flight Delays/Resources/Airports.imp",

    "description": "",

    "name": "Airports"

  },

  "pullType": "MANUALLY",

  "minKeepCount": 1,

  "properties": {

    "TextFileFormat": [

      "TEXT"

    ],

    "fileNameTimeRange_mode": [

      "OFF"

    ],

    "fileNameTimeRange_startDate": [

      ""

    ],

    "filter.minAge": [

      ""

    ],

    "filter.maxAge": [

      ""

    ],

    "characterEncoding": [

      "UTF-8"

    ],

    "recordSampleSize": [

      "1000"

    ],

    "escapeCharacter": [

      ""

    ],

    "detectColumnDefinition": [

      "SELECT_PARSE_AUTO"

    ],

    "collectAdditionalFields": [

      "false"

    ],

    "quoteCharacter": [

      "\""

    ],

    "delimiter": [

      ","

    ],

    "csv.max-lines-per-record": [

      "1"

    ],

    "external.store": [

      "false"

    ],

    "filter.page.does.split.creation": [

      "false"

    ],

    "fileType": [

      "CSV"

    ],

    "GenericConfigurationImpl.temp-file-store": [

      "1dad24d5-96c2-4af1-8460-b206f8df3cd2"

    ],

    "incrementalMode": [

      "false"

    ],

    "histogram.generation": [

      "false"

    ],

    "file": [

      "flightdelays/ICAOAirports.csv.zip"

    ],

    "strictQuotes": [

      "false"

    ]

  },

  "hadoopProperties": "",

  "dataStore": {

    "path": "/Users/admin/Applications/Flight Delays/Resources/Examples in S3.dst",

    "uuid": "a61e955c-576d-47b5-b50a-8554403eddbb"

  },

  "errorHandlingMode": "DROP_RECORD",

  "maxLogErrors": 1000,

  "maxPreviewRecords": 5000,

  "notificationAddresses": "",

  "notificationSuccessAddresses": "",

  "fields": [

    {

      "id": 347,

      "pattern": "",

      "acceptEmpty": true,

      "name": "id",

      "origin": "0",

      "valueType": "{\"type\":\"INTEGER\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 348,

      "pattern": "",

      "acceptEmpty": true,

      "name": "ident",

      "origin": "1",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 349,

      "pattern": "",

      "acceptEmpty": true,

      "name": "type",

      "origin": "2",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 350,

      "pattern": "",

      "acceptEmpty": true,

      "name": "name",

      "origin": "3",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 351,

      "pattern": "",

      "acceptEmpty": true,

      "name": "latitude_deg",

      "origin": "4",

      "valueType": "{\"type\":\"FLOAT\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 352,

      "pattern": "",

      "acceptEmpty": true,

      "name": "longitude_deg",

      "origin": "5",

      "valueType": "{\"type\":\"FLOAT\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 353,

      "pattern": "",

      "acceptEmpty": true,

      "name": "elevation_ft",

      "origin": "6",

      "valueType": "{\"type\":\"INTEGER\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 354,

      "pattern": "",

      "acceptEmpty": true,

      "name": "continent",

      "origin": "7",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 355,

      "pattern": "",

      "acceptEmpty": true,

      "name": "iso_country",

      "origin": "8",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 356,

      "pattern": "",

      "acceptEmpty": true,

      "name": "iso_region",

      "origin": "9",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 357,

      "pattern": "",

      "acceptEmpty": true,

      "name": "municipality",

      "origin": "10",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 358,

      "pattern": "",

      "acceptEmpty": true,

      "name": "scheduled_service",

      "origin": "11",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 359,

      "pattern": "",

      "acceptEmpty": true,

      "name": "gps_code",

      "origin": "12",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 360,

      "pattern": "",

      "acceptEmpty": true,

      "name": "iata_code",

      "origin": "13",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 361,

      "pattern": "",

      "acceptEmpty": true,

      "name": "local_code",

      "origin": "14",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 362,

      "pattern": "",

      "acceptEmpty": true,

      "name": "home_link",

      "origin": "15",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 363,

      "pattern": "",

      "acceptEmpty": true,

      "name": "wikipedia_link",

      "origin": "16",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 364,

      "pattern": "",

      "acceptEmpty": true,

      "name": "keywords",

      "origin": "17",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 365,

      "pattern": "",

      "acceptEmpty": false,

      "name": "dasFileName",

      "origin": "fileInfo.fileName",

      "valueType": "{\"type\":\"STRING\"}",

      "include": false,

      "version": 3

    },

    {

      "id": 366,

      "pattern": "",

      "acceptEmpty": false,

      "name": "dasFilePath",

      "origin": "fileInfo.filePath",

      "valueType": "{\"type\":\"STRING\"}",

      "include": false,

      "version": 3

    },

    {

      "id": 367,

      "pattern": "",

      "acceptEmpty": false,

      "name": "dasLastModified",

      "origin": "fileInfo.lastModified",

      "valueType": "{\"type\":\"DATE\"}",

      "include": false,

      "version": 3

    }

  ]

}

 

This is a return of the configurations that have been set up to import the data into Datameer. If you look closely, you will recognize some things that you see in the import wizard within the Datameer UI:

"version": "4.0.2",

  "className": "datameer.dap.common.entity.DataSourceConfigurationImpl",

  "file": {

    "uuid": "df4fe482-d07f-4f53-aa04-b58aac43f594",

    "path": "/Users/admin/Applications/Flight Delays/Resources/Airports.imp",

    "description": "",

    "name": "Airports"

 

The first part gives you general information about your Datameer instance, the path of the artifact in Datameer, any description and the name of the artifact. 

The next portion (below) gives you more details on the configurations of the import, such as file format, if a histogram will be generated for this job, CSV configurations, record sample size, partitioning, custom properties, email notification settings, etc. If you were in the Datameer UI, you would see the same type of configurations by right clicking on the artifact and selecting “Configure”. 

"pullType": "MANUALLY",

  "minKeepCount": 1,

  "properties": {

    "TextFileFormat": [

      "TEXT"

    ],

    "fileNameTimeRange_mode": [

      "OFF"

    ],

    "fileNameTimeRange_startDate": [

      ""

    ],

    "filter.minAge": [

      ""

    ],

    "filter.maxAge": [

      ""

    ],

    "characterEncoding": [

      "UTF-8"

    ],

    "recordSampleSize": [

      "1000"

    ],

    "escapeCharacter": [

      ""

    ],

    "detectColumnDefinition": [

      "SELECT_PARSE_AUTO"

    ],

    "collectAdditionalFields": [

      "false"

    ],

    "quoteCharacter": [

      "\""

    ],

    "delimiter": [

      ","

    ],

    "csv.max-lines-per-record": [

      "1"

    ],

    "external.store": [

      "false"

    ],

    "filter.page.does.split.creation": [

      "false"

    ],

    "fileType": [

      "CSV"

    ],

    "GenericConfigurationImpl.temp-file-store": [

      "1dad24d5-96c2-4af1-8460-b206f8df3cd2"

    ],

    "incrementalMode": [

      "false"

    ],

    "histogram.generation": [

      "false"

    ],

    "file": [

      "flightdelays/ICAOAirports.csv.zip"

    ],

    "strictQuotes": [

      "false"

    ]

  },

  "hadoopProperties": "",

  "dataStore": {

    "path": "/Users/admin/Applications/Flight Delays/Resources/Examples in S3.dst",

    "uuid": "a61e955c-576d-47b5-b50a-8554403eddbb"

  },

  "errorHandlingMode": "DROP_RECORD",

  "maxLogErrors": 1000,

  "maxPreviewRecords": 5000,

  "notificationAddresses": "",

  "notificationSuccessAddresses": "",

 

The final part of the return shows the columns and column configurations for the import job:

"fields": [

    {

      "id": 347,

      "pattern": "",

      "acceptEmpty": true,

      "name": "id",

      "origin": "0",

      "valueType": "{\"type\":\"INTEGER\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 348,

      "pattern": "",

      "acceptEmpty": true,

      "name": "ident",

      "origin": "1",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 349,

      "pattern": "",

      "acceptEmpty": true,

      "name": "type",

      "origin": "2",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

.......................................   

    {

      "id": 367,

      "pattern": "",

      "acceptEmpty": false,

      "name": "dasLastModified",

      "origin": "fileInfo.lastModified",

      "valueType": "{\"type\":\"DATE\"}",

      "include": false,

      "version": 3

    }

  ]

}

 

Download the return

 Now that you have an understanding of what is contained in this return, you will now download the data using a different GET command so you can make changes to it!

 The command will look very similar to our first GET command, except you are adding how you would like to save the file:

 

 curl -u <username>:<password> -X GET ‘http://<Datameer-serverIP>:<port-number>/ rest/import-job/<job-configuration-id>' > Airports.json 


When you run this command, it will save the file in whatever directory you are currently on in your command line, or you must specify the directory you would like to save to. For example, you can use the following command to navigate to your downloads folder:

cd /Users/<username>/Downloads  


Once you hit enter in terminal, you can run the GET command and the file will save to this folder. If you do not want to navigate to the folder, but would like to specify where it saves to, your GET command will look similar to this: 

curl -u admin:admin -X GET ‘http://localhost:8080/rest/import-job/3' > /Users/username/Downloads/Airports.json 

 

Edit the JSON file

Now that you have a physical file to edit, you canl make changes. You can do this two different ways:

  1. Through a text editor such as TextWrangler, Smultron, or Sublime.
  2. From command line.

If you would like to make these changes from a text editor, it is as simple as opening the downloaded file to the application.

If you would like to edit the file from command line, you will run the following command if you are already in the same directory as the file:

vi Airports.json


If you are not in the same directory, your command will look something like this:

vi /Users/username/Downloads/Airports.json

 

Now you will change a few things. If you are using your command line, press the “i” to begin editing.

First, let’s change the number of preview records from 1,000 to 5,000. This will increase the sample set that the workbook will use. To do this, find the following configuration: 

 

Before change

"recordSampleSize": [

  "1000"

 ], 


Change the 1000 to 5000.

After change

"recordSampleSize": [

  "5000" 

], 

 

Now, change some of the names of the columns. Find the “fields” section of the JSON. Make sure you are in edit mode and change some of the column names in the configuration “name”.

You can change these names to whatever you wish as along as they start with alphanumeric characters and no spaces (i.e. use “ID_Tags” instead of “ID Tags”).

In the example below, change the name of the column from "id" to "Identifier" 

After changes

"fields": [

    {

      "id": 347,

      "pattern": "",

      "acceptEmpty": true,

      "name": "id",

      "origin": "0",

      "valueType": "{\"type\":\"INTEGER\"}",

      "include": true,

      "version": 3

    },

After changes

"fields": [

    {

      "id": 347,

      "pattern": "",

      "acceptEmpty": true,

      "name": "Identifier",

      "origin": "0",

      "valueType": "{\"type\":\"INTEGER\"}",

      "include": true,

      "version": 3

    },


Make sure you save changes to the JSON.

If you are in command line, hit ESC, then type x and return.

You can use normal saving options if you are in a text editor. 

 

Post the changes to Datameer

Now that you have made some changes, you want to input these changes into Datameer. Since you are updating a file that already exists, use the PUT command: 

curl -u <username>:<password> -X PUT -d @<job-payload>.json 'http://<Datameer- serverIP>:<port-number>/rest/import-job/<job-configuration-id>' 


This command identifies the new file with the @<job-payload>.json and tells it where to go in Datameer with the http://<Datameer-serverIP>:<port-number>/rest/import-job/<job- c!onfiguration-id>’.

If you are in the same directory as the file you are going to use the PUT on, then you only have to use the direct file name:

curl -u admin:admin -X PUT -d @Airports.json 'http://localhost:8080/rest/import-job/23


If you are not in the directory where your file resides, make sure you put in the directory path: 

curl -u admin:admin -X PUT -d @/Users/<username>/Downloads/Airports.json 'http://localhost:8080/rest/import-job/23'


Once you run this command, you will get a response that tells you whether the PUT succeeds or not. It should look like this:

{
"status": "success"


This means you have successfully updated the import configuration outside of the Datameer UI!

Now, go into Datameer, right click the Airports import artifact and select “Show Details”. You should now see your configuration changes under the Configuration Details tab.


 For more information on REST API and Datameer, see: Accessing Datameer Using the REST API

 

 

“Got a question? Have an answer? Join the Datameer Community!

 

 

 

 

 

  • No labels