Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Open up whichever command line application you prefer. Here is the command you will be using:

Note
iconfalse

curl -u <username>:<password> -X GET 'http://<Datameer-serverIP>:<port-number>/rest/import-job/ <job-configuration-id>' 


You must make sure to fill in your actual username and password for your Datameer instance along with the Datameer URL and job configuration ID. Your job configuration ID can be found in your Datameer UI. Go to your Datameer instance and select the Airports import job and view the Information Browser to the right of the artifacts. You will find the ID here:

 

A completed REST call with GET will look something like this:

Note
iconfalse

curl -u admin:admin -X GET 'http://localhost:8080/rest/import-job/23'

 

Understand the return for the GET command

Once you run the command, your return should look like this:

Info
iconfalse

{

  "version": "4.0.2",

  "className": "datameer.dap.common.entity.DataSourceConfigurationImpl",

  "file": {

    "uuid": "df4fe482-d07f-4f53-aa04-b58aac43f594",

    "path": "/Users/admin/Applications/Flight Delays/Resources/Airports.imp",

    "description": "",

    "name": "Airports"

  },

  "pullType": "MANUALLY",

  "minKeepCount": 1,

  "properties": {

    "TextFileFormat": [

      "TEXT"

    ],

    "fileNameTimeRange_mode": [

      "OFF"

    ],

    "fileNameTimeRange_startDate": [

      ""

    ],

    "filter.minAge": [

      ""

    ],

    "filter.maxAge": [

      ""

    ],

    "characterEncoding": [

      "UTF-8"

    ],

    "recordSampleSize": [

      "1000"

    ],

    "escapeCharacter": [

      ""

    ],

    "detectColumnDefinition": [

      "SELECT_PARSE_AUTO"

    ],

    "collectAdditionalFields": [

      "false"

    ],

    "quoteCharacter": [

      "\""

    ],

    "delimiter": [

      ","

    ],

    "csv.max-lines-per-record": [

      "1"

    ],

    "external.store": [

      "false"

    ],

    "filter.page.does.split.creation": [

      "false"

    ],

    "fileType": [

      "CSV"

    ],

    "GenericConfigurationImpl.temp-file-store": [

      "1dad24d5-96c2-4af1-8460-b206f8df3cd2"

    ],

    "incrementalMode": [

      "false"

    ],

    "histogram.generation": [

      "false"

    ],

    "file": [

      "flightdelays/ICAOAirports.csv.zip"

    ],

    "strictQuotes": [

      "false"

    ]

  },

  "hadoopProperties": "",

  "dataStore": {

    "path": "/Users/admin/Applications/Flight Delays/Resources/Examples in S3.dst",

    "uuid": "a61e955c-576d-47b5-b50a-8554403eddbb"

  },

  "errorHandlingMode": "DROP_RECORD",

  "maxLogErrors": 1000,

  "maxPreviewRecords": 5000,

  "notificationAddresses": "",

  "notificationSuccessAddresses": "",

  "fields": [

    {

      "id": 347,

      "pattern": "",

      "acceptEmpty": true,

      "name": "id",

      "origin": "0",

      "valueType": "{\"type\":\"INTEGER\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 348,

      "pattern": "",

      "acceptEmpty": true,

      "name": "ident",

      "origin": "1",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 349,

      "pattern": "",

      "acceptEmpty": true,

      "name": "type",

      "origin": "2",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 350,

      "pattern": "",

      "acceptEmpty": true,

      "name": "name",

      "origin": "3",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 351,

      "pattern": "",

      "acceptEmpty": true,

      "name": "latitude_deg",

      "origin": "4",

      "valueType": "{\"type\":\"FLOAT\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 352,

      "pattern": "",

      "acceptEmpty": true,

      "name": "longitude_deg",

      "origin": "5",

      "valueType": "{\"type\":\"FLOAT\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 353,

      "pattern": "",

      "acceptEmpty": true,

      "name": "elevation_ft",

      "origin": "6",

      "valueType": "{\"type\":\"INTEGER\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 354,

      "pattern": "",

      "acceptEmpty": true,

      "name": "continent",

      "origin": "7",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 355,

      "pattern": "",

      "acceptEmpty": true,

      "name": "iso_country",

      "origin": "8",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 356,

      "pattern": "",

      "acceptEmpty": true,

      "name": "iso_region",

      "origin": "9",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 357,

      "pattern": "",

      "acceptEmpty": true,

      "name": "municipality",

      "origin": "10",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 358,

      "pattern": "",

      "acceptEmpty": true,

      "name": "scheduled_service",

      "origin": "11",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 359,

      "pattern": "",

      "acceptEmpty": true,

      "name": "gps_code",

      "origin": "12",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 360,

      "pattern": "",

      "acceptEmpty": true,

      "name": "iata_code",

      "origin": "13",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 361,

      "pattern": "",

      "acceptEmpty": true,

      "name": "local_code",

      "origin": "14",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 362,

      "pattern": "",

      "acceptEmpty": true,

      "name": "home_link",

      "origin": "15",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 363,

      "pattern": "",

      "acceptEmpty": true,

      "name": "wikipedia_link",

      "origin": "16",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 364,

      "pattern": "",

      "acceptEmpty": true,

      "name": "keywords",

      "origin": "17",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 365,

      "pattern": "",

      "acceptEmpty": false,

      "name": "dasFileName",

      "origin": "fileInfo.fileName",

      "valueType": "{\"type\":\"STRING\"}",

      "include": false,

      "version": 3

    },

    {

      "id": 366,

      "pattern": "",

      "acceptEmpty": false,

      "name": "dasFilePath",

      "origin": "fileInfo.filePath",

      "valueType": "{\"type\":\"STRING\"}",

      "include": false,

      "version": 3

    },

    {

      "id": 367,

      "pattern": "",

      "acceptEmpty": false,

      "name": "dasLastModified",

      "origin": "fileInfo.lastModified",

      "valueType": "{\"type\":\"DATE\"}",

      "include": false,

      "version": 3

    }

  ]

}

 

This is a return of the configurations that have been set up to import the data into Datameer. If you look closely, you will recognize some things that you see in the import wizard within the Datameer UI:

Info
iconfalse

"version": "4.0.2",

  "className": "datameer.dap.common.entity.DataSourceConfigurationImpl",

  "file": {

    "uuid": "df4fe482-d07f-4f53-aa04-b58aac43f594",

    "path": "/Users/admin/Applications/Flight Delays/Resources/Airports.imp",

    "description": "",

    "name": "Airports"

 

The first part gives you general information about your Datameer instance, the path of the artifact in Datameer, any description and the name of the artifact. 

The next portion (below) gives you more details on the configurations of the import, such as file format, if a histogram will be generated for this job, CSV configurations, record sample size, partitioning, custom hadoop properties, email notification settings, etc. If you were in the Datameer UI, you would see the same type of configurations by right clicking on the artifact and selecting “Configure”. 

Info
iconfalse

"pullType": "MANUALLY",

  "minKeepCount": 1,

  "properties": {

    "TextFileFormat": [

      "TEXT"

    ],

    "fileNameTimeRange_mode": [

      "OFF"

    ],

    "fileNameTimeRange_startDate": [

      ""

    ],

    "filter.minAge": [

      ""

    ],

    "filter.maxAge": [

      ""

    ],

    "characterEncoding": [

      "UTF-8"

    ],

    "recordSampleSize": [

      "1000"

    ],

    "escapeCharacter": [

      ""

    ],

    "detectColumnDefinition": [

      "SELECT_PARSE_AUTO"

    ],

    "collectAdditionalFields": [

      "false"

    ],

    "quoteCharacter": [

      "\""

    ],

    "delimiter": [

      ","

    ],

    "csv.max-lines-per-record": [

      "1"

    ],

    "external.store": [

      "false"

    ],

    "filter.page.does.split.creation": [

      "false"

    ],

    "fileType": [

      "CSV"

    ],

    "GenericConfigurationImpl.temp-file-store": [

      "1dad24d5-96c2-4af1-8460-b206f8df3cd2"

    ],

    "incrementalMode": [

      "false"

    ],

    "histogram.generation": [

      "false"

    ],

    "file": [

      "flightdelays/ICAOAirports.csv.zip"

    ],

    "strictQuotes": [

      "false"

    ]

  },

  "hadoopProperties": "",

  "dataStore": {

    "path": "/Users/admin/Applications/Flight Delays/Resources/Examples in S3.dst",

    "uuid": "a61e955c-576d-47b5-b50a-8554403eddbb"

  },

  "errorHandlingMode": "DROP_RECORD",

  "maxLogErrors": 1000,

  "maxPreviewRecords": 5000,

  "notificationAddresses": "",

  "notificationSuccessAddresses": "",

 

The final part of the return shows the columns and column configurations for the import job:

Info
iconfalse

"fields": [

    {

      "id": 347,

      "pattern": "",

      "acceptEmpty": true,

      "name": "id",

      "origin": "0",

      "valueType": "{\"type\":\"INTEGER\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 348,

      "pattern": "",

      "acceptEmpty": true,

      "name": "ident",

      "origin": "1",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

    {

      "id": 349,

      "pattern": "",

      "acceptEmpty": true,

      "name": "type",

      "origin": "2",

      "valueType": "{\"type\":\"STRING\"}",

      "include": true,

      "version": 3

    },

.......................................   

    {

      "id": 367,

      "pattern": "",

      "acceptEmpty": false,

      "name": "dasLastModified",

      "origin": "fileInfo.lastModified",

      "valueType": "{\"type\":\"DATE\"}",

      "include": false,

      "version": 3

    }

  ]

}

 

Download the return

 Now that you have an understanding of what is contained in this return, you will now download the data using a different GET command so you can make changes to it!

 The command will look very similar to our first GET command, except you are adding how you would like to save the file:

 

Note
iconfalse

 curl -u <username>:<password> -X GET ‘http://<Datameer-serverIP>:<port-number>/ rest/import-job/<job-configuration-id>' > Airports.json 


When you run this command, it will save the file in whatever directory you are currently on in your command line, or you must specify the directory you would like to save to. For example, you can use the following command to navigate to your downloads folder:

Note
iconfalse

cd /Users/<username>/Downloads  


Once you hit enter in terminal, you can run the GET command and the file will save to this folder. If you do not want to navigate to the folder, but would like to specify where it saves to, your GET command will look similar to this: 

Note
iconfalse

curl -u admin:admin -X GET ‘http://localhost:8080/rest/import-job/3' > /Users/username/Downloads/Airports.json 

 

Edit the JSON file

Now that you have a physical file to edit, you canl make changes. You can do this two different ways:

  1. Through a text editor such as TextWrangler, Smultron, or Sublime.
  2. From command line.

If you would like to make these changes from a text editor, it is as simple as opening the downloaded file to the application.

If you would like to edit the file from command line, you will run the following command if you are already in the same directory as the file:

Note
iconfalse

vi Airports.json


If you are not in the same directory, your command will look something like this:

Note
iconfalse

vi /Users/username/Downloads/Airports.json

 

Now you will change a few things. If you are using your command line, press the “i” to begin editing.

First, let’s change the number of preview records from 1,000 to 5,000. This will increase the sample set that the workbook will use. To do this, find the following configuration: 

 

Info
iconfalse
titleBefore change

"recordSampleSize": [

  "1000"

 ], 


Change the 1000 to 5000.

Info
iconfalse
titleAfter change

"recordSampleSize": [

  "5000" 

], 

 

Now, change some of the names of the columns. Find the “fields” section of the JSON. Make sure you are in edit mode and change some of the column names in the configuration “name”.

You can change these names to whatever you wish as along as they start with alphanumeric characters and no spaces (i.e. use “ID_Tags” instead of “ID Tags”).

In the example below, change the name of the column from "id" to "Identifier" 

Info
iconfalse
titleAfter changes

"fields": [

    {

      "id": 347,

      "pattern": "",

      "acceptEmpty": true,

      "name": "id",

      "origin": "0",

      "valueType": "{\"type\":\"INTEGER\"}",

      "include": true,

      "version": 3

    },

Info
iconfalse
titleAfter changes

"fields": [

    {

      "id": 347,

      "pattern": "",

      "acceptEmpty": true,

      "name": "Identifier",

      "origin": "0",

      "valueType": "{\"type\":\"INTEGER\"}",

      "include": true,

      "version": 3

    },


Make sure you save changes to the JSON.

If you are in command line, hit ESC, then type x and return.

You can use normal saving options if you are in a text editor. 

 

Post the changes to Datameer

Now that you have made some changes, you want to input these changes into Datameer. Since you are updating a file that already exists, use the PUT command: 

Note
iconfalse

curl -u <username>:<password> -X PUT -d @<job-payload>.json 'http://<Datameer- serverIP>:<port-number>/rest/import-job/<job-configuration-id>' 


This command identifies the new file with the @<job-payload>.json and tells it where to go in Datameer with the http://<Datameer-serverIP>:<port-number>/rest/import-job/<job- c!onfiguration-id>’.

If you are in the same directory as the file you are going to use the PUT on, then you only have to use the direct file name:

Note
iconfalse

curl -u admin:admin -X PUT -d @Airports.json 'http://localhost:8080/rest/import-job/23


If you are not in the directory where your file resides, make sure you put in the directory path: 

Note
iconfalse

curl -u admin:admin -X PUT -d @/Users/<username>/Downloads/Airports.json 'http://localhost:8080/rest/import-job/23'


Once you run this command, you will get a response that tells you whether the PUT succeeds or not. It should look like this:

Info
iconfalse

{
"status": "success"


This means you have successfully updated the import configuration outside of the Datameer UI!

Now, go into Datameer, right click the Airports import artifact and select “Show Details”. You should now see your configuration changes under the Configuration Details tab.


 For more information on REST API and Datameer, see: Accessing Datameer Using the REST API

 

 

Info
iconfalse
 Have additional questions about this guide or anything to do with Datameer?

“Got a question? Have an answer? Join the Datameer Community!