Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Select Export Data from the + (plus) drop down menu located in the top left corner of the screen or right-click in the navigation bar are the left side of the screen and select Create New > Export Data.
  2. Select a workbook by clicking Select then click Next.
  3. Select which sheet to export, then click Next. At the bottom of the page you see a preview of the a sheet's data.
  4. Select which connection the data should be exported to, either an existing connection or create a new one. Select to export the data as a CSV file, an AVRO file, a Parquet file, or Tableau (TDSX) file. Then click Next

  5. Enter the information on the Data Details tab which varies depending on the type of file you are creating. Click Next.
    1. Under the File settings, enter the file name and file path. Variable placeholders can be used instead of static values. Exports with the same file name will be overwritten.

      Under Advanced Settings, enter the maximum file size and how consecutive files should be numbered. All export jobs add a task number to the end of the filenames (before the extension) generated by the export job.
    2. There are additional settings if the file is being exported as a CSV. Select whether column headers should be contained in the first row, and how delimiter, quote, and escape characters should be defined.
      If you are exporting a float value to a .CSV, an additional trailing zero (.0) is added following the original value. Leading zeros (0.) are removed if the integer part of the float value is larger than zero.

      If you export to a database such as MySQL, you must select a table and select  either Use Existing Table to overwrite the database table or Create new if it doesn't exist to add the table. If you use an existing table with a large amount of previous data, the overall export might be slow as it first needs to delete the old data. However, if you create a new table, delete statements aren't used and performance might be improved. There is a hard limit of 1,000 entries.

      Exporting with SFTP

      An additional requirement for exporting with SFTP is that the user creating the export job must have permissions to create a folder (mkdir) on the SFTP server. This permission is required even when the folder on the SFTP server already exists.

      Anchor
      export_database
      export_database
      Exporting to a database

      When exporting to a database, the database connector is associated with a user name. This database user must have write and/or create permissions on the database itself in order to complete the export job.

      When exporting data to DB2 or to an Oracle connection, there are limitations on the number of characters contained in string data fields.

      • for DB2, the number of characters is limited to 2000.
      • for Oracle, the number of characters is limited to 4000.

      Attempting to export data fields containing a larger number of characters results in dropped records. 

      Record replacement policy

      If the Use existing option is selected Datameer doesn't create a primary key column with name "ID". (i.e., The workbook must be aware of the ID column. All columns from the destination table (can't be null) must be mapped to a column of the workbook.)

      Append Mode

      • The records append to the existing table.

      Overwrite Mode

      • All records from the existing table are deleted.
      • New records are written to the destination table.
      Panel

      Tips:

      • Always select Use existing for tables created outside of Datameer (e.g., Netezza)
      • Never select Use existing for tables created via Datameer.

      If the Create new if not exist option is selected Datameer creates an additional column with name "ID" (primary key). When the database supports auto increment, the database computes the value of this column.

      Example: Netezza doesn't support auto increment. In this case Datameer uses a sequence database object to generate the ID. Specific for Netezza: the name of the sequence used is <tablename>_seq.

      Append Mode

      • The table is created if it doesn't already exists, otherwise the existing table is re-used without changing any schema.
      • The records are appended to the existing table.

      Overwrite Mode

      • Any previous tmp table is first dropped.
      • A tmp table is created.
      • The records are written to the tmp table.
      • The entire destination table is dropped if one exists.
      • The tmp table is renamed to the destination table.
      • The tmp table is dropped.
      Panel

      Tip:

      • Always select Create new if not exist for tables created via Datameer.
      • Never select Create new if not exist for tables created outside of Datameer (e.g., Netezza)
      Note
      iconfalse
      titleCommon problems:
      • Export a workbook into a table via Create new if not exists.
        • A new table is created with an additional primary key column with name "ID".
      • Export a second workbook with Use existing into the same table.
        • This fails because the ID is not filled. (See description above)

      Advanced settings

      • Setting the Maximum number of concurrent data base connections gives control of database parallelism. The export job can't launch more than the set number of parallel tasks. Each task opens one database connection.
      • Setting the Maximum records per transaction exports approximately this number of records within a single database transaction.
      • Setting the Rows per batch establishs a limit for the number of rows inserted for each transaction within a task. Tuning this value can influence speed and avoid potential timeouts. Very large batches can overwhelm some databases and limiting the number of rows returned at a time can reduce strain.

      Note

      Anchor
      netezza_cs
      netezza_cs
      As of Datameer v6.1

      Netezza export jobs have the option of storing table and column identifiers as case-insensitive (regular identifiers) or case-sensitive (delimited identifiers). Prior to Datameer v6.2, all identifiers were stored as case-sensitive. As of Datameer v6.2, the default setting for identifiers in Netezza export jobs is set as case-insensitive. Previous Netezza export jobs case sensitivity settings can be change within the export job wizard. The schema for the database is always case-sensitive. Changing this setting might create a new table in Netezza.

  6. Select which fields to include. Clear any fields that should be excluded. Select whether empty fields can be accepted. 

    Panel

    Tips:

    • If there is a date field included in the data, you can change the parse pattern format to be used during export. The default date parse pattern for an export job is [yyyy-MM-dd HH:mm:ss].
    • Big integers, big decimals, and dates in Datameer are treated differently in Datameer than  in Hive and use a larger range of values. Because of this they are written as strings into a Hive table on export. For a full list of how mapping works, see Data Field Types in Datameer .
    • When exporting to Teradata, a row-length limit is automatically set to 64 kilobytes by Teradata. Additionally, by default, every string column is set to a limit of 8019 bytes, so when exports with more than 8 string columns reach this limit they show an error message. To fix this, manually create the table in Teradata and then set Datameer to append instead of replace the table.


    By default, the column mapping of the export job is based on the columns of the current worksheet being exported. When editing and/or overwriting an export job, it is possible for the column mapping to be based on the the previous export job mapping instead of the current worksheet column mapping. This can be configured in the default.properties file.

    Code Block
    languagec#
    titledefault.properties
    # Enables the workbook columns to synchronize with export wizard mapping.
    # If enabled, the export wizard displays the workbook columns on the mapping step.
    # Otherwise, the saved mapping from the database is used on the export wizard's mapping step.
    export.file.inherit-workbook-column-names.enabled=true
    export.db.inherit-workbook-column-names.enabled=true
    export.other.inherit-workbook-column-names.enabled=true
  7. Select how to handle invalid records, and click Next.

  8. Select the schedule for exporting the data. Select Manually for a non-recurrent export or you can select either After the workbook is calculated or On a Schedule. If you select On a Schedule, indicate the time settings or use a custom cron pattern. Under Advanced Settings, you can enter custom properties as key/value pairs. Then click Next.
  9. Enter a description if desired, click Save As. Give the file a name, then click Save. You can also enter an email address to receive any error messages.

...

  1. Create an new export job or right-click and select Open to open the export job configuration wizard.
  2. Under the Schedule settings, select Manually, After workbook is caculated,  or Scheduled.
  3. When Scheduled is selected, specificy a time for the export job to run.
  4. Click Save to save your changes.

...

Schedules created with non complex cron patterns are converted automatically in the inspector. Select Schedule Type and Custom from the drop down menu to view or edit the schedule cron pattern. 

Editing Export Job Settings

...