Set Up Datameer on an HDFS Transparent Encryption-enabled Cluster

Configuring HDFS Transparent Encryption-enabled Cluster with Datameer

Use the following steps to connect Datameer to an HDFS transparent encryption-enabled cluster and point the Datameer private folder to an encrypted folder:

Set the Datameer Private Folder field to the encrypted HDFS folder. Select HDFS Transparent Encryption.

Specify the KMS (Key Management Server) URI or KMS HA (High Availability) URI. 

The KMS URI is located in your Hadoop cluster core-site.xml file. It is the value for the property hadoop.security.key.provider.path=

Adding the KMS URI http@<kms1>:16000/kms into the above field automatically adds the following custom properties to all jobs:

<pre>hadoop.security.key.provider.path=&lt;kms://http@kms1;kms2:16000/kms&gt;
dfs.encryption.key.provider.uri=&lt;kms://http@kms1;kms2:16000/kms&gt;</pre>

No Datameer service restart is required for this change.

In Datameer versions 6.0.1, 6.0.2, and 6.0.3 an additional step is required.

Add the custom property: tez.dag.recovery.enabled=false

For a Key Management Server (KMS) in High Availability (HA) mode provide both hosts separated with a semicolon within the URI, http@<kms1>;<kms2>:16000/kms

Configuring Datameer for KMS in HA mode requires a Datameer service restart.

If the KMS is configured for TLS/SSL use https as protocol, https@<kms1>:16000/kms or https@<kms1>;<kms2>:16000/kms.

Requirements

In Kerberos security enabled server, make sure your KMS kms-site.xml has the following content:

  <property>
    <name>hadoop.kms.proxyuser.datameer.groups</name>
    <value>dasuser</value>
  </property>
  <property>
    <name>hadoop.kms.proxyuser.datameer.hosts</name>
    <value>*</value>
  </property>

This information allows the system user running Datameer to impersonate users in secure impersonation mode.