Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Additionally, a property to turn off sampling has been added:

Note
iconfalse

das.sampling.lookahead.maxdepth=<numerical value>

 

This property disables Smart Sampling and is based on a random sample.

Also, you will see that you are able to allocate more memory to this job with the following property: 

Note
iconfalse

das.join.map-side.memory.max-file-size=<numerical value>


This property allows the workbook to use more memory than other jobs on the machine. The numerical value entered is in bytes.

 

Select the compression

In the same Custom Hadoop Properties field, you can also select your compression type for the workbook. Here, you can choose the compression that will best optimize your workbook.

What is the best compression, you ask? Well, that depends on your workbook! For example, if your workbook takes a toll on your CPU, you may want to choose Snappy compression because it focuses on speed, not maximum compression.

 Once you select your compression type, you will add this configuration to the field.

 

In this case, we have added the default just to give you an idea of what this looks like:

Note
iconfalse

mapred.map.output.compression.codec=org.apache.hadoop.io.compress.DefaultCodec (Defines the compression codec of the output of Map)

and

mapred.output.compression.codec=org.apache.hadoop.io.compress.DefaultCodec (Defines the compression codec for the final output of a Map-Reduce job)


 

You can find the different compression configurations in our Frequenly Asked Questions.

 

Now you are finished with your job optimization.

...


Info

 

iconfalse

“Got a question? Have an answer? Join the Datameer Community!