How-To: Schedule Hadoop Jobs with Custom Cron Patterns in Datameer

One of the best things about creating an analysis in Datameer is that you only have to create it once, then you can set it up to run against your entire dataset in HDFS whenever you want. If you’re getting new data in every Friday at 3, then running an updated analysis every Friday at 3:05 makes total sense.

As is the case with the entire product, we have “easy” and we have “advanced”, Custom Cron patterns being the advanced option here.

In order to schedule with custom cron patterns, you must first understand what the combination of * and numbers mean in the pattern.

There are 5 pieces to a cron pattern:

The minute will be anything from 1-59. Hour will be anything from 0-23. Day of Month will allow 1-31. Month will accept 1-12 or names (first 3 letters of month name). Day of Week is Sunday through Saturday 0-7 (0 or 7 represents Sunday).

You can also use the first three letters in Day of Week. The * represents first-last, or all.

Custom Cron Pattern

Choose your schedule

Now that you have an understanding of how a cron pattern is broken down, you need to decide on a schedule.

For example, if you would like to run a job Monday through Friday every week at  4:05 PM every month:

The 5 represents the 5th minute of the hour, the 16 represents 4 PM. The ** calls all days of month and all months. The 1-5 calls Monday through Friday.
You can also write it like this:

This is just the basics. Here’s a quick screen capture to show you what it actually looks like in Datameer, and for a full tutorial and even more advanced instructions, visit our documentation on custom cron patterns.

