Using the Git Versioning Plug-in

The Git versioning plug-in is only available through the Advanced Governance package.

The Git versioning plug-in tracks workbook events for you and allows you to roll back to previous versions of a workbook. Every workbook artifact has an UUID, which makes it unique and is used with the plug-in to restore previous versions of a workbook.

Don't touch the Git repository if Datameer is running to avoid conflicts while storing new commits.


Prerequisites

To use this plug-in, you need to install Git on the same machine as the Datameer server. The Datameer service depends on the Git repository for constant writes, so a low response time is necessary. As a result, the repository folder needs to be locally set up on Datameer server and cannot be remote.

Datameer highly recommends that a MySQL application database be in production before the Git plug-in is installed. If the internal HSQL database is being used with Datameer when the Git plug-in is installed, a later HSQL to MySQL migration is not possible due to a discrepancy between the application databases' schemas.

Installing the Plug-in

  1. Before you install the plugin, you might want to configure the folder where plug-in configurations are stored. To check the current path setting, open the file system for Datameer and navigate to the conf/default.properties file. Search for the property that defines the folder where plug-in configurations are stored.

    Example
    <datameer-install-path>/conf/default.properties
    
    # Defines the folder where to store plugin configurations
    system.property.plugin.configs.dir=<folder name> 

    It is important to note where plug-in configurations are stored if you are upgrading Datameer in the future, since you need to copy the contents of the previous plug-in configurations folder into the file system of the upgraded version of Datameer.

  2. Create a new folder to store the Git repository on the Datameer server. Make sure the user who is running Datameer has read and write capabilities for this folder.

  3. In the command line, go to your Git folder and type in the command git init. If it is working correctly, you receive a response that the Git repository was initialized. 

  4. Use the directions under Managing Plug-ins and Extension Points to install the plug-in. On the Plug-ins page, click the cog symbol next to Git Versioning Plug-in to configure the settings.
  5. The following example creates a Git repository in Datameer's installation folder:

    Example
    $ mkdir /opt/datameer/current/versioning
    $ cd /opt/datameer/current/versioning
    $ git init
    Initialized empty Git repository in /opt/datameer/current/versioning/.git/
  6. Create a first backup of the Git repository. In the future, back up this repository frequently.

    Example
    cp conf/plugins/plugin-versioning-git.json versioning
    tar -zcvf `date +%Y_%m_%d`.versioning.tar.gz versioning/* 

Configuring the Plug-in

  1. On the Plug-ins page, click the cog symbol next to Git Versioning Plug-in to configure the settings.
  2. In the Local Path field, enter the path to the folder created on the Datameer server.
  3. Select the Request Snapshots of File Browser Artifacts 

Understanding the Repository Structure

The repository is structured in two ways: directories that you see in Datameer, and directories that contain files and folders by UUID. It helps to understand the folder structure in order to find the workbooks and commits you need.

Take the following example:

Example
drwxr-xr-x  52   !files-by-uuid/
-rw-r--r--   1   !folder.json
drwxr-xr-x  21   !folders-by-uuid/
drwxr-xr-x  13   .git/
drwxr-xr-x   5   .system/
drwxr-xr-x   4   Analytics/
drwxr-xr-x   8   Data/
drwxr-xr-x   4   Images/
drwxr-xr-x   4   Users/
drwxr-xr-x   4   Visualization/

In this example: 

  • The !files-by-uuid/directory stores symbolic links to the metadata of all workbooks in Datameer, identified by their UUID.
  • The !folder.json file shows the metadata of the root folder of the File Browser, including permissions and UUID.
  • !folders-by-uuid/ is a directory that stores symbolic links to the metadata all folders in Datameer, identified by their UUID.
  • .git/ is a directory that represents the local Git repository.
  • .system/ is a hidden directory in Datameer's File Browser with internal artifacts.
  • The directories from Analytics/ on represent the folders that the user has set up in the File Browser.

Roll back a workbook

  1. To roll back changes while Datameer is running, create a clone of the configured Git repository to avoid conflicts while Datameer is writing to the original repository using the git clone command.

    Example
    # assuming the configured repository is located under /opt/datameer/current/versioning
    $ cd /opt/datameer/current
    $ git clone versioning/ versioning-clone
    Cloning into bare repository 'versioning-clone'...
    done.
  2. Sync the cloned repository with the original repository to make sure the commits are the same using git checkout and git pull.

    Example
    # to sync the clone with the origin once (repeat as needed)
    $ cd /opt/datameer/current/versioning-clone
    $ git checkout master
    $ git pull
  3. See a history of changes to a workbook by using the git log command to see all commits over time for a specific workbook.

    Example
    # assuming the repository is located under /opt/datameer/current/versioning-clone
    $ cd /opt/datameer/current/versioning-clone
    $ git log --oneline --follow Analytics/Workbooks/Workbook1.wbk
    9499a56 Renamed Column on Workbook "1<UUID>"
    6f131f1 Created Formula on Workbook "1<UUID>"
    4fb067a Renamed Column on Workbook "1<UUID>"
    fe11c97 Renamed Sheet on Workbook "1<UUID>"
    5d9847a Copied Sheet on Workbook "1<UUID>"
    a9d8234 Deleted Formula Sheet on Workbook "1<UUID>"
    0ca3031 Updated File "1<UUID>"
    e667f5c Snaphot of Workbook "1<UUID>"

    Make sure to use git log on the real file of the workbook which lies in the directory, as opposed to using the command on a symbolic link. If you look for the history on the symbolic link, it shows only the creation of the link itself.

  4. To see what actually was changed within the workbook, look at a specific commit.

    Example
    $ git show <commit>

    This command gives an output similar to the following, which shows who changed the formula and who changed the type from a float to an integer:

    Example
    [datameer@<host> versioning-clone]$ git show <commit>
    commit <commit>
    Author: qa <qa@datameer.com>
    Date:   <timestamp>
    
        Created Formula on Workbook "<UUID>"
    
    diff --git a/Analytics/Workbooks/<workbookName>.wbk b/Analytics/Workbooks/<workbookName>.wbk
    index <ID>
    --- a/Analytics/Workbooks/<workbookName>.wbk
    +++ b/Analytics/Workbooks/<workbookName>.wbk
    @@ -43,7 +43,7 @@
               "position": 0
             },
             {
    -          "formula": "\u003dRANDBETWEEN(1;10)",
    +          "formula": "\u003dINT(RANDBETWEEN(1;10))",
               "id": "1",
               "name": "Int",
               "position": 1
    @@ -85,7 +85,7 @@
           "/Analytics/Workbooks",
           "<UUIID>"
         ],
    -    "modifiedAt": "<timestamp>",
    +    "modifiedAt": "<timestamp>",
         "name": "<wortkbookName>",
         "permission": {
           "groups": {
    ... 
  5. To restore a specific version of the workbook, check out the related state of the repository using its commit ID using the git checkout command.

    Example
    $ git checkout <commit>
  6. If you want to review the workbook to make sure you have the right version, you can print the metadata of the workbook using the cat command.

    Example
    # print the contents of a workbook
    $ cat \!files-by-uuid/1<UUID>.json
    
    {
      "_version": "5.11.18",
      ...
    }
  7. Restore the workbook by using the roll back a workbook REST API.

    Example
    # rollback a workbook by using cURL
    $ curl -u admin:admin -X PUT -H "Content-Type:application/json" -d @\!folders-by-uuid/<uuid>.json 'http://localhost:8080/api/workbooks/rollback'
  8. Reset the repository to the latest state using the git checkout command.

    Example
    $ git checkout master
    Switched to branch 'master'
  9. Re-sync the cloned repository with the original using the git checkout and git pull commands, to make sure both are up-to-date.

    Example
    # to sync the clone with the origin
    $ cd /opt/datameer/current/versioning-clone
    $ git checkout master
    $ git pull

Other REST API tools

You can also create or overwrite a workbook based on the JSON representation in a Git repository used by the Git versioning plug-in.

Installing the Git Versioning Plug-in After Upgrade

After upgrading Datameer, you might need to re-install the plug-in.

  1. Double-check that the Datameer service is stopped.
  2. Create a folder to store the Git repository on the Datameer server. Make sure the user who is running Datameer has read and write capabilities for the new folder.

    [datameer@<host> current]$ mkdir /opt/datameer/current/versioning
  3. Unpack the Git repository backup.

    tar zxvf <date>.versioning.tar.gz
  4. Move the configuration file to the correct folder.

    [datameer@<host> current]$ mv versioning/plugin-versioning-git.json conf/plugins
  5. Install the downloaded Git versioning plug-in into the plug-ins folder. This installation ensures that plug-in becomes loaded automatically during Datameer service start.

    [datameer@<host> current]$ mv plugin-versioning-git-<version>.zip plugins/

     The plug-in is installed now and the Datameer service is safe for start.

Backing Up to a Remote Git Repository

While the Git repository being used for the plug-in must reside on the same machine as the Datameer server, it is possible to back up this local repository to a remote repository outside of Datameer.

In order to push to a central (remote) Git repository, it is recommend to set up a cron job that syncs with that repository frequently. It is important that only the local Git repository is set to provide commits. This recommendation is to avoid merge conflicts or any other inconsistent states.

In this setup, you have the local Git repository, located on the Datameer server, with both read and write permissions and one remote Git repository that has only read permission that pulls from the local repository.