In Datameer, you can create a workbook to get to new insights with your data. Inside the workbook, you can add additional data sources, change the column and sheet names, collapse columns, apply formulas, view workbook details, and more.
You can also copy workbooks, import worksheets from other workbooks, duplicate workbooks and worksheets, and exchange datasources.
To create a workbook:
When using data from a sheet in another workbook or an import job, the data from that source sheet is copied to the new sheet if the new sheet is marked as kept. In previous versions, the source sheet was referenced instead, which led to workbook data not being deleted by housekeeping.
The workbook is populated with a sample of data from that data source. To use data in Datameer, that data must be part of a data source. If you have a source of data you want to use for analysis, such as an Excel spreadsheet, you need to create a data source using that data, or have a system administrator create one for you.
Data displayed in Datameer workbooks is a sampling of the full data from the source. When creating or editing an import job, file upload, or data link, you set the number of sample records to populate a workbook from the specific data source being used.
A data source added to a workbook displays the number of sample records configured for that source. When creating calculations in the workbook, those calculations are being applied only to the sample data until the workbook has run. After running a workbook, the calculations made in the workbook are applied to the full data and can be viewed or downloaded by right-clicking on the workbook and selecting Browse All Data.
The next time the workbook is opened after running, while the calculations have been applied to the full data, you still view the sample size configured from the data source.
Status notifications have been added to the bottom of workbook worksheets to show if the worksheet is displaying information based on sample or full data and if the worksheet is filtered.
If the worksheet's calculations are based only on sample data, the notification is orange. It tells you how many sample rows are being displayed in the worksheet and that you can run the workbook to display the sheet calculations based on the full data from the data source.
If the worksheet is displaying no information, this could be that none of the original sample data fits the criteria of the calculations. Running the workbook takes into account all current worksheet calculations and might bring in new sample data to populate the worksheet if data from the data source meet the calculation criteria.
If the worksheet's calculations are based on the full data, the notification is green. It tells you how many sample rows are being displayed in the worksheet out of how many total records from the data source. If all the records of the source are being used, the status bar displays the total record count.
Opening Existing Workbooks
To open an existing workbook:
A workbook can be opened by a user with permissions to view the workbook by entering a URL into the browser.
There are two valid URLs that can open a workbook, using the ID or the path.
Open a workbook to a specific worksheet:
Datameer gives you the option of saving previous versions of a workbook when it is run more than once. Learn more about optimizing your workbooks and data retention.
To open a prior version of a workbook:
To add additional data sources to an existing workbook:
For each data source you select, a tab is added to the workbook, containing a sample of data from that data source. You can add multiple data sources at the same time by holding the SHIFT button.
You can exchange data sources to run a workbook on a different set of similar data, which is faster than recreating the workbook.
For example, you can run a workbook on data for each month by exchanging the data source to point to the data for the next month. Each datasource must have the same structure. Further analysis applies only to the new data after the exchange. This option is only available when the current data sheet uses a data source imported by adding data. The option isn't available for sheets created using filters, joins, sorts, external data sheets, or formulas. You must also have write permissions for this workbook to use this option.
To exchange data sources in an existing workbook:
Errors might occur when exchanging data if column names have changed. On a joined sheet, original references to columns remain even if that column name doesn't exist in the new exchanged data. If this happens, replace the joined sheet reference with the updated column name from the exchanged data.
You can set up sheet names, resize columns, collapse and show columns, rename worksheets, import worksheets from other workbooks, and more.
There are multiple ways to navigate between sheets within a workbook:
There are eight different worksheet types in Datameer. Only the formula (Fx) worksheet type analyzes data using the formula bar or with the function wizard.
To change the name of a sheet:
To delete a sheet:
You can duplicate workbook sheets. New sheets are named with the next available number appended to the sheet name.
To duplicate sheets in a workbook:
In addition to the raw data of a sheet with formulas, joins, sorts, or clustering, you can also copy the underlying logic and actions of a sheet into a new sheet.
To reorder sheets within a workbook:
To reorder columns in a sheet:
The graph overlay feature provides a graphic display of the currently selected sheet and any ingoing and outgoing sheets.
Importing sheets allows you to chain workbooks together.
To import sheets from one workbook into another workbook:
When a user creates new columns using the Formula Builder, Datameer creates default column names based on the column name in the formula, and the formula type.
Column names in workbook sheets that are editable can be renamed. The status line just above the list of tabs indicates whether a particular sheet is editable.
Column names can only contain capital or lower-case characters, numbers, and underscores. A column name cannot begin with a number. Note that column names are case sensitive! For example, column names foo, FOO, FoO and fOO will be unique columns within the same worksheet.
All column names, in all workbook sheet types, are case sensitive. This is a behavior change with version 7.5.
Column names have a 255 character limit.
To edit a column name:
To resize columns:
You can also double-click a column name to fully expand the column.
To hide (collapse) a column, right-click the column name and select Hide Column.
Double-click the hidden column to bring it back into view.
A toggle columns toolbar button displays all columns of the current worksheet. You can use this menu to show or hide columns as a batch process.
To insert a new column:
Select Insert Column. A new column is added to the left of the chosen column.
Each workbook can contain a maximum of 702 columns.
To remove a column:
To access the cell context menu, right-click on a cell within a workbook.
The cell context menu includes three options:
Datameer 7.5 includes the option to split workbook string and list type column content by a selected delimiter. Several optional additional controls are available. Resulting column names are the original column name with a sequential number appended. For list columns, commas are the only delimiter and there will be one column per comma up to the column limit.
The split column operation can be updated as many times as you like until you have desired split configuration.
To split a column:
The following optional controls are available:
5. Optionally, enter a Column prefix that be prepended to the new column names.
6. Once you have entered your desired parameters, click Create Sheet.
7. A new SplitString or SplitList sheet opens containing the split columns.
If desired, you can use the Split tab in the Inspector to enter new split criteria. Click Update Sheet to apply changes.
Datameer 7.5 includes the option to perform ordinal, 1-hot or binned encoding on column data, which assigns a unique numeric value to each categorical or continuous value. Once applied, encoding can be updated as needed until the desired results are achieved. Using Datameer to perform encoding provides a consistent view of prepared data, which is especially helpful for teams working together on model building and testing activities.
To encode column data:
Once encoding is initiated you can use the Encoding tab of the Inspector to add to or change encoding criteria. Click Update to apply changes.
To format the contents of a column:
The worksheet highlights rows by each group series when a GROUPBY function is used.
Double-click a cell to select it, and then right-click it and select Copy Cell Content to copy its contents to the browser's clipboard. This functionality doesn't work with the Safari browser.
The inspector gives you important information about the data of each column in your workbook. Use this feature for quick insights into the data that each of your columns contain.
As of Datameer 7.2
There are two tabs at the top of the workbook inspector. The inspector is separated at the worksheet level and column level.
The worksheet level tab has a description box where text can be entered about each worksheet in the workbook.
The column level tab has a description box where text can be entered about each column of a worksheet. Below the description box, the column information is as described above.
To open or close the workbook column inspector:
The deduplication feature eliminates duplicate/redundant data from your worksheet columns. This procedure can be performed across all columns of a worksheet or for specifically selected columns.
To deduplicate a column(s) in a worksheet:
A deduplicated sheet can be updated by clicking on the Deduplicate tab on the page inspector. Make a change to any option(s) and click Update Sheet.
If you would like to revert an action that you performed in a workbook, you can use the Undo and Redo buttons available in the toolbar. By default, you can undo or redo the last two actions performed on a workbook that modify the workbook state in the database. These actions don't include changes such as font style or column size, as the workbook's state remains the same. You can also access these options through the Edit menu or using keyboard shortcuts (Cmd+Shift+Z for undo, Cmd+Shift+Z for redo). If the cursor is in a text field, using the keyboard shortcut reverts only the most recent changes in the text field, not the whole workbook.
Administrators can adjust the settings for these buttons in
system.properties. There, admins can edit the amount of time the history of a workbook is kept, the maximum number of actions a a user can revert, and how many workbooks can retain a history at the same time.
To make sure this functionality isn't causing performance issues, you can check various statistics about the memory usage and workbook states in
<Datameer>/logs/workbook_undo.log. The log is set to record the following values separated by comma:
All undo related actions are collected, but for memory performance, the system prints the values to the log once per minute.
Formulas provide the ability to analyze your data in powerful ways.
The formula editor is located at the top of a worksheet within a workbook. Select a column on a worksheet and apply a formula with supported Datameer functions and operators.
Formulas can be added manually or using Datameer's Formula Builder. An auto-complete feature gives you the ability to visualize and complete formulas efficiently.
As of Datameer 7.2
The formula editor has multi-line functionality. Add more than one line when writing a formula so that it is easier to both read and write long and/or nested formulas.
Add additional lines by pressing Enter on your keyboard while in the editor. Submit your formulas by clicking the checkbox on the right of the editor or using the keyboard shortcut ⌘-Enter (mac) and Ctrl-Enter (win).
The multi-line editor can be expanded or collapsed by clicking and holding the edge of the formula bar and dragging your mouse.
A formula can only reference columns from it's own worksheet and one other worksheet in the workbook. Referencing columns from different worksheets is possible after a join sheet has first been created.
|Syntax for referencing a worksheet and column|
|Worksheet and column||#apache_log!remoteUser|
Datameer's supported functions.
Enter a function name into the formula editor (case insensitive) with variables separated by a semicolon within parentheses.
The Greater Than or Equal To function is being used comparing the "status" column from the "apache_log" worksheet
From the formula editor, you can enter functions and operators within functions.
|IF(EQUALS(#apache_log!status;400);"Bad Request";"other")||The IF function is being used with the Equals function as one of the variables.|
You can create formulas on worksheets that are editable. When you click the data area of a column that has a formula associated with it, the formula displays above the workbook in the Fx bar. You can also create a new sheet and create formulas that reference fields from other sheets. Click the Fx button on the formula line to display the formula builder. (As of Datameer 7.2, the formula builder is located in the worksheet inspector.)
To create a formula using the Formula Builder:
The resulting function displays next to the fx icon. To learn more, see using the Formula Builder.
To use operators in a formula:
To use regular expressions in a formula:
See Using Regular Expressions to see some examples. Datameer offers an API to extend the built-in formulas. See the Developer's Guide to learn more.
To edit formulas:
Variable values can be used within the worksheet formula bar as well as other areas such as the filter worksheet feature.
Call a variable value using the following syntax:
Variable values can be set by an administrator.
Production mode can be applied to a workbook to accelerate performance by skipping the calculation of metrics used for columnmetric, flip sheet displays and generation of data samples.
To launch production mode for a workbook:
Production mode is effective the next time the workbook is executed.
You can save workbooks, view or change workbook settings, link a workbook to multiple data sources, calculate a workbook, and more.
When you save a workbook, you also specify the workbook settings. If you don't know which settings to use, you can keep the default settings and you can change them later or the system administrator can set them. See Configuring Workbook Settings to learn more.
To save workbooks:
If workbook settings need to be updated after saving, right-click on the workbook in the file browser and select Configure .
From the File menu, select Save As, or click the Save Workbook As icon on the toolbar.To save a copy of a workbook:
You can create multiple workbooks that reference the same data set:
From the workbook you can calculate the workbook using the entire data set. Depending on the volume of data involved, this process might take awhile.
After making changes to an existing workbook, the next time the workbook is run, the changes are applied in the workbook calculation. Historical data before the change to the workbook isn't updated.
To view full results:
To go to a specific record while in the workbook:
You can view information about the workbooks that you have open.
Workbook information gives general information about the current workbook, including:
Sheet summary information gives advanced information on individual pages within the current workbook including the following:
|Sheet type||Sheet type||Description||Job path||Connection||Last executed||Kept||Partitioned||Formulas used||Filter source||Filter connector||Filter arguments||Join type||Join pairs||Sort source||Sort arguments||Sources of union|
To view workbook details:
You can set up the schedule of when a job runs when you create a workbook or you can change the schedule settings later. See Configuring Workbook Settings to learn more about scheduling.
If you use other systems or BI tools, you can connect and consume results generated by Datameer workbooks so they can be leveraged for other processes or reporting mechanisms using the integration link.
The default limit for number of rows to download is 100,000, as this functionality is intended for small aggregated data sets. To adjust the record download limit, change the
Here is an example how it can be used in Power BI:
Only users with admin rights can set or change permissions.
To view sharing permissions and security settings for a workbook: