Here are some terms that have a specific meaning to Datameer.
One of the users registered in Datameer with unrestricted access who is responsible for managing the system. I.e., By configuring the system, monitoring the system, adding more users and assigning users both roles and groups.
Aggregate functions combine and then operate on all the values in a group. I.e., The function returns one value for each group.
AMI (Amazon Machine Image)
One of the users in Datameer with restricted access who can configure data sources, analyze data, and create infographics and reports.
Source code based specifications used to interact with or to add functionality to a program. E.g., In Datameer include custom functions, parsing scheme for import and export jobs, or custom plug-ins.
The system used to authenticate users in Datameer. Besides a default internal user management, Datameer ships with plug-ins to use LDAP/Active Directory. It is also possible to create custom plug-ins for authentication purposes.
One of the primitive data types used in Datameer. These are also known as high-precision float values.
One of the primitive data types used in Datameer. These are also known as unlimited integer values.
blank (blank cell)
|A blank cell can contain either an empty string value, a string with only white spaces, or a null value.|
One of the primitive data types used in Datameer. Based on Boolean algebra, these are either TRUE or FALSE.
A unique ID for each job that does not update if that job is run again. Once a job has been given a configuration ID it always hold that number.
|Where the data is stored such as a database, a file such as an S3 Amazon Web Services connection, or a Hive.|
Static values, e.g., a fixed number or string, used as function arguments, not to be confused with placeholders.
A new ID is created each time a job runs which produces new data.
A data link lets you feed data into a workbook without using an import job. Data links are not imported into HDFS, but are streamed into Datameer on demand.
A collection of data which is either in a tabular of non-tabular form. Data can be structured, semi-structured, or unstructured. In Datameer, data sets are the source of data, e.g. databases, server error logs, or Twitter feeds.
One of the data types used in Datameer. These are dates in a form recognized by Datameer, rather than recognized as strings.
EC2 (Amazon Elastic Compute Cloud)
Amazon Elastic Compute Cloud is a scalable web service offered by Amazon Web Services for computing data remotely.
EMR (Amazon Elastic MapReduce)
empty (empty string)
|An empty string is a string data type value with the length of zero. A cell with an empty string appears blank.|
This is a job which exports the results of a workbook to an external resource, e.g., a file or a database, that can be used independently of Datameer. Adaptors for several remote systems are included out of the box, and others can be added with plug-ins.
A complete formula including defined functions and required arguments. An expression can contain multiple (nested) formulas.
Field parameters including data field type, name, and acceptance of null values for a given data set.
|A font whose letters and characters each occupy the same amount of horizontal space.|
One of the primitive data types used in Datameer. These are 64-bit float values (also called doubles).
A formula is created by a data analyst and is similar to macros in other programs. It consists of a function and its required arguments.
Group series functions operate row-wise within a group. I.e., The function is applied to every row and therefore returns a value for every argument in the group.
This is the primary storage system used by Hadoop applications. It is used either in a cluster or as a stand-alone distributed file system.
|Infographics is a visualization tool that consolidates, aggregates, and arranges measurements and metrics (measurements compared to a goal) in the form of charts, graphs, reports, and sometimes scorecards on a single screen.|
One of the primitive data types used in Datameer. These are 64-bit integer values (also called longs).
|Measures dissimilarity between sample sets. Complementary to the Jaccard coefficient and is obtained by subtracting the Jaccard coefficient from 1, or, equivalently, by dividing the difference of the sizes of the union and the intersection of two sets by the size of the union.|
JDBC (Java Database Connectivity)
This is a Java-specific API defining how a database may be accessed.
JDK (Java Development Kit)
This is a collection of programming tools which can be used to design products with the Java programming language.
A new ID is created each time a job runs whether it produces new data or not.
A format for transmitting data from a server to a web application through a network using a pre-defined schema, while at the same time being easy to read.
|A data structure that uses a hash function to map identified keys to corresponding values. (See below JSON Object)|
|An unordered collection of key:value pairs with the ':' character separating the key and the value, comma-separated and enclosed in curly braces; the keys must be strings and should be distinct from each other.|
This is a general word referring to the configuration and executions needed to complete analyses in Datameer, e.g., import jobs, export jobs or workbook jobs. In Datameer every job configuration is numbered consecutively and independently of job executions. Datameer job executions usually correspond to one or more MapReduce jobs.
The settings necessary to execute a job in Datameer. Job configurations include e.g., file path, character encoding and schedule details for an import or export job and sheet names, formulas and connections for a workbook. Every job configuration is numbered consecutively with a unique identifying number, independently of the corresponding job executions.
These are the individual operations performed in Datameer according to a job configuration. Every job execution is numbered consecutively with a unique identifying number, independently of the corresponding job configurations.
The strategy used when combining two data sets, based on a given key.
|An authentication protocol that provides mutual authentication and single sign-on capabilities.|
|In Datameer multiple values can be combined into a list. Lists are a series of values of a single data type.|
MapReduce is a framework for processing data over a distributed file system. A 'map' step first splits the task into sub-tasks, and the 'reduce' step combines the results of the 'map' tasks into one result.
|My Datameer is a web portal to login and manage your Datameer account. Here you can renew a subscription, manage data limits, download updates, submit feature requests, submit support tickets, and more.|
null values (<null>)
|Null values (sometimes represented as ω) show that there is not any information attached to a specific record, or that specified information is not found within a specified connection. A cell with a null value appears blank.|
A category of database software providing an interface which users can use to quickly and interactively examine their data and results of processes in various dimensions.
These are special symbols which are used similarly to functions.
As Datameer is an analytics tool with a web interface, pages are information resources that can be seen using a web browser. In Datameer all components are embedded in pages, e.g., a workbook, data link configuration, or administrator controls.
Partitioning segments of similar data into individually stored, often hierarchical parts. Typically, these represent periods of time, e.g., months, days or hours. The division of data is typically done for ease of management and performance reasons.
A placeholder is symbol that is replaced by a dynamically changing value, e.g., %day% for the current day or %user% for the current user. Placeholders are also known as wildcards or free variables
An SDK shipped with Datameer to create custom plug-ins.
The total number of significant digits which can be included in a big decimal number.
|A sequence of characters that can be used to specify and recognize desired strings in a flexible and concise way.|
REST-style architecture consists of clients and servers where clients initiate requests to servers, and servers process those requests and return appropriate responses.
This is a scalable web storage service offered by Amazon Web Services used to store data remotely.
The number of significant digits behind the decimal point in a big decimal number.
SDK (Software Development Kit)
A collection of development tools for creating applications for a software package.
A broad topic best described as information security, including the use of Datameer-specific credentials or LDAP/Active Directory when connecting to Datameer or using secure impersonation when connecting Datameer to a database. Another tool used for implementing security is setting permissions for individual pages.
A form of structured data that doesn't conform with the formal tables or data models of relational databases.
A page or tab in a workbook. In datameer there are different types of sheets, e.g data sheet, formula sheet, join sheet, union sheet.
A set of tables comprised of a single central fact table surrounded by normalized dimensional hierarchies.
One of the primitive data types used in Datameer. All data that is not a Boolean value, a big decimal, a big integer, a date, a float value or an integer is considered a string. Strings can contain any type of (unix) character and are used to represent, text, URLs, and date patterns.
A star schema is a set of tables comprised of a single, central fact table surrounded by de-normalized dimensions.
Any document, file, image, report, form, etc. that has no defined, standard structure that would enable convenient storage in automated processing devices.
The group that a user is assigned to, e.g., sales department or research and development.
The role a user is assigned to, e.g., administrator or analyst.
An infographic tool to present data. Examples include graphs, pie charts, and maps.
The spreadsheet-like view used for analyses of data.