Page tree
Skip to end of metadata
Go to start of metadata

Datameer Application Server

Recommended Hardware (production environment with database on same server as Datameer):

Minimum:
  • 1U Server
  • 2 Quad Core CPUs
  • 16 + GB RAM
  • 2 x 1 TB Hard Drives (Recommended available disk space: 250 GB)
  • RAID - 0 striping
  • RAID - 1 mirroring
  • Dual 1GbE network
  • Redundant power
  • Failover requires a standby server with the same configuration
Recommended:
  • 1U Server
  • 2 Octa Core CPUs
  • 16 + GB RAM
  • 2 x 1 TB Hard Drives (Recommended available disk space: 250 GB)
  • RAID - 0 striping
  • RAID - 1 mirroring
  • Dual 10GbE network
  • Redundant power
  • Failover requires a standby server with the same configuration
Required Software:
  • Unix-based operating system (see Supported Operating Systems for more information)
  • Oracle Java 1.7 (32-bit or 64-bit) up-to-date bug fix version recommended 

    Hadoop distribution IBM Bighinsights recommends to run IBM JDK 1.7.

    OpenJDK will NOT work, due to unimplemented compression and encryption mechanisms (i.e. no kerberos)

  • Installed software: SSH, VI, MySQL 5.5, 5.6 (server and client executables must be available via shell search path)

    MySQL Database

    Datameer strongly recommends using MySQL databases instead of HSQL. Datameer service depends on the MySQL database, and it is used for constant writes for workbooks, permission changes, job execution, and scheduling, among other things. For proper function a response time should be between ten and twenty milliseconds.

  • Optional: SMTP server (for email notification)

Datameer Database Server

The Datameer database should be hosted on the same machine as the Datameer application server. It can be located on a hosted database only if the response time for a full write to the database is less than 20 milliseconds. If during database maintenance the database response can't be guaranteed, you need to gracefully shut down the Datameer service before maintenance. 

Hosted databases must use MySQL 5.5 or higher. The recommended size is a minimum of 5 GB. 

Hadoop Cluster

  • One of the supported Hadoop Distributions installed (Datameer installs Hadoop if it not already available)
  • Gigabit switches (10 GigE interconnected)
  • Sufficient power supply & cooling
     

The Hadoop distribution being used should be available to provide the Hadoop cluster design and sizing recommendations.

 Helpful Links:

 

Hadoop Master (NameNode and JobTracker)

Hardware:
  • 1U Server
  • 2 Quad Core CPUs
  • 32 + GB RAM
  • 2x1 TB hard drives
  • RAID - 0 striping
  • RAID - 1 mirroring
  • Dual 1GbE network
  • Redundant power
  • Failover requires a standby server with the same configuration
Software:
  • Unix-based system (e.g., Ubuntu Linux 10.04)
  • Java 1.7 (Oracle recommended)
  • Set JAVA_HOME to the root of your Java installation
  • Installed software: VI, SSH, SSHD, rsync, and SCP
     

2 Hadoop master nodes are required for HA (high availability) testing.

 

Hadoop Slave (Data Node)

Datameer recommends a minimum of 3 slave/data nodes in addition to the Hadoop master.

Hardware:
  • 1U Server
  • 2 Quad Core CPUs
  • 16 GB RAM (2 GB per Core)
  • 4x1 TB SAS JBOD
  • 1GbE network

Hard drive / data storage note:

A major feature of Hadoop is data redundancy which offers multiple benefits including availability, fast run times, and easy scalability.

As data may be stored multiple times on your hard drives, be aware of your storage sizing needs.

 

Software:
  • Unix-based system (e.g., Ubuntu Linux 10.04)
  • Java 1.7 (Oracle recommended)
    Set JAVA_HOME to the root of your Java installation
  • Installed software: VI, SSH, SSHD, rsync, and SCP

 

AWS (Elastic MapReduce) Deployments:

  • EC2 Access Key
  • EC2 Secret Access Key
  • EC2 Private Key Name
  • EC2 Private Key File
  • One empty S3 bucket
  • No labels