On-premise Installation

Prerequisites

Supported Operating Systems

  • CentOS - Versions 6.6+, 7.3
  • Red Hat Enterprise Linux - Versions 6.6+, 7.3
  • Ubuntu - Version 16.04 (supported for HDInsight only)
  • Debian - 8.1 (supported for DataProc only)

Supported Hadoop Distributions

  • HDP - Versions 2.5.5, 2.6.4, 3.x
  • MAPR - Version 6.0.1
  • Cloudera - Version 5.13
  • Azure - HDI 3.6
  • GCP - 1.2 (Unsecured), 1.3 (Secured) Dataproc
  • EMR - Version 5.17.0

Installation Procedure

The installation logs are available in <path_to_Infoworks_home>/iw-installer/logs/installer.log.

Perform the following:

Download and Extract Installer

  • Download the installer tar ball: wget <link-to-download>
  • Extract the installer: tar -xf deploy_<version_number>.tar.gz
  • Navigate to installer directory: cd iw-installer

Configure Installation

  • Run the following command: ./configure_install.sh

Enter the details for each prompt:

  • Hadoop distro name and installation path (If not auto-detected)
  • Infoworks user
  • Infoworks user group
  • Infoworks installation path
  • Infoworks HDFS home (path of home folder for Infoworks artifacts)
  • Hive schema for Infoworks sample data
  • IP address for accessing Infoworks UI (when in doubt use the FQDN of the Infoworks host)
  • HiveServer2 thrift server hostname
  • Hive user name
  • Hive user password

If Hadoop distro is Cloudera (CDH):

  • Impala hostname
  • Impala port number
  • Impala user name
  • Impala password
  • Is Impala Kerberized?

If Impala is Kerberized:

  • Kerberos Realm
  • Kerberos host FQDN

If Hadoop distro is GCP:

  • Managed Mongo URL
  • Are infoworks directories already extracted in IW_HOME?

Run Installation

  • Install Infoworks: ./install.sh -v <version_number>

NOTE: For machines without certificate setup, --certificate-check parameter can be entered as false as described in the following syntax: ./install.sh -v <version_number> --certificate-check <true/false>. The default value is true. If you set it to false, this performs insecure request calls. This is not a recommended setup.

NOTE:

For HDP, CentOS/RHEL6, replace <version_number> with 2.9.0-hdp-rhel6

For HDP, CentOS/RHEL7, replace <version_number> with 2.9.0-hdp-rhel7

For MapR or Cloudera, CentOS/RHEL6, replace <version_number> with 2.9.0-rhel6

For MapR or Cloudera, CentOS/RHEL7, replace <version_number> with 2.9.0-rhel7

For Azure, replace <version_number>with 2.9.0-azure

For GCP, replace <version_number>with 2.9.0-gcp

For EMR, replace <version_number>with 2.9.0-emr

Post Installation

If the target machine is Kerberos enabled, performed the following post installation steps:

  • Go to <IW_HOME>/conf/conf.properties
  • Edit the Kerberos security settings as follows (ensure these settings are uncommented):
Copy
  • Restart the Infoworks services.

NOTE: Kerberos tickets are renewed before running all the Infoworks DataFoundry jobs. Infoworks DataFoundry platform supports single Kerberos principal for a Kerberized cluster. Hence, all Infoworks DataFoundry jobs work using the same Kerberos principal, which must have access to all the artifacts in Hive, Spark, and HDFS.

VariableType to search · ESC to discard
GlossaryType to search · ESC to discard
InsertType to search · ESC to discard
No matches