On-premise Installation
Prerequisites
Supported Operating Systems
- CentOS - Versions 6.6+, 7.3
 - Red Hat Enterprise Linux - Versions 6.6+, 7.3
 - Ubuntu - Version 16.04 (supported for HDInsight only)
 - Debian - 8.1 (supported for DataProc only)
 
Supported Hadoop Distributions
- HDP - Versions 2.5.5, 2.6.4, 3.x
 - MAPR - Version 6.0.1
 - Cloudera - Version 5.13
 - Azure - HDI 3.6
 - GCP - 1.2 (Unsecured), 1.3 (Secured) Dataproc
 - EMR - Version 5.17.0
 
Installation Procedure
The installation logs are available in <path_to_Infoworks_home>/iw-installer/logs/installer.log.
Perform the following:
Download and Extract Installer
- Download the installer tar ball: 
wget <link-to-download> - Extract the installer: 
tar -xf deploy_<version_number>.tar.gz - Navigate to installer directory: 
cd iw-installer 
Configure Installation
- Run the following command: 
./configure_install.sh 
Enter the details for each prompt:
- Hadoop distro name and installation path (If not auto-detected)
 - Infoworks user
 - Infoworks user group
 - Infoworks installation path
 - Infoworks HDFS home (path of home folder for Infoworks artifacts)
 - Hive schema for Infoworks sample data
 - IP address for accessing Infoworks UI (when in doubt use the FQDN of the Infoworks host)
 - HiveServer2 thrift server hostname
 - Hive user name
 - Hive user password
 
If Hadoop distro is Cloudera (CDH):
- Impala hostname
 - Impala port number
 - Impala user name
 - Impala password
 - Is Impala Kerberized?
 
If Impala is Kerberized:
- Kerberos Realm
 - Kerberos host FQDN
 
If Hadoop distro is GCP:
- Managed Mongo URL
 - Are infoworks directories already extracted in IW_HOME?
 
Run Installation
- Install Infoworks: 
./install.sh -v <version_number> 
NOTE: For machines without certificate setup, --certificate-check parameter can be entered as false as described in the following syntax: ./install.sh -v <version_number> --certificate-check <true/false>. The default value is true. If you set it to false, this performs insecure request calls. This is not a recommended setup.
NOTE:
For HDP, CentOS/RHEL6, replace <version_number> with 2.9.0-hdp-rhel6
For HDP, CentOS/RHEL7, replace <version_number> with 2.9.0-hdp-rhel7
For MapR or Cloudera, CentOS/RHEL6, replace <version_number> with 2.9.0-rhel6
For MapR or Cloudera, CentOS/RHEL7, replace <version_number> with 2.9.0-rhel7
For Azure, replace <version_number>with 2.9.0-azure
For GCP, replace <version_number>with 2.9.0-gcp
For EMR, replace <version_number>with 2.9.0-emr
Post Installation
If the target machine is Kerberos enabled, performed the following post installation steps:
- Go to 
<IW_HOME>/conf/conf.properties - Edit the Kerberos security settings as follows (ensure these settings are uncommented):
 
- Restart the Infoworks services.
 
NOTE: Kerberos tickets are renewed before running all the Infoworks DataFoundry jobs. Infoworks DataFoundry platform supports single Kerberos principal for a Kerberized cluster. Hence, all Infoworks DataFoundry jobs work using the same Kerberos principal, which must have access to all the artifacts in Hive, Spark, and HDFS.