The following nodes are supported along with other transformations in Data Transformation:
The advanced analytics nodes are used for predictive modelling. Predictive modelling is the process by which a model is created to predict an outcome. If the outcome is categorical, it is called classification and if the outcome is numerical, it is called regression. Descriptive modelling or clustering is the assignment of observations into clusters so that observations in the same cluster are similar. |
H2O is an open source, in-memory, distributed, fast, and scalable machine learning and predictive analytics platform that allows you to build machine learning models on big data. |
Infoworks supports H2O as a machine learning engine for all the analytics nodes.
Following is the procedure to switch the machine learning engine from Spark ML to H2O.