Infoworks 6.1.3
Prepare Data

Retrieving Distinct Values

Distinct node allows you to retrieve distinct values/rows in a table.

NOTE Duplicate records in a data set must be removed before data mining.

For example, in a marketing database, individuals may appear multiple times with different address or company information. You can use the Distinct node to find or remove duplicate records in your data set. Distinct node starts with zero rows and columns.

Following are the steps to retrieve only the distinct rows in a table:

  • Drag and drop the Distinct node from the Transformations section to the pipeline editor page.
  • Connect the source node to the Distinct node.
  • Double-click the Distinct node. The properties page is displayed.
  • Click Add Distinct, select the Input Column, enter the Distinct Column Name or you can Use name of selected input column for distinct column and click Save.

The Distinct Properties page is displayed. You can verify the content in Schema and Data sections.

NOTE For details on derivations, see Derivations.

Post Processing Configuration

For details, see Configurations - Post Processing.