Infoworks allows running a script to create pipelines with the same structure, in bulk. |
Following are the steps to run the script for bulk pipeline creation:
python pipeline_create.py -s <input_sql> -c <input_csv> -t <TOKEN> -o <output_csv>
Where,
<input_sql>
is the path of the SQL template based on which new pipelines will be created,<input_csv>
is the path of the CSV file that includes the specifics of the pipelines to be created,<TOKEN>
is the user authentication token obtained from the user settings page<output_csv>
is the output CSV file generated once the script is run.Sample Query
xxxxxxxxxx
select * from {table1} UNION select * from {table2}
Where,
{table1}, {table2}...{tableN) are the alias for the actual tables given in the table_names column in the input CSV file.
Sample CSV Input
xxxxxxxxxx
domain_name,env_compute_template_name,env_storage_name,pipeline_name,source_name,table_names,sync_type,scd_type,target_schema,target_table,target_path,storage_format,target_natural_keys,target_partition_keys
ImportDomain,test,storage_dbfs,pipeline_test1,SalesDB_AP,"orders,order_details,products,categories",APPEND,SCD_1,dev_testing,big_ticket_sales1,/iw/pipelines/dev_testing/big_ticket_sales1,DELTA,category_name,shipcity
ImportDomain,test,storage_dbfs,pipeline_test2,SalesDB_AP,"orders,order_details,products,categories",MERGE,SCD_1,dev_testing,big_ticket_sales2,/iw/pipelines/dev_testing/big_ticket_sales2,DELTA,"category_name,shipcity",
ImportDomain,test,storage_dbfs,pipeline_test3,SalesDB_AP,"orders,order_details,products,categories",OVERWRITE,SCD_1,dev_testing,big_ticket_sales3,/iw/pipelines/dev_testing/big_ticket_sales3,DELTA,"category_name,shipcity",
The CSV file must contain the following columns:
The output CSV file includes the following columns: