5 Tips For Better DataStage Design #11

by Unknown on March 23, 2016 in #5, Configuration, dataset, Datastage, design, designer, develop, Development, disk, Jobs, parallelism, resources, scratch, tips, tricks

When writing intermediate results that will only be shared between parallel jobs, always write to persistent data sets (using Data Set stages). You should ensure that the data is partitioned, and that the partitions, and sort order, are retained at every stage. Avoid format conversion or serial I/O.
Data Set stages should be used to create restart points in the event that a job or sequence needs to be rerun. But, because data sets are platform and configuration specific, they should not be used for long-term backup and recovery of source data.

Depending on available system resources, it might be possible to optimize overall processing time at run time by allowing smaller jobs to run concurrently. However, care must be taken to plan for scenarios when source files arrive later than expected, or need to be reprocessed in the event of a failure.
Parallel configuration files allow the degree of parallelism and resources used by parallel jobs to be set dynamically at run time. Multiple configuration files should be used to optimize overall throughput and to match job characteristics to available hardware resources in development, test, and production modes.
The proper configuration of scratch and resource disks and the underlying file system and physical hardware architecture can significantly affect overall job performance.

Like the below page to get update
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

About Unknown
I am a Data Consultant at a Canadian financial firm. My keen interests varies from Data Analytics, ML, Kubernetes, NLP to ETL. I love to blog and travel in my spare time. If you’d like to get in touch, feel free to say hello through any of the social links.

DataGenX - Atul's Scratchpad

Breaking

Wednesday, March 23, 2016

5 Tips For Better DataStage Design #11

No comments:

Post a Comment

-

Follow Us

Search This Blog

Blog Archive

Disclaimer