Modify stage, one of the most un-used stage in DataStage but very useful in terms of performance tuning. It is advisable to Developers not to use transformer stage to just Trimming or NULL handling but if and only if in the case when they are aware and comfortable with the syntax and derivations supported by modify stage as there is no drop down or right click options to help us with functions/synatx.
The definition of Modify Stage as IBM documented -
"The Modify stage alters the record schema of its input data set. The modified data set is then output. You
can drop or keep columns from the schema, or change the type of a column.
The Modify stage is a processing stage. It can have a single input link and a single output link."
The operations offered by Modify Stage is -
1. Dropping of Columns
2. Keeping Columns
3. Create Duplicate Columns
4. Rename Columns
5. Change Null-ability of Columns
6. Change Data Type of Columns
Stage is supporting only 1 input stage and 1 output stage.
All these operations are easily done in other stages such as copy, transformer etc. But why Modify stage is required or can say, we should use this?
Answer of this Datastage problem is simple - Performance Tuning of jobs
** Why not to use Transformer -
Cause, Whenever we call the transformer functions, data processed to and through C++ code (transformer implementation) which cause the performance letancy(delay). This delay is negligible for less no of records than higher no. So, Prefer the Modify stage when no of records are high to process.
Keep looking for this place as we are going to learn lot of tips on Modify stage.
Get this Article as PDF - http://bit.ly/2fdroHR and http://bit.ly/2fdrDCr
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/
No comments:
Post a Comment