1. The query used in the database should be in such a way that required number of rows are fetched. Do not extract the columns which are not required.
2. For parallel jobs, sequential File should not be read using same partitioning.
3. For huge amount of data, use of sequential file stage is not a good practice. This stage also should not be used for intermediate storage between jobs. It degrades the performance of the job.
4. The number of lookups in a job design should be minimum. Join stage is a good alternative to lookup stage.
5. For parallel jobs, proper portioning method is to be used for better job performance and accurate flow of data.
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/
No comments:
Post a Comment