Best allocation of Partitions in DataStage for storage area
Srno |
No of Ways
|
Volume of Data
|
Best way of Partition
|
Allocation of Configuration File (Node)
|
1 |
DB2
EEE extraction in serial
|
Low
|
-
|
1
|
2 |
DB2
EEE extraction in parallel
|
High
|
Node
number = current node (key)
|
64
(Depends on how many nodes are allocated)
|
3 |
Partition
or Repartition in the Stages of DataStage
|
Any
|
Modulus (It should be single key that to integer)
Hash (Any number of keys with different data type)
|
8
(Depends on how many nodes are allocated for the job)
|
4 |
Writing
into DB2
|
Any
|
DB2
|
-
|
5 |
Writing
into Dataset
|
Any
|
Same
|
1,2,4,8,16,32,64
etc… (Based on the incoming records it writes into it.)
|
6 |
Writing
into Sequential File
|
Low
|
-
|
1
|
Best allocation of Partitions in DataStage for each stage
S. No
|
Stage
|
Best way of Partition
|
Important points
|
1 |
Join
|
Left and Right link: Hash or Modulus
|
All
the input links should be sorted based on the joining key and partitioned
with higher key order.
|
|
Lookup
|
Main link: Hash or same
Reference link: Entire
|
Both
the links need not be in the sorted order
|
|
Merge
|
Master and update link: Hash or Modulus
|
All
the input links should be sorted based on the merging key and partitioned
with higher key order. Pre-sort makes merge “lightweight” for memory.
|
|
Remove
Duplicate, Aggregator
|
Hash
or Modulus
|
If
the input link is in sorted order based on the key it will perform better.
|
|
Sort
|
Hash
or Modulus
|
Sorting
happens after partitioning
|
|
Transformer,
Funnel, Copy, Filter
|
Same
|
None
|
7 |
Change
Capture
|
Left and Right link: Hash or Modulus
|
Both
the input links should be in the sorted order based on the key and
partitioned with higher key order.
|
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/
No comments:
Post a Comment