The Audit service will provide simple measures on the ETL process: number of records input/output, number of rejected records, number of records loaded, Starting and Ending Row Count in the Target tables, New Rows in the source Table/Files, Inserted Rows and Updated Rows.
These measures are captured during the batch load process and stored in the audit tables for load audit checks and reporting purposes. Each process of the ETL should produce these key statistics by calling the audit services at the beginning and ending of a load process. All Audit tables will be stored in the staging repository. Exception reports can be produced for action to be taken by the business users and production support personnel.
Ref # | Step Description |
1 | Represents all the source systems from which data is extracted. Audit services will collect all the source audit data during the ETL process from source systems |
2 | Represents the ETL process which extracts data from Source systems and loads the data to target systems. Mostly audit data is not captured at the ETL process level. |
3 | Represents the target systems where the data gets loaded after the completion of the ETL process. Audit services will collect all the target audit data at the end of the ETL process from Target systems. |
4 | Audit Tables are internal tables where all the audit data related to source and target systems are loaded. Entry is made in this table every time an ETL batch process runs. These tables reside internally in the staging area. Audit reports can be generated using these tables. |
5 | Once the Audit data is captured. Audit check routines can be used to detect any discrepancy in the source or the target systems. This will be an automated process which will run at the completion of each ETL job based on the predefined rules. |
6 | As Mentioned in the previous step, These audit rules can be set to fail the job and stop the batch process or the batch process can continue if the rules did not fail. |
7 | Once the Audit check routines are complete for any job. Audit flag will be set to specify a success or failure on the audit check. |
8 | If the audit check routine does not indicate a failure flag, next batch process continues to run. |
9 | If the audit check routine indicates a failure, then the batch process is aborted. |
10 | When the batch process is aborted, the control passes to the alert system where appropriate action is taken by sending email notifications and pager messages. |
11 | Audit reports can be generated which will be useful to audit the ETL process and as well the source systems. |
Continue in next part - Audit Strategy in ETL #2
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/
No comments:
Post a Comment