USA: 1-623-232-1062 info@novaturetech.com

Data Analytics – ETL Testing

ETL stands for Extract Transform and Load. Typical ETL systems and the Verification and validations that happen across the ETL Process is demonstrated below.

OLTP VS OLAP
  • The focus of OLTP application testing is on software code while OLAP application testing is directed at the validation of the correctness of data
  • The volume of data involved in OLAP application testing is typically very large when compared to volume of data involved in the testing of OLTP applications
  • Data integration projects present different set of challenges for testing of full and incremental loads
  • Performance testing of data integration projects presents different set of challenges including the need for large volumes of test data when compared to OLTP applications
  • The number of use cases for OLTP applications are finite while the test scenarios for regression and performance testing of OLAP applications can be virtually unlimited.
Testing Approach
Testing Approach for DW Database
  • Data Validity
    Are the values within the acceptable subset of an encoded list?Is the data as per business rules?
  • Data Consistency
    Are there any unexpected duplicates?
  • Referential Integrity
    Are there any orphan records or missing foreign keys?
Testing Approach for ETL Transformation
  • Data Completeness
    Are the counts matching between source and the target?
  • Data Integrity
    Is the target data consistent with the source data?Is the Fact to dimension table foreign key mapped in the ETL appropriately?
  • Data Transformation
    Test Lookup transformationTest Aggregate transformationTest Expression transformationRegression testing of transformations
Testing Approach for Reports
  • Unit Testing
    Verify Dashboard page. Verify that the report layout, prompts, titles, download and filter display meet the designCompare the SQL Query generated by the report before and after making RPD changes to the RPDAggregate the detail reports output and compare it with the summary report data. Check the links to detail report from charts, data and table headings in the summary reportFor each dimension folder, pick all attributes and run reports. The objective is to check if any attribute is not mapped properly. Also check the dimension hierarchy. Regression Testing Security TestingCheck the User and Access privileges
Testing the Conceptual Schema

Two main types of test on the data warehouse conceptual schema in the scope of functional testing. The first, we call fact test, verifies that the workload preliminary expressed by users during requirement analysis is actually supported by the conceptual schema. The second type of test a conformity test, because it is aimed at assessing how well conformed hierarchies have been designed.

Testing the Logical Schema

Testing the logical schema before it is implemented and before ETL design can dramatically reduce the impact of errors due to bad logical design. An effective approach to functional testing consists in verifying that a sample of queries in the preliminary workload can correctly be formulated in SQL on the logical schema. We call this the star test.

Testing the ETL Procedures

A functional test of ETL is aimed at checking that ETL procedures correctly extract, clean, transform, and load data into the data mart. The best approach here is to set up unit tests and integration tests. Unit tests are white-box test that each developer carries out on the units developed. Integration test allows the correctness of data flows in ETL procedures to be checked.

Testing the Database

Database testing is mainly aimed at checking the database performances using either standard (performance test) or heavy (stress test) workloads.

Testing the Front-End

Functional testing of the analysis front-ends must necessarily involve a very large number of end-users, who generally are so familiar with application domains that they can detect even the slightest abnormality in data. Nevertheless, wrong results in OLAP analyses may be difficult to recognize. They can be caused not only by faulty ETL procedures, but even by incorrect data aggregations or selections in front-end tools.

© 2024 Novature Tech Pvt Ltd. All Rights Reserved.