ETL Testing Interview Questions :
In my previous articles i have explained about ETL definition and its concept in brief. In this article i would like to explain the ETL Testing Interview Questions with Answers in detail. ETL stands for Extract Transform and Load. The database developer always perform ETL task to extract the data from source system ,Transform it and load it in to target. The new concept is came in to the picture -ETL testing. ETL Tester is responsible to check all the data is coming properly in the target system and target system is working as per demand.
I will explain different ETL Testing Interview Questions and answers in next section :
Question 1 : What is ETL? (100% asked ETL Testing Interview Questions)
As the name suggested ETL stands for Extract Transform and Load. Just like the name applies ETL tool Extracts data from the source. Transforms the data while in transit and then it loads the data in to Specified database.
Following are two different definitions of ETL :
Definition 1 :
ETL is nothing but Extract, Transform and Loading of the data from multiple heterogeneous data sources to a single or multiple sources.
Definition 2 :
ETL is nothing but Extraction, Transformation and Loading of the data.
Question 2 : Explain Concept of Extraction, transformation and Loading.
Take data from an external source and move it to the warehouse pre-processor database.
Transform data task allows point-to-point generating, modifying and transforming data.
Load data task adds records to a database table in a warehouse.
Question 3 : What is difference between Manual Testing and ETL Testing?(90% asked ETL Testing Interview Questions)
1.The main difference between manual testing and ETL testing is manual testing is related to the functionality of the program and ETL testing is related to the databases and its count.
2.ETL is the automated testing process where you don’t need any technical knowledge other than the software. Also, ETL testing is extremely faster, systematic and assure top results as needed by the businesses.
3.Manual testing is highly time-consuming where you need technical knowledge to write the test cases and the scripts. It is slow, needs efforts, and highly prone to errors.
Question 4 : Explain Need Of ETL Testing.(100% asked ETL Testing Interview Questions)
Now a days we are migrating the tons of systems from old technology to new technology. At the time of migration activities user also needs to migrate the data as well from old DBMS to latest DBMS.So there is huge need to test that data is correct from target side.The following are some bullet points where i have explained the necessity of ETL Testing :
- To keep a check on the Data which are being transferred from one system (Old system) to the other (New system).
- To keep a track on the efficiency and speed of the process.
- To be well acquainted with the ETL process before it gets implemented into your business and production.
Question 5 : Where user can use ETL concepts. Give some examples.
- Before ETL tool user needs to write a long code for data transformation to data loading
- ETL makes the life simple and one tool will manage all the scenarios of transformation and loading of the data
- There are following examples where we are using the ETL :
Example 1 : Data warehousing :
The ETL is used in data warehousing concepts. User needs to fetch the data from multiple heterogeneous systems and loads it in data warehouse database.ETL Concept is mainly used here to extract the data from source,transform the data and load it in to target systems.
Example 2: Data Migrations
The data migrations are difficult efforts if you use the PL SQL or T-SQL development to do. If you want to migrate the data using simple way use different ETL tools.
Example 3 : Mergers and Aquisitions
Now a days lot of companies are merging in to different MNCs. To move the data from one company to another company the ETL concepts is been used.
Question 6 : Explain how ETL is used in third party data management.(100% asked ETL Testing Interview Questions)
The big organizations always gives different application development to different kind of vendors.Means not a single vendor is managing everything. Lets take example of Telecommunication project where billing is managed by one company and CRM is managed by other company.If CRM company needs some data from the company who is managing the Billing. That company will receive a data feed from the other company. To load the data from the feed ETL process is used.
Question 7 : Explain how ETL is used in Data warehousing?
The most common example of ETL is ETL is used in Data warehousing.User needs to fetch the historical data as well as current data for developing data warehouse. The Data warehouse data is nothing but combination of historical data as well as transactional data. Its data sources might be different.User needs to fetch the data from multiple heterogeneous systems and load it in to single target system which is also called as data warehouse.
As The ETL definition suggests that ETL is nothing but Extract, Transform and loading of the data; This process needs to be used in data warehousing widely. The simple example of this is managing sales data in shopping mall. If user wants the historical data as well as current data in the shopping mall first step is always user needs to follow the ETL process. Then that data will be used for reporting purpose.
Question 8 : Explain difference between ETL and BI tools?
An ETL tool is used to extract data from different data sources, transform the data, and load it into a DW system. In contrast, a BI tool is used to generate interactive and adhoc reports for end-users, dashboard for senior management, data visualizations for monthly, quarterly, and annual board meetings.
Most common ETL tools include − SAP BO Data Services (BODS), Informatica, Microsoft – SSIS, Oracle Data Integrator ODI, Talend Open Studio, Clover ETL Open source, etc.
Most common BI tools include − SAP Business Objects, SAP Lumira, IBM Cognos, JasperSoft, Microsoft BI Platform, Tableau, Oracle Business Intelligence Enterprise Edition, etc.
Question 9 : What is difference between ETL Testing and Database Testing?(80% asked ETL Testing Interview Questions)
Following are different difference points between ETL testing and database testing :
|ETL Testing||DB Testing|
|Business Intelligence reporting||Goal is to integrate data|
|Business flow environment based on earlier data||Applicable to business flow systems|
|Informatica, Cognos and QuerySurge can be used||QTP and Selenium tools for automation|
|Analysing data may have potential impact||Architectural implementation involves high impact.|
|Dimensional model||Entity relationship model|
|Analytics are processed||Transactions are processed|
|Denormalized data is used||Data used is normalized|
Question 10 :What are different characteristics of Data Warehouse?(100% asked ETL Testing Interview Questions)
- Data warehouse is a database which is separate from operational database which stores historical information also.
- Data warehouse database contains transactional as well as analytical data.
- Data warehouse helps higher management to take stratagic as well as tactical decisions using historical or current data.
- Data warehouse helps consolidated historical data analysis.
- Data warehouse helps business user to see the current trends to run the business.
- Data warehouse is used for reporting and data analysis purpose.
Question 11 : What are different types of Data warehouse Systems?
- Data Mart
- Online Analytical Processing (OLAP)
- Online Transactional Processing
- Predictive Analysis
Question 12 : Which are different steps of ETL testing process?(100% asked ETL Testing Interview Questions)
Following are different steps included in ETL Testing :
Step 1 :Analyzing the requirement:
Understanding the business structure and their particular requirement.
Step 2 :Validation and Test Estimation:
An estimation of time and expertise required to carry on with the procedure.
Step 3 : Test Planning and Designing the testing environment:
Based on the inputs from the estimation, an ETL environment is planned and worked out.
Step 4 :Test Data preparation and Execution :
Data for the test is prepared and executed as per the requirement.
Step 5 : Summary Report:
Upon the completion of the test run, a brief summary report is prepared for improvising and concluding.
These are different steps in ETL testing process.
Question 13 : How ETL is used in Data migration projects. Explain with example.(60% asked ETL Testing Interview Questions)
ETL tools are widely used in data migration projects. If the organization is managing the data in oracle 10 g previously and now organization wants to go for SQL server cloud database then there is need to migrate the data from Source to Target.To do this kind of migration the ETL tools are very useful. If user wants to write the code of ETL it is very time consuming process. To make this simple the ETL tools are very useful in which the coding is simple as compare to PL SQL or T-SQL code.So ETL process i very useful in Data migration projects.
Question 14 : Explain multiple steps to choose ETL tool .(90% asked ETL Testing Interview Questions)
The choosing the ETL tool is very difficult thing. You need to consider lot of factors while choosing the correct ETL tool according to the project.Choosing the ETL tool for specific project is very stratagic move even you need it for a small project.Make sure that ETL tool migrations are no small efforts.In this section i would like to give you some bullet points to consider while choosing your ETL tool.
1.Data Connectivity :
ETL tool should be communicate with any source of data no matter where it comes from.This is very critical.
Moving and changing a data requires some serious processing power. So you need to check the performance factors.
3.Transformation Flexibility :
Matching,Merging and changing the data is very critical.ETL data should provide these and many transformation packages which allow modifications to the data in transformation phase with simple drag and drop.
4.Data Quality :
Your data is not clean. The only way to leverage your data when your data is consistent and clean.
5.Flexible data Acquisition options :
Once the ETL is ready you need to check that ETL will work on previous data as well as new coming data.
6.Commited ETL Vendor :
You are playing with the organization data while doing the ETL process.So Choose vendor who is very well known in the industry and whose support is really great.
Question 16 : Name some important ETL bugs.(70% asked ETL Testing Interview Questions)
There are following popular ETL Bugs :
3.ECP related bugs
4.load condition bugs
5.The User-Interface bugs.
Question 17: What is Operational data Source
- ODS stands for Operational Data Store.
- ODS Comes between staging area & Data Warehouse. The data is ODS will be at the low level of granularity.
- Once data was populated in ODS aggregated data will be loaded into EDW through ODS.
Question 18 : Explain imporatance of mapping sheet.Who is responsible to create mapping sheet?
The ETL mapping sheet contains all necessary information from the source file and stores the details in rows and columns. This mapping sheet helps experts in writing SQL queries to speed up the testing process.
The Database designer is responsible to create the mapping sheet.
Question 19 : What is fact and what are its types.(100% asked ETL Testing Interview Questions)
It is a central component of a multi-dimensional model which contains the measures to be analyzed. Facts are related to dimensions.
Types of facts are:
Additive: A measure can participate arithmetic calculations using all or any dimensions.
Ex: Sales profit
Semi additive: A measure can participate arithmetic calculations using some dimensions.
Ex: Sales amount
Non Additive:A measure can’t participate arithmetic calculations using dimensions.
Question 20 : Explain Data extraction phase in ETL with its type.(90% asked ETL Testing Interview Questions)
The data extraction is nothing but extracting the data from multiple heterogeneous sources with using ETL tools.
There are 2 Types of Data Extraction
1.Full Extraction : All the data from source systems or operational systems gets extracted to staging area. (Initial Load)
2.Partial Extraction : Sometimes we get notification from the source system to update specific date. It is called as Delta load.
Source System Performance : The Extraction strategies should not affect source system performance.
Question 21 : What is Dimensions? Explain with example.
Dimension table is table which describes the business entities of an enterprise which describes the objects in a fact table.Dimension table has primary key which uniquely identifies each dimension row.Dimension table is sometimes called as lookup or reference table.The primary key of dimension table is used to associate relationship between fact table which contains foreign key.Dimension tables are normally in de-normalized form because these tables are only used to analyse the data and not used to execute transactions.
The fields in a dimension table is used to complete following 3 important requirement :
- Query Constrainting
- Grouping /Filtering
- Report labeling
Following are different examples of dimensions:
Question 22 : Explain about data transformation in ETL.
- Data Extracted from source system is in to Raw format. We need to transform it before loading in to target server.
- Data has to be cleaned, mapped and transformed
- There are following steps in transformation:
1.Selection :Select data to load in target
2.Matching : Match the data with target system
3.Data Transforming: We need to change data as per target table structures.
Question 23 : What are different examples of data transformation in ETL.
There are following examples of data transformation:
1.Standardizing data : Data is fetched from multiple sources so it needs to be standardized as per the target system.
2.Character set conversion : Need to transform the character sets as per the target systems. (Firstname and last name example).
3.Calculated and derived values: In source system there is first val and second val and in target we need the calculation of first val and second val.
4.Data Conversion in different formats : If in source system date in in DDMMYY format and in target the date is in DDMONYYYY format then this transformation needs to be done at transformation phase.
Question 24 : Explain partitioning in ETL?
The transactions are always needed to be divided for the better performance. The same processes are known as Partitioning. It simply makes sure that the server can directly access the sources through multiple connections.
Question 25 : What is data loading? Explain its types.
Data loading phase loads the prepared data from staging tables to main tables.
There are following types of data loading:
- Initial load : Populating all the data tables from source system and loads it in to data warehouse table.
- Incremental Load : Applying the ongoing changes as necessary in periodic manner.
- Full Refresh : Completely erases the data from one or more tables and reload the fresh data.
Question 26 : What are different types of ETL tool?
There are following types of ETL tools:
1.Enterprise ETL tools :
MS SQL server integration services
2.Open Source ETL Tools :
These are most important interview questions for Testers.Hope you like this article on ETL Testing Interview Questions.If you find it useful or if you have suggestions with the same kindly comment it in to comment box.