Data warehouse Interview Questions :
The term Data Warehouse provides the user with the data and tool to answer the questions which are important to run the business or part of business.In short,the Data warehouse is used for reporting the specified data of any business which is very important and using which the higher management of the business will take the decisions for growth of business.Now a days Business Intelligence is demanding technology to analyse the business and its history.In this article i will try to explain 20 Most important Data warehouse Interview Questions.I have explained about SQL interview questions in other articles.These questions are helpful to BI developers as well as Data warehouse people to crack the Data warehouse interviews:
1.What is OLAP(Online Analytical Processing)?(100 % asked Data warehouse Interview Questions )
OLAP is technology used in many Business Intelligence applications which includes complex analytical calculations.OLAP is used for complex calculations,Trends Analysis,sophisticated data modeling.OLAP database is stored in multidimensional database model.OLAP system contains less number of transactions but complex calculations like aggregation- Sum,count,average,min,max e.t.c.
The Aggregated data in OLAP system must be in months,quarters,years,weeks e.t.c. The key purpose to use OLAP system is to reduce the query response time and increase the effectiveness of reporting.If these aggregated calculations are already stored in repository and if user wants fast access of data then user can use OLAP system.OLAP database stores aggregated historical data in multidimensional schema.
Real Example :
If Company head wants information of Resources salary in year 2000.
In spite of using the transactional system we will use OLAP system here where aggregated data of year 2000 for Resources is already present.
2.What is OLTP(Online Transaction Processing)?
OLTP system is known as large number of small daily transactions like insert,update and delete.Operational database is known as OLTP system.OLTP system provides fast query processing as well as it is also responsible to provide data integrity and data consistency.The actual effectiveness of OLTP is measured in number of Transactions per second.OLTP normally contains current data and data normalization is used properly in OLTP system.
Real Example :
If Company head wants transactional report of all Employees In – Out time.
As Company head wants daily report of in-out time we need to provide it using OLTP system.We need to schedule report on daily basis using OLTP system.
3.What is Data Warehouse?(100 % asked in Data warehouse Interview Questions )
Data warehousing is a system which is used for reporting purpose as well as data analysis purpose where data is coming from multiple heterogeneous sources whether it is oracle, sql server, postgres,simple excel sheet.Data warehousing is specially used for reporting historical data.Data warehousing is core component of Business Intelligence.In Data warehouse there is one central mechanism called as repository using which one can fetch the historical reports.In Data warehouse user can store current data as well as historical data on single place called as repository.Data warehouse provides generalized data and consolidated data in multidimensional view.Data warehouse provides Online Analytical tools to analyse the data and reporting purpose.
Data warehousing is subject oriented,time variant,non-volatile collection of data which is used for creation of transactional reports as well as historical reports.
4.What are dashboards?
Dashboards are nothing but the arrangement of all the reports and charts graphs on one page.It is nothing but the collection of reports of different formats which has same functionality displayed on the same page.
5.What are different characteristics of Data Warehouse?
- Data warehouse is a database which is separate from operational database which stores historical information also.
- Data warehouse database contains transactional as well as analytical data.
- Data warehouse helps higher management to take stratagic as well as tactical decisions using historical or current data.
- Data warehouse helps consolidated historical data analysis.
- Data warehouse helps business user to see the current trends to run the business.
- Data warehouse is used for reporting and data analysis purpose.
6.What are different types of Data warehouse Systems?
- Data Mart
- Online Analytical Processing (OLAP)
- Online Transactional Processing
- Predictive Analysis
7.What is difference between Data warehouse and Transactional System?(100 % asked Data warehouse Interview Questions )
|Data warehouse||Transactional System|
|Data warehouse stores complex and general form of the data.||Transactional system stores the updated daily transactions,workloads etc.|
|Data warehouse stores historical information||Transactional System Contains current data of organization|
|Data warehouse database generally have only read only access means user can only select the data||Operational or Transactional database have insert,update privileges as the data processing and updation needed.|
|Data warehousing requires data cleaning,data validation and data consolidation||Transactional system requires parallel processing of the data,concurrency control.Data consolidation is less required in transactional system as compare to OLAP database.|
8.What is Data Mart?
Data Mart is a simplest set of Data warehouse which is used to focus on single functional area of the business.We can say Data Mart is a subset of Data warehouse which is oriented to specific line of business or specific functional area of business such as marketing,finance,sales e.t.c. The data come in to Data Mart by different transactional systems,other data warehouse or external sources.
Data Mart is simply a subset of Organization’s Data warehouse.
9.What is database schema?What are its types?(100 % asked BI Interview Questions)
Schema is logical description of whole database.Database schema is a skeleton or structure of the database which represents database logically.Same like a database Data warehouse also requires to maintain database schema.Data warehouse schema includes name of database objects with its relationship maintained in diagrammatic format.Database uses relational model where data warehouse uses following types of Database Schema:
3.Fact Constellation schema which is also called as Galaxy Schema.
10.What is mean by Fact Tables?Explain with example.(100 % asked BI Interview Question)
Fact table is central table found in star schema or snowflakes schema which is surrounded by dimension tables.Fact table contains numeric values that are known as measurements.Fact table has two types of columns:
2.Foreign key of dimension tables.
The measures in a fact table are of three types :
Measures that can be added across any dimension
Measures that can not be added across any dimension
Measures that can be added across some dimensions.
Real Example :
Following is a fact table which contains all the primary keys of dimensions table and added measures for ITEM,i.e.Product sold.
|ITEM KEY||Time key||Product key||Date key||Product Sold|
The fact table contains the foreign keys,time dimensions,product dimension,customer dimension,measurement values.Following are some examples of common facts :
No of unit sold,Margin,Sales revenue and the dimension tables are customer,time and product e.t.c. which is used to analyse data.
11.What is Dimensions in BI?(100 % asked Data warehouse Interview Questions )
Dimension table is table which describes the business entities of an enterprise which describes the objects in a fact table.Dimension table has primary key which uniquely identifies each dimension row.Dimension table is sometimes called as lookup or reference table.The primary key of dimension table is used to associate relationship between fact table which contains foreign key.Dimension tables are normally in de-normalized form because these tables are only used to analyse the data and not used to execute transactions.
The fields in a dimension table is used to complete following 3 important requirement :
- Query Constrainting
- Grouping /Filtering
- Report labeling
Real Life Example :
Consider following table which contains item information.In the following table ITEM KEY is primary key which uniquelly identifies the rows in the dimension table. ITEM KEY will be present in Fact table.
|ITEM KEY||ITEM NAME||BRAND||SOLD BY||Category|
In the image i have explained which are fact and which are dimension tables. You will able to see there are four dimesions :
12.What is mean by aggregates?
Aggregate table contains aggregated data which can be calculated by using different aggregated functions like count,avg,min,max.e.t.c.Aggregated tables are most widely used tables in OLAP database.Aggregates are functions where the values of table or column are grouped together and form a single value. Following are some aggregate functions:
Using the above aggregate functions the data will be inserted in aggregate table.The aggregate tables are used for performance optimization and data is coming fast using aggregate table.Aggregations applied on database level improves the performance of the query as the query not hits directly on table it will hit on aggregate table and fetches data.
If table contains the data of year 2016 and 2017 ,User wants a actual count of records in the table monthly,Quarterly and yearly.Then We need to make aggregate table which contains count of records monthly,For Quarterly we need to create other table and push the quarterly count in that table.And we need to use that tables in the reports so that report performance will improve drastically.
13.What is Star Schema?(100 % Asked Data warehouse Interview Questions )
In Star schema there is Fact table as a center and all dimension tables surrounded with that fact table.It is called as Star schema because diagram resembles a star with points radiating from center.Star schema is used in simple data mart or data warehouse.Star schema is designed in such way that it will optimize the querying on large data sets.In Star schema multiple dimension tables joined with only one fact table in denormalized form.
OBIEE BMM(Business model) Layer always follows star schema.
14.What is snowflakes schema?(100 % Asked Data warehouse Interview Questions )
Snowflake schema is a form of dimensional modeling where dimensions are stored with multiple dimension tables.Snowflake schema is variation over star schema.The schema is diagrammed as each fact is surrounded with dimensions;and some dimensions are further related to other dimensions which are branched in snowflake pattern.In snowflake schema multiple dimension tables are organized and joined with fact table.Only difference between star and snowflake schema is dimensions are normalized in snowflake schema.Normalization splits up data in to additional tables.
Real life Example :
In Diagram i shown the snowflake schema where sales table is a fact table and all are dimensions.Store table is further normalized in to different tables name city,state and region.
15.What is mean by Granularity?(80% Asked Data warehouse Interview Questions )
Granularity in table represents the level of information stored in the table.In BI granularity is very important concept to check the table data.The granularity is high and low .High granularity data contains the data with high information or you can say it as transaction level data is high granularity data.Low granularity means data has low level information only.Fact table always have low granularity mean we need very low level data in fact table.
Following 2 points are important in defining granularity :
1.Determining the dimensions that are to be included
2.Determining location to place hierarchy of each dimension of information.
Real life Example :
Date Dimension Granularity level :
16.What is mean by data analytics?
Data analytics (DA) is the science of examining raw data with the purpose of drawing conclusions about that information. A data warehouse is often built to enable Data Analytics.
17.Explain benefits of Data warehouse?
A data warehouse helps to integrate data and store them historically so that we can analyze different aspects of business including, performance analysis, trend, prediction etc. over a given time frame and use the result of our analysis to improve the efficiency of business processes.
18.Why Data warehouse is used?
For a long time in the past and also even today, Data warehouses are built to facilitate reporting on different key business processes of an organization, known as KPI. Today we often call this whole process of reporting data from data warehouses as “Data Analytics”. Data warehouses also help to integrate data from different sources and show a single-point-of-truth values about the business measures (e.g. enabling Master Data Management).
Data warehouse can be further used for data mining which helps trend prediction, forecasts, pattern recognition etc.
19.What is mean by Dimensional modeling?
Dimensional model consists of dimension and fact tables. Fact tables store different transactional measurements and the foreign keys from dimension tables that qualifies the data. The goal of Dimensional model is not to achieve high degree of normalization but to facilitate easy and faster data retrieval.
Ralph Kimball is one of the strongest proponents of this very popular data modeling technique which is often used in many enterprise level data warehouses.
20.What is mean by additive,semi-additive and non-additive measures?
Non-additive measures are those which can not be used inside any numeric aggregation function (e.g. SUM(), AVG() etc.). One example of non-additive fact is any kind of ratio or percentage. Example, 5% profit margin, revenue to asset ratio etc. A non-numerical data can also be a non-additive measure when that data is stored in fact tables, e.g. some kind of varchar flags in the fact table.
Semi Additive Measures
Semi-additive measures are those where only a subset of aggregation function can be applied. Let’s say account balance. A sum() function on balance does not give a useful result but max() or min() balance might be useful. Consider price rate or currency rate. Sum is meaningless on rate; however, average function might be useful.
Additive measures can be used with any aggregation function like Sum(), Avg() etc. Example is Sales Quantity etc.
21.What is Business Intelligence?
The term Business Intelligence (BI) provides the user with the data and tool to answer the questions which are important to run the business or part of business.In short,the business intelligence is used for reporting the specified data of any business which is very important and using which the higher management of the business will take the decisions for growth of business.Business Intelligence (BI) is used to take following organization level decisions:
1.BI is used to determine whether the business is running as per plan
2.BI is used to identify which things are going wrong
3.BI is used to take and monitor corrective actions
4.BI is used to identify the trends.
22.What are different stages of Business Intelligence?
There are following five stages of Business Intelligence:
3.Decision making support
These are some important data warehouse interview questions.Hope you like this article.If you like this article don’t forget to comment in comments section.