Data warehousing

What is definition of data warehouse and its characteristics?

What is definition of data warehouse and its characteristics?

Data warehousing is a system which is used for reporting purpose as well as data analysis purpose where data is coming from multiple heterogeneous sources whether it is  oracle, sqlserver, PostgreSQL, simple excel sheet. Data warehousing is specially used for reporting historical data. Data warehousing is core component of Business Intelligence (BI).In Data warehouse there is one central mechanism called as repository using which one can fetch the historical reports. In Data warehouse user can store current data as well as historical data on single place called as repository. Data warehouse provides generalized data and consolidated data in multidimensional view. Data warehouse provides Online Analytical tools to analyze the data and reporting purpose. I would like to start this article with definition of data warehouse in simple words.

What you will find in this article?

1.Definition of Data warehouse

2.Characteristics of data warehouse

3.OLAP vs OLTP

4.Types of Data warehouse

What is definition of data warehouse and its characteristics?

Definition of data warehouse :

Data warehousing is subject oriented, time variant, non-volatile collection of data which is used for creation of transactional reports as well as historical reports.

Characteristics of Data warehouse:

  1. Data warehouse is a database which is separate from operational database which stores historical information also.Data warehouse
  2. Data warehouse database contains transactional as well as analytical data.
  3. Data warehouse helps higher management  to take strategic as well as tactical decisions using historical or current data.
  4. Data warehouse helps consolidated historical data analysis.
  5. Data warehouse helps business user to see the current trends to run the business.
  6. Data warehouse is used for reporting and data analysis purpose.

Data warehouse is first invented by bill Inmon in 1990

Data warehouse Vs Transactional System :

Following table shows the difference between Data warehouse and Transactional system :

Data warehouse Transactional System
Data warehouse stores complex and general form of the data. Transactional system stores the updated daily transactions, workloads etc.
Data warehouse stores historical information Transactional System Contains current data of organization
Data warehouse database generally have only read only access means user can only select the data Operational or Transactional database have insert, update privileges as the data processing and updating needed.
Data warehousing requires data cleaning, data validation and data consolidation Transactional system requires parallel processing of the data, concurrency control. Data consolidation is less required in transactional system as compare to OLAP database.

Types of Data warehouse system :

  1. Data Mart
  2. Online Analytical Processing (OLAP)
  3. Online Transactional Processing
  4. Predictive Analysis

1.Data Mart :

Data Mart is a simplest set of Data warehouse which is used to focus on single functional area of the business. We can say DatData Marta Mart is a subset of Data warehouse which is oriented to specific line of business or specific functional area of business such as marketing,finance,sales etc.. The data come in to Data Mart by different transactional systems, other data warehouse or external sources.

Data Mart is simply a subset of Organization’s Data warehouse.

2.Online Analytical Processing System (OLAP) :

OLAP is technology used in many Business Intelligence applications which includes complex analytical calculations.OLAP is used for complex calculations, Trends Analysis, sophisticated data modeling.OLAP database is stored in multidimensional database model.OLAP system contains less number of transactions but complex calculations like aggregation- Sum,count,average,min,max etc..

The Aggregated data in OLAP system must be in months,quarters,years,weeks etc.. The key purpose to use OLAP system is to reduce the query response time and increase the effectiveness of reporting.If these aggregated calculations are already stored in repository and if user wants fast access of data then user can use OLAP system.OLTP database stores aggregated historical data in multidimensional schema.

Real Example :

If Company head wants information of Resources salary in year 2000.

In spite of using the transactional system we will use OLAP system here where aggregated data of year 2000 for Resources is already present.

3.Online Transaction Processing System(OLTP) :

OLTP system is known as large number of small daily transactions like insert, update and delete. Operational database is known as OTLP system. OLTP system provides fast query processing as well as it is also responsible to provide data integrity and data consistency. The actual effectiveness of OLTP is measured in number of Transactions per second. OLTP normally contains current data and data normalization is used properly in OLTP system.

Real Example :

If Company head wants transactional report of all Employees In – Out time.

As Company head wants daily report of in-out time we need to provide it using OLTP system. We need to schedule report on daily basis using OLTP system.

OLAP Vs OLTP :

Following Table shows difference between OLAP and OLTP system:

OLAP OLTP
It stores only historical data and historical data processing is done. It involves daily processing of data
OLAP System is used by higher management like managers , analysts , executives,CEO,CFOs OLTP system used by DBAs, Database Professionals ,Programmers for applying business logic.
OLAP is used to analyze the business OLTP is used to run the business
OLAP is very huge database so lot of indexes are used for fast data processing OLTP uses less indexing as data is less here
OLAP uses Star-Schema, Snowflakes schema or Fact-Dimensions OLTP uses Entity Relations
OLAP contains historical data OLTP contains Current data
OLAP has less joins and in de-normalized form database OLTP has large no of joins and in normalized form
OLAP system has aggregated multidimensional data OLTP has not aggregated data.
OLAP system gives summarized consolidated data OLTP system gives data in detailed format
OLAP database size is 100 GB to 100 TB OLTP database size is 100 MB to 100 GB

CLICK HERE TO CHECK  OUT  COMPLEX SQL QUERIES 

Hope Everyone likes this article. If you like this article don’t forget to comment on comment section.