Database – RDBMS:
There are a number of relational databases to store data. A relational database contains normalized data stored in tables. Tables contain records and columns. RDBMS makes it easy to work with individual records. Each row contains a unique instance of data for the categories defined by the columns.
RDBMS are used in OLTP applications(e.g. ATM cards) very frequently and sometimes data warehouse may also use relational databases. Please refer to Relational data modeling for details to know how data from a source system is normalized and stored in RDBMS databases.
Popular RDBMS Databases:
|RDBMS Name||Company Name|
|IBM DB2 UDB||IBM Corporation|
|IBM Informix||IBM Corporation|
|Microsoft SQL Server||Microsoft|
What is a Data Warehouse?
According to Inmon, famous author for several data warehouse books, “A data warehouse is a subject oriented, integrated, time variant, non volatile collection of data in support of management’s decision making process”.
In order to store data, over the years, many application designers in each branch have made their individual decisions as to how an application and database should be built. So source systems will be different in naming conventions, variable measurements, encoding structures, and physical attributes of data. Consider a bank that has got several branches in several countries, has millions of customers and the lines of business of the enterprise are savings, and loans. The following example explains how the data is integrated from source systems to target systems.
Example of Source Data:
|System Name||Attribute Name||Column Name||Datatype||Values|
|Source System 1||Customer Application Date||CUSTOMER_APPLICATION_DATE||NUMERIC(8,0)||11012005|
|Source System 2||Customer Application Date||CUST_APPLICATION_DATE||DATE||11012005|
|Source System 3||Application Date||APPLICATION_DATE||DATE||01NOV2005|
In the aforementioned example, attribute name, column name, datatype and values
are entirely different from one source system to another. This inconsistency in data can
be avoided by integrating the data into a data warehouse with good standards.
Example of Target Data (Data Warehouse):
|Target System||Attribute Name||Column Name||Datatype||Values|
|Record #1||Customer Application Date||CUSTOMER_APPLICATION_DATE||DATE||01112005|
|Record #2||Customer Application Date||CUSTOMER_APPLICATION_DATE||DATE||01112005|
|Record #3||Customer Application Date||CUSTOMER_APPLICATION_DATE||DATE||01112005|
In the above example of target data, attribute names, column names, and datatypes
are consistent throughout the target system. This is how data from various source system
is integrated and accurately stored into the data warehouse.
Data Warehouse Architecture Diagram:
A Data Warehouse is a relational/multidimensional database that is designed for query and analysis rather than transaction processing. A data warehouse usually contains historical data that is derived from transaction data. It separates analysis workload from transaction workload and enables a business to consolidate data from several sources.
In addition to a relational/multidimensional database, a data warehouse environment often consists of an ETL solution, an OLAP engine, client analysis tools, and other applications that manage the process of gathering data and delivering it to business users.
There are three types of data warehouses:
1. Enterprise Data Warehouse – An enterprise data warehouse provides a central database for decision support throughout the enterprise.
2. ODS (Operational Data Store) – This has a broad enterprise wide scope, but unlike the real Enterprise data warehouse, data is refreshed in near real time and used for routine business activity.
3. Data Mart – Datamart is a subset of data warehouse and it supports a particular region, business unit or business function.
Data warehouses and data marts are built on dimensional data modeling where
fact tables are connected with dimension tables. This is most useful for users to access
data since a database can be visualized as a cube of several dimensions. A data
warehouse provides an opportunity for slicing and dicing that cube along each of
A data mart is a subset of data warehouse that is designed for a particular line of business, such as sales, marketing, or finance. In a dependent data mart, data can be derived from an enterprise-wide data warehouse. In an independent data mart, data can be collected directly from sources.
Data Warehouse and Datamarts – Sample Diagram: