Dimensional Data Modeling

Dimensional Data Modeling comprises of one or more dimension tables and fact tables. Good examples of dimensions are location, product, time, promotion, organization etc. Dimension tables store records related to that particular dimension and no facts (measures) are stored in these tables.

For example, Product dimension table will store information about products (Product Category, Product Sub Category, Product and Product Features) and location dimension table will store information about location ( country, state, county, city, zip. A fact (measure) table contains measures (sales gross value, total units sold) and dimension columns. These dimension columns are actually foreign keys from the respective dimension tables.

Example of Dimensional Data Model:

Example of Dimensional Data Model

In the example figure above, sales fact table is connected to dimensions location, product, time and organization. It shows that data can be sliced across all dimensions and again it is possible for the data to be aggregated across multiple dimensions. “Sales Dollar” in sales fact table can be calculated across all dimensions independently or in a combined manner which is explained below.

  • Sales Dollar value for a particular product.
  • Sales Dollar value for a product in a location.
  • Sales Dollar value for a product in a year within a location.
  • Sales Dollar value for a product in a year within a location sold or serviced by an employee

In Dimensional data modeling, hierarchies for the dimensions are stored in the dimensional table itself. For example, the location dimension will have all of its hierarchies from country, state, county to city. There is no need for the individual hierarchical lookup like country lookup, state lookup, county lookup and city lookup to be shown in the model.

Dimensional Data Modeling is used for calculating summarized data. For example,
sales data could be collected on a daily basis and then be aggregated to the week level, the week data could be aggregated to the month level, and so on. The data can then be referred to as aggregate data. Aggregation is synonymous with summarization, and aggregate data is synonymous with summary data. The performance of dimensional data modeling can be significantly increased when materialized views are used. Materialized view is a pre-computed table comprising aggregated or joined data from fact and possibly dimension tables which also known as a summary or aggregate table.

 

 

Relational Data Modeling Example – Part 3

The completed relational data model is shown in the previous part and the corresponding data stored in database are shown in separate tables below.

State Lookup:

State CodeState NameDate TimeStamp
NYNew York1/1/2005 11:23:31 AM
FLFlorida1/1/2005 11:23:31 AM
CACalifornia1/1/2005 11:23:31 AM
NJNew Jersey1/1/2005 11:23:31 AM

County Lookup:

County CodeCounty NameDate TimeStamp
NYSHShelby1/1/2005 11:23:31 AM
FLJEJefferson1/1/2005 11:23:31 AM
CAMOMontgomery1/1/2005 11:23:31 AM
NJHUHudson1/1/2005 11:23:31 AM

City Lookup:

City CodeCity NameDate TimeStamp
NYSHMAManhattan1/1/2005 11:23:31 AM
FLJEPCPanama City1/1/2005 11:23:31 AM
CAMOSHSan Hose1/1/2005 11:23:31 AM
NJHUJCJersey City1/1/2005 11:23:31 AM

Employee Lookup:

Emp IdState CodeCounty CodeCity CodeManager IdEmp First NameEmp Last NameEmp Full NameDate TimeStamp
1NYNYSHNYSHMAPaulYoungPaul Young1/1/2005 11:23:31 AM
2FLFLJEFLJEPC1ChrisDavisChris Davis1/1/2005 11:23:31 AM
3CACAMOCAMOSH1LouisJohnsonLouis Johnson1/1/2005 11:23:31 AM
4NJNJHUNJHUJC1SamMathewSam Mathew1/1/2005 11:23:31 AM
5NYNYSHNYSHMA1NancyRobinsonNancy Robinson1/1/2005 11:23:31 AM
6FLFLJEFLJEPC2Sheela ShellumSheela Shellum1/1/2005 11:23:31 AM
7CACAMOCAMOSH3JeffBillJeff Bill1/1/2005 11:23:31 AM
8NJNJHUNJHUJC4JohnBurrellJohn Burrell1/1/2005 11:23:31 AM

Employer Lookup:

Employer IdEmployer NameDateTimeStamp
1001American Bank of NewYork1/1/2005 11:23:31 AM
1002American Bank of Florida1/1/2005 11:23:31 AM
1003American Bank of California1/1/2005 11:23:31 AM
1004American Bank of New Jersey1/1/2005 11:23:31 AM

Employee Employer XREF:

Employee IdEmployer IdDateTimeStamp
110011/1/2005 11:23:31 AM
210021/1/2005 11:23:31 AM
310031/1/2005 11:23:31 AM
410041/1/2005 11:23:31 AM
510011/1/2005 11:23:31 AM
610021/1/2005 11:23:31 AM
710031/1/2005 11:23:31 AM
810041/1/2005 11:23:31 AM

 

Relational Data Modeling Example – Part 2

Upon discussion with business analysts, data modeler can come up with the following conclusions regarding grouping and relationship between the data. These conclusions play a vital role in designing the data model as well as expanding for future scope.

  • Many cities can be in one county. City names will be unique across the country.
  • Many counties can be in one state. County names will be unique across the country.
  • Many states can be in USA. State names will be unique across the country USA.
  • One employee can work with many branches at same time.
  • For some employees, managers may not be there.

In order to implement the above decisions, relational data modeling is done in the following manner.

  • To achieve normalization, relevant attributes of employee, employer lookup, state lookup, county lookup and city lookup tables should be grouped and created.
  • In order to validate the data of employee table, employee table has been connected to state, county, and city lookups. Whenever state, county, city data is entered in employee table, data would be checked against respective lookup tables and correct data is stored. Hence there is no need to carry redundant data of state, county, city lookup in employee table.
  • All tables are identified by primary keys(PK). So data can be uniquely identified from tables.
  • Records can be inserted or updated directly in the respective lookup table. For example if a state name changes, then the change will be updated only in the state lookup, hence this change will not affect other tables like employee.
  • Since one employee can work in many branches at the same time, table EmployeeEmployerXREF has been created and it resolves many to many relationships.
  • Since an employee can be a manager in many occasions, column “manager identifier” has been added and becomes a foreign key to column employee identifier. The “manager identifier” column would contain the same value as of an employee identifier. Sometimes it may contain null values also. For example, Paul Young is the topmost person and doesn’t have any managers.
  • A new column DateTimeStamp has been added to all tables. This column gives the information about the date and time when the row was inserted or updated.

The completed relational data model is shown in Figure below and the corresponding data are shown in separate tables in the next page.

Relational Data Modeling Example Diagram:

Relational Data Modeling Example

 

Next: Relational Data Modeling Example – Part 3 

Relational Data Modeling Example

The sample source data shown in the table below provides the information about employees, their residential state, county, city and their employer names and manager names. It also describes employees working for an “American Bank” that has got many branches in several states. From data modeler point of view, analysis of the source data raises following questions.

  • How to group and organize the data?
  • How to avoid de-normalization since employee’s residential data like state name, county Name, city Name are repeated in most of the records.
  • What sort of relationship is between employer and employee?
  • What sort of relationship is between the employee and state, city, county?

Sample Source Data:

State NameCounty NameCity NameEmp First NameEmp Last NameEmp Full NameManager NameEmployer NameDate TimeStamp
New YorkShelbyManhattanPaulYoungPaul YoungAmerican Bank of New York1/1/2005 11:23:31 AM
FloridaJeffersonPanama CityChrisDavisChris DavisPaul YoungAmerican Bank of Florida1/1/2005 11:23:31 AM
CaliforniaMontgomerySan HoseLouisJohnsonLouis JohnsonPaul YoungAmerican Bank of California1/1/2005 11:23:31 AM
New JerseyHudsonJersey CitySamMathewSam MathewPaul YoungAmerican Bank of New Jersey1/1/2005 11:23:31 AM
New YorkShelbyManhattanNancyRobinsonNancy RobinsonPaul YoungAmerican Bank of New York1/1/2005 11:23:31 AM
FloridaJeffersonPanama CitySheelaShellumSheela ShellumChris DavisAmerican Bank of Florida1/1/2005 11:23:31 AM
CaliforniaMontgomeryShelbyJeffBillJeff BillLouis JohnsonAmerican Bank of California1/1/2005 11:23:31 AM
New JerseyHudsonJersey CityJohnBurrellJohn BurrellSam MathewAmerican Bank of New Jersey1/1/2005 11:23:31 AM

In the next page, we will discuss how to resolve these problems in order to design a good relational data model.

Next: Relational Data Modeling Example – Part 2

Relational Data Modeling Tutorial

Relational (OLTP) Data Modeling:

Relational Data Model is a data model that views the real world as entities and relationships.

Entities are concepts, real or abstract about which information is collected. Entities are associated with each other by relationship and attributes are properties of entities.

Business rules would determine the relationship between each of entities in a data model.

The goal of relational data model is to normalize (avoid redundancy)data and to present it in a good normal form.

While working with relational data modeling, a data modeler has to understand 1st normal form through 5th normal form to design a good data model.

Following are some of the questions that arise during the development of entity relationship data model. A complete business and data analysis would lead to design a good data model.

  • What will be the future scope of the data model? How to normalize the data?
  • How to group attributes in entities?
  • How to name entities, attributes, keys groups, relationships?
  • How to connect one entity to other? What sort of relationship is that?
  • How to validate the data?
  • How to normalize the data?
  • How to present reports?

Next: Relational Data Modeling Example

Physical Data Modeling Tutorial

Physical Data Modeling:

Physical data model includes all required tables, columns, relationships, database properties for the physical implementation of databases. Database performance, indexing strategy, physical storage and de-normalization are important parameters of a physical model.

Logical data model is approved by functional team and there-after development of physical data model work gets started. Once physical data model is completed, it is then forwarded to technical teams (developer, group lead, DBA) for review. The transformations from logical model to physical model include imposing database rules, implementation of referential integrity, super types and sub types etc.

In the example, the entity names have been changed to table names, changed attribute names to column names, assigned nulls and not nulls, and datatype to each column.

Example of Physical Data Model:

Example of Physical Data Mode

 

Logical Data Modeling Tutorial

Logical Data Modeling:

This is the actual implementation and extension of a conceptual data model. A Logical data model is the version of a data model that represents the business requirements (entire or part) of an organization and is developed before the physical data model.

As soon as the conceptual data model is accepted by the functional team, development of logical data model gets started. Once logical data model is completed, it is then forwarded to functional teams for review. A sound logical design should streamline the physical design process by clearly defining data structures and the relationships between them. A good data model is created by clearly thinking about the current and future business requirements.
Logical data model includes all required entities, attributes, key groups, and relationships that represent business information and define business rules.

In the example, we have identified the entity names, attribute names, and relationship. For detailed explanation, refer to relational data modeling.

Example of Logical Data Model:

In the example, we have identified the entity names, attribute names, and relationship. For detailed explanation, refer to relational data modeling.

 

Enterprise Data Modeling Tutorial

Enterprise Data Modeling:

The development of a common consistent view and understanding of data elements and their relationships across the enterprise is referred to as Enterprise Data Modeling. This type of data modeling provides access to information scattered throughout an enterprise under the control of different divisions or departments with different databases and data models.

Enterprise Data Modeling is sometimes called as global business model and the entire information about the enterprise would be captured in the form of entities.

Data Model Highlights:

When a enterprise logical data model is transformed to a physical data model, super types and sub types may not be as is. i.e. the logical and physical structure of super types and sub types may be entirely different. A data modeler has to change that according to the physical and reporting requirement.

When a enterprise logical data model is transformed to a physical data model, length of table names, column names etc may exceed the maximum number of the characters allowed by the database. So a data modeler has to manually edit that and change the physical names according to database or organization’s standards.

One of the important things to note is the standardization of the data model. Since a same attribute may be present in several entities, the attribute names and data types should be standardized and a conformed dimension should be used to connect to the same attribute present in several tables.

Standard Abbreviation document is a must so that all data structure names would be consistent across the data model.

Consider an example of a bank that contains different line of businesses like savings, credit card, investment, loans and so on.

In example, enterprise data model contains all entities, attributes, relationships, from lines of businesses savings, credit card, investment and loans.

Example of Enterprise Data Model:

 

Example of Enterprise Data Model

Conceptual Data Modeling Tutorial

Conceptual Data Modeling:

Conceptual data model includes all major entities and relationships and does not contain much detailed level of information about attributes and is often used in the INITIAL PLANNING PHASE.

Conceptual data model is created by gathering business requirements from various sources like business documents, discussion with functional teams, business analysts, smart management experts and end users who do the reporting on the database. Data modelers create conceptual data model and forward that model to functional team for their review.

Conceptual Data Model – Highlights:
  • CDM is the first step in constructing a data model in top-down approach and is a clear and accurate visual representation of the business of an organization.
  • CDM visualizes the overall structure of the database and provides high-level information about the subject areas or data structures of an organization.
  • CDM discussion starts with main subject area of an organization and then all the major entities of each subject area are discussed in detail.
  • CDM comprises of entity types and relationships. The relationships between the subject areas and the relationship between each entity in a subject area are drawn by symbolic notation (IDEF1X or IE). In a data model, cardinality represents the relationship between two entities. i.e. One to one relationship, or one to many relationship or many to many relationship between the entities.
  • CDM contains data structures that have not been implemented in the database.
  • In CDM discussion, technical as well as non-technical team projects their ideas for building a sound logical data model.

Consider an example of a bank that contains different line of businesses like savings, credit card, investment, loans and so on.

In example diagram below, conceptual data model contains major entities from savings, credit card, investment and loans. Conceptual data modeling gives an idea to the functional and technical team about how business requirements would be projected in the logical data model.

Conceptual Data Modeling – Example diagram:

 

Conceptual Data Modeling - Example diagram: