Dimensional Data Modeling

Dimensional Data Modeling comprises of one or more dimension tables and fact tables. Good examples of dimensions are location, product, time, promotion, organization etc. Dimension tables store records related to that particular dimension and no facts (measures) are stored in these tables.

For example, Product dimension table will store information about products (Product Category, Product Sub Category, Product and Product Features) and location dimension table will store information about location ( country, state, county, city, zip. A fact (measure) table contains measures (sales gross value, total units sold) and dimension columns. These dimension columns are actually foreign keys from the respective dimension tables.

Example of Dimensional Data Model:

Example of Dimensional Data Model

In the example figure above, sales fact table is connected to dimensions location, product, time and organization. It shows that data can be sliced across all dimensions and again it is possible for the data to be aggregated across multiple dimensions. “Sales Dollar” in sales fact table can be calculated across all dimensions independently or in a combined manner which is explained below.

  • Sales Dollar value for a particular product.
  • Sales Dollar value for a product in a location.
  • Sales Dollar value for a product in a year within a location.
  • Sales Dollar value for a product in a year within a location sold or serviced by an employee

In Dimensional data modeling, hierarchies for the dimensions are stored in the dimensional table itself. For example, the location dimension will have all of its hierarchies from country, state, county to city. There is no need for the individual hierarchical lookup like country lookup, state lookup, county lookup and city lookup to be shown in the model.

Dimensional Data Modeling is used for calculating summarized data. For example,
sales data could be collected on a daily basis and then be aggregated to the week level, the week data could be aggregated to the month level, and so on. The data can then be referred to as aggregate data. Aggregation is synonymous with summarization, and aggregate data is synonymous with summary data. The performance of dimensional data modeling can be significantly increased when materialized views are used. Materialized view is a pre-computed table comprising aggregated or joined data from fact and possibly dimension tables which also known as a summary or aggregate table.

 

 

Data Modeling Standards | Modeling Data

Data Modeling standardization has been in practice for many years and the following section highlight the needs and implementation of the data modeling standards.

Standardization Needs | Modeling data:

Several data modelers may work on the different subject areas of a data model and all data modelers should use the same naming convention, writing definitions and business rules.

Nowadays, business to business transactions (B2B) are quite common, and standardization helps in understanding the business in a better way. Inconsistency across column names and definition would create a chaos across the business.

For example, when a data warehouse is designed, it may get data from several source systems and each source may have its own names, data types etc. These anomalies can be eliminated if a proper standardization is maintained across the organization.

Table Names Standardization:

Giving a full name to the tables, will give an idea about data what it is about. Generally, do not abbreviate the table names; however this may differ according to organization’s standards. If the table name’s length exceeds the database standards, then try to abbreviate the table names. Some general guidelines are listed below that may be used as a prefix or suffix for the table.

Examples:

Lookup – LKP – Used for Code, Type tables by which a fact table can be directly accessed.
e.g. Credit Card Type Lookup – CREDIT_CARD_TYPE_LKP

Fact – FCT – Used for transaction tables:
e.g. Credit Card Fact – CREDIT_CARD_FCT

Cross Reference – XREF – Tables that resolves many to many relationships.
e.g. Credit Card Member XREF – CREDIT_CARD_MEMBER_XREF

History – HIST – Tables the stores history.
e.g. Credit Card Retired History – CREDIT_CARD_RETIRED_HIST

Statistics – STAT – Tables that store statistical information.
e.g. Credit Card Web Statistics – CREDIT_CARD_WEB_STAT

Column Names Standardization:

Some general guidelines are listed below that may be used as a prefix or suffix for the column.

Examples:

Key – Key System generated surrogate key.
e.g. Credit Card Key – CRDT_CARD_KEY

Identifier – ID – Character column that is used as an identifier.
e.g. Credit Card Identifier – CRDT_CARD_ID

Code – CD – Numeric or alphanumeric column that is used as an identifying attribute.
e.g. State Code – ST_CD

Description – DESC – Description for a code, identifier or a key.
e.g. State Description – ST_DESC

Indicator – IND – to denote indicator columns.
e.g. Gender Indicator – GNDR_IND

Database Parameters Standardization:

Some general guidelines are listed below that may be used for other physical parameters.

Examples:

Index – Index – IDX – for index names.
e.g. Credit Card Fact IDX01 – CRDT_CARD_FCT_IDX01

Primary Key – PK – for Primary key constraint names.
e.g. CREDIT Card Fact PK01- CRDT-CARD_FCT_PK01

Alternate Keys – AK – for Alternate key names.
e.g. Credit Card Fact AK01 – CRDT_CARD_FCT_AK01

Foreign Keys – FK – for Foreign key constraint names.
e.g. Credit Card Fact FK01 – CRDT_CARD_FCT_FK01