Comparison of Logical and Physical Data Modeling

Logical | Physical Data Modeling:

When a data modeler works with the client, his title may be a logical data modeler or a physical data modeler or combination of both. A logical data modeler  designs the data model to suit business requirements, creates and maintains the lookup data, compares the versions of data model, maintains change log, generate reports from data  model and whereas a physical data modeler has to know about the source and target databases properties.

A physical data modeler should know the technical-know-how  to create data models from existing databases and to tune the data models with referential integrity, alternate keys, indexes and how to match indexes to SQL code. It  would be good if the physical data modeler knows about replication, clustering and so on.

The differences between a logical data model and physical data model is shown below.

Logical vs Physical Data Modeling

[ultimatetables 11 /]

 

Data Modeling Subject Area, Default, Domain, Rules & Constraints

Logical and Physical Data Modeling Objects:

To become a data modeler, you need to understand the concepts of a database. Before proceeding further, please visit our topics listed under the section “Database and Data Modeling” to get a fair knowledge about the database. The following table briefs about the objects used in constructing the data model especially in domains, rules, check constraints, and subject area.

What is a Logical Data Model?

This is a business presentation of objects in a database which represents the business requirements (entire or part) of an organization. Usually object names are very descriptive and Supertypes / Subtypes, relationships between different objects are shown, which is very easy for every one to understand the business of the organization.

What is a Physical Data Model?

Physical Data Model contains most of the objects present in the database. From Developers perspective, it shows the table name, column name,  data type, null, not null, unique constraint, primary key constraints, foreign key constraints, which helps them to code.

Objects used in a Data Model


Data Model Type: Logical

Data Model Objects: Subject Area or Work Space

Explanation: 

In a data model, there is one main subject area which comprises all objects present in all subject areas and other subject areas based on their processes or business domains. Each subject area contains objects, which are relevant to that subject area and the subject area is very useful in understanding the data model and to generate reports and PRINT OUTS based on main subject areas or other subject areas. In a telecommunication data model, there may be several subject areas like Service Request, Service Order, Ticketing and Main Subject Area. In a Mortgage data model, there may be several subject areas like borrower, loan, under writing and main subject area. Usually subject areas are created on main business processes. In Telecommunication (telephone service subscription by customer), service request is a process to get the request from the customer through phone, email, fax etc. Service Order is the next process to approve the service request and provide telephone line subscription to customers. Ticketing is a process by which complaints are gathered from the customer and problems are resolved.


Data Model Type: Physical

Data Model Objects: Subject Area or Work Space

Explanation: 

It is a copy of the logical subject area but some objects like supertype and sub types objects may not be AS IS like the logical subject area.


Data Model Type: Logical

Data Model Objects: Entity

Explanation: 

It is the business presentation of a table present in a database. Example: COUNTRY


Data Model Type: Physical

Data Model Objects: Table

Explanation: 

It is comprised of rows & columns, which stores data in a database. Example: CNTRY


Data Model Type: Logical

Data Model Objects: Attribute

Explanation: 

It is the business presentation of a column present in a database. Example: Country Code, Country Name.


Data Model Type: Physical

Data Model Objects: Column

Explanation: 

It is a data item, which stores data for that particular item. Example: CNTRY_CD, CNTRY_NM.


Data Model Type: Logical

Data Model Objects: Default

Explanation: 

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Default

Explanation: 

When user input is not present, default value attached with the particular column is inserted into that column.

Step:

  • First, you have to create a default object in the data model.
  • You have to attach the default object with the column.
  • When you generate scripts from the physical data model, automatically, default will be attached to the column.

Example 1:

You may need a situation, where system date and time should be inserted when the record is inserted. With Oracle, you can attach SYSDATE to that column.

Column name: TODAY;

Datatype: DATE;

Default Syntax : DEFAULT;

Oracle Default: SYSDATE TODAY DATE DEFAULT SYSDATE

Example 2:

You many need to know about the schema name who inserted that record.

Column name: SCHEMA_NAME;

Datatype: VARCHAR2(30);

Default Syntax: DEFAULT;

Oracle Default: USER SCHEMA_NAME VARCHAR2(30) DEFAULT USER


Data Model Type: Logical

Data Model Objects: Domain

Explanation: 

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Domain

Explanation: 

When you create a data model, there are several columns in tables, which are code, identifier, indicators, descriptive columns, date columns, NOT NULL columns, Primary key columns etc. To make it consistent across the data model, we can use domains.

Steps:

  • First, you have to create a domain object in the data model.
  • You have to attach the domain object with the column.

Example:

For a DESCRIPTION column, you can create a domain which has the following (NOT NULL, Datatype as VARCHAR2 (200)).

You can attach this domain to all descriptive columns present in tables. So every descriptive column present in the table will have NOT NULL as the constraint and datatype as VARCHAR2 (200).


Data Model Type: Logical

Data Model Objects: Check Constraint Rule

Explanation: 

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Check Constraint Rule

Explanation: 

Steps:

  • First, you have to create a rule object in the data model.
  • You have to attach the rule object with the column or domain.

Check Constraint Rule can be imposed on columns like:

  • Example 1: Indicator Columns: Yes or NO
  • Example 2: Gender Columns: Male or Female
  • Example 3: Marital Status: Married or Single

Data Model Type: Logical

Data Model Objects: NULL

Explanation:

There is no name for this NULL either in logical or physical data model. NULL is an option so that it allows NULL values for that column.


Data Model Type: Physical

Data Model Objects: NULL

Explanation:

Column allows NULL VALUES (Values can be empty).


Data Model Type: Logical

Data Model Objects: Not Null Constraint

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Not Null Constraint

Explanation:

Column should always contain data.


Data Model Type: Logical

Data Model Objects: Unique Constraint

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Unique Constraint

Explanation:

Non NULL Values should be different from each other.


Data Model Type: Logical

Data Model Objects: Primary Key Constraint

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Primary Key Constraint

Explanation:

Unique Constraint + Not Null Constraint.


Data Model Type: Logical

Data Model Objects: Foreign Keys

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Foreign Keys

Explanation:

This is a constraint imposed on the child table. Whatever values are present in the child table, their corresponding values should be present in the parent table. This constraint can be imposed on one column or group of columns and NULL values are allowed in child table.


Data Model Type: Logical

Data Model Objects: Relationships

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Relationships

Explanation:

Identifying, Non-Identifying, Self Relationship for Non Identifying relationship, M:N relationship.


Data Model Type: Logical

Data Model Objects: Sequence

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Sequence

Explanation:

To generate a unique number, sequence is used.


Data Model Type: Logical

Data Model Objects: Views, Synonyms

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Views, Synonyms

Explanation:

Usually development team and DBA team does that.


Data Model Type: Logical

Data Model Objects: Procedure, Function, Packages, Triggers, Materialized Views

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Procedure, Function, Packages, Triggers, Materialized Views

Explanation:

Usually developer write these and some times data modelers implement that in the new data model.


Data Model Type: Logical

Data Model Objects: Indexes and Unique Indexes

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Indexes and Unique Indexes

Explanation:

Index is used for fastest retrieval of data from the database. Whenever a primary key constraint is created on a table, index is also created.  When we use a column in the where clause, data modelers index it after getting guidance from the development team and DBA team. Unique Index is created when there is a need for unique values in that column.


Data Model Type: Logical

Data Model Objects: SuperType and SubType

Explanation:

Supertype is an entity type that has got relationship (parent to child relationship) with one or more subtypes and it contains the attributes that are common to its subtypes.

Subtypes are subgroups of the supertype entity and have unique attributes, they will be different from each subtypes.

Supertypes and subtypes are parent and child entities respectively and the primary keys of supertype and subtype are always identical.

For detailed explanation, visit our website in Supertype and Subtype.


Data Model Type: Physical

Data Model Objects: SuperType and SubType

Explanation:

Visual representation of supertypes and subtypes will not be identical in logical and physical data model. In logical data model, it explains about the business, but the same cannot be incorporated in physical data model.


 

Identifying and Non-Identifying Relationships

Analysis of Data and creating Tables & Relationships:

Usually in a data model, Tables (Entities) looks like rectangular or square type boxes, which contains columns(attributes) and these tables are connected by lines (relationships). Identifying relationship and Non-Identifying relationship are two types of relationships. Based on cardinality, relationship are broadly classified into one to one relationship, one to many relationship and many to many relationship. Occasionally, there may arise a situation where you need to create a foreign key column by connecting a relationship to primary key column of the same table, which is called as self-referential or recursive relationship.

With the following example, we will explain you about how you can learn and practice different types of relationships with Quest’s TOAD Software tool.

Primary Key:

If the attribute (column) has to be unique and should contain NON-NULL values, data modelers create primary key constraint on that particular attribute and it is called the primary key attribute of that entity.

Foreign Key:

Attributes within same entity or attributes from different entities will be connected by a connector (foreign key) to the primary key attribute.

Composite Primary key:

If more than one column is present as a primary key, then it is called as primary key.

Let us consider a software company, which is involved in implementing projects. Assume that the client has sent the following information to the software company and software company has to create logical and physical data models and implement that in a database.

Sample Data:

Employee_Projects_Sample_Data

Data Modeler analyzes data, talks to business analysts or smart management experts in the client and understands the requirements.

Following section describes business requirements.

Business Requirements:
  • One employee has only one manager and CEO doesn’t have any manager.
  • Manager Numbers are also Employee Numbers.
  • One employee can work in only one department and one or more employees can work in the same department.
  • One or more employees can have the same title.
  • One employee can work in one or more projects and one project can have one or more employees.

Following sections, how business requirements are transformed to a data model.

Analysis:

How many tables should be present in the data model?
  1. We have to create EMPLOYEE table and Employee_No should be the primary key of the table and later other stand alone columns and foreign key columns have to be added based on the relationship.
  2. Employee_No has one to many relationship with Department_No because one or more employees can work in the same department. So DEPARTMENT table should be created and Dept_No should be the primary key of the DEPARTMENT table. Dept_No has to be connected to EMPLOYEE table and Dept_No in EMPLOYEE table is a foreign key to DEPARTMENT table.
  3. Employee_No has one to many relationship with Title_Code because one or more employees can have the same title. So TITLE table should be created as a separate table and Title_Code should be the primary key of the TITLE table. Title_Code has to be connected to EMPLOYEE table and Title_Code in EMPLOYEE table is a foreign key to TITLE table.
  4. One or more employees can work in one or more projects. So a separate table PROJECT should be created first and Project_Code should be the primary key of the table. Since the relationship between EMPLOYEE and PROJECT table depicts a many-to-many relationship and this relationship cannot be resolved using these two tables alone. So you need to create one more table EMPLOYEE_PROJECT to resolve this.
  5. Employee_No has many to many relationship with Project_Code because one or more employees can work in one or more projects and one project can have one or more employees. So a separate table EMPLOYEE_PROJECT should be created which shows many to many relationship with employees and project. It should be connected to Employee table and Project Table. Here Employee_No and Project_Code are the composite primary key of this table.
  6. Since Employee_No and Manager_No are same, we should create a self referential or recursive relationship within the same EMPLOYEE table. So a separate foreign key column should be created by connecting with Employee_no and should be named as Manager_No.

Next section, explains how to create entities, attributes, primary keys, foreign keys, and how to implement these in a database by using TOAD SOFTWARE DATA MODELING TOOL.

 

Data Modeling Development Cycle

Data Modeling Development Cycle

Following are the important phases in the Data Model Development Life Cycle.

  1. Gathering Business Requirements
  2. Conceptual Data Modeling
  3. Logical Data Modeling
  4. Physical Data Modeling
  5. Development of the schema or the database
  6. Maintenance of the data model as per the changes.
1. Gathering Business Requirements – First Phase:

Data Modelers have to interact with business analysts to get the functional requirements and with end users to find out the reporting needs.

2. Conceptual Data Modeling (CDM) – Second Phase:

This data model includes all major entities, relationships and it will not contain much detail
about attributes and is often used in the INITIAL PLANNING PHASE. Please refer the diagram below and follow the link to learn more about Conceptual Data Modeling Tutorial.

Conceptual Data Modeling - Example diagram:

3. Logical Data Modeling (LDM) – Third Phase:

This is the actual implementation of a conceptual model in a logical data model. A logical data model is the version of the model that represents all of the business requirements of an organization. Please refer the diagram below and follow the link to learn more about Logical Data Modeling Tutorial.

In the example, we have identified the entity names, attribute names, and relationship. For detailed explanation, refer to relational data modeling.

4. Physical Data Modeling (PDM) – Fourth Phase:

This is a complete model that includes all required tables, columns, relationship, database properties for the physical implementation of the database. Please refer the diagram below and follow the link to learn more about Physical Data Modeling Tutorial.

Example of Physical Data Mode

5. Database – Fifth Phase:

DBAs instruct the data modeling tool to create SQL code from physical data model. Then the SQL code is executed in server to create databases.

 

Supertype and SubType

SuperType and SubType in Data Modeling:

At times, few entities in a data model may share some common properties (attributes) within themselves apart from having one or more distinct attributes. Based on the attributes, these entities are categorized as Supertype and Subtype entities.

Supertype is an entity type that has got relationship (parent to child relationship) with one or more subtypes and it contains attributes that are common to its subtypes.

Subtypes are subgroups of the supertype entity and have unique attributes, but they will be different from each subtype.

Supertypes and Subtypes are parent and child entities respectively and the primary keys of supertype and subtype are always identical.

E.g. People, Bank Account, Insurance, Asset, Liability, Credit Card.

When designing a data model for PEOPLE, you can have a supertype entity of PEOPLE and its subtype entities can be vendor, customer, and employee. People entity will have attributes like Name, Address, and Telephone number, which are common to its subtypes and you can design entities employee, vendor, and consumer with their own unique attributes. Based on this scenario, employee entity can be further classified under different subtype entities like HR employee, IT employee etc. Here employee will be the superset for the entities HR Employee and IT employee, but again it is a subtype for the PEOPLE entity.

A person can open a savings account or a certificate deposit (fixed deposit) in a bank. These accounts have attributes like account number, account opening date, account expiry date, principal amount, maturity amount, account balance, interest rate, checks issued, pre-cancellation fee etc. While designing a data model, you can create supertype parent entity as “Account” and subtype entities as Savings Account and Certificate Deposit. Account entity will store attributes like account number, interest rate that are common to savings account and certificate deposit entity. Savings account entity will have attributes like account balance and checks issued. While fixed deposit entity will have attributes like account opening, account expiry date, principal amount, maturity amount, pre-cancellation fee etc. When you design a logical data model in this manner, it provides more meaning to the business and the attributes are not cluttered in one table.

Supertype & Subtype Example:

One good example for explaining this SuperType & SubType is describing the Tax Terms related to Employees. Here Employee is the SuperType or the Parent Entity whereas the two Child Entities “FULL TIME EMPLOYEE W2” & “HOURLY EMPLOYEE 1099” are the SubTypes.

More Examples:

Insurance entity can act as supertype and entities like health insurance, life insurance, auto insurance, liability insurance, malpractice insurance etc., can be subtype entities.

Asset entity can be the supertype and entities like bank, real estate, auto, retirement fund, stocks, and life insurance etc., can be subtype entities.

Liability entity may be the supertype and entities like real estate, auto loan, alimony, other debts etc., may be subtype entities.

Credit card entity may be the supertype and entities like balance transfer cards, cash back credit cards, business credit card, student credit card, secured credit card etc., may be the subtype entities.

 

Reach US!!!

 

  • We provide online training in advanced OLTP Data Modeling and DIMENSIONAL Data Modeling.
  • We also teach the data structures with Data Analytics Software “R”.
  • We provide online Data Modeling Project Support when you get struck with projects that you are involved.
  • We can analyse your Business Requirements, understand and suggest solutions to create OLTP Data Models and Dimensional Data models.
  • We provide Data Modeling Interview Preparation Sessions with a lot of Data Modeling Interview Questions/Answers, which will help you to clear any interview.

 

If you are interested, please reach us at Training@LearnDataModeling.com or 91-9080157239

 

Dimensional Data Modeling

Dimensional Data Modeling comprises of one or more dimension tables and fact tables. Good examples of dimensions are location, product, time, promotion, organization etc. Dimension tables store records related to that particular dimension and no facts (measures) are stored in these tables.

For example, Product dimension table will store information about products (Product Category, Product Sub Category, Product and Product Features) and location dimension table will store information about location ( country, state, county, city, zip. A fact (measure) table contains measures (sales gross value, total units sold) and dimension columns. These dimension columns are actually foreign keys from the respective dimension tables.

Example of Dimensional Data Model:

Example of Dimensional Data Model

In the example figure above, sales fact table is connected to dimensions location, product, time and organization. It shows that data can be sliced across all dimensions and again it is possible for the data to be aggregated across multiple dimensions. “Sales Dollar” in sales fact table can be calculated across all dimensions independently or in a combined manner which is explained below.

  • Sales Dollar value for a particular product.
  • Sales Dollar value for a product in a location.
  • Sales Dollar value for a product in a year within a location.
  • Sales Dollar value for a product in a year within a location sold or serviced by an employee

In Dimensional data modeling, hierarchies for the dimensions are stored in the dimensional table itself. For example, the location dimension will have all of its hierarchies from country, state, county to city. There is no need for the individual hierarchical lookup like country lookup, state lookup, county lookup and city lookup to be shown in the model.

Dimensional Data Modeling is used for calculating summarized data. For example,
sales data could be collected on a daily basis and then be aggregated to the week level, the week data could be aggregated to the month level, and so on. The data can then be referred to as aggregate data. Aggregation is synonymous with summarization, and aggregate data is synonymous with summary data. The performance of dimensional data modeling can be significantly increased when materialized views are used. Materialized view is a pre-computed table comprising aggregated or joined data from fact and possibly dimension tables which also known as a summary or aggregate table.

Online Dimensional Data Modeling Training | Data Warehouse Training | Data Mart Training

Course Description:

The dimensional data modeling training explains how to design Data Ware House and Data Marts from OLTP data models.

What you can learn in our Dimensional Data Modeling training?

  • Why do you need a data warehouse?
  • What is the reason for a data mart?
  • Why is star schema? How is that different from dimensional data modeling?
  • Why customers prefer dimensional data modeling or snow flake modeling?
  • What are slowly changing dimensions?

To get more information about this training program, send an email to Training@LearnDataModeling.Com 0r call us @ 91-9080157239.

 

Relational Data Modeling Example – Part 2

Upon discussion with business analysts, data modeler can come up with the following conclusions regarding grouping and relationship between the data. These conclusions play a vital role in designing the data model as well as expanding for future scope.

  • Many cities can be in one county. City names will be unique across the country.
  • Many counties can be in one state. County names will be unique across the country.
  • Many states can be in USA. State names will be unique across the country USA.
  • One employee can work with many branches at same time.
  • For some employees, managers may not be there.

In order to implement the above decisions, relational data modeling is done in the following manner.

  • To achieve normalization, relevant attributes of employee, employer lookup, state lookup, county lookup and city lookup tables should be grouped and created.
  • In order to validate the data of employee table, employee table has been connected to state, county, and city lookups. Whenever state, county, city data is entered in employee table, data would be checked against respective lookup tables and correct data is stored. Hence there is no need to carry redundant data of state, county, city lookup in employee table.
  • All tables are identified by primary keys(PK). So data can be uniquely identified from tables.
  • Records can be inserted or updated directly in the respective lookup table. For example if a state name changes, then the change will be updated only in the state lookup, hence this change will not affect other tables like employee.
  • Since one employee can work in many branches at the same time, table EmployeeEmployerXREF has been created and it resolves many to many relationships.
  • Since an employee can be a manager in many occasions, column “manager identifier” has been added and becomes a foreign key to column employee identifier. The “manager identifier” column would contain the same value as of an employee identifier. Sometimes it may contain null values also. For example, Paul Young is the topmost person and doesn’t have any managers.
  • A new column DateTimeStamp has been added to all tables. This column gives the information about the date and time when the row was inserted or updated.

The completed relational data model is shown in Figure below and the corresponding data are shown in separate tables in the next page.

Relational Data Modeling Example Diagram:

Relational Data Modeling Example

 

Next: Relational Data Modeling Example – Part 3 

Relational Data Modeling Example

The sample source data shown in the table below provides the information about employees, their residential state, county, city and their employer names and manager names. It also describes employees working for an “American Bank” that has got many branches in several states. From data modeler point of view, analysis of the source data raises following questions.

  • How to group and organize the data?
  • How to avoid de-normalization since employee’s residential data like state name, county Name, city Name are repeated in most of the records.
  • What sort of relationship is between employer and employee?
  • What sort of relationship is between the employee and state, city, county?

Sample Source Data:

[ultimatetables 4 /]

In the next page, we will discuss how to resolve these problems in order to design a good relational data model.

Next: Relational Data Modeling Example – Part 2

Relational Data Modeling Tutorial

Relational (OLTP) Data Modeling:

Relational Data Model is a data model that views the real world as entities and relationships.

Entities are concepts, real or abstract about which information is collected. Entities are associated with each other by relationship and attributes are properties of entities.

Business rules would determine the relationship between each of entities in a data model.

The goal of relational data model is to normalize (avoid redundancy)data and to present it in a good normal form.

While working with relational data modeling, a data modeler has to understand 1st normal form through 5th normal form to design a good data model.

Following are some of the questions that arise during the development of entity relationship data model. A complete business and data analysis would lead to design a good data model.

  • What will be the future scope of the data model? How to normalize the data?
  • How to group attributes in entities?
  • How to name entities, attributes, keys groups, relationships?
  • How to connect one entity to other? What sort of relationship is that?
  • How to validate the data?
  • How to normalize the data?
  • How to present reports?

Next: Relational Data Modeling Example

1 2 3 4 5 6