Data Modeling Interview Questions

General Questions:

  • Have you exported objects from the data model to different database? What is the use of it?
  • What is aggregated data and why do you need aggregated data?
  • What is the role of a logical data modeler and physical data modeler?
  • How do you gather requirements for creating data model?
  • What is the naming convention you follow?
  • What are the different notations that can be used?
  • What is Cardinality?
  • What is identifying and non identifying relationship?
  • Why do you need phrases in a data model?
  • Have you generated reports for data models? Why do you need reports from data model?
  • How you will recover if data modeling file is corrupted?
  • What is Meta data? What kind of Meta data do you capture in a data model?
  • Explain a difficult situation in data modeling and how did you resolve it?

Logical Data Modeling Questions:

  • What is logical data modeling?
  • What is conceptual data modeling? Why do you need that?
  • What is enterprise data modeling? Why do you need that?
  • What is relational data modeling? When you will use relational data modeling?
  • When you will go for relational data modeling?
  • What is OLTP?
  • What is OLAP data modeling?
  • What is dimensional data modeling?
  • What is a dimension?
  • What is a slowly changing dimension? What are the types of slowly changing dimension? Which one you will use it?
  • What is super type and sub type? Where you will use in logical or physical data model? If you generate the super type and sub type structure it in a database, what are the consequences?
  • What is Star Schema? When you use Star Schema?
  • What is Snow Flake? When you will use Snow Flake Schema?
  • What is the difference between Snow Flake and Star Schema?
  • Can you compare logical data model with data base? How?
  • How you will find out the difference between the logical data models created previous month and current month?
  • What is Data Ware House?
  • What is Data Mart?
  • What is the difference between data ware house and data mart?

Physical Data Modeling Questions:

  • What is physical data modeling?
  • What is the difference between logical and physical data modeling?
  • Can you compare physical data model with data base? How?
  • How version controlling is done in a data model?

 

Reverse Engineering a Data Model

What is Reverse Engineering a Data Model?

Reverse Engineering is a process or function to create a logical and physical data model by extracting information from an existing data source. You may either create a physical data model or logical data model or combination of both. Data Source may be oracle data base or MS SQL Server or IBM DB2 or .sql files etc.

Example: Oracle database may contain objects likes tables, views, indexes, sequences, triggers, rules, data type, owner, constraints like primary keys, foreign keys, comments etc. You can specify the objects that you are interested to create a model. If you are not interested in triggers, then there is no need for you to check that option.

All you have to know is how to connect to the data source and whether necessary privileges are allocated to you.

If you do have any problems, while connecting to the data source, please ask system administrator or database administrator to resolve the issues.

Reasons for Reverse Engineering:

  • Data base may be present but there may be no data models and hence no logical data model and physical data model print outs for which project team may not be sure about the objects(tables, index, constraints etc) and relationships between them in data base.
  • Data Models may be present but it may not be in sync with data base.
  • When you integrate different sources(some sources may not have data models) into one database to create a data warehouse, then you have to represent these sources as data models (original AS IS). During discussions within the project team, it will give more clarity about the original structure (AS IS) and future (data warehouse) data models.
  • You can give logical data models to the business community (SME, Business Analysts etc.) and physical data models to development community (developers and dbas)

Note:

You can modify/update the reverse engineered model as per your needs. Alignment of tables and relationship on the work place may not be proper and printing out the models will not be in a presentable format. So please align it.

Sources from which data model can be reverse engineered: Oracle, MS SQL, PostGreSql, DB2 UDB, MYSQL, MS ACCESS, Sybase, xml files, sql files etc.

The Reverse Engineering wizard for each data base modeling software may be different. Learn how to connect to the databases, how to check different database objects that you want to reverse engineer and see whether you can save these settings.

How to create a data model from a Database?

For example, if you would like to reverse engineer database object from Oracle Production Server. To connect to oracle, you will need user name, password, and connection string. Then you have to check/click all different database objects(tables, columns, indexes, views, synonyms etc) that your are interested to reverse. Try to find out whether you can save these settings with a “NAME” (PRODUCTION_ORACLE_SERVER). The reason is when you connect to the same database next time, then there is no need for you to check all those options once again and you can use the saved information with the name “PRODUCTION_ORACLE_SERVER”.

How to create a data model by reverse engineering from scripts?

In the section, we are using the DDL scripts which have been created earlier in the page DDL Scripts from Data Model. Generated.SQL is the file name that contains DDL scripts which is stored in your hard drive. Now we will explain how to reverse engineer and create data model from the above script.

Reverse Engineering – Creating a data model from DDL scripts using Toad Data Modeler:

  • Click File→Reverse Engineering.Click Next button on the bottom of the page.
  • Select Data Source: Under “Select Data Source”, you can see the options “Database RE” and “DDL Script RE”.
  • Since we have created this .sql script from “oracle 11g release 2” database, highlight “oracle 11g release 2” and click next.
  • Select Data Provider: By default, “DDL script file data provider” is chosen. Click next.
  • Connecting:Browse file from your folder and select the file “C:\Users\Data Modeling\Documents\Toad Data Modeler\GeneratedScripts\Generated.SQL”. Click next.
  • Select Data Migrator: By default, it chooses “Reverse Engineering from oracle 11g release 2 DDL script of database”. Click next.
  • What to reverse: Check the boxes against different objects like tables, sequences, triggers etc. Click next
  • Options: Check different options like check constraints, indexes, unique constraints etc. Click next
  • Alias: If you want to use the same kind of settings in future, then save your settings, click save and enter some alias name. Click next.
  • Tables: You can check the tables that are required. Click Execute.

Now you can see the created data model with all database objects, relationships, and their properties in the work-space.

 

DDL Scripts from a Data Model

How to create DDL Scripts from Data Model?

  • Open the physical data model “Data_Modeling_Relationship.txl”.
  • Click menu Model/Verify Model.
  • A new window will be opened.
  • Check the appropriate objects like entities, functions, packages, materialized views, relationship, synonyms, data types etc which are listed in the new window.
  • Click “Verify” button on the bottom of the page.
  • It shows the errors and warning in a data model. If there are no errors and warnings.
  • Click Menu/Generate DDL Script.
  • A new window will be opened. It has tabs like what to generate, detail settings, referential integrity, Select List.
  • Check the appropriate objects like directories, editions, entities, functions, java, materialized views, packages, permissions to objects, procedures, relationships, sequences, synonyms, user data types, user groups, views etc which are listed in the new window.
  • In the bottom of the window, you can see buttons like Generate, Show Code, Show Log.
    • Click the button ‘Generate’. It will save the DDL scripts in a file under a folder. You can see the location of the file in “what to generate tab”. Example: C:\Users\Data Modeling\Documents\Toad Data Modeler\GeneratedScripts\Generated.SQL
    • If you have access to the databases, then you can directly execute those scripts and create database objects in database or database schema.
  • Click Show. It will retrieve the information from that file “Generated.sql” and code is shown below.

— Create tables section —————–

— Table DEPARTMENT

CREATE TABLE “DEPARTMENT”(

“DEPARTMENT_NO” Integer NOT NULL,

“DEPARTMENT_NAME” Varchar2(30 ) NOT NULL

)

/
— Add keys for table DEPARTMENT

ALTER TABLE “DEPARTMENT” ADD CONSTRAINT “Department_PK” PRIMARY KEY (“DEPARTMENT_NO”)

/

 

— Table TITLE

CREATE TABLE “TITLE”(

“TITLE_CODE” Integer NOT NULL,

“TITLE_DESCRIPTION” Varchar2(30 ) NOT NULL

)

/

 

— Add keys for table TITLE

ALTER TABLE “TITLE” ADD CONSTRAINT “TITLE_PK” PRIMARY KEY (“TITLE_CODE”)

/

 

— Table PROJECT

CREATE TABLE “PROJECT”(

“PROJECT_CODE” Integer NOT NULL,

“SOFTWARE_USED” Varchar2(30 ) NOT NULL

)

/

 

— Add keys for table PROJECT

ALTER TABLE “PROJECT” ADD CONSTRAINT “PROJECT_PK” PRIMARY KEY (“PROJECT_CODE”)

/

— Table EMPLOYEE

CREATE TABLE “EMPLOYEE”(

“EMPLOYEE_NO” Integer NOT NULL,

“DEPARTMENT_NO” Integer NOT NULL,

“TITLE_CODE” Integer NOT NULL,

“EMPLOYEE_NAME” Varchar2(30 ) NOT NULL,

“MANAGER_NO” Integer

)

/

 

— Add keys for table EMPLOYEE

ALTER TABLE “EMPLOYEE” ADD CONSTRAINT “EMPLOYEE_PK” PRIMARY KEY (“EMPLOYEE_NO”)

/

— Table EMPLOYEE_PROJECT

CREATE TABLE “EMPLOYEE_PROJECT”(

“EMPLOYEE_NO” Integer NOT NULL,

“PROJECT_CODE” Integer NOT NULL

)

/

 

— Add keys for table EMPLOYEE_PROJECT

ALTER TABLE “EMPLOYEE_PROJECT” ADD CONSTRAINT “EMPLOYEE_PROJECT_FK” PRIMARY KEY (“PROJECT_CODE”,”EMPLOYEE_NO”)

/

 

— Create relationships section ————————————————-

ALTER TABLE “EMPLOYEE” ADD CONSTRAINT “EMPLOYEE_FK01” FOREIGN KEY (“DEPARTMENT_NO”) REFERENCES “DEPARTMENT” (“DEPARTMENT_NO”)

/

 

ALTER TABLE “EMPLOYEE” ADD CONSTRAINT “EMPLOYEE_FKO2” FOREIGN KEY (“TITLE_CODE”) REFERENCES “TITLE” (“TITLE_CODE”)

/

 

ALTER TABLE “EMPLOYEE_PROJECT” ADD CONSTRAINT “EMPLOYEE_PROJECT_FK02” FOREIGN KEY (“PROJECT_CODE”) REFERENCES “PROJECT” (“PROJECT_CODE”)

/

 

ALTER TABLE “EMPLOYEE_PROJECT” ADD CONSTRAINT “EMPLOYEE_PROJECT_FK01” FOREIGN KEY (“EMPLOYEE_NO”) REFERENCES “EMPLOYEE” (“EMPLOYEE_NO”)

/

 

ALTER TABLE “EMPLOYEE” ADD CONSTRAINT “EMPLOYEE_FK03” FOREIGN KEY (“MANAGER_NO”) REFERENCES “EMPLOYEE” (“EMPLOYEE_NO”)

/

 

Compare Data Models and create DDL scripts

During the development cycle of data model, several changes/updates may take place over time depending upon the needs and requirement of that project. As and when required, different data model versions will be created with little or more changes between them. At one point of time, one may require to know the exact differences between different versions(older version and newer version) of data models. Knowing these will help in assuring the updates in the newer version are correct and staying intact with the needs of the requirement. Also you can apply these changes to the underlying database by creating “ALTER” scripts. In that case, data modeling tools can help us to sort out the differences between data models.

With the help of a data modeling tool you can compare:

  • between two data models
  • data model with the script
  • data model with the underlying database

Significance of comparing data models:

  • To know the differences between them Or to know what changes have been taken place in the latest version.
  • Whatever differences are there, we can create scripts and the same can be implemented on the database.

In this article, we are using Toad software to compare between two different data model versions as well as to create ALTER scripts for database update.

Note: In Toad Data Modeler Software, data models are saved with file extension of .txp.

The following table represents a data model with a file name “Domain_Rule_Default_v1.txp” in the first stage of a development of a data model.

Column NameData Type
FULL_NAMENAME_DOMAIN NN
STREET_NAMENAME_DOMAIN NN
CITY_NAMENAME_DOMAIN NN
GENDERChar(1) NN
SSNChar(9) NN
RECORD_DATEDate NN
USER_NAMENAME_DOMAIN NN

In the second stage of a development of a data model, you are requested to add another column “COUNTRY_NAME VARCHAR2 (30) NOT NULL)”.

Follow the guidelines below to insert the requested column in the data model using Toad Software.

  • Save the existing data model “Domain_Rule_Default_v1.txp” as “Domain_Rule_Default_v2.txp”.
  • Add a column “COUNTRY_NAME VARCHAR2 (30)” and assign NOT NULL to it.
  • Save the data model.

The following table represents the data model “Domain_Rule_Default_v2.txp” created in the second stage of a development of a data model.

Column NameData Type
FULL_NAMENAME_DOMAIN NN
STREET_NAMENAME_DOMAIN NN
CITY_NAMENAME_DOMAIN NN
GENDERChar(1) NN
SSNChar(9) NN
RECORD_DATEDate NN
USER_NAMENAME_DOMAIN NN
COUNTRY_NAMEVarchar2(30) NN

For comparing the data models:

  • Open TOAD Data Modeler.
  • Click menu File/Sync & Convert/Sync & Convert Wizard.
  • Action: On the new window that is appearing, click the button “Compare Models and Generate Alter Report”. Click next.
  • Select Left Side: Click the button Model File “*.txp, *.txl “. Select “Domain_Rule_Default_v2.txp”. Click next.
  • Select Right Side: Similarly click the button Model File “*.txp, *.txl “. Select “Domain_Rule_Default_v1.txp”. Click next.
  • Settings: You can select the comparison rules (Ignore text case, Ignore Model Name, Resolve Application Variables, Ignore empty Primary Keys, etc). Click next.
  • Select Object Types: You have compare all, compare all without graphics, compare settings buttons. Click compare all and. Click next.
  • Select Items: A new window appears and you can see that COUNTRY_NAME doesn’t exist in “Domain_Rule_Default_v1.txp”.

How to generate Alter Script after comparing the data models?

  • Open TOAD Data Modeler.
  • Click menu File/Sync & Convert/Sync & Convert Wizard.
  • Action: On the new window that is appearing, click the button “Generate Alter Script”. Click next.

Finally, the script generated by TOAD as follows:


ALTER TABLE “DOMAIN_RULE_DEFAULT” ADD (“COUNTRY_NAME” Varchar2(30) NOT NULL)


 

Comparison of Relational and Dimensional Data Modeling

Relational | Dimensional Data Modeling:

Relational Data Modeling is used in OLTP systems which are transaction oriented and Dimensional Data Modeling is used in OLAP systems which are analytically based. In a data warehouse environment, staging area is designed on OLTP concepts, since data has to be normalized, cleansed and profiled before loaded into a data warehouse or data mart. In OLTP environment, lookups are stored as independent tables in detail whereas these independent tables are merged as a single dimension in an OLAP environment like data warehouse.

Relational vs Dimensional Data Modeling

Relational Data ModelingDimensional Data Modeling
Data is stored in RDBMSData is stored in RDBMS or Multidimensional databases
Tables are units of storageCubes are units of storage
Data is normalized and used for OLTP. Optimized for OLTP processingData is de-normalized and used in data warehouse and data mart. Optimized for OLAP
Several tables and chains of relationships among themFew tables and fact tables are connected to dimensional tables
Volatile(several updates) and time variantNon volatile and time invariant
SQL is used to manipulate dataMDX is used to manipulate data
Detailed level of transactional dataSummary of bulky transactional data(Aggregates and Measures) used in business decisions
Normal ReportsUser friendly, interactive, drag and drop multidimensional OLAP Reports

 

Comparison of Logical and Physical Data Modeling

Logical | Physical Data Modeling:

When a data modeler works with the client, his title may be a logical data modeler or a physical data modeler or combination of both. A logical data modeler  designs the data model to suit business requirements, creates and maintains the lookup data, compares the versions of data model, maintains change log, generate reports from data  model and whereas a physical data modeler has to know about the source and target databases properties.

A physical data modeler should know the technical-know-how  to create data models from existing databases and to tune the data models with referential integrity, alternate keys, indexes and how to match indexes to SQL code. It  would be good if the physical data modeler knows about replication, clustering and so on.

The differences between a logical data model and physical data model is shown below.

Logical vs Physical Data Modeling

Logical Data ModelPhysical Data Model
Represents business information and defines business rulesRepresents the physical implementation of the model in a database.
EntityTable
AttributeColumn
Primary KeyPrimary Key Constraint
Alternate KeyUnique Constraint or Unique Index
Inversion Key EntryNon Unique Index
RuleCheck Constraint, Default Value
RelationshipForeign Key
DefinitionComment

 

Data Modeling Subject Area, Default, Domain, Rules & Constraints

Logical and Physical Data Modeling Objects:

To become a data modeler, you need to understand the concepts of a database. Before proceeding further, please visit our topics listed under the section “Database and Data Modeling” to get a fair knowledge about the database. The following table briefs about the objects used in constructing the data model especially in domains, rules, check constraints, and subject area.

What is a Logical Data Model?

This is a business presentation of objects in a database which represents the business requirements (entire or part) of an organization. Usually object names are very descriptive and Supertypes / Subtypes, relationships between different objects are shown, which is very easy for every one to understand the business of the organization.

What is a Physical Data Model?

Physical Data Model contains most of the objects present in the database. From Developers perspective, it shows the table name, column name,  data type, null, not null, unique constraint, primary key constraints, foreign key constraints, which helps them to code.

Objects used in a Data Model


Data Model Type: Logical

Data Model Objects: Subject Area or Work Space

Explanation: 

In a data model, there is one main subject area which comprises all objects present in all subject areas and other subject areas based on their processes or business domains. Each subject area contains objects, which are relevant to that subject area and the subject area is very useful in understanding the data model and to generate reports and PRINT OUTS based on main subject areas or other subject areas. In a telecommunication data model, there may be several subject areas like Service Request, Service Order, Ticketing and Main Subject Area. In a Mortgage data model, there may be several subject areas like borrower, loan, under writing and main subject area. Usually subject areas are created on main business processes. In Telecommunication (telephone service subscription by customer), service request is a process to get the request from the customer through phone, email, fax etc. Service Order is the next process to approve the service request and provide telephone line subscription to customers. Ticketing is a process by which complaints are gathered from the customer and problems are resolved.


Data Model Type: Physical

Data Model Objects: Subject Area or Work Space

Explanation: 

It is a copy of the logical subject area but some objects like supertype and sub types objects may not be AS IS like the logical subject area.


Data Model Type: Logical

Data Model Objects: Entity

Explanation: 

It is the business presentation of a table present in a database. Example: COUNTRY


Data Model Type: Physical

Data Model Objects: Table

Explanation: 

It is comprised of rows & columns, which stores data in a database. Example: CNTRY


Data Model Type: Logical

Data Model Objects: Attribute

Explanation: 

It is the business presentation of a column present in a database. Example: Country Code, Country Name.


Data Model Type: Physical

Data Model Objects: Column

Explanation: 

It is a data item, which stores data for that particular item. Example: CNTRY_CD, CNTRY_NM.


Data Model Type: Logical

Data Model Objects: Default

Explanation: 

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Default

Explanation: 

When user input is not present, default value attached with the particular column is inserted into that column.

Step:

  • First, you have to create a default object in the data model.
  • You have to attach the default object with the column.
  • When you generate scripts from the physical data model, automatically, default will be attached to the column.

Example 1:

You may need a situation, where system date and time should be inserted when the record is inserted. With Oracle, you can attach SYSDATE to that column.

Column name: TODAY;

Datatype: DATE;

Default Syntax : DEFAULT;

Oracle Default: SYSDATE TODAY DATE DEFAULT SYSDATE

Example 2:

You many need to know about the schema name who inserted that record.

Column name: SCHEMA_NAME;

Datatype: VARCHAR2(30);

Default Syntax: DEFAULT;

Oracle Default: USER SCHEMA_NAME VARCHAR2(30) DEFAULT USER


Data Model Type: Logical

Data Model Objects: Domain

Explanation: 

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Domain

Explanation: 

When you create a data model, there are several columns in tables, which are code, identifier, indicators, descriptive columns, date columns, NOT NULL columns, Primary key columns etc. To make it consistent across the data model, we can use domains.

Steps:

  • First, you have to create a domain object in the data model.
  • You have to attach the domain object with the column.

Example:

For a DESCRIPTION column, you can create a domain which has the following (NOT NULL, Datatype as VARCHAR2 (200)).

You can attach this domain to all descriptive columns present in tables. So every descriptive column present in the table will have NOT NULL as the constraint and datatype as VARCHAR2 (200).


Data Model Type: Logical

Data Model Objects: Check Constraint Rule

Explanation: 

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Check Constraint Rule

Explanation: 

Steps:

  • First, you have to create a rule object in the data model.
  • You have to attach the rule object with the column or domain.

Check Constraint Rule can be imposed on columns like:

  • Example 1: Indicator Columns: Yes or NO
  • Example 2: Gender Columns: Male or Female
  • Example 3: Marital Status: Married or Single

Data Model Type: Logical

Data Model Objects: NULL

Explanation:

There is no name for this NULL either in logical or physical data model. NULL is an option so that it allows NULL values for that column.


Data Model Type: Physical

Data Model Objects: NULL

Explanation:

Column allows NULL VALUES (Values can be empty).


Data Model Type: Logical

Data Model Objects: Not Null Constraint

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Not Null Constraint

Explanation:

Column should always contain data.


Data Model Type: Logical

Data Model Objects: Unique Constraint

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Unique Constraint

Explanation:

Non NULL Values should be different from each other.


Data Model Type: Logical

Data Model Objects: Primary Key Constraint

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Primary Key Constraint

Explanation:

Unique Constraint + Not Null Constraint.


Data Model Type: Logical

Data Model Objects: Foreign Keys

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Foreign Keys

Explanation:

This is a constraint imposed on the child table. Whatever values are present in the child table, their corresponding values should be present in the parent table. This constraint can be imposed on one column or group of columns and NULL values are allowed in child table.


Data Model Type: Logical

Data Model Objects: Relationships

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Relationships

Explanation:

Identifying, Non-Identifying, Self Relationship for Non Identifying relationship, M:N relationship.


Data Model Type: Logical

Data Model Objects: Sequence

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Sequence

Explanation:

To generate a unique number, sequence is used.


Data Model Type: Logical

Data Model Objects: Views, Synonyms

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Views, Synonyms

Explanation:

Usually development team and DBA team does that.


Data Model Type: Logical

Data Model Objects: Procedure, Function, Packages, Triggers, Materialized Views

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Procedure, Function, Packages, Triggers, Materialized Views

Explanation:

Usually developer write these and some times data modelers implement that in the new data model.


Data Model Type: Logical

Data Model Objects: Indexes and Unique Indexes

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Indexes and Unique Indexes

Explanation:

Index is used for fastest retrieval of data from the database. Whenever a primary key constraint is created on a table, index is also created.  When we use a column in the where clause, data modelers index it after getting guidance from the development team and DBA team. Unique Index is created when there is a need for unique values in that column.


Data Model Type: Logical

Data Model Objects: SuperType and SubType

Explanation:

Supertype is an entity type that has got relationship (parent to child relationship) with one or more subtypes and it contains the attributes that are common to its subtypes.

Subtypes are subgroups of the supertype entity and have unique attributes, they will be different from each subtypes.

Supertypes and subtypes are parent and child entities respectively and the primary keys of supertype and subtype are always identical.

For detailed explanation, visit our website in Supertype and Subtype.


Data Model Type: Physical

Data Model Objects: SuperType and SubType

Explanation:

Visual representation of supertypes and subtypes will not be identical in logical and physical data model. In logical data model, it explains about the business, but the same cannot be incorporated in physical data model.


 

Identifying and Non-Identifying Relationships

Analysis of Data and creating Tables & Relationships:

Usually in a data model, Tables (Entities) looks like rectangular or square type boxes, which contains columns(attributes) and these tables are connected by lines (relationships). Identifying relationship and Non-Identifying relationship are two types of relationships. Based on cardinality, relationship are broadly classified into one to one relationship, one to many relationship and many to many relationship. Occasionally, there may arise a situation where you need to create a foreign key column by connecting a relationship to primary key column of the same table, which is called as self-referential or recursive relationship.

With the following example, we will explain you about how you can learn and practice different types of relationships with Quest’s TOAD Software tool.

Primary Key:

If the attribute (column) has to be unique and should contain NON-NULL values, data modelers create primary key constraint on that particular attribute and it is called the primary key attribute of that entity.

Foreign Key:

Attributes within same entity or attributes from different entities will be connected by a connector (foreign key) to the primary key attribute.

Composite Primary key:

If more than one column is present as a primary key, then it is called as primary key.

Let us consider a software company, which is involved in implementing projects. Assume that the client has sent the following information to the software company and software company has to create logical and physical data models and implement that in a database.

Sample Data:

Employee_Projects_Sample_Data

Data Modeler analyzes data, talks to business analysts or smart management experts in the client and understands the requirements.

Following section describes business requirements.

Business Requirements:
  • One employee has only one manager and CEO doesn’t have any manager.
  • Manager Numbers are also Employee Numbers.
  • One employee can work in only one department and one or more employees can work in the same department.
  • One or more employees can have the same title.
  • One employee can work in one or more projects and one project can have one or more employees.

Following sections, how business requirements are transformed to a data model.

Analysis:

How many tables should be present in the data model?
  1. We have to create EMPLOYEE table and Employee_No should be the primary key of the table and later other stand alone columns and foreign key columns have to be added based on the relationship.
  2. Employee_No has one to many relationship with Department_No because one or more employees can work in the same department. So DEPARTMENT table should be created and Dept_No should be the primary key of the DEPARTMENT table. Dept_No has to be connected to EMPLOYEE table and Dept_No in EMPLOYEE table is a foreign key to DEPARTMENT table.
  3. Employee_No has one to many relationship with Title_Code because one or more employees can have the same title. So TITLE table should be created as a separate table and Title_Code should be the primary key of the TITLE table. Title_Code has to be connected to EMPLOYEE table and Title_Code in EMPLOYEE table is a foreign key to TITLE table.
  4. One or more employees can work in one or more projects. So a separate table PROJECT should be created first and Project_Code should be the primary key of the table. Since the relationship between EMPLOYEE and PROJECT table depicts a many-to-many relationship and this relationship cannot be resolved using these two tables alone. So you need to create one more table EMPLOYEE_PROJECT to resolve this.
  5. Employee_No has many to many relationship with Project_Code because one or more employees can work in one or more projects and one project can have one or more employees. So a separate table EMPLOYEE_PROJECT should be created which shows many to many relationship with employees and project. It should be connected to Employee table and Project Table. Here Employee_No and Project_Code are the composite primary key of this table.
  6. Since Employee_No and Manager_No are same, we should create a self referential or recursive relationship within the same EMPLOYEE table. So a separate foreign key column should be created by connecting with Employee_no and should be named as Manager_No.

Next section, explains how to create entities, attributes, primary keys, foreign keys, and how to implement these in a database by using TOAD SOFTWARE DATA MODELING TOOL.

 

Data Modeling Development Cycle

1. Gathering Business Requirements – First Phase:

Data Modelers have to interact with business analysts to get the functional requirements and with end users to find out the reporting needs.

2. Conceptual Data Modeling (CDM) – Second Phase:

This data model includes all major entities, relationships and it will not contain much detail
about attributes and is often used in the INITIAL PLANNING PHASE. Please refer the diagram below and follow the link to learn more about Conceptual Data Modeling Tutorial.

Conceptual Data Modeling - Example diagram:

3. Logical Data Modeling (LDM) – Third Phase:

This is the actual implementation of a conceptual model in a logical data model. A logical data model is the version of the model that represents all of the business requirements of an organization. Please refer the diagram below and follow the link to learn more about Logical Data Modeling Tutorial.

In the example, we have identified the entity names, attribute names, and relationship. For detailed explanation, refer to relational data modeling.

4. Physical Data Modeling (PDM) – Fourth Phase:

This is a complete model that includes all required tables, columns, relationship, database properties for the physical implementation of the database. Please refer the diagram below and follow the link to learn more about Physical Data Modeling Tutorial.

Example of Physical Data Mode

5. Database – Fifth Phase:

DBAs instruct the data modeling tool to create SQL code from physical data model. Then the SQL code is executed in server to create databases.

 

Supertype and SubType

At times, few entities in a data model may share some common properties (attributes) within themselves apart from having one or more distinct attributes. Based on the attributes, these entities are categorized as Supertype and Subtype entities.

Supertype is an entity type that has got relationship (parent to child relationship) with one or more subtypes and it contains attributes that are common to its subtypes.

Subtypes are subgroups of the supertype entity and have unique attributes, but they will be different from each subtype.

Supertypes and Subtypes are parent and child entities respectively and the primary keys of supertype and subtype are always identical.

E.g. People, Bank Account, Insurance, Asset, Liability, Credit Card.

When designing a data model for PEOPLE, you can have a supertype entity of PEOPLE and its subtype entities can be vendor, customer, and employee. People entity will have attributes like Name, Address, and Telephone number, which are common to its subtypes and you can design entities employee, vendor, and consumer with their own unique attributes. Based on this scenario, employee entity can be further classified under different subtype entities like HR employee, IT employee etc. Here employee will be the superset for the entities HR Employee and IT employee, but again it is a subtype for the PEOPLE entity.

A person can open a savings account or a certificate deposit (fixed deposit) in a bank. These accounts have attributes like account number, account opening date, account expiry date, principal amount, maturity amount, account balance, interest rate, checks issued, pre-cancellation fee etc. While designing a data model, you can create supertype parent entity as “Account” and subtype entities as Savings Account and Certificate Deposit. Account entity will store attributes like account number, interest rate that are common to savings account and certificate deposit entity. Savings account entity will have attributes like account balance and checks issued. While fixed deposit entity will have attributes like account opening, account expiry date, principal amount, maturity amount, pre-cancellation fee etc. When you design a logical data model in this manner, it provides more meaning to the business and the attributes are not cluttered in one table.

More Examples:

Insurance entity can act as supertype and entities like health insurance, life insurance, auto insurance, liability insurance, malpractice insurance etc., can be subtype entities.

Asset entity can be the supertype and entities like bank, real estate, auto, retirement fund, stocks, and life insurance etc., can be subtype entities.

Liability entity may be the supertype and entities like real estate, auto loan, alimony, other debts etc., may be subtype entities.

Credit card entity may be the supertype and entities like balance transfer cards, cash back credit cards, business credit card, student credit card, secured credit card etc., may be the subtype entities.

1 2 3 4 5 6