Domain object, Default object and Check constraints

Why do we need to create Domains, Defaults & Check Constraints?

When you create domains, defaults and attach them to the columns in a data model, you are actually speeding up the data model creation process by using the predefined datatypes for all the similar columns. Metadata/data in the database and data structures will be consistent across the data models so that conflicts can be reduced. Check constraints enforce some rules on columns in the data model as per your instructions.

In this section, you will see how you can create domain, defaults and check constraints by using TOAD Data Modeler for the sample data shown below. To know more about the definitions and meaning of a domain, defaults and check constraints, please refer our earlier section Data Modeling Objects listed under the category Data Modeling.

Sample Data:

Column NameDataType
FULL_NAMEVARCHAR2(30) NOT NULL
STREET_NAMEVARCHAR2(30) NOT NULL
CITY_NAMEVARCHAR2(30) NOT NULL
GENDERCHAR(1) NOT NULL
SSNCHAR(9) NOT NULL
RECORD_DATEDATE NOT NULL
USER_NAMEVARCHAR2(30) NOT NULL

Sample Data Observation:

  • You see columns FULL_NAME, CITY_NAME, STREET_NAME, and USER_NAME with the same datatype of VARCHAR2 (30). So, first create a domain with datatype as VARCHAR2 (30). Then attach it to four of these columns.
  • For column GENDER, it can take only two values. Create a check constraint.
  • For column SSN, it should contain only numerical values. So create check constraint.
  • For column RECORD_DATE, it will store system date. So create a default.
  • For column USER_NAME, it will store the name of the user who inserted or updated that record. So create a default.

Steps to be followed in Toad:

  • Go to TOAD DATA MODELER and create a table “DOMAIN_RULE_DEFAULT” as mentioned in the section Create Data Modeling Objects
  • Domain:
    • Creating Domain: Go to menu “MODEL/DOMAIN”. In the new window, click add. By default, a domain is created by TOAD. Change caption and name to “NAME_DOMAIN” and change datatype to VARCHAR2(30).
    • Attaching Domains: You have to attach domain “NAME_DOMAIN” for FULL_NAME, CITY_NAME, STREET_NAME, and USER_NAME. While creating columns, you can see a list box for attaching DOMAINS. Select the domain “NAME_DOMAIN” and attach it to those columns.
  • Default:
    • Creating Default for column RECORD_DATE: Go to menu “MODEL/DEFAULT”. In the new window, click add. By default a domain created by TOAD. Edit in General tab and Change caption and name to “TODAY_DATE_DEFAULT”. In SQL tab, type the syntax “SYSDATE”.
    • Creating Default for column USER_NAME: For column USER_NAME, create another default RECORD_BY_USER_DEFAULT and in SQL tab, type the syntax “USER”.
    • Attaching columns to Defaults: You have to attach the Default “TODAY_DATE_DEFAULT” to column “RECORD_DATE”. While creating columns, you will see a list box for attaching Defaults. Select the default “TODAY_DATE_DEFAULT” and attach it to RECORD_DATE column. Do the same procedure for the other column USER_NAME to attach it to the default “RECORD_BY_USER_DEFAULT”.
  • CHECK Constraint:
    • Creating and attaching Check Constraint: While creating the column gender, you can see a tab “check constraints”. Click add. By default a check constraint is created by TOAD. Edit in General tab and change caption and name to “GENDER_CHECK”. In sql tab, type “gender in (‘M’, ‘F’)”.
    • Similarly do it for SSN column also. In sql tab, type SSN BETWEEN ‘000000001’ AND ‘999999999’.

This is how the table looks in a Data Model.

Column NameDataType
FULL_NAMENAME_DOMAIN NN
STREET_NAMENAME_DOMAIN NN
CITY_NAMENAME_DOMAIN NN
GENDERChar(1) NN
SSNChar(9) NN
RECORD_DATEDate NN
USER_NAMENAME_DOMAIN NN

Note: NN means NOT NULL.

In few columns, you see NAME_DOMAIN. The domain that you attached is displayed here instead of data type.

DDL Script for this table:

 


 

 

 

CREATE TABLE “DOMAIN_RULE_DEFAULT”(
“FULL_NAME” Varchar2(30 ) NOT NULL,
“STREET_NAME” Varchar2(30 ) NOT NULL,
“CITY_NAME” Varchar2(30 ) NOT NULL,
“GENDER” Char(1 ) NOT NULL CONSTRAINT “GENDER_CHECK” CHECK (GENDER IN (‘M’, ‘F’)),
“SSN” Char(9 ) NOT NULL CONSTRAINT “SSN_CHECK” CHECK (SSN BETWEEN (‘000000001’ AND ‘999999999’)),
“RECORD_DATE” Date DEFAULT Sysdate NOT NULL,
“USER_NAME” Varchar2(30 ) DEFAULT USER NOT NULL)


 

Data Modeling Subject Area, Default, Domain, Rules & Constraints

Logical and Physical Data Modeling Objects:

To become a data modeler, you need to understand the concepts of a database. Before proceeding further, please visit our topics listed under the section “Database and Data Modeling” to get a fair knowledge about the database. The following table briefs about the objects used in constructing the data model especially in domains, rules, check constraints, and subject area.

What is a Logical Data Model?

This is a business presentation of objects in a database which represents the business requirements (entire or part) of an organization. Usually object names are very descriptive and Supertypes / Subtypes, relationships between different objects are shown, which is very easy for every one to understand the business of the organization.

What is a Physical Data Model?

Physical Data Model contains most of the objects present in the database. From Developers perspective, it shows the table name, column name,  data type, null, not null, unique constraint, primary key constraints, foreign key constraints, which helps them to code.

Objects used in a Data Model


Data Model Type: Logical

Data Model Objects: Subject Area or Work Space

Explanation: 

In a data model, there is one main subject area which comprises all objects present in all subject areas and other subject areas based on their processes or business domains. Each subject area contains objects, which are relevant to that subject area and the subject area is very useful in understanding the data model and to generate reports and PRINT OUTS based on main subject areas or other subject areas. In a telecommunication data model, there may be several subject areas like Service Request, Service Order, Ticketing and Main Subject Area. In a Mortgage data model, there may be several subject areas like borrower, loan, under writing and main subject area. Usually subject areas are created on main business processes. In Telecommunication (telephone service subscription by customer), service request is a process to get the request from the customer through phone, email, fax etc. Service Order is the next process to approve the service request and provide telephone line subscription to customers. Ticketing is a process by which complaints are gathered from the customer and problems are resolved.


Data Model Type: Physical

Data Model Objects: Subject Area or Work Space

Explanation: 

It is a copy of the logical subject area but some objects like supertype and sub types objects may not be AS IS like the logical subject area.


Data Model Type: Logical

Data Model Objects: Entity

Explanation: 

It is the business presentation of a table present in a database. Example: COUNTRY


Data Model Type: Physical

Data Model Objects: Table

Explanation: 

It is comprised of rows & columns, which stores data in a database. Example: CNTRY


Data Model Type: Logical

Data Model Objects: Attribute

Explanation: 

It is the business presentation of a column present in a database. Example: Country Code, Country Name.


Data Model Type: Physical

Data Model Objects: Column

Explanation: 

It is a data item, which stores data for that particular item. Example: CNTRY_CD, CNTRY_NM.


Data Model Type: Logical

Data Model Objects: Default

Explanation: 

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Default

Explanation: 

When user input is not present, default value attached with the particular column is inserted into that column.

Step:

  • First, you have to create a default object in the data model.
  • You have to attach the default object with the column.
  • When you generate scripts from the physical data model, automatically, default will be attached to the column.

Example 1:

You may need a situation, where system date and time should be inserted when the record is inserted. With Oracle, you can attach SYSDATE to that column.

Column name: TODAY;

Datatype: DATE;

Default Syntax : DEFAULT;

Oracle Default: SYSDATE TODAY DATE DEFAULT SYSDATE

Example 2:

You many need to know about the schema name who inserted that record.

Column name: SCHEMA_NAME;

Datatype: VARCHAR2(30);

Default Syntax: DEFAULT;

Oracle Default: USER SCHEMA_NAME VARCHAR2(30) DEFAULT USER


Data Model Type: Logical

Data Model Objects: Domain

Explanation: 

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Domain

Explanation: 

When you create a data model, there are several columns in tables, which are code, identifier, indicators, descriptive columns, date columns, NOT NULL columns, Primary key columns etc. To make it consistent across the data model, we can use domains.

Steps:

  • First, you have to create a domain object in the data model.
  • You have to attach the domain object with the column.

Example:

For a DESCRIPTION column, you can create a domain which has the following (NOT NULL, Datatype as VARCHAR2 (200)).

You can attach this domain to all descriptive columns present in tables. So every descriptive column present in the table will have NOT NULL as the constraint and datatype as VARCHAR2 (200).


Data Model Type: Logical

Data Model Objects: Check Constraint Rule

Explanation: 

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Check Constraint Rule

Explanation: 

Steps:

  • First, you have to create a rule object in the data model.
  • You have to attach the rule object with the column or domain.

Check Constraint Rule can be imposed on columns like:

  • Example 1: Indicator Columns: Yes or NO
  • Example 2: Gender Columns: Male or Female
  • Example 3: Marital Status: Married or Single

Data Model Type: Logical

Data Model Objects: NULL

Explanation:

There is no name for this NULL either in logical or physical data model. NULL is an option so that it allows NULL values for that column.


Data Model Type: Physical

Data Model Objects: NULL

Explanation:

Column allows NULL VALUES (Values can be empty).


Data Model Type: Logical

Data Model Objects: Not Null Constraint

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Not Null Constraint

Explanation:

Column should always contain data.


Data Model Type: Logical

Data Model Objects: Unique Constraint

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Unique Constraint

Explanation:

Non NULL Values should be different from each other.


Data Model Type: Logical

Data Model Objects: Primary Key Constraint

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Primary Key Constraint

Explanation:

Unique Constraint + Not Null Constraint.


Data Model Type: Logical

Data Model Objects: Foreign Keys

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Foreign Keys

Explanation:

This is a constraint imposed on the child table. Whatever values are present in the child table, their corresponding values should be present in the parent table. This constraint can be imposed on one column or group of columns and NULL values are allowed in child table.


Data Model Type: Logical

Data Model Objects: Relationships

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Relationships

Explanation:

Identifying, Non-Identifying, Self Relationship for Non Identifying relationship, M:N relationship.


Data Model Type: Logical

Data Model Objects: Sequence

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Sequence

Explanation:

To generate a unique number, sequence is used.


Data Model Type: Logical

Data Model Objects: Views, Synonyms

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Views, Synonyms

Explanation:

Usually development team and DBA team does that.


Data Model Type: Logical

Data Model Objects: Procedure, Function, Packages, Triggers, Materialized Views

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Procedure, Function, Packages, Triggers, Materialized Views

Explanation:

Usually developer write these and some times data modelers implement that in the new data model.


Data Model Type: Logical

Data Model Objects: Indexes and Unique Indexes

Explanation:

Same as Physical data model. Only the name changes.


Data Model Type: Physical

Data Model Objects: Indexes and Unique Indexes

Explanation:

Index is used for fastest retrieval of data from the database. Whenever a primary key constraint is created on a table, index is also created.  When we use a column in the where clause, data modelers index it after getting guidance from the development team and DBA team. Unique Index is created when there is a need for unique values in that column.


Data Model Type: Logical

Data Model Objects: SuperType and SubType

Explanation:

Supertype is an entity type that has got relationship (parent to child relationship) with one or more subtypes and it contains the attributes that are common to its subtypes.

Subtypes are subgroups of the supertype entity and have unique attributes, they will be different from each subtypes.

Supertypes and subtypes are parent and child entities respectively and the primary keys of supertype and subtype are always identical.

For detailed explanation, visit our website in Supertype and Subtype.


Data Model Type: Physical

Data Model Objects: SuperType and SubType

Explanation:

Visual representation of supertypes and subtypes will not be identical in logical and physical data model. In logical data model, it explains about the business, but the same cannot be incorporated in physical data model.


 

Data Modeling Development Cycle

1. Gathering Business Requirements – First Phase:

Data Modelers have to interact with business analysts to get the functional requirements and with end users to find out the reporting needs.

2. Conceptual Data Modeling (CDM) – Second Phase:

This data model includes all major entities, relationships and it will not contain much detail
about attributes and is often used in the INITIAL PLANNING PHASE. Please refer the diagram below and follow the link to learn more about Conceptual Data Modeling Tutorial.

Conceptual Data Modeling - Example diagram:

3. Logical Data Modeling (LDM) – Third Phase:

This is the actual implementation of a conceptual model in a logical data model. A logical data model is the version of the model that represents all of the business requirements of an organization. Please refer the diagram below and follow the link to learn more about Logical Data Modeling Tutorial.

In the example, we have identified the entity names, attribute names, and relationship. For detailed explanation, refer to relational data modeling.

4. Physical Data Modeling (PDM) – Fourth Phase:

This is a complete model that includes all required tables, columns, relationship, database properties for the physical implementation of the database. Please refer the diagram below and follow the link to learn more about Physical Data Modeling Tutorial.

Example of Physical Data Mode

5. Database – Fifth Phase:

DBAs instruct the data modeling tool to create SQL code from physical data model. Then the SQL code is executed in server to create databases.

 

What is Data Model Versioning?

Data Model Versioning | Concepts | Advantages | Uses:

Data Model Versioning is the process of assigning either unique version names or unique version numbers to different stages of a data model. Data Modeling tools may have this versioning facility in their software itself and versions of data models may be stored in their repository.

The following information just gives an idea about how to do versioning manually.

Example:

a) <Project Name>_<mmddyyyy>; Banking_01012010

b) <Project Name>_<Version Number>; Banking_v1

During the development cycle of the data model, SME’s (Smart Management Experts) or Business Analysts will request the data modeling team to create a new subject area for a new line of business or modify the existing subject area.

In the initial stages of development of a data model, whenever a new subject area is added to the data model or changes done to the data model, immediately, data model changes will be sent to the “project team” by email. Data Models are stored in a shared network where “project team” will have privileges to view the data model and “data modelers” will have privileges to update the data model.

Practical Example:

To start with, A bank may have “savings account” as their line of business. Later it may add a different line of business “Credit Card”. To start with, data model will have only only subject area “Savings Account”. When credit card is added to the bank’s business, SMEs or business analyst will analyze it and they will send a new requirement to the data modeling team to add different entities in the logical data model. They may also send few changes(add attribute/delete attribute) in existing “Savings Account” data model. In order to keep track of these changes, we need versioning of the data model.

Versioning Example:

Assume that this data model work will be completed within 6 months starting from Jan 2010 and ending in June 2010.

In the shared network allocated for the project team, create a folder called “Data Modeling”. Under data modeling, create sub folders like “Jan 2010”, “Feb 2010”, “Mar 2010”, “Apr 2010”, “May 2010” and “June 2010”. The logic behind this is data model updates done in that particular month are stored under that month folder.

For Data Models, Start with version V1 in January and update it to V2 in February. Whatever other changes you do within that particular month, suffix “V1” or “V2” by .1, .2 etc.

  • Date: 1st January 2010: A new requirement to create “Savings Account” data model was given by SMEs or business analysts. Assume that the project name is “Banking”. Create a data model by name “Banking_v1” and add necessary entities in the data model. Save it under “Jan 2010” folder.
  • Date: 25th January 2010: Few changes have been sent by SMEs for “Savings Account” subject area. Save the existing data model as “Banking_V1.1”, update the changes and store it under “Jan 2010” folder. Now you have two versions of the data model.
  • Date: 25th February 2010: A new requirement about “Credit Card” was sent by SMEs. Save the the latest model “Banking_v1.1” as “Banking_v2” and apply the changes. Now you have three versions of data model. Store it under “Feb 2010” folder.
Advantages:
  • Data Model Changes can be tracked. Weekly or monthly changes can be sent to the project team by email.
  • Data Model can be compared with the data base and data models can be brought in SYNC with data base.
  • Changes can be easily rolled back (Removing the changes). If SMEs or business analysts are not sure, very often these roll backs will happen.
  • Reports can be generated from the data model and sent to the “documentation team”.
  • Clarity within the project team.
  • Some times the project team may be interested in a particular version of the data model. Its easier to send that particular version of the data model.
Interview Question:
  • How do you implement data model versioning?

Data Model Repository

What is a Data Model Repository? 

A data modeling repository is a storage area, where metadata of a data model is stored. The data stored is different from the software perspective, organization’s perspective and usage perspective. Repository can be stored any where; either in a data base or locally within any system.

Example: ETL Repository and Data Modeling Repository are different based on the software/usage perspective. In Data Modeling repository, meta data related to data
modeling is stored and in ETL repository, meta data related to ETL (Extraction, Transformation, and Loading) is stored. Organization will only store the meta data that they are interested.

From the data modeling perspective, data models and relevant meta data are stored in repository.

Whenever there are several data modelers working in an organization, and if they have to access the same data models concurrently, organizations buy repository. Otherwise, they store meta data in a shared network.

When data modeling software is bought with repository tool, system administrators install repository and share the username/password to the “Data Model Repository
Administrator”. This “Data Model Repository Administrator” has super privileges.

The administrator creates usernames and allocates privileges on data models for business analysts, SME, data modelers, Application developers (development/reporting), DBA’s, Business Users, Managers etc.

Examples of Privileges Allocated:
  • Creation and Updation of Logical Data Model. Based on the needs, privileges will be allocated on ALL or FEW or ONE data model present in the organization.
  • Creation and Updation of Physical Data Model Based on the needs, privileges will be allocated on ALL or FEW or ONE data model present in the organization.
  • Creation and Updation of Logical/Physical Data Model Based on the needs, privileges will be allocated on ALL or FEW or ONE data model present in the organization.
  • View Logical Data Model or Physical Data Model or both.
  • Creation and Updation of a particular database object (tables, views, indexes etc.)

All you have to know is how to login, log out, privileges allocated, different menus present in repository and how to work in repository, etc.

Uses of Repository:

  • Helps the data modelers to work on the same data model consistently/collaboratively and merge all work activities in the same data model itself.
  • Creating different Version of the data model to keep track of changes.
  • Generating Reports from the repository.
  • Applying security to data model.
  • Back Up and Recovery of the data models.

 

Data Modeling Reports

From Data Modeling tools, reports can be easily generated for technical and business needs. The reports that have been generated from logical data model and physical data model are called as business reports and technical reports respectively. Most of the data modeling tools provide default reports like subject area reports, entity reports, attribute reports, table reports, column reports, indexing reports, relationship reports etc. The advantage of these reports is, whether they are technical or non-technical, everybody would understand what is going on within the organization.

Other than default reports provided by data modeling tools, a data modeler can also create customized reports as per the needs of an organization. For example, if an expert asks of both logical and physical reports of a particular subject area in one file(e.g in .xls), logical and physical reports can be easily merged and reports can be easily generated accordingly. Data Modeling tools provide the facility of sorting, filtering options and the reports can be exported into file formats like .xls, .doc, .xml etc.

Logical Data Model Report:

Logical Data Model Report describes information about business such as the entity names, attribute names, definitions, business rules, mapping information etc.

Logical Data Model Report Example:

Logical Data Model Report Example

Physical Data Model Report:

Physical Data Model Report describes information such as the ownership of the database, physical characteristics of a database (in oracle, table space, extents, segments, blocks, partitions etc), performance tuning (processors, indexing), table name, column name, data type, relationship between the tables, constraints, abbreviations, derivation rules, glossary, data dictionary, etc., and is used by the technical team.

Physical Data Model Report Example:

Physical Data Model Report Example

 

Data Modeling Standards | Modeling Data

Data Modeling standardization has been in practice for many years and the following section highlight the needs and implementation of the data modeling standards.

Standardization Needs | Modeling data:

Several data modelers may work on the different subject areas of a data model and all data modelers should use the same naming convention, writing definitions and business rules.

Nowadays, business to business transactions (B2B) are quite common, and standardization helps in understanding the business in a better way. Inconsistency across column names and definition would create a chaos across the business.

For example, when a data warehouse is designed, it may get data from several source systems and each source may have its own names, data types etc. These anomalies can be eliminated if a proper standardization is maintained across the organization.

Table Names Standardization:

Giving a full name to the tables, will give an idea about data what it is about. Generally, do not abbreviate the table names; however this may differ according to organization’s standards. If the table name’s length exceeds the database standards, then try to abbreviate the table names. Some general guidelines are listed below that may be used as a prefix or suffix for the table.

Examples:

Lookup – LKP – Used for Code, Type tables by which a fact table can be directly accessed.
e.g. Credit Card Type Lookup – CREDIT_CARD_TYPE_LKP

Fact – FCT – Used for transaction tables:
e.g. Credit Card Fact – CREDIT_CARD_FCT

Cross Reference – XREF – Tables that resolves many to many relationships.
e.g. Credit Card Member XREF – CREDIT_CARD_MEMBER_XREF

History – HIST – Tables the stores history.
e.g. Credit Card Retired History – CREDIT_CARD_RETIRED_HIST

Statistics – STAT – Tables that store statistical information.
e.g. Credit Card Web Statistics – CREDIT_CARD_WEB_STAT

Column Names Standardization:

Some general guidelines are listed below that may be used as a prefix or suffix for the column.

Examples:

Key – Key System generated surrogate key.
e.g. Credit Card Key – CRDT_CARD_KEY

Identifier – ID – Character column that is used as an identifier.
e.g. Credit Card Identifier – CRDT_CARD_ID

Code – CD – Numeric or alphanumeric column that is used as an identifying attribute.
e.g. State Code – ST_CD

Description – DESC – Description for a code, identifier or a key.
e.g. State Description – ST_DESC

Indicator – IND – to denote indicator columns.
e.g. Gender Indicator – GNDR_IND

Database Parameters Standardization:

Some general guidelines are listed below that may be used for other physical parameters.

Examples:

Index – Index – IDX – for index names.
e.g. Credit Card Fact IDX01 – CRDT_CARD_FCT_IDX01

Primary Key – PK – for Primary key constraint names.
e.g. CREDIT Card Fact PK01- CRDT-CARD_FCT_PK01

Alternate Keys – AK – for Alternate key names.
e.g. Credit Card Fact AK01 – CRDT_CARD_FCT_AK01

Foreign Keys – FK – for Foreign key constraint names.
e.g. Credit Card Fact FK01 – CRDT_CARD_FCT_FK01

 

Logical Data Modeler Role | Physical Data Modeler Role

Data Modeler Role:

Data Modelers fall into two major categories of Logical or Physical Data Modelers depending upon the role they play in developing Logical or Physical data models. Based on organization needs, data modelers only do logical data modeling or physical data modeling or combination of both. Nowadays, organizations prefer to hire data modelers, who can do logical as well as physical data modeling efficiently. Logical data modelers interact with stake holders, business analysts, smart management experts (SME) and developers.
Physical data modelers interact with logical data modelers and the database team.

For logical data modeling, please refer Logical Data Modeling Tutorial and for physical data modeling, please refer Physical Data Modeling Tutorial.

Logical Data Modeler Role:

Business Requirement Analysis:
  • Interact with Business Analysts to get the functional requirements.
  • Interact with end users and find out the reporting needs.
  • Conduct interviews, brain storming discussions with project team to get additional requirements.
  • Gather accurate data by data analysis and functional analysis.
Development of data model:
  • Create standard abbreviation document for logical, physical and dimensional data models.
  • Create logical, physical and dimensional data models (data warehouse data modelling).
  • Document logical, physical and dimensional data models (data warehouse data modelling).
Reports:
  • Generate reports from data model.
Review:
  • Review the data model with functional and technical team.
Support & Maintenance:
  • Assist developers, ETL, BI team and end users to understand the data model.
  • Maintain change log for each data model.

Physical Data Modeler Role:

  • Create sql code from data model and co-ordinate with DBAs to create development, testing, regression and production database.
  • Check to see data models and databases are in sync.
  • Adding data base objects (like indexes, partitions in oracle database) for performance.
  • Generating reports.

 

Steps to create Data Model

Steps to create and maintain a new data model from business requirements:

These are the general guidelines to create a standard data model and in real time, a data model may not be created in the same sequential manner as shown below. Based on the enterprise’s requirements, some of the steps may be excluded or included in addition to these.

Sometimes, data modeler may be asked to develop a data model based on the existing database. In that situation, the data modeler has to reverse engineer the database and create a data model.

Steps to create a Logical Data Model:

  1. Get Business requirements.
  2. Analyze Business requirements.
  3. Create High Level Conceptual Data Model. Get it approved.
  4. Create a new Logical Data Model. Add the following to the logical data model.
  5. Select target database where data modeling tool creates the scripts for physical schema.
  6. Create standard abbreviation document for naming logical and physical objects according to business/data modeling standard.
  7. Create domain.
  8. Create rule.
  9. Create default.
  10. Create Entity and add definitions.
  11. Create attribute and add definitions.
  12. Assign datatype to attribute. If a domain is already present then the domain should be attached to attribute.
  13. Add check constraint/rule or default to the columns (wherever it is required).
  14. Create primary or unique keys to attribute.
  15. Create unique index or bitmap index to attribute.
  16. Based on the analysis, create surrogate key columns.
  17. If required, create super types and sub types.
  18. Analyze the relationship between entities and Create foreign key relationship (one to many or many to many) between those entities.
  19. Create subject areas and add relevant entities to those subject areas.
  20. Align the objects in the main subject area and other subject areas.
  21. Validate the data model.
  22. Generate reports from the data model.
  23. Take a print out of the data model.
  24. Get it approved.

Steps to create a Physical Data Model:

  1. Create Physical Data Model from the existing logical data model.
  2. Add database properties to physical data model.
  3. Generate SQL Scripts from Physical Data Model. Tick or check the necessary parameters in the tool, create scripts and then forward that to DBA. (DBA will execute those scripts in database).
  4. Compare database and data model. Make sure everything is okay.
  5. Create a change log document for differences between the current version and previous version of the data model.

Maintenance of Data Models:

  • Maintain Logical & Physical Data Model.
  • For each release (version of the data model), try to compare the present version with the previous version of the data model. Similarly, try to compare the data model with the database to find out the differences.

 

Why to build or create a data model | Advantages of Data Modeling

Why to build or create a data model?

  • To avoid redundancy of data in a OLTP database.
  • In Data Warehousing, data from source systems can be transformed as per the rules and loaded into target tables.
  • In Data Warehousing, you can do data profiling by cleaning the data from source systems and load that into data warehouse columns. i.e. Same column from different source system may have different data structure and column name. In data warehouse, we can create a column as per standards and load the data.
  • In Data warehousing, several columns data help in predicting the future, which is a part of data mining.
  • In Data Warehousing or in Data Mart, you can drill down the data to a certain and you can get consolidated information. For example, with location dimension, You can group the data on a state level basis, county level basis, city level basis. With time dimension, you can drill down on a yearly basis or quarterly basis or on monthly basis.
  • A new application for OLTP (Online Transaction Processing), ODS (Operational Data Store),
    data warehouse and data marts.
  • Rewriting data models from existing systems that may need to change reports.
  • Incorrect data modeling in the existing systems.
  • A data base that has no data models.

Advantages and Importance of Data Model:

  • The goal of a data model is to make sure that all data objects provided by the functional team are completely and accurately represented.
  • Data model is detailed enough to be used by the technical team for building the physical database.
  • The information contained in the data model will be used to define the significance of business, relational tables, primary and foreign keys, stored procedures, and triggers.
  • Data Model can be used to communicate the business within and across businesses.

 

1 2