Why to build or create a data model | Advantages of Data Modeling

Why to build or create a data model?

  • To avoid redundancy of data in a OLTP database.
  • In Data Warehousing, data from source systems can be transformed as per the rules and loaded into target tables.
  • In Data Warehousing, you can do data profiling by cleaning the data from source systems and load that into data warehouse columns. i.e. Same column from different source system may have different data structure and column name. In data warehouse, we can create a column as per standards and load the data.
  • In Data warehousing, several columns data help in predicting the future, which is a part of data mining.
  • In Data Warehousing or in Data Mart, you can drill down the data to a certain and you can get consolidated information. For example, with location dimension, You can group the data on a state level basis, county level basis, city level basis. With time dimension, you can drill down on a yearly basis or quarterly basis or on monthly basis.
  • A new application for OLTP (Online Transaction Processing), ODS (Operational Data Store),
    data warehouse and data marts.
  • Rewriting data models from existing systems that may need to change reports.
  • Incorrect data modeling in the existing systems.
  • A data base that has no data models.

Advantages and Importance of Data Model:

  • The goal of a data model is to make sure that all data objects provided by the functional team are completely and accurately represented.
  • Data model is detailed enough to be used by the technical team for building the physical database.
  • The information contained in the data model will be used to define the significance of business, relational tables, primary and foreign keys, stored procedures, and triggers.
  • Data Model can be used to communicate the business within and across businesses.

 

Data Modeling Concepts | What is Data Modeling | Data Modeling Overview

What is a Data Model?

A Data model is a conceptual representation of data structures (tables) required for a database and is very powerful in expressing and communicating the business requirements.

Data Modeling Overview:

A data model visually represents the nature of data, business rules governing the data, and how it will be organized in the database. A data model is comprised of two parts logical design and physical design. Data Models are created in either Top Down Approach or Bottom-Up Approach. In Top-Down Approach, data models are created by understanding and analyzing the business requirements. In Bottom Up Approach, data models are created from existing databases, which has no data models. IDEF1X is the common notation used in creating data models since it is more descriptive.

Data Modeling can be broadly classified into OLTP data modeling and OLAP data modeling. Data Models are created for OLTP databases by normalizing the data and for OLAP databases, data models are created by de-normalizing the data. Data Models cannot be frozen since update will happen on data modeling based on business requirements. Data Models looks like a blue print or like a map. Data Model is not an exact replica of the database and it will not contain all the objects or code present in the database since several objects are available in database and tonnes and tonnes of code would have been developed by developers. Usually Data Models contains the key database objects like tables, columns, relationships, constraints etc.

Data model helps functional and technical team in designing the database. Functional team normally refers to one or more Business Analysts, Business Managers, Subject Matter Experts, End Users etc., and Technical teams refers to one or more programmers, DBAs etc. Data modelers are responsible for designing the data model and they communicate with functional team to get the business requirements and technical teams to implement the database.

Data Modeling Concept:

The concept of data modeling can be better understood if we compare the development cycle of a data model to the construction of a house. For example Company ABC is planning to build a guest house (database) and it calls the building architect (data modeler) and projects its building requirements (business requirements). Building architect (data modeler) develops the plan (data model) and gives it to company ABC. Finally company ABC calls civil engineers (DBA) to construct the guest house (database).

Data Engineers, Data Modeler and Data Architect are the common titles for those who are involved in data modeling. To become an efficient data modeler, you should have an overview about the database objects, constraints, normalization and understanding the requirements correctly.

Before learning Data Modeling, one has to spend couple of hours to understand the basic concepts of SQL (Structured Query Language) like CREATE TABLE, ALTER TABLE, MODIFY TABLE, RELATIONSHIPS and SELECT commands etc.

Steps to learn Data Modeling:

  1. Fundamentals of SQL
  2. Conceptual Data Modeling
  3. Logical Data Modeling
  4. Physical Data Modeling
  5. Dimensional Data Modeling
  6. Maintenance of the data model.

If you are interested in Data Modeling Training, please reach Antony at Training@LearnDataModeling.com or 91-9080157239/91-98846 75745

How to create a physical data model from a logical data model using Embarcadero’s Data Architect?

We are focusing to create this physical data model with oracle as the database. So we have selected oracle 12c as the database.

Open existing logical data model “02_LDM_creating_entity_version2”.

Click menu Model/Generate Physical Model. A small screen will appear. Click ok.

A new screen is opened. There are 5 steps in this process.

We are explaining step 1 and Step 3 only.

Step 1 of 5:

Name the physical data model: pdm_creating_entity_version2

Type of physical model: relational

Target_database_platform: oracle12c.

Click next and again click next.

Step 3 of 5:

Attach the naming template “enterprise_naming_standards.nst”. Click finish.

You can now see the physical data model.

04_physical_data_model_diagram

 

 

How To Create Naming Standards In Embarcadero’s Data Architect?

Create a .csv file named “Naming_Standards.csv” and store the following information in that csv file. The reason for creating this file is to avoid errors and typos in our naming conventions and data modeling naming convention standards can also be followed.  We have eliminated VOWELS in the actual word so that physical object names will not exceed the actual length provided by the database. Later this file will be created in .nst format (embarcadero’s naming standards format) and has to be attached to the physical data model that we will create in the next section. This file (.nst) can be attached to all data models created across the enterprise.

Note: We are not providing naming standards for logical data model.

Step1: Creating .csv file and storing it as “Naming_Standards.csv”

Open your Microsoft Excel and type the following data. Then store it in .nsv format.

DEPARTMENT DPRTMNT
NUMBER NMBR
NAME NM
ADDRESS ADDRSS
IDENTIFIER IDNTFR
EMPLOYEE EMPLY
MANAGER MNGR
SALARY SLRY
DATE DT
OF OF
JOINING JNNG
PHONE PHN
NUMBER NMBR
INCENTIVE INCNTV
INDICATOR INDCTR

Step2: Creating .nst file and storing it as “Enterprise_Naming_Standards.nst”

Open existing logical data model “02_LDM_creating_entity_version2”

Click menu Tools/Naming Standards Template Editor.

A new screen appears.

On top of the screen, there are four different tabs. Name, Logical, Physical, and Mapping.

Under name tab: type the file name as enterprise_naming_standards.

Under Logical tab, you can change max length of entity, attribute, view, key, relationship. Select case as upper for entity.

Under physical tab, you can change you can change max length of entity, attribute, view, key, relationship.  You have to change the length as per the target database standards. Select case as upper for table.

Under mapping tab, click import. Select the file “naming_standards.csv”.

In mapping tab, below the screen there are three tabs abbreviations, order, and general. Click general tab.

Since we will be placing underscore between each word of the column, type “_” in physical word separation under heading “delimiter options”.

Click save. Save the file as “enterprise_naming_standards.nst”

 

03_how_to_convert_.csv_file_to_.nst_file

 

How to add identifying, non-identifying, self-recursive relationship in ER Studio Data Architect?

Now, we are going to make changes to the previously created data model. So create another file LDM_creating_entity_version2. Open the previous data model “LDM_creating_entity_version1” and click menu file/save as “LDM_creating_entity_version2”.

How to add entity and attributes in ER Studio Data Architect:

Add the following attributes to “Employee” Entity. Place the cursor in attribute section of Employee Entity, right click edit entity. Click attributes tab and add the following attributes.

  • “Employee Name” with data type as Varchar(50); NOT NULL;
  • “Incentive Indicator” with datatype as Char(10);  NOT NULL;
  • “Phone Number” with datatype as Char(12); NULL;

You can use UP or DOWN arrow keys to align the attributes.

Create another entity Address with Address Identifier as attribute, datatype as Integer and Address identifier as primary key.

Address ID should be a sequence number.

How to create a sequence number in ERStudio Data Architect:

When you add an attribute, you can see a section “Identity Property”.  Click the check box in Identity column.

Type 1 for seed and 1 for increment. Seed means the starting number of the sequence and increment means how it has to be incremented.

Create another entity Employee-Address with no attributes.  We will explain how to add attributes below.

How to add non-identifying relationship in ER studio data architect?

Department entity to Employee entity:

Many employees can work in one department. One-to-many relationship. Hence we can create this relationship by Non-Identifying Relationship.

Click menu Insert/Relationship/Non-Identifying mandatory:

Click the cursor on Department Entity and then on Employee Entity.

Now you can see the non-identifying relationship between department entity and employee identity.

A new attribute Department Number is added to employee entity.

How to add identifying relationship in ER studio data architect?

Employee entity and Address entity: One employee can stay in many addresses.

One address can contain many employees.

Many to many relationship: Identifying relationship

Create address entity with address id as the primary key.

Click menu Insert/Relationship/Identifying

Click the cursor on employee entity and employee address entity. Again click on address entity and then on employee-address entity.

Now you can see the identifying relationship in employee address entity.

Two new attributes employee number, and address identifier are added as primary keys to employee-address entity. Since two keys are there, it is called as composite primary keys.

How to add self-recursive relationship in ER studio data architect?

To connect relationship between an employee and a manager: There is no attribute which references the employee number in the employee entity.  So we will create a new attribute Manager Number by using self-recursive relationship. When we want to create an attribute in an entity that references the same primary key attribute, we have to create role name, because, two attributes cannot have the same name in the same table.

In this example, the parent entity and child entity are same (employee entity).

Click menu Insert/Relationship/Non-Identifying mandatory

Click the cursor on employee entity twice. You can see a new screen. Type role name as manager number.

You can see a new attribute manager number in employee entity.

If you want to know more about the relationship, click each relationship line and you can see different options.

02_identifying_and_non_identifying relationship

 

How To Create A Logical Data Model In ER Studio Data Architect?

The following example has been created from the Embarcadero’s product ER/Studio Data Architect, Version 10.0. You can find different products in the URL http://www.embarcadero.com/products. For more information, please visit http://www.embarcadero.com.

ER/Studio Data Architect is a powerful Data Modeling tool with several applications. You can create either relational data model or dimensional data model from Embarcadero’s ER/Studio Data Architect. Based on the usage, you can create conceptual, enterprise and sub models etc. After you finish your creation of logical relational data model, you can create physical relational data model.

How you will create a logical relational data model in ER/Studio Data Architect?

  • Open ER Studio Data Architect.
  • Click File New.
  • From the new screen that appears click “Draw a new data model” and select “Relational” from the drop down box. A new screen appears. By default, you will create a logical data model.
  • Either you can use menu options or icons present in the screen and we will follow menu approach.

How to create an entity (Table), add attribute (column), datatype, primary key in ER/Studio Data Architect?

  • Click Menu Insert/Entity. Drop that entity in working area by clicking your cursor in the working area.
  • Do the same thing to create another entity.  By default the first entity you created will be Entity1 and the second entity will be Entity2 respectively.
  • Change the entity name “Entity1” to Department.
  • Right click “Entity1”. Click edit entity. A new screen “Entity Editor” is opened.
  • Type the entity name as “Department”.
  • There are different tabs present in that “Entity Editor” screen.
  • Click tab “Attributes”. On the bottom of the screen, you can see add button and Click that add button.
  • Type Attribute name as “Department Number”.
  • Select the datatype as Integer.
  • Click the check box “Add to Primary Key”.
  • Click add and a new attribute “Department Number” is created.
  • Create another attribute with “Department Name”,and datatype as Varchar(50). Select datatype as Varchar and width as 50.
  • A new entity “Department” with a primary key “Department Number” and a standalone attribute “Department Name” is created.

Similarly, create another entity “Employee” with attribute names:

  • “Employee Number” with data type as Integer and constraint as primary key.
  • “Salary” with data type as Numeric (10, 2), NOT NULL constraint as a standalone attributes. Type width as 10 and scale as 2.
  • “Date of Joining” with datatype as Date, NOT NULL constraint as a standalone attribute:
  • Click menu file/save. Save it as “LDM_creating_entity_version1”

01_logical_data_model_diagram

Since we have not created domain names, defaults, role names, naming standards, we have not selected those options. We will explain it later.

We will explain in the next section how to create physical data model  of data model  LDM_creating_entity_version2. Based on that you can create physical data model of LDM_creating_entity_version1.

01_physical_data_model_diagram

Data Modeling Frequently Asked Interview Questions And Answers – Part 1

The following data modeling questions and answers are conceptual questions that are asked during the data modeler interview.

 1. What is data modeling?

A data model is a conceptual representation of business requirement (logical data model) or database objects (physical) required for a database and are very powerful in expressing and communicating the business requirements and database objects. The approach by which data models are created is called as data modeling.

2. What does data model contain?

Logical Data Model: Entity, Attributes, Super Type, Sub Type, Primary Key, Alternate Key, Inversion Key Entry, Rule, Relationship, Definition, business rule, etc

Physical Data Model: Table, Column, Primary key Constraint, Unique Constraint or Unique Index, Non Unique Index, Check Constraint, Default Value, Foreign Key, comment etc.

Please refer http://www.learndatamodeling.com/diff_lpdm.php

3. What is a logical data model and logical data modeling?

A logical data model is the version of a data model that represents the business requirements (entire or part of an organization). This is the actual implementation and extension of a conceptual data model. Logical Data Models contain Entity, Attributes, Super Type, Sub Type, Primary Key, Alternate Key, Inversion Key Entry, Rule, Relationship, Definition etc. The approach by which logical data models are created is called as logical data modeling.

4. What is a physical data model and physical data modeling?

Physical data model includes all required tables, columns, relationship, database properties for the physical implementation of databases. Database performance, indexing strategy, and physical storage are important parameters of a physical model. The important or main object in a database is a table which consists or rows and columns. The approach by which physical data models are created is called as physical data modeling.

5. What is the difference between a logical and physical data model?

Please refer http://www.learndatamodeling.com/diff_lpdm.php

 6. What is a table (entity)?

Data stored in form of rows and columns is called as table. Each column has datatype and based on the situation, integrity constraints are enforced on columns.

7. What is a column (attribute)?

Column also known as field is a vertical alignment of the data and contains related information to that column.

8. What is a row?

Row also known as tuple or record is the horizontal alignment of the data.

9. What is ER (entity relationship) diagram or ERD?

ER diagram is a visual representation of entities and the relationships between them. In a data model, entities (tables) look like square boxes or rectangular boxes, which contain attributes and these entities, are connected by lines (relationship).

10. What is a primary key constraint?

Primary key constraint is imposed on the column data to avoid null values and duplicate values. Primary Key=Unique + Not Null. Example: social security number, bank account number, bank routing number

11. What is a composite primary key constraint?

When more than one column is a part of the primary key, it is called as composite primary key constraint.

12. What is a surrogate key?

In normal practice, a numerical attribute is enforced a primary key which is called as surrogate key.  Surrogate key is a substitute for natural keys. Instead of having primary key or composite primary keys, the data modelers create a surrogate key; this is very useful for creating SQL queries, uniquely identify a record and good performance.

13. What is a foreign key constraint?

Parent table has primary key and a foreign key constraint is imposed on a column in the child table.  The foreign key column value in the child table will always refer to primary key values in the parent table.

14. What is a composite foreign key constraint?

When group of columns are in a foreign key, it is called as composite foreign key constraint.

15. What are the important types of Relationships in a data model?

Identifying, Non-Identifying Relationship, Self-Recursive relationship are the types of relationship.

16. What is identifying relationship?

Usually, in a data model, parent tables and child tables are present. Parent table and child table are connected by a relationship line. If the referenced column in the child table is a part of the primary key in the child table, relationship is drawn by thick lines by connecting these two tables, which is called as identifying relationship.

17. What is non-identifying relationship?

Usually, in a data model, parent tables and child tables are present. Parent table and child table are connected by a relationship line. If the referenced column in the child table is a not a part of the primary key and standalone column in the child table, relationship is drawn by dotted lines by connecting these two tables, which is called as non-identifying relationship.

18. What is self-recursive relationship?

A standalone column in a table will be connected to the primary key of the same table, which is called as recursive relationship.

19. What is cardinality?

One to One, One to many, and many to many are different types of cardinalities. In a database, high cardinality means more unique values are stored in a column and vice versa.

 20. What is a conceptual data model and conceptual data modeling?

Conceptual data model includes all major entities and relationships and does not contain much detailed level of information about attributes and is often used in the initial planning phase. Data Modelers create conceptual data model and forward that model to functional team for their review. The approach by which conceptual data models are created is called as conceptual data modeling.

21. What is an enterprise data model?

Enterprise data model comprises of all entities required by an enterprise. The development of a common consistent view and understanding of data elements and their relationships across the enterprise is referred to as Enterprise Data Modeling. For better understanding purpose, these data models are split up into subject areas.

22. What is relational data modeling?

The visual representation of objects in a relational database (usually a normalized) is called as relational data modeling. Table contains rows and columns.

23. What is OLTP data modeling?

OLTP acronym stands for ONLINE TRANSACTIONAL PROCESSING. The approach by which data models are constructed for transactions is called as OLTP data modeling. Example: all online transactions, bank transactions, trading transactions.

24. What is a constraint? What are the different types of constraint?

Constraint is a rule imposed on the data.  The different types of constraints are primary key, unique, not null, foreign key, composite foreign key, check constraint etc.

25. What is a unique constraint?

Unique constraint is imposed on the column data to avoid duplicate values, but it will contain NULL values.

 26. How many null values can be inserted in a column that has unique constraint?

Many null values can be inserted in an unique constraint column because one null value is not equal to another null value.

27. What is a check constraint?

Check constraint is used to check range of values in a column.

28. What is index?

Index is imposed on a column or set of columns for fastest retrieval of data.

29. What is a sequence?

Sequence is a database object to generate unique number.

Top Data Modeling Interview Questions asked in 2018:

  • what is Canonical Data Model?
  • What is Optionality?
  • What is Cardinality?
  • How columnar databases are different from the RDBMS Database?
  • What are the deliverables of a Data Modeler?
  • How do you present the data model to the business team and the technical team?
  • How do you maintain the data model after the project implementation?
  • How do you define the business rules in the data model?
  • Can you do shared role?
  • What is Enterprise Data Architecture?
  • What is Big Data Data Modeling?
  • What kind of knowledge you have in NO SQL Databases?
  • What is the difficult scenario you faced in Data Modeling?
  • What is the document equivalent to RTM (Requirement Traceability Matrix) in Data Modeling?
  • Which approach you followed: Inmon’s approach or Ralph Kimball’s approach? & Why?
  • As a Data Modeler, how can you ensure that the Data is available to customer 24/7, 365 days and the Data is most reliable?
  • What is the data modeling tool that you have worked?
  • How will you reverse engineer using a data dictionary in excel format?
  • Can we reverse engineer, upload an Excel file with entity name, attribute name and datatype to create a data model?
  • Why do you have a flat file in Data Warehouse environment without primary keys?
  • What is the difference between a domain and a datatype?
  • Under which scenario you would use recursive relationship, why and why not?for example: Hierarchical data set or Master-Detail data set?
1 4 5 6