Use ‘Active’ Model Templates to facilitate reusable modeling metadata and to implement/enforce enterprise standards
Erwin Data Modeler is now certified to work with: Db2 z/OS v12 SQL Server 2019 PostgreSQL v9.6.15, v10.10 and v11.5 Oracle 18c (18.3)
Erwin Data Modeler now supports the new features and capabilities offered in Db2 LUW 11.1 like: Storage Group object support has been added. Table properties for all three types (Permanent, Global Temporary, and Nickname) have been updated. Also, Index and XML Index support has been enhanced. Tablespace properties have been updated to support Storage Space Name and Data Tag. Materialized query table properties have been enhanced.
Erwin DM Scheduler: Enables users to define and schedule database reverse engineering (REDB) tasks in advance Minimizes workstation contention by allowing modelers to plan for and run REDB tasks when resources are available
Overall modeler productivity has been enhanced via new product functionality and workflows: A “quick” compare option set and “speed” option Supertype-Subtype transformation updated to retain separate logical and physical constructs for Identity and Roll-Up option selections Deletion of a relationship offers modeler the option of converting migrated FK attribute(s)/column(s) into child entity/table owned attribute(s)/column(s) Mart OPEN dialog now displays number of entities, attributes and relationships Mart and mart report performance has been significantly improved Database “Target server” dialog can be accessed by clicking the database name displayed on the physical model status bar
Naming Standards enhanced to fully support physical-to-logical name mapping
Users defined in Azure Active Directory can now connect to the Mart from erwin Data Modeler.
Erwin Data Modeler now offers an automation option for Reverse Engineering and Forward Engineering through the Erwin API.
Erwin Data Modeler includes flexible extended notes on all object types that can be: compared, searched, and edited using the Bulk Editor reported via the Report Designer managed using the erwin API
Erwin SCAPI has been extended to support the application of naming standards.
Erwin Data Modeler provides “native” support for Hadoop Hive as a database target – including AVRO files!
Licensing improvements include: Silent Install Option to “hide” Licensing dialog Support for proxy connections to the License/Activation server Optional network domain name and subnet IP-based usage restrictions erwin Data Modeler “edition”-based licensing The ability to “borrow” (Long Term Check Out) cloud-based concurrent licenses
Collaboratively administer the Workgroup Edition mart using our web-based Mart Administration utility Session (Offline/Disconnect) Management Security (User/Role) Management AD Authentication (via LDAP) Permission/Authorization Management Model Locking
Collaboratively administer the Workgroup Edition mart using our web-based Mart Administration utility Session (Offline/Disconnect) Management Security (User/Role) Management AD Authentication (via LDAP) Permission/Authorization Management Model Locking
Erwin Data Modeler supports a wide variety of reports from a mart, including: administrative cross-model model object count reports
The erwin Data Modeler (DM) User Interface has been revamped to include the following: A Microsoft “Office-like” ribbon-based appearance UI themes, font sizes, and component direction settings Tabbed and “side-by-side” diagram window arrangement option A “Quick Access” toolbar
Erwin Data Modeler supports both online and local product documentation and help.
Erwin Data Modeler has added “native” support for Redshift as a database target.
Report Designer includes the ability to create customized report filters and properties using erwin Data Modeler’s TLX scripting language
Erwin Data Modeler supports new features and capabilities offered in SQL Server like: Always Encrypted columns Dynamic Data Masking Temporal tables Memory Optimized Table Natively compiled Stored Procedures
Erwin Data Modeler supports new features and capabilities offered in SQL Server like: Always Encrypted columns Dynamic Data Masking Temporal tables Memory Optimized Table Natively compiled Stored Procedures
User enhancements and modeling improvements include: A new option has been added to allow modelers to enforce Relationship Nullability Rules A new option has been added to allow modelers to enforce Supertype-Subtype relationship rules Oracle 12c support for 128 character TABLE and COLUMN Names and PostgreSQL support for 63 character TABLE and COLUMN names New support added for relationship cardinality ranges (for example, 1..5) New “Forward Engineering” wizard Ability to tag ATTRIBUTES/COLUMNS as PII Support for Hive VIEW Partitions
Erwin Data Modeler is offered as a native 64-bit application with access to sufficient memory to complete operations on large models (A 32-bit option is still available).
What is a Data Model and Data Modeler Responsibilities?
What’s in it: Data Model, High Level Data Modeler Responsibilities along with images of a conceptual data model, logical data model and physical Data Model is explained using Erwin Data Modeling Tool.
Video Tutorial 2:
Explain Data Modeling Development Life Cycle?
What’s in it: This question is asked by the Interviewers. Our video explains step by step process in Data Modeling Development Life Cycle. This life cycle is common for RDMS Data Modeling and Dimensional Data Modeling.
Video Tutorial 3:
What is an ER Diagram?
What’s in it: ER Diagram also know as Erwin Relationship Diagram is explained with a data model created by using Erwin Data Modeling Tool. Primary Key Constraint, Composite Primary Key Constraint, Foreign Key Constraint, Composite Foreign Key Constraint are explained here in this Data Modeling Video Tutorial.
Video Tutorial 4:
What is a Conceptual Data Model?
What’s in it: This is the first phase in designing the data model in the Data Modeling Development Life and is explained in Detail with image of Erwin Data Model.
Video Tutorial 5:
What is a Logical Data Model?
What’s in it: The is second phase in designing the data model in the Data Modeling Development Life Cycle and is explained in details
Video Tutorial 6:
What is the difference between a Logical Data Model and Physical Data Model?
What’s in it: The differences between Logical Data Model and Physical Data Model is tabulated and explained with the help of a data model created from the Erwin Tool.
Video Tutorial 7:
What is an Entity and Attribute?
What’s in it: Entity and Attribute are explained with a Diagram.
Video Tutorial 8:
What is Identifying and Non-Identifying Relationship?
What’s in it: Identifying and Non-Identifying Relationship are explained with the help of a small data model drawn from Erwin Data Modeling Tool. Difference between those two are also explained.
Video Tutorial 9:
What is Forward Engineering & What is Reverse Engineering?
What’s in it: Forward Engineering, Reverse Engineering, SuperType and SubType are explained.
Video Tutorial 10:
What is Data Model Meta Data & Data Model Repository
What’s in it: Data Model Meta Data & Data Model Repository are explained in this video tutorial.
Video Tutorial 11:
What is the process to create a Data WareHouse Schema and What are the steps involved in designing the data mart?
What’s in it: This video explains important points in designing the Data Mart and Data Warehouse Schema.
Video Tutorial 12:
What is Dimensional Data Model and a Star Schema?
What’s in it: This video explains the basics of Dimensional Data Model and Schema with OLA use case as the example.
Video Tutorial 13:
What is ETL? / What is Datawarehouse? / What is DataMart?
What’s in it: Basics of ETL, Datawarehouse and Datamart are explained in this Data Modeling Video Tutorial.
Video Tutorial 14:
What is Check Constraint? / What is a Sequence? / What is an Index?
What’s in it: Brief explanation of Check Constraint, Sequence, and Index is explained. Also NULL values in Unique constraint are explained. The above topic is an important question in the Data Modeling Interview..
Video Tutorial 15:
What is OLTP Data Modeling? / What are the different types of constraints? / What is an Unique Constraint?
What’s in it: Brief explanation about OLTP data modeling and constraint. This is an important question in Data Modeling Interview.
Video Tutorial 16:
What is a Foreign Key Constraint? / What is a composite Foreign Key constraint?
What’s in it: Foreign Key Constraint is explained in a detailed manner by taking an example from IBM.
Video Tutorial 17:
Erwin Data Modeling Interview Questions – Part 1
What’s in it:
This video helps with the list of interview questions that can be asked in creating : Conceptual data model by Erwin. Logical data model by Erwin. Physical data model by Erwin.
Video Tutorial 18:
Dimensional Data Modeling Interview Questions – Part 1
What’s in it:
These Dimensional Data Modeling Questions are related to Data Warehouse and Data Mart. When you work as an ETL developer or BI developer, these questions will be asked in the interview.
Entity Relationship Diagram (also called as ER Diagram, ER Model, ERD) contains entities/attributes and relationships with those entities. ER Model is a methodological approach to create entities in normalized forms to minimise redundancy. It is drawn with the use of Data Modeling Tools and the following steps are done to create the entity relationship diagram.
Understanding the business requirements
Meetings / discussions with stakeholders (BA/SME/PM etc) to find out entities and relationships
Data Modeler designs the ER diagram.
Data Modeling with ER Model:
In the above mentioned example,
1. Following entities are created by data modeler
Department
Gender
Employee
Salary Grade
Degree
2. Then the data modeler adds the relevant attributes to those entities.
3. Creating attributes: Example Employee
Employee ID
Department Number
Salary Grade Identifier
Gender Code
Degree Code
Employee Name
Birth DATE
4. The data modeler assigns relevant datatypes for each attribute based on the type of data stored:
Example Employee ID in employee entity
5. Relationships:
After creating entities and attributes the data modeler decides how to create the primary key, foreign key
Creating primary key:
Example: Employee ID in Employee entity
Creating foreign key:
Example: Employee entity references Department entity, Gender entity, Salary Grade and Degree Entity
Optionality and Cardinality options:
(Source: www.Erwin.com)
Creating foreign key:
Example: Employee entity references Department entity, Gender entity, Salary Grade and Degree Entity
In The below ER diagram created from Erwin Data Modeling Tool by using Information Engineering (IE) notation, near to that department entity, one can see ZERO and ONE and near employee entity, ZERO AND ONE AND MORE. These symbols have specific meaning and the data modeler has assigned the right symbols due to business rules.
Similarly, entities are created in ER diagram and the data modeler gets the approval from the business team, technical to release the data models.
Data Modeling Software Tools Trial Versions:
Oracle has released “SQL Developer Data Modeler” and it is a free data modeling tool. If you want to become a data modeler, you can start designing data with “SQL Developer Data Modeler”.
If you are interested to learn data modeling by using Erwin or SAP PowerDesigner or Oracle SQL Data Modeler, please reach us for more details – Training@learndatamodeling.com or 91-90801 57239.
Anyone can Learn Data Modeling Concepts by reading some books or going through online or by getting some help from experts in the field. But practically, while working with a particular Data Model, there may arise many challenging situations where one has to think about various possibilities and requirements, before imposing a cardinality rule or before creating a relationship. Will it be sufficient if I go with one additional entity or I should create one lookup in order to satisfy the Business Requirement? Will the current data structures be hold good for any future expansions and modifications?
So many challenging situations may arise during the course of one Data Modeling Life Cycle. This page is entirely for that purpose which makes Data Modeler to be to think ahead and decide on what rule to apply in order to create the right Data Model.
We will come up with many difficult situations and present you with some practical based scenarios and post it on this page on the daily or weekly basis. Data Modelers make a note of this page and use it as a reference or an assignment to improve your skills in the Data Modeling Field.
Challenging Situations in Data Modeling from the Client End:
No documents or fewer documents with less information which explains the business process and business rules.
Less knowledge and no proper explanation from SME or BAs.
User requirements and demands are more.
No proper plan in the Enterprise Architecture.
Challenging Situations in Data Modeling from the Data Modeler’s Perspective:
Not able to understand the business requirements.
Understanding the business requirements but not able to predict the entities required for this project.
Understanding the entities required but not able to place the attributes properly in the entity.
Placing the attributes in entities but not able to create identifying relationships or non-identifying relationships.
Able to create identifying and non-identifying relationships but not sure whether ZeroOrOne to ZeroOneorMore, One to ZeroOneorMore.
How the change in the business rules affects the data model?
Original Business Rule 1:
Employee’s present manager to whom he reports must be tracked.
An entity called Employee is created and by using Self Referential Integrity and the role name, Manager is added.
Business Rule 1 Change:
Employee’s previous manager, present manager and future managers are to be tracked.
Original Business Rule 2:
Employees current residential address must be tracked.
An entity called address is created and you connect that address with employee entity.
Business Rule 2 Change:
Employees previous/current/future residential address must be tracked.
Here how these Business Rules(1 & 2) affects the current Data Model and as a data modeler, how will you implement these changes? What is the best way to do it?
Business Rule 3:
If there are three definite values for a column, and if you are sure that these values never change over time then what approach you will follow? – a Lookup or a Check Constraint?
Example: In URLA 1003 form provided by Fannie Mae, Page No.5, Section No. 4 one can see Purchase, Refinance and Other as the values for the field “Loan Purpose”. Should we create a lookup for these 3 values or will it be better if we create an identifier and impose a check constraint on these 3 values.
Business Rule 4:
How will you implement a Candidate Key – By creating it as a Primary Key or an Unique Constraint?
Example: Social Security Number is always a Candidate Key and whether this should be created as a Primary Key in a table or a stand alone column with Unique Constraint imposed on it.
Reach US!!!
We provide online training in advanced OLTP Data Modeling and DIMENSIONAL Data Modeling.
We also teach the data structures with Data Analytics Software “R”.
We provide online Data Modeling Project Support when you get struck with projects that you are involved.
We can analyse your Business Requirements, understand and suggest solutions to create OLTP Data Models and Dimensional Data models.
We provide Data Modeling Interview Preparation Sessions with a lot of Data Modeling Interview Questions/Answers, which will help you to clear any interview.
If you are interested, please reach us at Training@LearnDataModeling.com or 91-9080157239
The centralized table in a star schema is called as FACT table. A fact table typically has two types of columns: those that contain facts and those that are foreign keys to dimension tables. The primary key of a fact table is usually a composite key that is made up of all of its foreign keys.
In the example, “Sales Dollar” is a fact (measure) and it can be added across several dimensions. Fact tables store different types of measures like additive, non additive and semi additive measures.
Measure Types:
Additive – Measures that can be added across all dimensions.
Non Additive – Measures that cannot be added across all dimensions.
Semi Additive – Measures that can be added across few dimensions and not with others.
In the real world, it is possible to have a fact table that contains no measures or facts. These tables are called as Factless Fact tables.
Steps in designing Fact Table:
Identify a business process for analysis(like sales).
Identify measures or facts (sales dollar).
Identify dimensions for facts(product dimension, location dimension, time dimension, organization dimension).
List the columns that describe each dimension.(region name, branch name, region name).
Determine the lowest level of summary in a fact table(sales dollar).
Example of a Fact Table with an Additive Measure in Star Schema:
A snowflake schema is a term that describes a star schema structure normalized through the use of outrigger tables. i.e dimension table hierarchies are broken into simpler tables.
In Star Schema example we had 4 dimensions like location, product, time, organization and a fact table (sales).
In Snowflake schema, the example diagram shown below has 4 dimension tables, 4 lookup tables and 1 fact table. The reason is that hierarchies(category, branch, state, and month) are being broken out of the dimension tables(PRODUCT, ORGANIZATION, LOCATION, and TIME) respectively and shown separately. In OLAP, this Snowflake schema approach increases the number of joins and poor performance in retrieval of data. In few organizations, they try to normalize the dimension tables to save space. Since dimension tables hold less space, Snowflake schema approach may be avoided.
In general, an organization is started to earn money by selling a product or by providing service to the product. An organization may be at one place or may have several branches.
When we consider an example of an organization selling products throughout the world, the main four major dimensions are product, location, time and organization. Dimension tables have been explained in detail under the section Dimensions. With this example, we will try to provide detailed explanation about STAR SCHEMA.
What is Star Schema?
Star Schema is a relational database schema for representing multidimensional data. It is the simplest form of data warehouse schema that contains one or more dimensions and fact tables. It is called a star schema because the entity-relationship diagram between dimensions and fact tables resembles a star where one fact table is connected to multiple dimensions. The center of the star schema consists of a large fact table and it points towards the dimension tables. The advantage of star schema are slicing down, performance increase and easy understanding of data.
Steps in designing Star Schema:
Identify a business process for analysis(like sales).
Identify measures or facts (sales dollar).
Identify dimensions for facts(product dimension, location dimension, time dimension, organization dimension).
List the columns that describe each dimension.(region name, branch name, region name).
Determine the lowest level of summary in a fact table(sales dollar).
Important aspects of Star Schema & Snow Flake Schema:
In a star schema every dimension will have a primary key.
In a star schema, a dimension table will not have any parent table.
Whereas in a snow flake schema, a dimension table will have one or more parent tables.
Hierarchies for the dimensions are stored in the dimensional table itself in star schema.
Whereas hierarchies are broken into separate tables in snow flake schema. These hierarchies helps to drill down the data from topmost hierarchies to the lowermost hierarchies.
Glossary:
Hierarchy: A logical structure that uses ordered levels as a means of organizing data. A hierarchy can be used to define data aggregation; for example, in a time dimension, a hierarchy might be used to aggregate data from the Month level to the Quarter level, from the Quarter level to the Year level. A hierarchy can also be used to define a navigational drill path, regardless of whether the levels in the hierarchy represent aggregated totals or not.
Level: A position in a hierarchy. For example, a time dimension might have a hierarchy that represents data at the Month, Quarter, and Year levels.
Fact Table: A table in a star schema that contains facts and connected to dimensions. A fact table typically has two types of columns: those that contain facts and those that are foreign keys to dimension tables. The primary key of a fact table is usually a composite key that is made up of all of its foreign keys.
A fact table might contain either detail level facts or facts that have been aggregated (fact tables that contain aggregated facts are often instead called summary tables). A fact table usually contains facts with the same level of aggregation.
Example of Star Schema:
In the example sales fact table is connected to dimensions location, product, time and organization. It shows that data can be sliced across all dimensions and again it is possible for the data to be aggregated across multiple dimensions. “Sales Dollar” in sales fact table can be calculated across all dimensions independently or in a combined manner which is explained below.
Sales Dollar value for a particular product.
Sales Dollar value for a product in a location.
Sales Dollar value for a product in a year within a location.
Sales Dollar value for a product in a year within a location sold or serviced by an employee
Dimensions that change over time are called Slowly Changing Dimensions. For instance, a product price changes over time; People change their names for some reason; Country and State names may change over time. These are a few examples of Slowly Changing Dimensions since some changes are happening to them over a period of time.
Slowly Changing Dimensions are often categorized into three types namely Type1, Type2 and Type3. The following section deals with how to capture and handling these changes over time.
The “Product” table mentioned below contains a product named, Product1 with Product ID being the primary key. In the year 2004, the price of Product1 was $150 and over the time, Product1’s price changes from $150 to $350. With this information, let us explain the three types of Slowly Changing Dimensions.
Product Price in 2004:
[ultimatetables 48 /]
Type 1: Overwriting the old values.
In the year 2005, if the price of the product changes to $250, then the old values of the columns “Year” and “Product Price” have to be updated and replaced with the new values. In this Type 1, there is no way to find out the old value of the product “Product1” in year 2004 since the table now contains only the new price and year information.
Product:
[ultimatetables 47 /]
Type 2: Creating an another additional record.
In this Type 2, the old values will not be replaced but a new row containing the new values will be added to the product table. So at any point of time, the difference between the old values and new values can be retrieved and easily be compared. This would be very useful for reporting purposes.
Product:
[ultimatetables 49 /]
The problem with the above mentioned data structure is “Product ID” cannot store duplicate values of Product1 since “Product ID” is the primary key. Also, the current data structure doesn’t clearly specify the effective date and expiry date of Product1 like when the change to its price happened. So, it would be better to change the current data structure to overcome the above primary key violation.
Product:
[ultimatetables 50 /]
In the changed Product table’s Data structure, “Product ID” and “Effective DateTime” are composite primary keys. So there would be no violation of primary key constraint. Addition of new columns, “Effective DateTime” and “Expiry DateTime” provides the information about the product’s effective date and expiry date which adds more clarity and enhances the scope of this table. Type2 approach may need additional space in the data base, since for every changed record, an additional row has to be stored. Since dimensions are not that big in the real world, additional space is negligible.
Type 3: Creating new fields.
In this Type 3, the latest update to the changed values can be seen. Example mentioned below illustrates how to add new columns and keep track of the changes. From that, we are able to see the current price and the previous price of the product, Product1.
Product:
[ultimatetables 51 /]
The problem with the Type 3 approach, is over years, if the product price continuously changes, then the complete history may not be stored, only the latest change will be stored. For example, in year 2006, if the product1’s price changes to $350, then we would not be able to see the complete history of 2004 prices, since the old values would have been updated with 2005 product information.
In a relational data model, for normalization purposes, year lookup, quarter lookup, month lookup, and week lookups are not merged as a single table. In a dimensional data modeling (star schema), these tables would be merged as a single table called TIME DIMENSION for performance and slicing data.
This dimensions helps to find the sales done on date, weekly, monthly and yearly basis. We can have a trend analysis by comparing this year sales with the previous year or this week sales with the previous week.