Database: Sample Data
With a simple example of designing a data model with ‘Employee related information’, most of the data structures can be easily understood. So we will try to design a data model using the sample data given below and will explain the data modeler’s involvement in the database environment.
Sample Source Data:
First Name | Middle Name | Last Name | SSN | Gender | Birth Date | Salary | Dept. No | Dept. Desc |
Melinda | J | Schipper | 765-43-21 | Female | 20-APR-1973 | 5000 | ||
Kevin | A | Schulte | 123-45-67 | Male | 13-OCT-1969 | 5000 | 10 | IT |
Note: If you go through our topics provided under the Data Modelling Section, you will have a brief idea about how to design a data model.
Database: Sample Data Analysis
For the sample data provided in the above table, as a Data Modeler you have to design logical data model, physical data model and generate DDL scripts. In order to do the above tasks, you need to create the following:
- Create 3 tables GENDER_LKP, DEPARTMENT_LKP, EMPLOYEE_DTL.
- Assign correct data types to the columns.
- Create constraints like primary key, unique key, null, not null, check to the columns.
- Assign correct PHYSICAL names for tables and columns.
- Select GENDER_LKP, DEPARTMENT_LKP as parent tables and EMPLOYEE_DTL as the child table and connect them.
- Even though column ‘DTTM_STMP’ has not been present in the sample data, you have to add ‘DTTM_STMP’ to all tables to know the date and time on which a record is inserted or updated.
GENDER_LKP Analysis:
- Since ‘male’ and ‘female’ values will be repeated for several records in EMPLOYEE_DTL table, you have to design this lookup for saving disk space. Sometimes the detail table may not show the exact business requirements also.
- E.g. Since the example data contains both genders; you know that there are two genders. What would have happened if all records were male? By seeing the sample data, you might have come to a conclusion that there are only ‘males’ allowed as employees. By discussing with Business Analyst, you would come to know that both genders are allowed. For this purpose, you design a lookup table, which identifies the business requirements also.
- You have to assign NOT NULL constraint for all columns since lookup tables in general should not have null value.
- You have to create a column Gender Code, which is not present in sample data and this column should be assigned Primary Key to validate the detailed data in EMPLOYEE_DTL table.
Sample Source Data:
Entity Name | Table
Name |
Att
Name |
Col
Name |
Data
type |
Null Option | P Key | F Key |
Gender Lookup | GENDER_LKP | Gender Code | GNDR_CD | VARCHAR2(10) | NOT NULL | Yes | No |
Gender Lookup | GENDER_LKP | Gender Description | GNDR_DESC | VARCHAR2(50) | NOT NULL | No | No |
Gender Lookup | GENDER_LKP | DateTime Stamp | DTTM_STMP | DATE | NOT NULL | No | No |
Analysis of DEPARTMENT_LKP:
- Since ‘Dept No’ and ‘Dept Desc’ values will be repeated for several records in EMPLOYEE_DTL table, you have to design this lookup for saving the disk space.
- You have to assign NOT NULL constraint for all columns. Generally lookups should not have null value.
- You have to assign ‘Dept No’ as the primary key to validate the detailed data in EMPLOYEE_DTL table.
Sample Source Data:
Entity Name | Table
Name |
Att
Name |
Col
Name |
Data
type |
Null Optn | P Key | F Key |
Department Lookup | DEPART
MENT_LKP |
Depart
ment Number |
DPTMT_NUM | NUMBER(2) | NOT NULL | Yes | No |
Department Lookup | DEPART
MENT_LKP |
Depart
ment Description |
DPTMT_DESC | VARCHAR2(50) | NOT NULL | No | No |
Department Lookup | DEPART
MENT_LKP |
DateTime Stamp | DTTM_STMP | DATE | NOT NULL | No | No |
Analysis of EMPLOYEE_DTL:
- As of now, this example entity contains only sample of 2 records. In real time, this entity would store millions of records. In order to retrieve the data in a faster way, you have to create an additional column EMP_DTL_ID in EMPLOYEE_DTL and assign it as primary key.
- In our example, column EMP_DTL_ID will be populated through sequence initiated by a trigger and you can see the sequence code and trigger code in later sections.
- Even though SSN can be added as a primary key in EMPLOYEE_DTL table, you have to add a new column EMP_DTL_ID as the primary key to ensure the fastest retrieval of data.
- Tip: Wherever it is required, you have to create new columns.
Sample Source Data:
Entity Name | Table
Name |
Att
Name |
Col
Name |
Data
type |
Null Option | P Key | F Key |
Employee Detail | EMPLOYEE
_DTL |
Employee Detail Identifier | EMP_DTL_ID | NUMBER | NOT NULL | Yes | No |
Employee Detail | EMPLOYEE
_DTL |
Department Number | DPTMT_NUM | NUMBER(2) | NULL | No | Yes |
Employee Detail | EMPLOYEE
_DTL |
Gender Code | GNDR_CD | VARCHAR2(10) | NOT NULL | No | Yes |
Employee Detail | EMPLOYEE
_DTL |
First Name | FRST_NM | VARCHAR2(30) | NOT NULL | No | No |
Employee Detail | EMPLOYEE
_DTL |
Last Name | LST_NM | VARCHAR2(30) | NOT NULL | No | No |
Employee Detail | EMPLOYEE
_DTL |
Middle Name | MDLE_NM | VARCHAR2(30) | NULL | No | No |
Employee Detail | EMPLOYEE
_DTL |
SSN | SSN | VARCHAR2(11) | NOT NULL | No | No |
Employee Detail | EMPLOYEE
_DTL |
Birth Date | BRTH_DT | DATE | NOT NULL | No | No |
Employee Detail | EMPLOYEE
_DTL |
Salary Amount | SLRY_AMT | NUMBER(7,2) | NOT NULL | No | No |
Employee Detail | EMPLOYEE
_DTL |
DateTime Stamp | DTTM_STMP | DATE | NOT NULL | No | No |