Categories: BI

Role of Metadata in ETL

Role of Metadata in ETL:

When you deal with a data warehouse, various phases like Business Process Modeling, Data Modeling, ETL, Reporting etc., are inter-related with each other and they do contain their own metadata. For example in ETL, it will be very difficult for one to extract, transform and load source data into a data warehouse, if there is no metadata available for the source like where and how to get the source data.

Let us explain the role of metadata in the ETL process with the help of an example table shown below which contains information about an organisation’s employees.

Employee NameEmployee AgeEmployee SalaryEmployee Title
John Hick36$3000Informatica Specialist

In the above table, the second row, containing information like John Hick, 36, $3000, Informatica Specialist are known as Data. Whereas the first row, (i.e) table header containing headings like Employee Name, Employee Age, Employee Salary, Employee Title are called as Metadata for the above said data.

An organization may be using data modeling tools, such as Erwin, Embarcadero, Oracle designer, Sybase Power Designer etc., for developing data models. Functional and technical team should have spent much time and effort in creating the data model’s data structures (tables, columns, data types, procedures, functions, triggers etc.,). By using metadata capturing tools, these data structures can be imported into metadata repository which we call it as metadata.

For example when you work with Informatica’s Metadata Exchange, it captures the metadata present in these tools and loads into the repository. There is no need for Informatica developer to create these data structures once again since metadata(data definitions) have been already captured and stored. Similarly most of the ETL tools have that capability to capture metadata from RDBMS, files, ERP, Applications etc.

In ETL, Metadata Repository is where all the metadata information about source, target, transformations, mapping, workflows, sessions etc., are stored.
From this repository, metadata can be manipulated, queried and retrieved with the help of wizards provided by metadata capturing tools. During the ETL process, when we are mapping source and target systems, we are actually mapping their metadata.

In any organization, a useful metadata often stored in a repository can be a handy resource to know about the organization’s information systems. Assume that each department in an organization may have a different business definitions, data types, attribute names for the same attribute or they may have a single business definition for many attributes. These anomalies can be overcome by properly maintaining metadata for these attributes in the centralized repository.

Thus metadata plays a vital role in explaining about how, why, where data can be found, retrieved, stored and used efficiently in an information management system.

learndmdwbi

View Comments

Share
Published by
learndmdwbi

Recent Posts

Oracle’s Database Dictionary Views

These SQL commands are related with Oracle's data dictionary and can be used to get…

8 months ago

Oracle important DDL Statements

important DDL Statements from Oracle like Commit, Rollback, Grant, Revoke etc..

8 months ago

Oracle Database Data Definition Language(DDL Statements)- DROP Object Commands

In this section, we will try to explain about important database DROP commands that are…

8 months ago

Oracle Database Data Manipulation Language (DML) Commands

In this section, we will try to explain about important database DML commands that are…

8 months ago

Oracle Database Data Definition Language(DDL Statements) – ALTER Commands:

In this section, we will try to explain about important database ALTER commands that are…

8 months ago

Oracle DDL Statements – CREATE Commands

An Oracle database consists of DDL commands, which are useful to create, modify and drop…

8 months ago