ETL Tool – Informatica – Part 1
Informatica is a powerful ETL tool from Informatica Corporation, a leading provider of enterprise data integration software and ETL software.
The important Informatica Components are:
- Power Exchange
- Power Center
- Power Center Connect
- Power Exchange
- Power Channel
- Metadata Exchange
- Power Analyzer
- Super Glue
In Informatica, all the Metadata information about source systems, target systems and transformations are stored in the Informatica repository. Informatica’s Power Center Client and Repository Server access this repository to store and retrieve metadata.
Note: To know more about Metadata and its significance, please click here.
Source and Target:
Consider a Bank that has got many branches throughout the world. In each branch data may be stored in different source systems like oracle, sql server, terradata, etc. When the Bank decides to integrate its data from several sources for its management decisions, it may choose one or more systems like oracle, sql server, terradata, etc. as its data warehouse target. Many organisations prefer Informatica to do that ETL process, because
Informatica is more powerful in designing and building data warehouses. It can connect to several sources and targets to extract meta data from sources and targets, transform and load the data into target systems.
Guidelines to work with Informatica Power Center:
Repository: This is where all the metadata information is stored in the Informatica suite. The Power Center Client and the Repository Server would access this repository to retrieve, store and manage metadata.
Power Center Client: Informatica client is used for managing users, identifying source and target systems definitions, creating mapping and mapplets, creating sessions and run workflows etc.
Repository Server: This repository server takes care of all the connections between the repository and the Power Center Client.
Power Center Server: Power Center server does the extraction from source and then loading data into targets.
Designer: Source Analyzer, Mapping Designer and Warehouse Designer are tools reside within the Designer wizard.
Source Analyzer is used for extracting metadata from source systems.
Mapping Designer is used to create mapping between sources and targets. Mapping is a pictorial representation about the flow of data from source to target.
Warehouse Designer is used for extracting metadata from target systems or metadata can be created in the Designer itself.
Data Cleansing: The PowerCenter’s data cleansing technology improves data quality by validating, correctly naming and standardization of address data. A person’s address may not be same in all source systems because of typos and postal code, city name may not match with address. These errors can be corrected by using data cleansing process and standardized data can be loaded in target systems (data warehouse).
Transformation: Transformations help to transform the source data according to the requirements of target system. Sorting, Filtering, Aggregation, Joining are some of the examples of transformation. Transformations ensure the quality of the data being loaded into target and this is done during the mapping process from source to target.
Workflow Manager: Workflow helps to load the data from source to target in a sequential manner. For example, if the fact tables are loaded before the lookup tables, then the target system will pop up an error message since the fact table is violating the foreign key validation. To avoid this, workflows can be created to ensure the correct flow of data from source to target.
Workflow Monitor: This monitor is helpful in monitoring and tracking the workflows created in each Power Center Server.
Power Center Connect: This component helps to extract data and metadata from ERP systems like IBM’s MQSeries, Peoplesoft, SAP, Siebel etc. and other third party applications.
Power Center Exchange: This component helps to extract data and metadata from ERP systems like IBM’s MQSeries, Peoplesoft, SAP, Siebel etc. and other third party applications.
Next ⇒ ETL Tool – Informatica – Part 2