Thursday, 4 July 2013

Datastage Online Training | Online Datastage Training In Hyderabad

DATASTAGE


             Decision support systems are usually based on the development of Data Warehouse infrastructures.
 Data warehouse architecture has two major areas:

The staging area and the presentation area.

1) We present the staging area. The sources, from which data shall be systematically extracted, in order to be loaded in the DW, are determined. The database schema documentation of these sources is reviewed in order to design the data extraction logic.Datastage Online Training

2) Documentation quality of the data structures of these sources influences the degree of difficulty in designing the data extraction logic. Data extracted are loaded in the staging area, either as simple files or as updates in database tables. The staging area may have various stages. Extraction of data from sources, transformation of data into new structures and data loading in the DW, a process known as ETL, takes places in the staging area.Datastage Online Training In Hyderabad

The extraction process requires the determination of source relational tables - fields, from which data shall be extracted (as mentioned above, documentation of these structures is crucial for design). The design of the extraction process determines Various types of raw data processing, take place at the staging area:

Data standardization: data transformation to a standard format, if needed Sorting of records Matching and merging records of the same entity, which are derived from different sources (e.g. order records of the same Customer from different order handling systems), after standardization Processing of calculated facts (facts derived from detailed data e.g. total monetary value of an order).
  •     Management of surrogate keys, which replace operational systems keys
  •     Enrichment of records with default values, if required
  •     Production of aggregate data, if needed


Data conversion according to the technological platform used by the DW (DBMS, operating system).The ETL process is automated by software and executed periodically to update the DW. the frequency of data extraction the extraction method (e.g. changes only) and technology (database partial replication) the database instance or the file in which data are initially loaded, in the staging area.Online Datastage Training

Moreover, the volume of data to be extracted is estimated, in order to plan for computational & storage capacity. Estimation sheets known as 'volumetric sheets' are developed with the following information per source field:
  •     extraction frequency
  •     estimated volume
  •     Standardization and transformation rules applied (if any)
  •     DW database field to which data will be loaded.


In many cases, data quality assessment and data cleansing steps also take place in the staging area. Design and implementation of the automated ETL process, often represents a major part of the man effort to develop a DW (international statistics estimate that it exceeds 70% of total effort). The DW staging area, is often implemented in a separate physical server (staging server), thus adding complexity and cost. However, this approach has certain advantages like.