Data Extraction
Data Warehouse
Data ExtractionWhat is Bi-directional Extract
Bi-directional extracts refer to the ability of a system to extract, cleanse and transfer data in two directions among different database types which include hierarchical, networked, and relational databases.
This functionality is extremely useful in data warehousing projects. Data warehouse extraction is the process of retrieving all sorts of data including unstructured and badly structured from various data sources. This data will be furthered processed and migrated to other storage locations.
A notable process in a data warehouse is the Extract, Transform and Load (ETL) process which extract data from several sources, transforms the data to be tailored to business needs and loads them into the warehouse to be used by the company.
Data extracts are taken from source systems. Each of the different source system use various formats based on departments and other business segments. Common source formats include flat files and relational database and other non-relational database structures such as IMS, VSAM or ISAM.
After data is being extracted, they go to the transformation stage. During this phase, series of rules and functions related to business needs, requirements, rules and policies are applied on them. In this stage, some values are translated and encoded, for example, 1 may be assigned to male gender, 2 to female, etc. Similar data from different sources are merged to avoid redundancy which can slow performance of the system.
Data cleansing may also take place during the transformation stage. In data cleansing, the extracted are scrutinized so that corruption and inaccuracy can be detected and removed. This process makes sure that consistency is maintained. The actual process of data cleansing involves removing typographical errors and inconsistencies as well as comparing and validating data entries against a companies list of entities.
Finally when the data extracts are cleansed and transformed, the will then be loaded into the data warehouse. Loading times and procedures may vary depending on the company. Some companies may load data every week, others everyday. Still, for really big companies with branches spread around the globe, they may have hourly loading.
For internet service companies like search engines, their data warehouse process may be very complex that they need to load and update their date warehouse every minute or second. Complex data warehouse systems usually have audit and history trail of the data changes.
Bi-directional data extracting process makes updates and data loading very fast. As companies get fresher news, trends and patterns in their line of business, new products, marketing strategies, business policies and rules may be formulated. This will give them higher advantage against other companies in the very competitive field of business.
But using bi-directional data extra processes also entail more investment especially in the acquisition of more advance and faster IT infrastructure. With simultaneous data extraction from several sources usually involving disparate systems, companies will have to invest in the latest technologies that can withstand intensive processing pressures.
Data extracts are the one of the foundation for business intelligence applications and technologies so that companies can make better business decisions for competitive advantage. Extrapolated data and information from external environmental indicators can be used for spotting trends and patterns.
Comments
karthick said:
|
bi-directional is a data is used to transfer the data both side |
