November 2009
Business Intelligence: November 2009  
Newcomp Solutions
Business Intelligence & ETL

Business intelligence (BI) is a tech-based tool that allows decision makers process information to gain insights that drive enterprise performance. However, the path from information to knowledge is not always linear. Typically, a series of steps, including collection, organization, evaluation, and transformation precede the data analysis and reporting that provide insight.

Issues relating to data access – the so-called ETL components – have significant implications for practically every phase of a BI project. ETL stands for extraction, transformation, and loading of data. The implementation of ETL depends on factors such the type of data, number of data sources, data volume, and the preferred mode of data transportation. This article highlights ETL in the context of a BI project.

E is for Extraction

Extraction involves identifying the relevant data sets and pulling them from one or more operational and transactional source systems. The size or volume of the data is a function of business operation. It is not unusual for businesses to generate huge megabytes of data from multiple sources on a daily basis. The choice of extraction method depends on the specifications of the data source systems and the target warehouse environment.

One of the challenges of the extraction phase is making sense of possible differences in the formats of the data source systems. Examples of commonly used formats include flat files, relational databases, and non-relational databases. The variety of formats necessitates the parsing of extracted data to determine their cohesiveness and conformity with underlying structures.

The data extraction method may be full or incremental. The full extraction method means that all currently available data are pulled completely from the source systems. This method assumes that the available data set is, indeed, complete. The incremental method involves the extraction of only the portion of data that has changed since a marked historical period of event. However, in order to utilize the incremental method, the data warehouse has to incorporate change capture techniques as part of the extraction process.

Other considerations in the extraction phase include whether or not to incorporate data transformation as part of the process, whether to extract the data directly from the source systems, or extract them from a staged area outside of the original source systems. None of these considerations is trivial.

T is for Transformation

The transformation phase of ETL involves preparing the extracted data for loading to the target database or databases. Some data may require little or no transformation, while other data sets will need to be restructured, cleaned, and organized before transportation. Typical types of data transformation include:

  • Re-mapping
  • Sorting
  • Filtering
  • Cleaning
  • Aggregation
  • Pivoting
  • Recoding
  • Standardization

SQL functionality makes several of these transformations relatively routine in most BI environment.

L is for Loading

Experts consider data loading from the source systems a simple task in the ETL process, assuming a successful completion of the extraction phase. Depending on the extraction method, data loading may overwrite existing data, be cumulative, or involve a blend of both. Each requires specific constraints, triggers, and audits to ensure the preservation of the integrity of the source data.

The most direct loading route involves moving the extracted data from a source system directly to a target database in a data warehouse. Other less straightforward transportation routes involve an intermediary (pre-warehouse) staging database or an additional (post-warehouse) database.

Tools of the Trade

Although relatively easy to compartmentalize, the actual implementation of ETL varies from enterprise to enterprise. In practically all cases, ETL is resource-intensive. This explains the evolution of ETL tools, which are intended to facilitate the processes.

Practically, all the top BI brands include ETL tools. High profile examples include:

  • Microsoft’s SQL Server Integration Services (SSIS)
  • Oracle’s Warehouse Builder (OWB)
  • IBM’s Manager WebSphere DataStage
  • SAP- Business Object’s Data Integrator

Conclusion

The foregoing are highlights of the ETL processes in a BI project. Any enterprise seriously considering BI deployment needs to commit significant amount of time to shop around. White papers will prove to be helpful research resource for the quick education of current and prospective BI users. Most vendors post these reports on their websites. However, less focus on the flashy side of BI will prove beneficial in identifying what is ultimately useful for the enterprise.

To visit Rachel's blog and original URL, click here.


 PRINTER FRIENDLY VERSION
Learn More
home
Learn More
Request a Meeting
Tell a Friend
About Us
About Us
Newcomp specializes in designing and implementing Business Intelligence, Planning and Data Management. A respected and award-winning team for more than ten years, Newcomp delivers premium services with an outstanding commitment to your successful project.

Contact us at 888-892-4276 or via email at info@newcomp.com

Join Newcomp for a Cognos Express Breakfast Seminar on December 11!
Le Meridien King Edward, 37 King Street East Toronto
Join us on December 11, 2009 at Le Meridien King Edward in Toronto for Breakfast & an exciting demonstration of IBM Cognos Express. IBM Cognos Express is an all-in-one Performance Management Platform purpose-built and priced for mid-size companies, delivering all the dashboard, reporting, planning and analysis capabilities you need.
Save 10% on Year-End Training!
Make your New Year’s Resolution to improve your IBM Cognos Skills! Newcomp is pleased to announce a 10% discount on any training booked before the end of the year. Some new courses you may not be aware of include: IBM Cognos Express Xcelerator (e-Learning), Web-Based Training Collection (10 titles at 75% off their individual prices!), Essentials for IBM Cognos 8 BI (a bundle of Classroom training and Web-Based Training). There are really too many to list – please email training@newcomp.com for more information!


Powered by IMN