Three decades back, a genuine need was felt in the market to build an exclusive system to exclusively cater for reporting and analytical needs. This need stemmed from several performance bottlenecks faced by Transactional Processing (OLTP) systems to meet the demands of analysis of large volumes of data. Transactional systems are built on top of Relational Databases, which are optimized by design to be efficient in transaction processing.
Relational database and Entity Relationship modelling supported four main characteristics i.e. ACID (Atomicity, Consistency, Isolation, Durability) to ensure that transactional systems are fault tolerant in “ALL” aspects of data. This ensured that transactional systems can scale up to handle very large volumes of data processing. Applications such as banking, ticket booking, sales logging needed assurances from underlying systems in order to ensure data security and integrity of business. If the database schema is analyzed closely for any OLTP application, one might notice the large volume of tables and different relationships between them. This ensured that data insert and updates were extremely fast, but information retrieval across multiple tables was slow due to large number joins that have to be executed.
NEED FOR OLAP
Analytical applications initially did not require independent systems due to volume of ad-hoc requests and data were minimal. Majority of the requirements were pre-formatted reports that were scheduled to refresh during off peak hours. These two factors ensured that reports and analysis did not put undue pressure on OLTP systems. As enterprises grew in scale and adopted multiple applications, business users started facing delay in reports due to system capacity and data integration problems. Organizations also had silos of data that failed to give complete picture of the state of business. A separate system for exclusive reporting and analytics use that will overcome all hurdles imposed by the architecture was adopted and OLAP systems were born. Two most common complaints that resulted in implementing OLAP system were
- Reporting on OLTP system was slow and not user friendly.
- On the fly ad-hoc analysis like drill down were cumbersome.
- Simple report requests needed expert IT help.
- Different applications were used for different purposes e.g. Finance and Inventory Management, which had to be merged in order to get single consolidated view of information.
- The silo effect had a bad side effect of multiple versions of truth and resulted in the data could not be trusted.
In order to implement a successful OLAP system, three components were required to work in tandem. There are many more activities that are performed in each area and the list below captures only a subset of core functionalities.
- Multidimensional structure of dimensions and facts.
- Single data warehouse or multiple data mart approach
- De-normalized version of OLTP and built based on business subject area.
- Layerd architectures to accommodate Pre-Staging, Staging, Integration, Data Harmonization, ODS & Data Mart.
- Clear differentiation of Master and Transactional data. Build the logic for data load and refresh accordingly.
- Integrate from different sources both technically (e.g. File & RDBMS) and application (e.g. CRM & ERP) perspectives
- Identify deltas, slowly changing dimensions, cross references, pre-calculation etc
- Periodic batch data load jobs to ensure that OLAP and OLTP are in sync.
- Canned vs Ad-Hoc reports, where all canned reports were scheduled to run at off-peak hours and keep fresh data for user.
- Give more flexibility in the hands of user to perform analysis on a single data mart that is subject area oriented.
- Visual analysis of data became mainstream component in ad-hoc analysis
In recent years, several technologies have emerged that are reducing the gaps and paving way for a consolidated system.