On Line Analytical Processing comes in two distinct flavors, which are Multi-dimensional OLAP and Relational OLAP.
MOLAP, as the name indicates, uses a star schema (or) snow flake schema as the base data model for building the data warehouse. Data is loaded into the warehouse through ETL jobs on regular basis and all reporting tasks refer to the staged data available in data warehouse (E.g. SAP Business Warehouse, SQL Server Analysis Services).
In this approach, majority of complexity lies in the ETL staging data flow(s). ETL tools have to accommodate all business rules logic as well as technical hurdles that come in the way of maintaining a consistent data warehouse. Large projects will have multiple layers incorporated into ETL data flow such as staging layer, integration layer, harmonization layer, datamart layer etc before final data is updated in the warehouse.
ROLAP on the other hand does not rely on a dedicated warehouse, but instead creates a semantic layer that will expose underlying data directly to reporting layer. The semantic layer acts as a middle man to generate SQL queries that are in turn passed on to underlying source systems. Data is fetched and directly sent to reporting layer for further processing (E.g. SAP Business Objects, Microstrategy). Biggest challenge in this approach lies in designing effective semantic layer that addresses all reporting requirements and do not lag in performance. ROLAP tools inherit dependencis based on underlying transactional system architecture and limitations that arise due to complex SQL statements. ROLAP tools do not generate efficient SQL queries for complex scenarios and impact on reporting layer performance can be visibly seen. (In certain cases, the complex SQL statement is broken down to multiple sequential steps and dataset is processed and stored in multiple temporary tables for each stage before collating the final output)
So, what is the most commonly adopted architecture? Since both architectures have their own advantages and disadvantages, a mixed approach is always adopted. ROLAP has a niche when direct operational reporting is required whereas MOLAP has edge when large volumes of data is used from analysis perspective.
All BI projects today adopt a mix of both methodologies i.e. Hybrid Architecture. It is not uncommon to find some areas such as Sales and CRM using data warehouse approach whereas other areas like Finance and Accounts use direct reporting functionality.
Performance is one key element that comes to forefront and holds sway over all design factors because it is the most critical component for BI adoption. Additionally other factors that influence hybrid approach are
- Business Complexity
- Business Rules / Calculation Logic
- Landscape Diversity
- Volume and Variety of end consumers
- Diversity of reporting requirements
Behind the scenes, there are other factors that come into play such as Human Factor (e.g. user preference for Excel integration) or Financial Considerations (e.g. Licenses and Hardware availability) or Corporate/Contractual Policy etc. Additionally, in real-time scenarios it is impossible to choose one architecture over other that covers entire breadth of all reporting scenarios.
Hybrid approach is not rigid and the architecture can be tweaked to fit the actual requirements of a project, which is the main reason for its wide spread adoption.