Outrigger dimensions are, as the name suggests, is part of dimensional data in a data warehouse. “Outriggers” are performance improvement feature that enables us to build better optimized data models. An outrigger dimension is typically employed when a dimension table grows large i.e. in terms of number of columns. The large dimension table is broken down into manageable chunks based on data relationships, analysis needs and built into the data warehouse. Outrigger scenarios can also arise with two different dimensions referencing each other along with fact data e.g. Customer and Geography are two dimensions where Customer dimension is linked with Geography based on Postal code.
One can visualize the concept in this manner. Whenever dimension table size in terms of record count in creases drastically, degenerative dimension should be implemented. A degenerative dimension moves the field from dimension table to fact table and thus keeps the dimension table at manageable size. If the dimension table grows in size horizontally, in terms of columns, outrigger dimensions can help. One has to be careful with several dependencies before employing outriggers to break a dimension table. One of the table will act as primary dimension table and the rest will be secondary. Only the primary table is linked with fact data. This approach inherently adds to snow flaking of underlying schema, which can be good and bad if not handled judiciously. As a positive side effect, when outriggers are created, it helps in expanding the scope of Conformed Dimensions within a data warehouse.
It the example of outrigger dimension i.e. Customer vs Demographics, attributes that qualify for outrigger can be logically grouped based on some criteria. Typically the following characteristics of an attribute are considered before flagging it as an outrigger.
- Data Granularity & Carnality: Fields that are high in granularity and low in carnality are good candidates.
- Slowly Changing Type: If attribute is part of SCD flagged items, it cannot be moved our of primary dimension table.
- Update Frequency: Data that gets loaded only monthly e.g. data sourced from market research firms versus master data from internal system.
- Source of the actual data: Similar to previous point, wherein the customer dimension is built based on an internal CRM system and demographics data is sourced via flat file from a vendor.
- User analytical needs: If majority of users require only customer based reports on daily basis, whereas the demographic related analysis is carried out quarterly.
Outrigger dimensions are implemented in almost all projects without realizing the same in a data model. Although snow flake schema has its minuses, it is the default choice for any data warehouse project today. Only when performance takes a hit, data models are tuned, where attributes are shuffled to create better models.
In SAP BW, attributes marked as "Navigation Attributes" of an object can be seen as outriggers. The technical implementation of this does not conform to the ideas discussed above, but conceptually it works in a similar fashion.