Data Model – Date, Time & Timestamp

videoblocks.com

It is common practice to keep Date and Time as two separate dimensions with pre-loaded data. Timestamp dimension is not a good data model design because it adds to complexity of data model maintenance by introducing an ever growing dimension table which leads to degeneration issues due to grain. Another problem arises when timezone conversions and clock switch takes place (e.g. Daylight Savings) and it is best to avoid from design and implementation perspective.

Majority of analytical scenarios would require only date and time up to second level granularity, for which a Timestamp dimension less model would suffice. There are specific use cases where measures are calculated that span multiple time stamps, for example a call center. 

Consider a case of CRM system based Contact Center management scenario where KPI’s are computed based on day to day operations with sample KPI’s as follows

  • Number of Calls: A measure of volume handled by the contact center
    • Standard Date & Time Dimension would suffice 
  • Mean Time for Resolution (MTR): A measure of time taken to resolve the issue by support personnel
    • A call could span few minutes or hours or across days. At times, multiple calls might be required to fully resolve an issue during which multiple timestamps may be recorded

One can pre-compute the KPI’s, which is a quick an easy solution to implement. When the data model is released for self service / data exploration, the lack of all these important attributes for analysis will become apparent. 

Timestamp fits all the characteristics of an attribute, but cannot be included as a dimension due to the complexities discussed previously. There are two options to accommodate such requirements into the data model. 

  • Fact Table Extension
    • Quick and dirty method where the time stamp fields are added as additional columns in fact table
    • Not a good design practice and forces the data model to be rigid and hard to modify in future
  • Factless Fact Table
    • Create a child table of the main fact table, which will act like degenerated dimension 
    • The secondary fact table holds only the timestamp data that will be used for analysis (e.g. Created, Assigned, Started, Tested, Closed)
    • In case there are multiple calls for an issue, the secondary fact table can hold those details too. Here, the data can be transposed into multiple columns 

In conclusion, timestamps cannot be handled either as a Dimension or Fact and calculations cannot be delegated to database always. Based on the business process and measures, it is imperative to build a data model that is flexible enough to hold the data, which will be used by reporting layer for in-depth analysis.