In a DW/BI implementation project, after the first round of Requirements gathering from users is complete, sketching the data model is started. Having a well-constructed data model during the early stages of the project is critical because the ETL, DBA, and Reporting teams depend on it for their deliverables.
During this phase of design, the data analyst/architect identifies Strong & Weak entities in order to build dimension tables. In the case of standardized OLTP systems such as Enterprise Resource Planning, and Customer Relationship Management from well-known vendors, this task is easier to accomplish due to the nature of common design adopted across multiple vendors. While building a data model, all user requirements must be met without many iterations heading back to the drawing board.
Cardinality and Granularity play an important role in defining and designing dimensions. These affect the outcome of the Star or Snowflake schema adopted as well as other design aspects such as Dimension Conformity, Outrigger Dimensions, and Degenerative Dimensions.
Cardinality
This refers to the number of unique items within a field in a table. For example, Employee is an entity for which Gender is a low cardinality item, whereas Tax Identification Number is a high cardinality item. Gender When looked at from the perspective of the table as a whole, adding low or high cardinality fields will impact the size, but it is the analytic need that should dictate the inclusion of an attribute. One can observe that aggregated queries are much faster while executed on low cardinality attributes.
One key functionality of cardinality is that it helps in defining relationships i.e., one-to-one, one-to-many, etc. across entities.
Granularity
Data grain refers to the amount of detail with respect to an entity. consider the Employee example seen previously, if the data model captures only Employee ID, Name, Department, Manager & Salary it is a dimension with low grain. If the dimension contains Employee ID, First Name, Middle Name, Last Name, Suffix, Salutation, Gender, Marital Status, No of Dependents, Designation, Department, Manager, Salary, or Tax ID, etc., is a high grain dimension.
In conclusion, both concepts are intertwined during implementation. Since both refer to the quantum of data, it conveys an image of interrelated concepts. In reality, both are independent concepts and play different roles while designing data models.