2.1.1 Source & Row Quality
What is the single most common question(s) that users ask about data?  How about:
  • What is the source of the data?
  • What is the quality of the data?
I think it’s both. And that is probably why these two columns are in every table in PPDM, as a foreign key no less back to the reference tables of R_SOURCE and   R_PPDM_ROW_QUALITY.  Let’s see a definition of these two terms:
Definition of Source:
The individual or organization that has been identified as the originator of this row of information.
Definition of Row Quality:
A set of values that indicate the quality of the data in the row as defined by the source of the row.
I think that having every row stored in a data model that can answer questions on the source & quality of  data is outstanding and is one of the basic tenets  modeled into PPDM.  Let’s look at an example of what happens when a row of data is stored and a value is stored in either the Source or Row_Quality column.  By default, the rdbms will check the value against the reference table (based on the foreign key definition) to ensure that the value is there.  An example would be if you are trying to store a value of Wes as the source.  Let’s say that the value stored in the R_SOURCE table  is ‘WES’. If ‘Wes’ is used in the SQL statement like this:
graphic
When the row is inserted into the database, an error message will show that the value is in-valid like this: 
graphic
This tells user that the value of ‘Wes’ is not stored in the R_SOURCE.  A SQL query against the R_SOURCE shows that the value for ‘Wes’ is stored as ‘WES’. 
graphic
A simple change of ‘Wes’ to ‘WES’ will allow the row to be added in.  The importance of this is multiple instances of ‘Wes’ are not stored in the database.  This means that any SQL query that wants to find all rows that are sourced by ‘WES’ only have to look for ‘WES” and not any combinations of the following: ‘Wes’, ‘wes’, ‘WeS’, ‘wES’, etc.  Using this as a very simple example, this shows the value of using reference tables and having referential integrity in a data model.  
It should be noted that these two tables must be populated first before any other rows of data can be populated in PPDM v3.8.  They are the core of the data model and as such, care should be taken when adding or deleting data to them.