Skip to main content

Development

Teradata Temporal Feature in Data Warehouse Design

If you have read books or articles written by Ralph Kimball, you should know about basic components and process to complete a dimensional modeling for enterprise data warehouse (EDW). When an architect is designing on DW, he would have to think about the slowly change dimension (SCD) approach that will best suit the business need and mainstream data structure in the transaction system. We can go over the various typical SCD type.

SCD Type 1

The new record will overwrite the existing one directly without tracking history information. This is unusual in the dimensional design for most business cases.

SCD Type 2

It will track for historical records by incorporating 2 columns of Eff_Start_Date/Eff_End_Date. When do record update, it will physically insert a new record and populate Eff_Start_Date as today. At the same time, the Eff_End_Date of old one was set to today. This mechanism ensure the active record is flagged correctly and old record is kept in warehouse.

99999

SCD Type 3

This approach will track changes using separate columns and keep history by adding additional column such as Original_Cell_Phone_Number. I personally don’t recommend this way.

SCD Type 4

This method is a bit similar to type 2 but it will have 2 tables. One is to store current state of record information and another one is keeping historical records. Once a new employee is created, the records need to be inserted to both tables. This method is widely used in the DW design work.

Teradata Temporal Feature

Teradata has been playing an important role in DB/DW market for its prebuilt solution for PB level data volume. The temporal feature is available in Teradata version 13.10 and on forward. To know about this feature, we would need to look at the new field type of PERIOD that represents span of date. It has a beginning bound (defined by the value of a beginning element) and an ending bound (defined by the value of an ending element). Beginning and ending elements can be DATE, TIME, or TIMESTAMP types, but both must be the same type.                                                                                         If we are to create a temporal feature based table:

CREATE MULTISET TABLE Employee(Surrogate_Key INTEGER,                                                                                             Employee_Num VARCHAR(50) NOT NULL,                                                                                                                                                 Employee_Name VARCHAR(200),                                                                                                                                                          Cell_Phone_Number VARCHAR(50),                                                                                                                                          Effective_Date PERIOD(DATE) NOT NULL AS Date)                                                                                                                            PRIMARY INDEX(Surrogate_Key);

If we are to insert a new record into this temporal feature enable table:

INSERT INTO Employee                                                                                                                                                                            (Surrogate_Key, Employee_Num, Employee_Name, Cell_Phone_Number, Effective_Date)                                             VALUES (100, ‘99999999999’, ‘Employee’, ‘888-888-888-888’,                                                                                                                PERIOD(DATE ‘2009-10-01’, UNTIL_CHANGED));

In the DW design model, SCD type 2 is quite popular to track and do analysis on historical information. In most cases, business dimension (employee) is only valid in a specified period and gets expired at point-in-time so Taradata PERIOD data type can exactly satisfy the need.

In Teradata database, there are 2 important concepts for bi-temporal table: Transaction-Time, Valid-Time. Transaction time and valid time are independent time dimensions. A table can have either type of column, both, or neither. If you want to learn more, recommend you to read an excellent document called “A Case Study of Temporal Data” which can be downloaded from Teradata website.

Similar to other data type such as Integer, Varchar, Char, Date etc, there are a set of DML including query language for PERIOD and it is not difficult to grasp for DW designer and developer. That’s also one of reasons why I had interest in it.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Kent Jiang

Currently I was working in Perficient China GDC located in Hangzhou as a Lead Technical Consultant. I have been with 8 years experience in IT industry across Java, CRM and BI technologies. My interested tech area includes business analytic s, project planning, MDM, quality assurance etc

More from this Author

Follow Us