tracking your data across the fourth dimension
TRANSCRIPT
–Wikipedia
“A temporal database is a database with built-in support for handling data involving time…”
– Jeff Carouth, https://twitter.com/jcarouth/status/496842218674470912
“Tonight @JCook21 explained temporal databases and I’m sure my brain is now leaking out of my
nose.”
Databases are good at ‘now’
❖ Create
❖ Read
❖ Update
❖ Delete
❖ At any point we only see the current state of the data
Databases are good at ‘now’
❖ How many people work in each department of the company?
❖ For each product category how many products are in stock? Where is the stock located at?
❖ How many orders are currently in each fulfilment state?
The fourth dimension❖ Show me how salaries paid have changed by
department for each quarter over the last 4 years and how they’re forecast to change next year
❖ Show me how stock levels have changed over time. How much stock are we forecast to have at any point in the future?
❖ For audit purposes show me a complete history of every change to this data, what period of time each change was valid for and when we knew about any changes
Decision Time
❖ Records the time at which a decision was made
❖ Modelled as a single value
❖ Allows for granularity through the data type used
Decision Time
EmpId Name Hire Date Decision to Hire
1 Jeremy 2014-03-03 2014-01-20
2 Anna 2015-01-02 2013-12-15
3 Yann 2013-08-20 2013-08-20
Valid Time
“In temporal databases, valid time (VT) is the time period during which a database fact is valid in the
modelled reality.”
–Wikipedia
Valid Time
❖ Modelled as a period of time between two dates
❖ Lower bound is always closed but upper bound can be open
Valid Time
EmpId Name Hire date Termination date
1 Jeremy 2014-03-03 2015-01-20
2 Anna 2015-01-02 ∞
3 Yann 2013-08-20 2015-12-22
4 Colin 2015-05-01 ∞
Valid TimeEmpId Name Dept Hire date Term date StartVT EndVT
1 Jeremy Dev 2014-03-03 ∞ 2014-03-03 2014-07-30
1 Jeremy QA 2014-03-03 2015-01-20 2015-01-21 2015-01-20
2 Anna Dev 2015-01-02 ∞ 2015-01-02 2015-01-30
2 Anna Mgmt 2015-01-02 ∞ 2015-01-31 ∞
3 Yann Mgmt 2013-08-20 2015-12-22 2013-08-20 ∞
4 Colin Dev 2015-05-01 ∞ 2015-05-01 ∞
Valid-time on its own may not be enough!
Name Type StartVT EndVT
Saturn Planet Billions of years ago ∞
Pluto Planet Billions of years ago ∞
Valid-time on its own may not be enough!
Name Type StartVT EndVT
Saturn Planet Billions of years ago ∞
Pluto Dwarf planet Billions of years ago ∞
Valid-time on its own may not be enough!
Name Type StartVT EndVT
Saturn Planet Billions of years ago ∞
Pluto Plutoid Billions of years ago ∞
Valid-time on its own may not be enough!
Name Type StartVT EndVT
Saturn Planet Billions of years ago ∞
Pluto Planet Billions of years ago 2006
Pluto Dwarf planet 2006 2008
Pluto Plutoid 2008 ∞
Transaction Time
“In temporal databases, transaction time (TT) is the time period during which a fact stored in the
database is considered to be true.”
–Wikipedia
Transaction Time
❖ Modelled as a period of time between two dates
❖ Lower bound is always closed but upper bound can be open
Transaction Time
Name Type StartVT EndVT StartTT EndTT
Pluto Planet Billions of years ago ∞ 1930 2006
Pluto Dwarf planet
Billions of years ago ∞ 2006 2008
Pluto Plutoid Billions of years ago ∞ 2008 ∞
Valid Time != Transaction Time
Name Clothing StartVT EndVT StartTT EndTT
Father Christmas null A long time
ago ∞ 1973 1975
Santa Claus red A long time ago ∞ 1975 1980
Saint Nicholas red 270 AD ∞ 1980 1982
How many temporal aspects should you use?
❖ As many or few as your application needs!
❖ Tables that implement two aspects are bi-temporal
❖ You can implement more aspects, in which case you have multi temporal tables
Is your head spinning?
❖ Decision time records when a decision was taken
❖ Valid Time records the period of time for which the fact is valid
❖ Transaction Time records the period of time for which the fact is considered to be true
A note on the example tablesCREATE TABLE dept (DNo INTEGER,DName VARCHAR(255)
);
CREATE TABLE emp (ENo INTEGER,EName VARCHAR(255),EDept INTEGER
);
Periods
❖ Table component, capturing a pair of columns defining a start and end date
❖ Not a new data type, but metadata about columns in the table
❖ Closed-open constraint
❖ Enforces that end time > start time
Valid time
❖ Also called application time in SQL:2011
❖ Modelled as a pair of date time columns with a period
❖ Name of the columns and period is up to you
Temporal primary keys
❖ SQL:2011 allows a valid time period to be named as part of a primary key
❖ Can also enforce that the valid time periods do not overlap
Temporal foreign keys
❖ What happens if a parent and child table both define valid time periods?
❖ It doesn’t make sense to allow a row in a child table to reference a row in a parent table where the valid time does not overlap
❖ SQL:2011 allows valid time periods to be part of foreign key constraints
Temporal foreign keys
ALTER TABLE dept ADD (DStart DATE,DEnd DATE,PERIOD FOR DPeriod (DStart, DEnd)
);
ALTER TABLE empADD FOREIGN KEY (Edept, EPeriod)REFERENCES dept (DNo, PERIOD DPeriod);
Querying valid time tables
❖ Can query against valid time columns as normal - they’re just normal table columns
❖ Updates and deletes can be performed for a period of a valid time time period
Querying valid time tables❖ SQL:2011 allows you to create periods to use in your queries
and use new predicates:
❖ CONTAINS
❖ OVERLAPS
❖ EQUALS
❖ PRECEDES
❖ SUCCEEDS
❖ IMMEDIATELY SUCCEEDS and IMMEDIATELY PRECEDES
Querying valid time tables
UPDATE EmpFOR PORTION OF EPeriodFROM DATE '2011-02-03'TO DATE '2011-09-10'
SET EDept = 4WHERE ENo = 22217;
Querying valid time tables
DELETE EmpFOR PORTION OF EPeriodFROM DATE '2011-02-03'TO DATE '2011-09-10'
WHERE ENo = 22217;
Querying valid time tables
SELECT EName, EdeptFROM EmpWHERE ENo = 22217AND EPeriod CONTAINS DATE '2015-01-23';
Querying valid time tables
SELECT EName, EdeptFROM EmpWHERE ENo = 31AND EPeriod OVERLAPS PERIOD (DATE '2015-01-01', DATE '2015-01-31');
Transaction time
❖ Also known as system time in SQL:2011
❖ Modelled as two DATE or TIMESTAMP columns
❖ Management of the columns for the period is handled by the database for you
Transaction time
❖ When data is updated:
❖ Transaction time end is set to current time on the existing row
❖ A new row is added with the updated date and a transaction time start of the current time
Transaction time
❖ When data is deleted:
❖ Transaction time end is set to current time in the existing row
Transaction time
❖ Because the system manages transaction time:
❖ Not possible to alter transaction time values in the past
❖ Not possible to add future dated transaction time values
❖ Referential constraints on historical data are never checked
Transaction time
CREATE TABLE emp (…,Sys_start TIMESTAMP(12) GENERATED ALWAYS
AS ROW START,Sys_end TIMESTAMP(12) GENERATED ALWAYS
AS ROW END,PERIOD FOR SYSTEM_TIME (Sys_start,
Sys_end)) WITH SYSTEM VERSIONING;
Querying transaction time tables
❖ New predicates to be used with transaction time:
❖ FOR SYSTEM_TIME AS OF
❖ FOR SYSTEM_TIME FROM
❖ FOR SYSTEM_TIME BETWEEN
❖ If none of the above supplied the database should only return rows for the current system time
Querying transaction time tables
SELECT ENo, ENameFROM empWHERE ENo = 22FOR SYSTEM_TIME AS OFTIMESTAMP '2015-01-28 12:45:00';
Querying transaction time tables
SELECT ENo, ENameFROM empWHERE ENo = 22AND EPeriod CONTAINS DATE '2014-08-27'FOR SYSTEM_TIME AS OFTIMESTAMP '2015-01-28 12:45:00';
Grey areas/not implemented yet
❖ Evolving schema over time
❖ Support for period joins
❖ Support for period aggregates or period grouped queries
❖ Support for period normalization
❖ Support for multiple valid time periods per table
Current support❖ Oracle 12c
❖ SQL:2011 compliant but not even nearly complete
❖ PostgreSQL
❖ 9.1 and earlier: temporal contributed package
❖ 9.2 native ranged data types
❖ IBM DB2 through ‘time travel query’ feature
❖ Teradata 13.10 and 14
❖ Handful of others implemented as extensions
Implementing valid time
❖ Add a pair of date time columns to your table for the valid time period.
❖ Can make these part of your primary key
Implementing valid time
❖ Things to consider:
❖ Have to check for end time > start time
❖ Have to check for overlaps in valid time periods
❖ Temporal foreign keys have to be implemented yourself
❖ Queries become potentially more complex
Implementing transaction time❖ Add a column recording transaction time start to your table
❖ For each table create a backup table mirroring the columns in the main table, adding a transaction time end column too
❖ Create a trigger that fires on each update or delete to copy old values from the main table to the backup table
❖ Should add transaction time end to the backup table
❖ Should also update the transaction time start to now in the main table if the operation is an update
Implementing transaction time
❖ Things to consider:
❖ Extra complexity
❖ How long should backup data be kept for?
❖ Do you optimize for fast reads or writes?
❖ Should truncating the main table delete the data from the backup?
More information
❖ Wikipedia article on Temporal Databases
❖ Temporal features in SQL:2011 (PDF)
❖ Time and Relational Theory
Thanks for listening!
❖ Any questions?
❖ I’d love some feedback
❖ https://joind.in/talk/view/13294
❖ Contact me:
❖ @JCook21