Location Agnostic Data
Arup NandaArup NandaArup NandaArup Nanda
Proligence
Location Agnostic Data2
It’s 11 PM.
Do you know where your
data is?
Location Agnostic Data3
Schema A
SALES
SALES_HISTORY
Location Agnostic Data4
Schema A
SALES
SALES_HISTORY
Sales in last
3 months
Sales in last
24 months
Location Agnostic Data5
Schema A
SALES
SALES_HISTORY
Schema B
SALES_HISTORY
Datawarehouse
Location Agnostic Data6
Schema A
SALES
SALES_HISTORY
Schema B
SALES_HISTORY
Datawarehouse
Location Agnostic Data7
Schema A
SALES
SALES_HISTORY
Schema B
SALES_HISTORY
Datawarehouse
HBase
SALES_HISTORY
Hadoop
Location Agnostic Data8
SALES
SALES_HISTORY SALES_HISTORY
Datawarehouse
SALES_HISTORY
HadoopOLTP
Location Agnostic Data9
SALES
SALES_HISTORY SALES_HISTORY
Datawarehouse
SALES_HISTORY
HadoopOLTP
Location Agnostic Data10
SALES
SALES_HISTORY SALES_HISTORY
Datawarehouse
SALES_HISTORY
HadoopOLTP
Cloud
Location Agnostic Data11
Sharding
Location Agnostic Data12
North America South America EMEA APACLocation
Very Important Important Somewhat Imp RestCustomer
Importance
Location Agnostic Data13
Different Location Different Technology
Different Geography
Challenges
• Need to know:
– The technology
• Oracle Database, Hadoop?
• SQL, or Map Reduce Code
– The location
– The data availability
• Do I have the right data in the same DB or got to get from somewhere else?
Location Agnostic Data14
Solutions
• Put the business logic in the database
• But, which database?
• Which technology
Location Agnostic Data15
Solution, contd.
• Put the logic in the app layer
• Independent of technology
• But is really technology independent?
• Definitely inconsistent
• Need to reinvent the wheel
• Still need to know the location, currency, etc.
Location Agnostic Data16
Location Agnostic Data17
Different Location Different Technology
Different Geography
Location Agnostic Data18
Data As a Service
Step 1
• Gather a list of all high level data
– Not tables, columns; but data elements such as reservations, sales, etc.
Location Agnostic Data19
Step 2
• Map tables, columns, collections, etc. into these data elements
• Define an document business logic for each datastore
Location Agnostic Data20
Logic Data store
System of record, less than 3 months OLTPDB.SchemaA.TableA.ColumnA
Less than 2 years DWDB.SchemaA.ColumnA
More than 2 years Hbase
Define all other elements such as how often synced up.
Step 3
• Identify datastores that can’t tolerate latency
Location Agnostic Data21
Customer
Records
Sales Records
Unpredictable
division across
geographical
boundaries
Are generally
called within
ranges. Out of
ranges can be
slow.
Replicate
Shard
Example
Location Agnostic Data22
North
America
EMEA APAC
Customer Customer Customer
Sales Sales Sales
Sharded Data Strategies
• Use DB Links
create database link newyork
connect to gwuser/gwuser
using ‘newyork’;
Location Agnostic Data23
Sharded Data, contd.
• Create View SALES as
select * from sales
union all
select * from sales@newyork
union all
select * from sales@london
Location Agnostic Data24
Sharded Data, contd.
• Create INSTEAD OF triggers
create trigger
instead of INSERT on sales
for each row
if country_code in (‘US’,’CA’,…) then
INSERT into sales@newyork
…
Location Agnostic Data25
Step 4
• Write APIs to serve data elements
• Could be:
– Java Webservice
– Oracle Packages
– RESTful APIs
Location Agnostic Data26
Services
Location Agnostic Data27
North
America
EMEA APAC
Customer Customer Customer
Sales Sales Sales
Sales Service
Customer Service
Service
• Has a fixed “catalog” of requests it can address
• Has knowledge about location, data dispersion, technology
• Example
– Accept StoreID
– Lookup country of the store
– Is data replicated?
– If so, get local data
– If not, locate the DB for that country
– Get the data from that DB
Location Agnostic Data28
Services
Location Agnostic Data29
North
America
EMEA APAC
Sales Service
Services
Location Agnostic Data30
North
AmericaEurope APAC
Sales Service
Africa
Services
Location Agnostic Data31
North
AmericaEurope APAC
Sales Service
Africa
Hadoop
Services
Location Agnostic Data32
North
AmericaEurope APAC
Sales Service
Africa
Hadoop
Suitability
1. Reference data—country details, store details
� Static or almost static
2. Low variability—customer preferences
3. Latency tolerated—European customers generally call the European
call centers
4. Read type of access
Location Agnostic Data33
Bridging Technologies
Location Agnostic Data34
ServiceA Helper
Service
Gateways
Location Agnostic Data35
Gateway
Summary
• Data will continue to exist in many places
• Single source of truth is next to impossible to attain
• Architect to live with diverse data, sometimes competing versions
• Decide which data is to be manually replicated and sharded
• Use gateways or services to connect from one database to the other allowing
location and technology transparency.
• In Oracle databases, DB Links, Views and Instead of Triggers can provide the
location transparencies
• Gateways can bridge different technologies
Location Agnostic Data36