sabre presentation for mysql user conference 2004
DESCRIPTION
Low fare search was a cluster of 8 mainframes, running a heuristic that didn't always get a good solution. We built new algorithms and moved it all to a Linux cluster. This presentation describes the parts we put on MySQL, back when mainstream mission critical hadn't even heard of MySQL. The open source precompiler let us take HP NonStop code and compile it, unchanged, to run against MySQL.TRANSCRIPT
![Page 1: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/1.jpg)
Confidential
MySQL at Sabre
Alan Walker Sabre Labs
February 2004
![Page 2: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/2.jpg)
2 2 2
Agenda
• Sabre Holdings Overview
• Business drivers for MySQL & Open Source
• Shopping for fares
• Air Travel Shopping Engine (ATSE)
• Data replication strategy
• ESQL precompiler for MySQL
• Other MySQL users at Sabre
![Page 3: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/3.jpg)
3 3 3
A world leader in travel commerce,
retailing travel products, and
providing distribution and
technology solutions for the
travel industry
Who is Sabre Holdings?
![Page 4: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/4.jpg)
4 4 4
Sabre Holdings Businesses
![Page 5: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/5.jpg)
5 5 5
Sabre Holdings Fast Facts
• Industry leader in multiple travel channels
• Revenues of $2.06 billion in 2002
• S&P 500 company
• NYSE:TSG
• Headquarters in Dallas/Fort Worth, Texas
• 6,500 employees in 45 countries
![Page 6: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/6.jpg)
6 6 6
Business drivers
for a single customer request
fare combinations
Over 3 billion
Multiple airlines, flights, fare types, dates
prices, taxes, surcharges
![Page 7: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/7.jpg)
7 7 7
Business drivers
• No direct revenue for shopping queries
• Revenue for booking, but not looking (searching)
• Look-to-book ratio increasing
• Competition requires staying on the “leading edge”
• Highly reliable and scalable database
• Fast processors
• Large real memory
• Smart algorithms
• Shopping is a good fit for horizontal scale
• Pricing requires higher precision
![Page 8: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/8.jpg)
8 8 8
Business drivers
DB / Middleware
Operating System
Application
Computing
Stack
Commodity
Point
Hardware
Hardware, operating system, database and middleware are
becoming commodities. This drives the cost down rapidly.
Open source software is a major driver of this effect.
![Page 9: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/9.jpg)
9 9 9
Business Solution
• Linux servers alongside HP NonStop servers to create
“hybrid” Air Travel Shopping Engine (ATSE) platform
• HP NonStop delivers high availability and reliability
– Better than or equal to legacy, but at significantly lower cost
– Best fit for critical workloads and master database
management
• Linux / MySQL delivers 64-bit memory and faster CPUs
– Lower availability and reliability than HP NonStop but at
significantly lower cost
– Best fit for CPU-intensive shopping workloads
Most cost-effective platform for the shopping workload
![Page 10: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/10.jpg)
10 10 10
Business drivers
• Sabre’s legacy
• World’s first commercial OLTP system in 1960 • Mainframe clusters running TPF • Operating system customized to our needs • True 7*24 application, with zero scheduled downtime • Most application code in assembler
• Sabre’s future
• Higher-level languages • Relational databases • Internet
• Open systems
• Reduce specialized training • Use off the shelf software • HP NonStop with OSS is a key component (LINUX?)
![Page 11: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/11.jpg)
11 11 11
Shopping
• Finding cheap air fares is hard!
• With 50+ connect points to consider, and >100 fares per
leg, we need to evaluate >3 billion combinations
• Up to a million fares can change every day
• Availability changes continuously
• Solve it >100 times per second
• Other functions
• Price 250 tickets per second
• Process 1000 flight routing requests per second
![Page 12: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/12.jpg)
12 12 12
Pricing
• Shopping vs. Pricing
• Shopping is the problem of finding low fares
• Pricing is used to print the ticket
• Pricing has to be accurate, or we pay the difference to the
airline
• Many internet search engines still rely on mainframes to
actually print the ticket
• Pricing also requires additional functions, such as refunds,
exchanges and auditing
![Page 13: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/13.jpg)
13 13 13
Algorithms
• Fare-led search
• Graph-based algorithm that searches all fare
combinations across 50+ connect points
• Can generate up to a 4-segment connection
• Search space of >3 billion fare combinations
• Match or exceed any competitor in finding lowest fare
• Only loses to competitors to have access to exclusive
private fares and/or other discounts
• Search actually checks Direct Connect Availability, so that
low fare options are actually bookable
![Page 14: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/14.jpg)
14 14 14
Algorithms
• Dynamic schedules
• Connections are not generated overnight and stored
• Not limited to routes explicitly setup by airlines or other
marketing staff
• Availability Manager
• Flexible rules to access airline availability
• Current methods
– Direct Connect
– Host Availability
– Teletype (AVS)
• Can also use
– Cached DCA
– Inventory proxy
![Page 15: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/15.jpg)
15 15 15
ATSE Hybrid
• Air shopping for desirable itineraries
• Must search through multiple airlines, flights, fare types,
dates, adjacent airports, etc.
• Must calculate prices, taxes, surcharges
• Complexity
• Single round-trip request can have over 3 billion fare
combinations
• Search is CPU and memory intensive
• Business driver
• No direct revenue for shopping transactions
• Increasing look to book ratio
![Page 16: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/16.jpg)
16 16 16
ATSE Hybrid
• Combine Linux servers and HP NonStop servers
• HP NonStop delivers high availability and reliability
• Better than or equal to TPF at significantly lower cost
• Master database management
• Data replicated in real-time to Linux servers
• PNR pricing, schedules and availability
• Linux delivers 64-bit memory model and faster CPUs
• Lower availability and reliability than HP NonStop but at
significantly lower cost
• Horizontally scaled server farm with spare capacity
• Best fit for CPU-intensive shopping workloads
![Page 17: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/17.jpg)
17 17 17
ATSE Hybrid
I B M
PSS
Naming Service
And
Load Balancing
Load Information
Schedule and Availability
Updates
I B M
MVS
Fare and Rule
Updates
HP Non-Stop
Linux Server Farm
DB Image
Load
and Updates
E/R
Logging
and Billing
l a t i g i d l a t i g i d l a t i g i d l a t i g i d l a t i g i d l a t i g i d
Availability
Requests
Shopping
Transactions
Linux Linux Linux Linux Linux Linux Linux Linux
Linux Linux Linux Linux Linux Linux Linux Linux
Linux Linux Linux Linux Linux Linux Linux Linux
Air Shopping
Transactions
![Page 18: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/18.jpg)
18 18 18
ATSE Linux servers
• In production since July 2003
• Started with HP rp5405 servers (Unix PA-RISC)
– Migrated to Itanium in December 2003
• Using 45 HP rx5670 servers
– 4-way, 1.5 GHz, 6MB L2 cache, 32GB RAM, 4x72GB SCSI
• Software
• MySQL 4.0.15
• GNU compilers – g++ 3.2.3 and glibc 2.3.2
• TAO object request broker
• Redhat RHAS 2.1
• GoldenGate Extractor/Replicator
• Monitoring – Prognosis, CA Unicenter, scripts
![Page 19: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/19.jpg)
19 19 19
ATSE Software
• Extensive use of open source software
• MySQL 4.0.15
• GNU compilers – g++ 3.2.3 and glibc 2.3.2
• TAO object request broker
• Redhat Linux AS 3.0
• Third party software
• GoldenGate Extractor/Replicator
• Monitoring – Prognosis, CA Unicenter, scripts
• Internally developed applications and scripts
![Page 20: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/20.jpg)
20 20 20
Data replication
• HP NonStop (Tandem) is master database
• Golden Gate Software used to replicate to MySQL
– Extracts data form undo/redo logs on the NonStop server
– Performs INSERT / UPDATE / DELETE on MySQL
– Software performs catch-up / resync in case of crashes or
other failures
• Each Linux server has an identical copy of the database
– 50GB database on each server, all InnoDB
• Replication volume
• 150 tables replicated (over 300 on NonStop server)
• Can replicate 1M fare changes / hour
• Data updates on 7x24 basis
![Page 21: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/21.jpg)
21 21 21
HP NonStop
Data replication
SQL/MP
DB TMF
Log Extract
Queue Data
Pump
Linux IA-64
MySQL
Queue
DB
Receive
Updater
= Golden Gate Software
![Page 22: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/22.jpg)
22 22 22
Data Replication
Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux
DataPump
DataPump
DataPump
DataPump
DataPump
DataPump
DataPump
DataPump
DataPump
DataPump
DataPump
DataPump
Server-Net
MySQL
Queue
Extract
Collector
Repli-cator
MySQL
Queue
Extract
Collector
Repli-cator
MySQL
Queue
Extract
Collector
Repli-cator
MySQL
Queue
Extract
Collector
Repli-cator
MySQL
Queue
Extract
Collector
Repli-cator
MySQL
Queue
Extract
Collector
Repli-cator
MySQL
Queue
Extract
Collector
Repli-cator
MySQL
Queue
Extract
Collector
Repli-cator
MySQL
Queue
Extract
Collector
Repli-cator
MySQL
Queue
Extract
Collector
Repli-cator
MySQL
Queue
Extract
Collector
Repli-cator
MySQL
Queue
Extract
Collector
Repli-cator
Extract
Queue
Extract
Queue
Extract
Queue
Extract
Queue
Extract
Queue
Extract
Queue
![Page 23: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/23.jpg)
23 23 23
Results
Reduced runtime costs
(over 80% compared to legacy)
Reduced development
costs
Increased
functionality Decreased fare
loading cycle times Competitive
Advantage
![Page 24: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/24.jpg)
24 24 24
Hybrid
• Horizontal scalability
• Ability to throw inexpensive CPUs at the problem
• Tolerate failure of a single server
• How do we get there from here?
• Database and network functions remain on Himalaya
• C++ code readily ports to Linux
• Publish/subscribe metaphor for data in memory
• 64-bit addressing to avoid memory constraints
![Page 25: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/25.jpg)
25 25 25
Connectivity
• CORBA
• Major functions use CORBA internally
• CORBA requests to TPF for availability
• CORBA to CTS for DCA this Summer (bypass TPF)
• Asynchronous messaging via MQ Series
• XML
• Currently uses XML requests from TPF (over RPPC) for
pricing functions
• Working on direct access from Travelocity to ATSE
– Will be used for BIP
– Already working over HTTP (development systems)
– Working on security & billing for production
![Page 26: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/26.jpg)
26 26 26
Timeline
• 2000
• Proof Of Concept, April – August
• 5 core developers, partnership with Compaq
• 2001
• Development & training began in February
• Initial hardware delivered
• 2002
• Phase 1 in production since July
• Zero downtime since implementation
• Rapidly developing additional functionality
• Wow – this is from an ancient slide, huh?
![Page 27: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/27.jpg)
27 27 27
Precompiler
• Challenge
• 500K lines of C/C++, 150+
files with embedded SQL
• We did not want to rewrite
ESQL / C code by hand
• Solution
• Wrote a precompiler that
converts ESQL to inline
MySQL calls
• About 1000 lines of awk
• We are willing to share this
code with others
EXEC SQL BEGIN DECLARE SECTION;
int host_a;
double host_b;
char host_c;
EXEC SQL END DECLARE SECTION;
EXEC SQL DECLARE csr1 CURSOR FOR
SELECT a, b, c
FROM table1
WHERE x = :hostvar1;
EXEC SQL OPEN csr1;
while (rc >= 0 && rc != 100){
EXEC SQL FETCH csr1 INTO
:host_a, :host_b, :host_c;
printf("Fetch %d, %lf, %s\n",
host_a, host_b, host_c);
}
EXEC SQL CLOSE csr1;
![Page 28: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/28.jpg)
28 28 28
Precompiler
• How it works
• Convert C / ESQL to C++ code
• Polymorphism matches data types in the declare section
• Can ignore the declare section
EXEC SQL BEGIN DECLARE SECTION;
int host_a;
double host_b;
char host_c;
EXEC SQL END DECLARE SECTION;
// EXEC SQL BEGIN DECLARE SECTION;
int host_a;
double host_b;
char host_c;
// EXEC SQL END DECLARE SECTION;
![Page 29: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/29.jpg)
29 29 29
Precompiler
EXEC SQL DECLARE csr1 CURSOR FOR
SELECT a, b, c
FROM table1
WHERE x = :hostvar1;
// EXEC SQL DECLARE csr1
static e2mysql csr1 = {
" SELECT a,b,c FROM table1 WHERE x = :hostvar1"
, NULL , 0};
Cursor declarations (SELECT statements) are converted to a static
struct. The struct has the text of the SQL, as well as statement
handles for doing prepare / execute (where applicable)
![Page 30: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/30.jpg)
30 30 30
Precompiler
// EXEC SQL FETCH csr1
static int16 fetch_csr1()
{
if ( ! csr1.rslt )
return SQL_ERROR;
if ( csr1.row >= mysql_num_rows(csr1.rslt) )
return SQL_NO_DATA;
MYSQL_ROW row = mysql_fetch_row(csr1.rslt);
SQLBindColPoly(row[0], host_a, sizeof(host_a));
SQLBindColPoly(row[1], host_b, sizeof(host_b));
SQLBindColPoly(row[2], host_c, sizeof(host_c));
++csr1.row;
return SQL_SUCCESS;
}
EXEC SQL FETCH csr1 INTO :host_a, :host_b, :host_c;
The OPEN, FETCH and CLOSE statements are converted into
function calls. The precompiler generates the code for these calls
and puts it at the end of the source module.
![Page 31: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/31.jpg)
31 31 31
Precompiler
inline int32
SQLBindColPoly(const char* value, int32& parm, uint16 size)
{
parm = atoi(value);
return SQL_SUCCESS;
}
A lightweight wrapper around the database API lets us
use polymorphism to convert to the types specified in the
declare section. There is a wrapper function for each
simple C++ type that we handle.
![Page 32: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/32.jpg)
32 32 32
Precompiler
• Notes
• Light-weight C++ wrapper to MySQL API
• The precompiler understands some SQL syntax and does
some modifications of NonStop SQL/MP statements
• We have also used our precompiler to target other DBMS
– ODBC API
– Oracle
– PostgreSQL
• Since we convert C to C++, this may be problematic for
ESQL programs that used deprecated K&R syntax
– C++ compilers are stricter than C compilers
– However, we did not have this problem with our application
![Page 33: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/33.jpg)
33 33 33
Other MySQL applications at Sabre
• ATSE is our largest and most mission critical
• We have other production systems that rely on MySQL
• Site59.com is the most visible
• MySQL also used for some internal databases
• More under development
• MySQL / Linux / SATA drives make cheap data marts
• Sometimes cheaper to replicate to a data mart than to
upgrade a central data warehouse
• Currently testing with a 1.5B row database
![Page 34: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/34.jpg)
34 34 34
Site59
• Last minute travel packages
• Acquired by Travelocity in
March 2002
• Sales volume?
• Transaction rates?
• All dynamic content generated
using PHP & MySQL
![Page 35: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/35.jpg)
35 35 35
Site59
Presentation
(Apache/PHP)
Replication Frontend DB
(MySQL, Linux)
Backend DB
(Oracle, Sun)
Application
Server Internet
HTTP
Reservations
System Gateway
XML/HTTP
Site59 implements a fairly “classic” dynamic website using MySQL.
Dynamic content is generated at about 30Mbits / second. Extensive
use is made of single and dual processor Linux machines (IA-32)
![Page 36: Sabre presentation for MySQL user conference 2004](https://reader034.vdocuments.net/reader034/viewer/2022042513/554bc5f4b4c90530298b554b/html5/thumbnails/36.jpg)
36 36 36
Fulfill Session Shop Sell Price
Travel Commerce Processing Chain