gist lunch meeting otb research institute for housing, urban and mobility studies 2008-04-25 1...

20
2008-04-25 GISt lunch meeting 1 OTB Research Institute for Housing, Urban and Mobility Studies Writing a DBMS buyers guide Wim de Haas Wilko Quak sed on presentation at FOSS4G 2007 on benchmarking

Upload: lesley-carol-powers

Post on 13-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

2008-04-25

GISt lunch meeting

1

OTB Research Institute for Housing, Urban and Mobility Studies

Writing a DBMS buyers guide

Wim de Haas

Wilko Quak

Based on presentation at FOSS4G 2007 on benchmarking

2008-04-25 2

OTB Research Institute for Housing, Urban and Mobility Studies

GISt lunch meeting

Overview

• Original idea: benchmarking• Complications of benchmarking• New Idea: buyers guide• What should be in this guide

2008-04-25 3

OTB Research Institute for Housing, Urban and Mobility Studies

GISt lunch meeting

Benchmark consideration: Weird Cases department

diagonalquery

geometry

flatquery

geometry

2008-04-25 4

OTB Research Institute for Housing, Urban and Mobility Studies

GISt lunch meeting

Benchmark consideration: Hot vs Cold

2008-04-25 5

OTB Research Institute for Housing, Urban and Mobility Studies

GISt lunch meeting

Why bother with benchmarking…

Stonebraker2007:

• Where to find dramatic differences in Spatial DBMSs?

We define “dramatically outperform” to mean at least a factor 10 advantage […then] customers will be inclined to try the new architecture

2008-04-25 6

OTB Research Institute for Housing, Urban and Mobility Studies

GISt lunch meeting

Where to expect Dramatic differences?

• Linux vs Windows. (No)• Choice of DBMS (Only in specific cases)• Choice of FileSystem (no)• Functionality Difference (Yes)• Choice of Parameters (Maybe)

2008-04-25 7

OTB Research Institute for Housing, Urban and Mobility Studies

GISt lunch meeting

Problems with testing

• DBMS vendors do not want published results• Oracle explicitly forbids publishing benchmark

results• Hardware• Moore’s Law

• Release Frequency of Software• Spatial testing cannot be done on synthetic data• Too many parameters

Benchmark results are outdatedbefore they are publised

2008-04-25 8

OTB Research Institute for Housing, Urban and Mobility Studies

GISt lunch meeting

Solution

Don’t spend our time on producing benchmark results:

• Write buyer’s guide: we need a classification of users.

• Let people do their own testing: Tell them what en when to test and help them with at test suite.

2008-04-25 9

OTB Research Institute for Housing, Urban and Mobility Studies

GISt lunch meeting

Classification of spatial DMBS users

Four classes:1. Server Builders: publish spatial data via web

server2. GIS User: Load various datasets and perform

complex analyses3. Data Maintainer: Maintain one core dataset4. Power Users: All of the above and more

2008-04-25 10

OTB Research Institute for Housing, Urban and Mobility Studies

GISt lunch meeting

Class 1: Web Server Builders

• You do not really need a DBMS for this (You use a fraction of DBMS functionality)

• Only one query counts: Find everything within BBOX

2008-04-25 11

OTB Research Institute for Housing, Urban and Mobility Studies

GISt lunch meeting

Class 2: GIS users

• Main interest is functionality• Spend more time on loading data• Need a good query optimiser• Analysis

2008-04-25 12

OTB Research Institute for Housing, Urban and Mobility Studies

GISt lunch meeting

Class 3: Dataset Maintainers

• Limited number of queries• Transactions are an issue• Clustering of data after updates is interesting

2008-04-25 13

OTB Research Institute for Housing, Urban and Mobility Studies

GISt lunch meeting

Class 4: Power users

• Do their own testing• Need a platform to discuss their findings

2008-04-25 14

OTB Research Institute for Housing, Urban and Mobility Studies

GISt lunch meeting

Test suite proposal

1. Very simple performance test script with few parameters• BBOX Query• Fixed Dataset (Propasal OpenStreetMap

dataset)2. Configurable test suite• Full Suite that tests every corner of DBMS• For specialists only

2008-04-25 15

OTB Research Institute for Housing, Urban and Mobility Studies

GISt lunch meeting

Test 1: simple BBOX select

Write simple script that generates a lot of rectangle queries.

Paremeter:• DBMS size• query box size• experiment length

2008-04-25 16

OTB Research Institute for Housing, Urban and Mobility Studies

GISt lunch meeting

Test 1: grow DBMS size

• Question: Does query response time depend on DBMS size or on core memory?

• Experiment: Run same test on more an more copies of same database.

2008-04-25 17

OTB Research Institute for Housing, Urban and Mobility Studies

GISt lunch meeting

Test 1 – result: PostGIS vs MySQL

0

0.01

0.02

0.03

0.04

0.05

0.06

0 500000 1000000 1500000 2000000 2500000 3000000 3500000

PostGIS

MySQL

2008-04-25 18

OTB Research Institute for Housing, Urban and Mobility Studies

GISt lunch meeting

Test 2: Comprehensive Test Suite

• Create set of killer polygons so that every line of source code will be touched by running operations.

• Test Query optimizer• Test Join Operator• Must be done with Skewed Data

2008-04-25 19

OTB Research Institute for Housing, Urban and Mobility Studies

GISt lunch meeting

What should be in the Buyer’s guidePerformance is not an issue.

What are issues:• Details of functionality (topology, coordinate

transforms)• Total cost of ownership (open-source vs proprietary)• Configuration (faster disks or faster CPU)• Ease of Use (2 days of programming ==

A LOT OF HARDWARE)• Use of standards (vendor lock-in, system integration)

Can we answer these questions?

2008-04-25 20

OTB Research Institute for Housing, Urban and Mobility Studies

GISt lunch meeting

Discussion