gpfs to improve multi-site data access -...

17
GPFS to Improve Multi-site Data Access Lugano, May 30th 2013 Luc Corbeil and Stefano Claudio Gorini, CSCS

Upload: vokien

Post on 15-Feb-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

GPFS to Improve Multi-site Data Access

Lugano, May 30th 2013 Luc Corbeil and Stefano Claudio Gorini, CSCS

Context

•  Scientists like their data “close” to their desk – Simple, interactive access – Real-time manipulation

•  Scientists can’t have petabytes in their office (yet) – (or I’d like to visit your office) – Large data sets centralized in fewer big storage repositories

• What if your data is hosted hundreds of kilometers away?

– Lower bandwith – Higher latency – Could have to put/get data constantly, manually

2

To Solve This

• One could use the following: – OpenAFS

– Can cache data locally – But not a parallel filesystem

– Multi-cluster, multi-site GPFS – Aka DEISA – Very nice interface – Performance while accessing remote data may

vary • GPFS 3.5 / AFM looks like a good fit on paper

– Aiming for local-type performance with remote data

3

What is Active File Management?

4

Ø AFM is a scalable, high-performance, file system caching layer integrated with the GPFS cluster file system.

Ø  It enables you to create associations between GPFS clusters and define the location and flow of file data, to automate the management of the data.

Ø  This allows you to implement a single namespace view across sites around the world.

RS-NAS Pilot Project

•  Funded by SWITCH – Internet/WAN provider for Swiss Universities

• Problem – Large data sets to store – Can it be done efficiently using remote storage capacity?

•  End users: – ETHZ – University of Zürich

•  Storage provider – CSCS in Lugano

5

RS-NAS Remote Project

6 min(Latency) à CACHE

Access your Data Anywhere

7

Hosted at CSCS, main GPFS cluster where the data is safely maintained and backed up.

Hosted in Zürich, minimal GPFS cluster and a disk cache. Designed to satisfy

the short term needs.

link that connects the two Clusters: a dedicated link or a simple

Internet connection.

It solves almost all our problems

8

AFM AFM

INTERNET

HSM

AFM

Other SWISS UNI

AFM

Other SWISS UNI

GPFS

It solves almost all our problem

9

The home data is readable and writeable, but the

remote cached data can’t be modified locally.

Data can only be modified In the cache.

Home data cannot be modified.

This is like similar to a snapshot, but modified data will

never be synchronized back, relationship is broken.

It Works!!

10

It Works!! – Bonnie++

11

Random seeks Seq Create Seq Read Seq Delete Random

Create Random Read

GPFS Multicluster 399000 59725 9073 6606 79584 8100 AFM 33859 13530 8241 3183 18131 1319

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

Late

ncy

in u

s

It Works!! – MPIIO benchmark

12

0.03 GB 0.3 GB 2.5 GB 20 GB MC 0.553378 3.683378 24.088768 301.645924 AFM 0.710782 1.536846 12.676661 105.639763

0

50

100

150

200

250

300

350

seco

nd

s

Benchmark Conclusion

• Results are nice, but not surprising – Cached data access performs much better than WAN access

– Latencies are improved – WAN link usage is reduced – Still, nice to get the expected result

•  Cases where it pays the most – Data writes not flooding the cache – Data reads of data that is in cache – Data re-reads when data is still in cache – Data pre-fetch is possible

– (but sys-admin triggered)

13

Other Potential Use Cases

Some creativity could be used with AFM

– Migration tool from a random filesystem to GPFS

– Scheduled data pre-fetch, ready when you need

– Interactive gateway to remote data, with or without write possibilities

14

Early Experience Report

• Pilot Project means that

– it’s not designed for production – Non-redundant hardware caused issues

– The OS can misbehave – NFS daemon bugs, being worked on by RedHat

– GPFS AFM can misbehave – Issues quickly resolved, no outstanding issue

– Murphy can still pay you a visit – Firmware issue with storage caused data loss

15

Conclusion

16

Pros

Ø  Feel at home all around the world

Ø  Reduce the cost

Cons Ø  Unique bottle neck – NFS Link

Ø  NFS BUG

What is missing? Ø  Multi-writer mode – coming on

GPFS 3.5 by the end of 2013

17

A special Thanks to: •  Tacchella Davide (CSCS) •  Kalyan Gunda (IBM)