concept: well-managed provisioning of storage space on osg sites owned by large communities, for...

5
• Concept: Well-managed provisioning of storage space on OSG sites owned by large communities, for usage by other science communities in OSG. • Examples – Providers: CMS, ATLAS. – Consumers: D0, CDF, …, DES, SBGrid. Opportunistic Storage on OSG

Upload: joel-mclaughlin

Post on 13-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Concept: Well-managed provisioning of storage space on OSG sites owned by large communities, for usage by other science communities in OSG. Examples –Providers:

• Concept: Well-managed provisioning of storage space on OSG sites owned by large communities, for usage by other science communities in OSG.

• Examples – Providers: CMS, ATLAS.– Consumers: D0, CDF, …, DES, SBGrid.

Opportunistic Storage on OSG

Page 2: Concept: Well-managed provisioning of storage space on OSG sites owned by large communities, for usage by other science communities in OSG. Examples –Providers:

Procedure• A provisioning site implements the model, makes space

allocations when needed, and advertises the ‘token’ to the consumer VO.

• Technological model leverages on:– Space reservation functions in SRM v2.2 spec.– If applicable at a site, dCache filesystem internals and disk

partitioning.

• Space allocation at a storage site:– Based on a formal understanding between provider and

consumer.– Allocation made with a well-defined size and lifetime.

• E.g., 1 TB for 1 year.

– Space expected to expire after the due lifetime.

Page 3: Concept: Well-managed provisioning of storage space on OSG sites owned by large communities, for usage by other science communities in OSG. Examples –Providers:

Technology Areas in need of Improvement

• Token lifetime flexibility: In current implementations, altering the lifetime of a token is not possible. This can lead to unintended expiration of tokens, and loss of data in the expired space. This flexibility will be required for re-negotiation of space allocations.

• Token access control consistency: If a token identifier is widely known, there is a potential of VOs' writing into their own areas - by using another VO's token. Strict access control over space allocations will be required for wider usage of opportunistic storage.

• Token advertisement: Within limitations of token access control, dynamic mechanisms to advertise tokens using Generic Information Provider (GIP) will be useful for wider deployment of opportunistic storage.

• Pure opportunistic throttles: If a site does not perform a physical partition, separating subsets of disks, there is a risk of opportunistic load overlays -- disk I/O, CPU load, and network I/O overlays -- taking a toll interfering with the main provider’s own transfers. Overall, not a major problem in the short-term. In long term, however, new internal mechanisms for separation of data-mover queues on a per-VO-basis or a per-token-basis will be required on disks.

Page 4: Concept: Well-managed provisioning of storage space on OSG sites owned by large communities, for usage by other science communities in OSG. Examples –Providers:

D0’s Needs• D0 typically submits 60,000-100,000 jobs per week at

20-25 sites on OSG. The experiment’s workflows make multiple requests for input data in quick succession.

• In past, due to lack of storage local to the processing sites, D0 input/output data had to be transferred in real time over the wide area network.

• This had led to high latencies, job timeouts, job failures, and excessively low overall efficiencies.

Page 5: Concept: Well-managed provisioning of storage space on OSG sites owned by large communities, for usage by other science communities in OSG. Examples –Providers:

D0’s Solution• D0 started using opportunistic storage in Summer’08.• D0 and OSG worked together to make changes in

D0’s workflow to adapt to SRM client-side usage. • Main providers

– CMS: Tier-2’s at UCSD, UNL, Purdue.– ATLAS: MidWest Tier-2 at IU, Great Lakes Tier-2 at MSU: .

• Results:– Ready availability of space for data movement and storage.– Increase in D0’s workflow success rate.– Increase in D0’s efficiency of OSG wall hours utilization.– Increase in D0 Event production.