lcg/egee grid incident response

18
INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org LCG/EGEE Grid Incident Response Ian Neilson, Grid Deployment Group, CERN TERENA NRENS-Grids Workshop 12 th May 2005, Amsterdam

Upload: avital

Post on 14-Jan-2016

38 views

Category:

Documents


0 download

DESCRIPTION

LCG/EGEE Grid Incident Response. Ian Neilson, Grid Deployment Group, CERN TERENA NRENS-Grids Workshop 12 th May 2005, Amsterdam. TOC. Background Grids Grid Projects Grid Environment Incident Handling Guide Requirements Requests Operational Aspects Project Environment - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: LCG/EGEE Grid Incident Response

INFSO-RI-508833

Enabling Grids for E-sciencE

www.eu-egee.org

LCG/EGEE Grid Incident ResponseIan Neilson, Grid Deployment Group, CERN

TERENA NRENS-Grids Workshop

12th May 2005, Amsterdam

Page 2: LCG/EGEE Grid Incident Response

TERENA NRENS-Grids Workshop, Amsterdam

Enabling Grids for E-sciencE

INFSO-RI-508833

TOC

• Background– Grids– Grid Projects– Grid Environment

• Incident Handling Guide– Requirements– Requests

• Operational Aspects– Project Environment– Security Coordination Team

• Planning– Use-case Testing– Service Challenges

Page 3: LCG/EGEE Grid Incident Response

TERENA NRENS-Grids Workshop, Amsterdam

Enabling Grids for E-sciencE

INFSO-RI-508833

Grids

“[Grids] enable the sharing, exchange, discovery, and aggregation of resources distributed across

multiple administrative domains ...”- Sun Microsystems

Virtual Organisations

Middleware Job Managers

Computing Elements

Resource Brokers

Proxy Servers

Storage Resource Manager

LCG – LHC Computing Grid

EGEE – Enabling Grids for e-Science in Europe

OSG – Open Science Grid

GridPP – Grid Particle Physics

PPDG – Particle Physics Data Grid

Globus ToolkitVirtual Data Toolkit

gLite

Page 4: LCG/EGEE Grid Incident Response

TERENA NRENS-Grids Workshop, Amsterdam

Enabling Grids for E-sciencE

INFSO-RI-508833

EGEE in one slide

• 70 institutions in 28 countries,federated in regional clusters

• 32MEUR for first 2 years(plans for another 2 years)

• Deployment andreengineering project

• 50% operations & support,25% training & appl. support,25% reengineering

Page 5: LCG/EGEE Grid Incident Response

TERENA NRENS-Grids Workshop, Amsterdam

Enabling Grids for E-sciencE

INFSO-RI-508833

Country providing resourcesCountry anticipating joining

In LCG-2: 131 sites, 30 countries >12,000 cpu ~5 PB storage

Includes non-EGEE sites:• 9 countries• 20 sites

Computing Resources: April 2005

Page 6: LCG/EGEE Grid Incident Response

TERENA NRENS-Grids Workshop, Amsterdam

Enabling Grids for E-sciencE

INFSO-RI-508833

LCG/EGEE Security environment

• The players

Users VOs

Sites

Personal dataRoles

Usage patterns…

Experiment dataAccess patternsMembership …

ResourcesAvailability

Accountability…

GridGrid

Page 7: LCG/EGEE Grid Incident Response

TERENA NRENS-Grids Workshop, Amsterdam

Enabling Grids for E-sciencE

INFSO-RI-508833

The Risks

• Top risks from Security Risk Analysis– http://proj-lcg-security.web.cern.ch/proj-lcg-security/RiskAnalysis/risk.html

– Launch attacks on other sites Large distributed farms of machines

– Illegal or inappropriate distribution or sharing of data Massive distributed storage capacity

– Disruption by exploit of security holes Complex, heterogeneous and dynamic environment

– Damage caused by viruses, worms etc. Highly connected and novel infrastructure

Page 8: LCG/EGEE Grid Incident Response

TERENA NRENS-Grids Workshop, Amsterdam

Enabling Grids for E-sciencE

INFSO-RI-508833

Joint Security Policy Group

Incident Response

http://cern.ch/proj-lcg-security/documents.html

Security & Availability Policy

UsageRules

Certification Authorities

AuditRequirements

VOSecurityPolicy(Draft)

Application Development& Network Admin Guide

UserRegistration

SiteRegistration

Page 9: LCG/EGEE Grid Incident Response

TERENA NRENS-Grids Workshop, Amsterdam

Enabling Grids for E-sciencE

INFSO-RI-508833

Incident Response

• Overview– LCG Security Group Agreement on Incident Response

June 2003 LCG-1 https://edms.cern.ch/document/428035/1

– Updated as The OSG Incident Handling and Response Guide Developed with JSPG

https://edms.cern.ch/file/428035/2/OSG_incident_handling_v1.0.pdf

“To guide the development and maintenance of a common capability for handling and response to cyber security incidents on Grids.”

– Aims to established common policies and processes, organizational structures, cross-organizational relationships, common communications methods, and a modicum of centrally-provided services and processes.

Grid Incident definition:“..event that poses a .. threat [to] the integrity of services, resources,

infrastructure, or identities.”

Page 10: LCG/EGEE Grid Incident Response

TERENA NRENS-Grids Workshop, Amsterdam

Enabling Grids for E-sciencE

INFSO-RI-508833

Incident Response

• The OSG Incident Handling and Response Guide– What it mandates (MUST do’s)

REPORT RESPOND PROTECT information gathered ANALYSE

– What it recommends (SHOULD do’s) Provide monitored contact mailing lists at sites Public Disclosure (summary) through site Public Relations Use signed mails

• See also Andrew Cormack’s draft “CSIRTs and Grids” comparison available here.

Page 11: LCG/EGEE Grid Incident Response

TERENA NRENS-Grids Workshop, Amsterdam

Enabling Grids for E-sciencE

INFSO-RI-508833

Incident Response

• Reporting (MUST)– Provide contact information

Individual contacts Monitored list (optional but HIGHLY desirable) Management through GOCDB (?soon)

– Report to LOCAL site security = sites should have local plan Does not replace or interfere with local plans

– Report to [email protected] Initial incident notification only, no chat Closed list Filtered abuse@.. & security@.. Currently we use [email protected]

• -egee- alias• Open list hence no moderated lists

Page 12: LCG/EGEE Grid Incident Response

TERENA NRENS-Grids Workshop, Amsterdam

Enabling Grids for E-sciencE

INFSO-RI-508833

Incident Response

• Responding (MUST)– Initial Classification

Low, Medium, High classifications– Containment

Assumes local containment process in place Attacks through the grid

• Default action to block grid access initiallyo Authorization control MUST be provided for services

Attacks on the grid• Little/no possible central control• Notify the attacking site (NREN CSIRTS)• Coordination of blocking, restoration of service

– Notification [email protected] User, VO if identity compromise Management

– Post-Incident Analysis

Page 13: LCG/EGEE Grid Incident Response

TERENA NRENS-Grids Workshop, Amsterdam

Enabling Grids for E-sciencE

INFSO-RI-508833

Operational Security Coordination

• Operational Security Coordination Team - OSCT

• EGEE operational channels are still being established.• No central authority over sites

OSCT

ROC

RC

CIC/GOC

CSIRT

“External”GRID

Media/Press“PR” • Incident Response Planning

• Best Practice Information• Security Monitoring• Security Service Challenges

Page 14: LCG/EGEE Grid Incident Response

TERENA NRENS-Grids Workshop, Amsterdam

Enabling Grids for E-sciencE

INFSO-RI-508833

Operational issues

• Recognising and reporting • What is a local CSIRT?

– Scale of coverage 24x7 site/campus network operations team Department Security Officer LCG system administrator

• Who is a security contact?– as above

• Contact management• Intersection with local CSIRT procedures

– Local quarantine and analysis

• Keeping emergency channels clear– Discussions, cross-postings

Page 15: LCG/EGEE Grid Incident Response

TERENA NRENS-Grids Workshop, Amsterdam

Enabling Grids for E-sciencE

INFSO-RI-508833

Incident Response Planning

• Response Planning Objectives– Provide a framework to use when something happens– But must be usable flexibly– Can be tested

• Classification Based ‘Use Cases’– LOW

e.g. Local single non-privileged identity compromised, local denial of service.

– MEDIUM e.g. Local privileged identity compromised, attack on grid service

not threatening grid stability.

– HIGH e.g. Exploitation of trust fabric, attack leading to grid instability or

denial of service against all service replicas.

Page 16: LCG/EGEE Grid Incident Response

TERENA NRENS-Grids Workshop, Amsterdam

Enabling Grids for E-sciencE

INFSO-RI-508833

Security Service Challenges

• Objectives– Simulating small, well defined security incidents.– Learn and iterate to update procedures.– Formalise in updated incident response procedures.– Feedback to development and testing activities.

• Exercise response procedures in controlled manner– Non-intrusive

Compute resource usage trace to owner• Run a job, can we trace it back to submission?

• SSC1 in testing phase now.• Future ?SSC2

Storage resource usage trace to owner• Run a job to store a file

Disruptive• Disrupt a service and map the effects on the service and grid

Page 17: LCG/EGEE Grid Incident Response

TERENA NRENS-Grids Workshop, Amsterdam

Enabling Grids for E-sciencE

INFSO-RI-508833

Summary

• Diverse and complex Grid environments

• We have– Basic Incident Response proposals in place– Basic Organisational structures in place

• We need to implement through– Testing and awareness through Service Challenges– Improving planning process in OSCT

Page 18: LCG/EGEE Grid Incident Response

TERENA NRENS-Grids Workshop, Amsterdam

Enabling Grids for E-sciencE

INFSO-RI-508833

Thank You

Thanks to UK PPARC for my funding in LCG