agentcities - agents and grids
DESCRIPTION
AgentCities - Agents and Grids. Thoughts on Monitoring and Agents. Prof Mark Baker ACET, University of Reading Tel: +44 118 378 8615 E-mail: [email protected] Web: http://acet.rdg.ac.uk/~mab. Outline. Monitoring: What is it? A View of Grid Monitoring. Ganglia Example. - PowerPoint PPT PresentationTRANSCRIPT
February 20, 07 [email protected]
AgentCities - Agents and Grids
Prof Mark Baker
ACET, University of Reading Tel: +44 118 378 8615 E-mail: [email protected] Web: http://acet.rdg.ac.uk/~mab
Thoughts on Monitoring and Agents
February 20, 07 [email protected]
Outline• Monitoring: What is it?• A View of Grid Monitoring.• Ganglia Example.• Generic Monitoring Architecture• A Layered View.• Monitoring Issues.• Where do Agents fit in?• Summary.
February 20, 07 [email protected]
Monitoring: What is it?• Monitoring is part of the process of administrating
and managing computer-based resources: – However, the term “monitoring” is rather an overloaded
word.
• The term implies that we are effectively “watching” the state of some component or resource.
• This type of passive monitoring (read only) is useful in some spheres (e.g. job submission), but has limited usefulness for actually managing these computer-based resources.
• Dynamic monitoring (read/write) is more useful because now we can not only watch the status of the resources, but we can interact with them to control and manage them too (e.g. reconfigure on the fly, change QoS setting, queue priorities…).
February 20, 07 [email protected]
A View of Grid Monitoring• Traditional view of monitoring is looking at static
and dynamic computer-based resource information:– Static Information:
• For example - CPU type, amount of memory, OS type…– Dynamic Information:
• For example - CPU, memory, disk use.
• This information gathered can be used for all manner of tasks:– Basic systems monitoring (sys admin tasks),– General accounting,– Monitoring for job submissions purposes (want to choose best
resource for task placement),– Monitoring to ensure QoS,– Policing SLA,– Performance profiling of systems and applications (looking for
bottlenecks and other problems),– Potential for security reasons.
February 20, 07 [email protected]
Ganglia
February 20, 07 [email protected]
Generic Architecture (Local)
ResourceWarnings& Alerts
GatherPerformance
Statistics
LocalGrid
Resource 1
LocalGrid
Resource 2
LocalGrid
Resource n
Grid SiteResourceMonitorWebserver(Servlets)
Local Cache(Database)
Remote (registered)Grid Sites
Resource andHistorical
Performance Data
Performance Information Gathering Protocols: SNMP, WBEM….
Agent/Sensor
Agent/Sensor Agent/Sensor
February 20, 07 [email protected]
Generic Architecture (Global)
IDC GridRMGateway(Servlet)
Grid Site A
IDC GridRMGateway(Servlet)
Grid Site B
IDC GridRMGateway(Servlet)
Grid Site C
GMADirectory
Request/ Response
Register
RegisterRegist
er
Web Client
12
3
February 20, 07 [email protected]
Data Management Issues• Need to produce:
– A simple and expressive API,– Device drivers and manager for each Agent,– A means of describing the monitored data:
• Implies an XML-based schema and an ontology.
Agent API
Common Agent API
SNMPAgent NWS
AgentNetLAgent
WBEMAgent
SCMAgent
XYZAgent
Ontologies and Schema
Resource Markup Language
API
Agent Devices
Agent Driver ManagerDriver Manager
February 20, 07 [email protected]
Some Architectural Issues• Sensors/Agents:
– Make everyone install custom agents, or use existing ones!• Potentially billions of resources that need monitoring!
• Protocols:– No real standards apart from SNMP.– XML used extensively now - GLUE often used (limited).
• Resources verses Services:– On-going debate.
• Scalability:– Need global extent, current systems are typically designed
for small scale, based on cluster monitoring.
• Security:– Often little or no security.– OK for read-only systems, but…
• Intrusiveness:– Trade-off as usual, do not want to affect systems monitored.
February 20, 07 [email protected]
Monitoring Systems• Recent review showed that there are about
twenty active Grid-based monitoring systems.
• These range from systems: – That are “built from scratch” - to use such a
system you need to install all the their software for monitoring purposes,
– To those that are built on existing infrastructure and standards - gather SNMP/Ganglia data and use this for monitoring purposes.
• The latter systems are becoming increasing popular and widely used to day.
February 20, 07 [email protected]
Where do Agents fit in with Monitoring?
• Agent booklet definition:– “An agent is a computer system that is capable
of flexible autonomous action in a dynamic, unpredictable, typically multi-agent domains.”
• According to this definition we “just” throw away what we have and start again with agents!
• However, there are a raft of very practical problems… – Not least among these is that most of the world
does not use agent-based technologies, and do not want to replace there monitoring infrastructure with something new and unproven.
February 20, 07 [email protected]
Where do Agents fit in with Monitoring?
Agent/Sensor API
Common Agent/Sensor API
SNMPAgent NWS
AgentNetLAgent
WBEMAgent
SCMAgent
XYZAgent
Ontologies and Schema
API
Agent Devices
Agent/Sensor Driver ManagerDriver Manager
Data/Information
Intelligence/Knowledge
Brokers, Schedulers, Policing
Clients
Intelligent Tools
February 20, 07 [email protected]
Where do Agents fit in with Monitoring?
• Not practical to replace existing monitoring infrastructure with agents.
• However, there is vast space to use agents to process data/information gathered and use this provide intelligence/knowledge to higher-level tools.
• Key agent features:– Intelligence - rule-based decision making.– Complex agent-to-agent interaction - to produce
knowledge for more sophisticated decision making.
• Potential problems!:– Integrating agent frameworks and the Grid, APIs, and
protocols - practical aspects of wide-scale deployment!
February 20, 07 [email protected]
Where do Agents fit in with Monitoring?
• SLA/QoS/site-policy policing• Intelligent brokering for a range of tasks:
– Negotiation, – Bartering,– Arbitration,– Job submission,– Resource reservation.
• Accounting tools.• Autonomic behaviour - help in providing self-
healing capabilities of distributed systems.• Working with Semantic Web technologies to
create/provide knowledge.
February 20, 07 [email protected]
Summary• Well established monitoring infrastructure
for existing distributed systems - clusters, LANs, the Grid…
• Higher level tools/services that use the gathered monitoring data are few and far between - seems a good space where agent-based systems can work.
• Need “intelligence” to provide knowledge to consumers of Grid-based services.
• Not necessarily easy to put agent and Grid infrastructure, various issues security, different architectures, API, protocols…