computational infrastructure for policy informatics

Download Computational Infrastructure for Policy Informatics

Post on 13-Jan-2016




0 download

Embed Size (px)


Computational Infrastructure for Policy Informatics. Policy Informatics in an Interdependent World Workshop Washington DC September 13 2007 Geoffrey Fox Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401 - PowerPoint PPT Presentation


  • Computational Infrastructure for Policy InformaticsPolicy Informatics in an Interdependent World WorkshopWashington DC September 13 2007

    Geoffrey FoxComputer Science, Informatics, PhysicsPervasive Technology LaboratoriesIndiana University Bloomington IN 47401

  • *e-moreorlessanything e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it. from its inventor John Taylor Director General of Research Councils UK, Office of Science and Technology e-Science is about developing tools and technologies that allow scientists to do faster, better or different researchSimilarly e-Business captures an emerging view of corporations as dynamic virtual organizations linking employees, customers and stakeholders across the world. This generalizes to e-moreorlessanything including presumably e-PolicyinformaticsA deluge of data of unprecedented and inevitable size must be managed and understood.People (see Web 2.0), computers, data and instruments must be linked. On demand assignment of experts, computers, networks and storage resources must be supported

  • *Role of CyberinfrastructureCyberinfrastructure is infrastructure that supports distributed science (e-Science) data, people, computersExploits Internet technology (Web2.0) adding (via Grid technology) management, security, supercomputers etc.It has two aspects: parallel low latency (microseconds) between nodes and distributed highish latency (milliseconds) between nodesParallel needed to get high performance on individual large simulations, data analysis etc.; must decompose problemDistributed aspect integrates already distinct components especially natural for dataCyberinfrastructure is in general a distributed collection of parallel systems Cyberinfrastructure is made of services (originally Web services) that are just programs or data sources packaged for distributed access

  • Structure of CyberinfrastructureDistributed software systems are being revolutionized by developments from e-commerce, e-Science and the consumer Internet. There is rapid progress in technology families termed Web services, Grids and Web 2.0The emerging distributed system picture is of distributed services with advertised interfaces but opaque implementations communicating by streams of messages over a variety of protocolsComplete systems are built by combining either services or predefined/pre-existing collections of services together to achieve new capabilitiesAs well as Internet/Communication revolutions (distributed systems), multicore chips will likely be hugely important (parallel systems)Industry not academia is leading innovation in these technologies

  • Policy Informatics InfrastructureThe Party Line approach is clear one creates a Cyberinfrastructure consisting of distributed services accessed by portals/gadgets/gateways/RSS feedsServices include:original dataTransformations or filters implementing DIKW (Data Information Knowledge Wisdom) pipeline Final Decision Support step converting wisdom into actionGeneric services such as security, profiles etc.Some filters could correspond to large simulationsInfrastructure will be set up as a System of Systems (Grids of Grids)Services and/or Grids just accept some form of DIKW and produce another form of DIKWOriginal data has no explicit input; just output

  • SSSSSSSSSSSSSSSSSSSSFSFSFSFSFSFSFSFSFSFSFSFSFSFSFSFSFSFSFSFSPortalFSOSOSOSOSOSOSOSOSOSOSOSOSMDMDMDMDMDMDMDMDMDMetaDataFilter ServiceSensor ServiceOtherServiceAnother GridRaw Data Data Information Knowledge WisdomDecisionsSSSSAnother ServiceAnother ServiceSSAnother GridSSAnother GridFSInter-Service Messages

  • Information Management/ProcessingDiagram describes e-Science, Military Command and Control and perhaps Policy InformaticsData Information Knowledge Wisdom transformation(SOAP or just RSS) messages transport information expressed in a semantically rich fashion between sources and services that enhance and transform information so that complete system providesSemantic Web technologies like RDF and OWL might help us to have rich expressivity but they might be too complicatedWe are meant to build application specific information management/transformation systems for each domain Each domain has specific services/standards (for APIs and Information) and will use generic services (like R for datamining) and standards (RDF, WSDL)What is PIML Policy Informatics Markup Language?Standards made before consensus or not observant of technology progress are dubious (cf. HLA in simulation or many grid standards)

  • Too much Computing?Historically one has tried to increase computing capabilities byOptimizing performance of codesExploiting all possible CPUs such as Graphics co-processors and idle cyclesMaking central computers available such as NSF/DoE/DoD supercomputer networksNext Crisis in technology area will be the opposite problem commodity chips will be 32-128way parallel in 5 years time and we currently have no idea how to use them especially on clientsOnly 2 releases of standard software (e.g. Office) in this time spanGaming and Generalized decision support (data mining) are two obvious ways of using these cyclesIntel RMS analysisNote even cell phones will be multicoreToo much data matched to Too much computing but implications involved rather different

  • Intels Projection

    Pradeep K. Dubey,

    What is ?What if ?Is it ?RecognitionMiningSynthesisCreate a model instanceRMS: Recognition Mining SynthesisModel-basedmultimodalrecognitionFind a modelinstanceModelReal-time analytics ondynamic, unstructured,multimodal datasets

    Photo-realism andphysics-basedanimationModel-lessReal-time streaming andtransactions on static structured datasets

    Very limited realism

    Pradeep K. Dubey,

    What is a tumor?Is there a tumor here?What if the tumor progresses?It is all about dealing efficiently with complex multimodal datasetsRecognitionMiningSynthesisImages courtesy:

  • Intels Application Stack

  • What should we do?There will be high quality parallel data mining algorithmsSpeech Recognition, Text and multimedia search and browsersNew generation of desktop aides What are synergies to Personal aides in an information rich world (future of PC?) and Policy Informatics?What filters (data mining) does policy informatics need?As computing free, focus on identifying information/knowledge/wisdom needed (there is probably too much data but not so much wisdom in DIKW pipeline)We should use supercomputer/computer services but Information services more important and less controversialIdentify standards for data and data-mining APIsSet up distributed Policy Informatics ServicesUse Web 2.0 (as it makes things easier) not current Grids (which makes things harder)Build a Programmable Policy Informatics WebEmphasize SimplicityIs Secrecy important and in fact viable?Should we care just about original data or also about the whole pipeline DIKW?

  • Web 2.0 Mashups and APIs has (Sept 12 2007) 2312 Mashups and 511 Web 2.0 APIs and with GoogleMaps the most often used in MashupsMashups are called workflow in Grid arena

  • The List of Web 2.0 APIsEach site has API and its featuresDivided into broad categoriesOnly a few used a lot (49 APIs used in 10 or more mashups)RSS feed of new APIsAmazon S3 growing in popularity

  • Spare Slides

  • Grid Service Philosophy IServices receive data in SOAP messages, manipulate it and produce transformed data as further messagesKnowledge is created from information by servicesInformation is created from data by servicesSemantic Grid comes from building metadata rich systems of services Meta-data is carried in SOAP messagesThe Grid enhances Web services with semantically rich system and application specific managementOne must exploit and work around the different approaches to meta-data (state) and their manipulation in Web Services

  • Grid Service Philosophy IIThere are a horde of support services supplying security, collaboration, database access, user interfacesThe support services are either associated with system or application where the former are WS-* and GS-* which implicitly or explicitly define many support servicesThere are generalized filter services which are applications that accept messages and produce new messages with some data derived from that in inputSimulations (including PDEs and reactive systems)Data-miningTransformationsAgentsReasoning Decision making Tools are all termed filters hereAgent Systems are a special case of GridsPeer-to-peer systems can be built as a Grid with particular discovery and messaging strategies

  • Grid Service Philosophy IIIFilters can be a workflow which means they are just collections of other simpler servicesGrids are distributed systems that accept distributed messages and produce distributed result messagesA service or a workflow is a special case of a GridA collection of services on a multi-core chip is a GridSensors or Instruments are managed by services; they may accept non SOAP control messages and produce data as messages (that are not usually SOAP)

  • Virtual Observatory Astronomy GridIntegrate ExperimentsRadioFar-InfraredVisibleVisible + X-rayDust MapGalaxy Density Map

  • Service or Web service ApproachOne uses GML, CML etc. to define the data in a system and one uses services to capture methods or programsIn eScience, important services fall in three classesSimulationsData access, storage, federation, discoveryFilters for data mining and


View more >