1 peter fox xinformatics 4400/6400 week 10, april 9, 2013 information management, workflow and...
TRANSCRIPT
![Page 1: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/1.jpg)
1
Peter Fox
Xinformatics 4400/6400
Week 10, April 9, 2013
Information management, workflow and discovery
/check-in for project definitions
![Page 2: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/2.jpg)
Review of reading• Information Integration
– Social issues in information discovery and sharing– Information integration in geo-informatics – http://cseweb.ucsd.edu/~goguen/projs/data.html– http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1839387/
• Information Life Cycle– MSDN Information Life Cycle– Information Life Cycle definition and context– http://www.computerworld.com/s/article/79885/The_new_buzzwords_Information_lifecycle_management– http://www.databasejournal.com/sqletc/article.php/3340301/Database-Archiving-A-Critical-Component-of-Information-
Lifecycle-Management.htm– http://en.wikipedia.org/wiki/Information_Lifecycle_Management– http://msdn.microsoft.com/en-us/library/bb288451.aspx
• Information Visualization– http://mastersofmedia.hum.uva.nl/2011/04/18/the-simple-ways-of-information-visualization/comment-page-1/– http://www.siggraph.org/education/materials/HyperVis/domik/folien.html– http://www.visual-literacy.org/periodic_table/periodic_table.html
• Information model development and visualization– http://www.acm.org/crossroads/xrds7-3/smeva.html
• Outside the current box– Peter Fox and James Hendler, 2011, Changing the Equation on Scientific Data Visualization, Science, Vol. 331 no. 6018
pp. 705-708, DOI: 10.1126/science.1197654 online at http://www.sciencemag.org/content/331/6018/705.full or see: http://escience.rpi.edu/publications/visualization/fox_hendler_science2011.html
2
![Page 3: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/3.jpg)
Logical Collections• The primary goal of a Management system is to
abstract the physical collection into logical collections. The resulting view is a uniform homogeneous collection.
• Note the analogy with logical models and information integration: so EARLY ON
– Identifying naming conventions and organization– Aligning cataloguing and naming to facilitate search,
access, use (who uses?)– Provision of **contextual** information
3
![Page 4: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/4.jpg)
Physical Handling• Map between physical and logical. • Where and who does it come from?– Is there a transfer into a physical form?– Is it backed-up, archived, cached? …– What formats?– Naming conventions – do they change?
• Note analogy to physical models
4
![Page 5: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/5.jpg)
Interoperability Support
5
![Page 6: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/6.jpg)
Security• Access authorization and change verification. This
is the basis of trusting your information.
6
![Page 7: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/7.jpg)
Ownership• Who is responsible for quality and meaning
7
![Page 8: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/8.jpg)
Metadata• Recall metadata are data about data.
• Metainformation?
8
![Page 9: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/9.jpg)
Persistence• Deployment of mechanisms to counteract
technology obsolescence.
9
![Page 10: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/10.jpg)
Discovery• Ability to identify useful relations and
information inside the collection
• More on this later in this class10
![Page 11: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/11.jpg)
Dissemination
11
• Mechanisms to make aware the interested parties of changes and additions to the collections.
• Do you rely on information retrieval? The Web?
![Page 12: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/12.jpg)
Summary of Information Management• Creation of logical collections
• Physical handling
• Interoperability support
• Security support
• Ownership
• Metadata collection, management and access.
• Persistence
• Knowledge and information discovery
• Dissemination and publication 12
![Page 13: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/13.jpg)
Note for your project writeup!• Information management! Cover the 9 areas.
13
![Page 14: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/14.jpg)
Information Workflow• What is a workflow?
• Why would you use it?
• Key considerations for information, cf. data
• Some pointers to workflow systems
14
![Page 15: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/15.jpg)
15
What is a workflow?• General definition: “series of tasks performed
to produce a final outcome” (taxes?)
• Information workflow – involves people but potentially want to– Automate jobs that a person traditionally
performed manually– Process large volumes of information faster than
one could do by hand
• NB difference from data workflows – it reaches out to encompass the user (e.g. ‘unrecorded actions’)
![Page 16: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/16.jpg)
16
Background: Business Workflows
• Example: planning a trip• Need to perform a series of tasks: book a flight,
reserve a hotel room, arrange for a rental car, etc.
• Each task may depend on outcome of previous task– Days you reserve the hotel depend on days of the
flight– If hotel has shuttle service, may not need to rent a
car
• Prior information, experience, preferences…
![Page 17: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/17.jpg)
Tripit.com?
17
![Page 18: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/18.jpg)
18
What about information workflows?
• Perform a set of transformations/ operations on information source(s)
• Examples– Generating images from raw data– Identifying areas of interest from a large
information source (e.g. word cloud)– Classifying a set of objects– Querying a web service for more information
on a set of objects– Many others…
![Page 19: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/19.jpg)
19
More on Workflows
• Can process many information types:– Archives– Web pages– Streaming/ real time– Images – Semiotic systems
• Robust workflows depending on formal (concept and logical) models of the flow of information among components
• May be simple and linear or very complex
![Page 20: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/20.jpg)
20
Challenges • Questions:
– What are some challenges for users in implementing workflows?
– What are some challenges to executing these workflows?
– What are limitations of writing a program?
• Mastering a programming language
• Visualizing workflow
• Sharing/exchanging workflow
• Formatting issues
• Locating datasets, services, or functions
![Page 21: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/21.jpg)
21
Workflow Management Systems
![Page 22: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/22.jpg)
22
Benefits of Workflows
• Documentation of aspects of analysis
• Visual communication of analytical steps
• Ease of testing/debugging• Reproducibility• Reuse of part or all of workflow in
a different project
![Page 23: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/23.jpg)
23
Additional Benefits
• Integration of and between multiple computing environments
• ‘Automated’ access to distributed resources via other architectural components, e.g. web services and Grid technologies
• System functionality to assist
with information integration of
heterogeneous components and
source
![Page 24: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/24.jpg)
Why not just use a script?• Script does not specify
low-level task scheduling and communication
• May be platform-dependent
• Can’t be easily reused• May not have sufficient
documentation to be adapted for another purpose
24
![Page 25: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/25.jpg)
Why can a GUI be useful?• No need to learn a programming language
• Visual representation of what workflow does
• Allows you to monitor workflow execution
• Enables user interaction (though not necessarily collaboration)
• Facilitates sharing of workflows
25
![Page 26: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/26.jpg)
Some workflow systems• Kepler• SCIRun• Sciflo• Triana• Taverna• Pegasus• Some commercial tools:
– Windows Workflow Foundation– Mac OS X Automator
• http://www.isi.edu/~gil/AAAI08TutorialSlides/5-Survey.pdf • http://www.isi.edu/~gil/AAAI08TutorialSlides/ • See reading for this week
26
![Page 27: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/27.jpg)
Discovery• How does someone find your information?
• How would you provide discovery of – collections – files – ‘bits’
• How would you find ->
27
![Page 28: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/28.jpg)
Discoveryo Search (Federated Search)oHelped by
oFolksonomies (user contributed)o Intelligent AgentsoSearch EnginesoTaxonomies
o Find photos of KimoBoy or girl?
28
![Page 29: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/29.jpg)
Use cases• Find a sound recording of a swallow.
• Excuse me?
29
![Page 30: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/30.jpg)
Use cases• Find a sound recording of an African Swallow
• Find a sound recording of a bird that sounds like an African Swallow
• Media types – how can you discover them?
30
![Page 31: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/31.jpg)
Use cases• Find the movie that Jean Tripplehorn first
starred in/ that was her most successful/ was lead actress?
• Has anyone gene sequenced a mouse?
• Find images of primary productivity in the North Atlantic
• Discovery can often involve information integration (or is it *almost always*?)
31
![Page 32: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/32.jpg)
32
Three level ‘metadata’ solution for DATA
Level 1:
Data Registration at the Discovery Level,
e.g. Volcanolocation and activity
Level 2:
Data Registration at the Inventory Level,
e.g. list of datasets,times, products
Level 3:
Data Registration at the Item Detail
Level, e.g. access toindividual quantities
Ontology basedData IntegrationUsing scientific
workflows
Earth Sciences Virtual DatabaseA Data Warehouse where
Schema heterogeneity problem is Solved; schema based integration
Data Discovery Data Integration
A.K.Sinha, Virginia Tech, 2006
![Page 33: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/33.jpg)
33
Three level ‘metadata’ solution?
Level 1:
Registration at the Discovery Level,
e.g. Find the upperlevel entry point to a
source
Level 2:
Registration at the Inventory Level,
e.g. list of datasets,using the logical
organization
Level 3:
Registration at the Item Detail
Level, i.e. annotatione.g. tagging
Integrationusing mappingmanagement
Catalog/ IndexSchema based integration
Information Discovery
Information
Integration
A.K.Sinha, Virginia Tech, 2006
![Page 34: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/34.jpg)
Information discovery• What makes discovery work?
– Metadata– Logical organization– Attention to the fact that someone would want to
discover it– It turns out that file types are a key enabler or
inhibitor to discovery– Result ranking using *tuned* algorithm
• What does not work?– Result ranking algorithms that depend on
unconventional information types (icon, index, symbol)
34
![Page 35: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/35.jpg)
Federated search• “is the simultaneous search of multiple online
databases or web resources and is an emerging feature of automated, web-based library and information retrieval systems. It is also often referred to as a portal or a federated search engine.” wikipedia
• Libraries have been doing this for a long time (Z39.50, ISO23950)
• Key is consistent search metadata fields (keywords)• E.g. Geospatial One Stop http://www.geodata.gov
35
![Page 36: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/36.jpg)
Smart search• Semantically aware search, e.g.
http://noesis.itsc.uah.edu , http://eie.cos.gmu.edu (Water -> Semantic Search)
• Faceted search, e.g. mspace (http://mspace.fm ), exhibit (MIT), S2S (RPI; http://aquarius.tw.rpi.edu/s2s )
36
![Page 37: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/37.jpg)
NOESIS
37
![Page 38: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/38.jpg)
Faceted search
38
logd.tw.rpi.edu
![Page 39: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/39.jpg)
Summary - discovery• Useful to write a few discovery use cases to
drive how your design is developed
• Evolution of your role in facilitating discovery and what/ how others implement access to your information
39
![Page 40: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/40.jpg)
Reading for this week• Is retrospective
40
![Page 41: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/41.jpg)
Check in for Project Assignment
• Analysis of existing information system content and architecture, critique, redesign and prototype redeployment
• Or a new use case, development, etc.
41
![Page 42: 1 Peter Fox Xinformatics 4400/6400 Week 10, April 9, 2013 Information management, workflow and discovery /check-in for project definitions](https://reader036.vdocuments.net/reader036/viewer/2022070415/5697bf7c1a28abf838c844ff/html5/thumbnails/42.jpg)
What is next
• April 16 – Information Audit
• April 23 –
• April 30 –
• May 6 – final project presentations42