xml, cm, and km kmworld 2001 thursday november 1, 2001 darlene fichter data library coordinator...
TRANSCRIPT
XML, CM, and KM KMWorld 2001
Thursday November 1, 2001
Darlene FichterData Library Coordinator
University of Saskatchewan Libraries
Frank CervoneAssistant University Librarian for Information Technology
Northwestern University
Why XML?
A critical component of KM involves knowledge representation and codification
To support knowledge activities, computers must have access to structured collections of information and sets of inference rules that they can use to conduct automated reasoning
What is XML?
Structured data interchange– A common syntax for expressing structure in data
Designed to account for “unstructured” data– Documents
Inherently conveys meaning/structure Content and process separate from structure Delivered via standard text files
XML Example – Rich Site Summary
<?xml version="1.0"?>
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/ formats/rss-0.91.dtd">
<rss version="0.91" encoding= "ISO_8859-1">
<channel> <title>book news</title> <link>http://www.test.com</link> <description>Book news - headlines from around the web, refreshed every 15
minutes</description> <language>en-us</language></channel>
Headlines
<item> <title>
'Author Unknown' by Don Foster
</title><link>
http://www.salon.com/books/feature/2001/10/30/pbacks/index.html </link><description>
Salon Nov 1 2001 6:51AM </description>
</item>
XML is open
Open standards NOT proprietary Platform neutral, license-free and widely
supported Influenced by a number of standards
organization Agreement on a number of core standards in
the XML family
XML strengths
Flexible– Make collaborative information exchange simpler
Less expensive implementation– Light-weight software modules
Separates content from processing Easily internationalized
– Full Unicode support Enables complex information retrieval
XML is flexible
Very flexible – you can define your own languages, vocabulary, and metadata
Easily extended by adding additional elements (fields) and attributes
Data description can be sent with the data
XML enables less expensive implementation
Implementation tools are modularized– XML browser can be implemented in less than 200K– HTML browser > 4MB to 80 MB
Standard syntax makes processing easier and therefore less expensive
– Simple implementation of “validity checking”
Lower cost– Allow small and medium-sized organizations to participate in
data exchange initiatives
XML separates content from process
Doesn’t impose a particular manner for processing
Doesn’t impose constraints on how to handle information
Same data can be used in web page, hand held device through simple “transformations”– “loosely coupled”– “future proof”
XML is easily internationalized
Unicode standard supports a wide range of languages and scripts
– Latin (Western and Eastern European, non-western languages)– Greek– Cyrillic– Hebrew– Arabic– Armenian– Georgian– Thai– Lao– Hangul (Korean)– Ideographs (Chinese, Japanese, Korean)– Hiragana and Katakana (Japanese)– Cherokee– Khmer– Ethiopian
XML enables complex information retrieval
Supports encoding of metadata through both standardized and constructed tag sets
XML downsides
Space, processor, and bandwidth hog Just a document syntax, not a full-fledged
programming language Doesn’t work for binary data Is a regression from centralized and efficient
databanks Specifications are not complete
XML and content management
CM systems repositories use XML for tagging and storing information
CM systems use XML as a standard protocol for integration with other applications
XML is invisible to the information creator– XML markup created as the information is captured
Emerging Standards For KM
XTM OPML RFML FLBC
Industry specific standards:
•Legal
•Publishing
•Scientific research
XTM: Topic Maps
Topic maps are a new ISO standard for describing knowledge structures and associating them with information resources
Used to organize information into knowledge bases
“GPS” for information http://www.topicmaps.org/xtm/index.html
“A book without an index is like a country without a map”
OPML
Outline Processor Markup Language– Outline-structured information
Used for data the is easily browsed and editable– Specifications– Legal briefs– Product plans– Presentations– Screenplays– Directories
RFML
Relational-functional markup language Used to define relationship and functions
among data elements– Tables within relational databases– Relational views
FLBC
Formal Language for Business Communication– Automated communication – Conversation management– Dialog management– Based on speech act theory
Formally defined message types Broad range of message types Defined in terms of intentions Clear delineation between message type and content
XML in Use
Portals Content management & syndication Content management: industry sector Integration Analytical/decision making Search and retrieval Visualization