xmldb m2 stanford
Post on 25-Feb-2018
217 Views
Preview:
TRANSCRIPT
-
7/25/2019 XMLDB M2 Stanford
1/49
1
Module 2Module 2
XML BasicsXML Basics(XML, Namespaces,(XML, Namespaces,Usage scenarios, DTDs)Usage scenarios, DTDs)
-
7/25/2019 XMLDB M2 Stanford
2/49
2
History: SGML vs. HTML vs.History: SGML vs. HTML vs.
XMLXML
SGML (1960)
XML(1996)
HTML(1990) XHTML(2000)
http://www.w3.org/TR/2006/REC-xml-20060816/
-
7/25/2019 XMLDB M2 Stanford
3/49
3
Why XML ?Why XML ?
HTML is to be interpreted by browsersHTML is to be interpreted by browsers Shown on the screen to a humanShown on the screen to a human
Desire to separate the content fromDesire to separate the content frompresentationpresentation Presentation has to please the human eyePresentation has to please the human eye
Content can be interpreted by machines, forContent can be interpreted by machines, formachines presentation is a handicapmachines presentation is a handicap
Semantic markup of the dataSemantic markup of the data
-
7/25/2019 XMLDB M2 Stanford
4/49
4
Information about a book inInformation about a book in
HTMLHTMLPolitics of experience by Ronald Laing,Politics of experience by Ronald Laing,
published in 1967published in 1967 Item Item
number:320070381076cellspacing="0" width="100%">
-
7/25/2019 XMLDB M2 Stanford
5/49
5
The same information in XMLThe same information in XML
Elements
Information is (1) decoupled from presentation, then (2)chopped into smaller pieces, and then (3) marked with
semantic meaningIt can be processed by machinesLike H!L, only synta", not logical abstract data model
-
7/25/2019 XMLDB M2 Stanford
6/49
6
XML key conceptsXML key concepts
DocumentsDocuments
ElementsElements
AttributesAttributes Namespace declarationsNamespace declarations
TextText
CommentsComments Processing InstructionsProcessing Instructions
All inherited from SGML, then HTMLAll inherited from SGML, then HTML
-
7/25/2019 XMLDB M2 Stanford
7/49
7
The key concepts of XMLThe key concepts of XML
Elements
#ocuments$lements%ttributes
e"t
&ested structure'onceptual treerder is importantnly characters*, not integers, etc
-
7/25/2019 XMLDB M2 Stanford
8/49
8
ElementsElements
Enclosed in TagsEnclosed in Tags Begin Tag: e.g.,Begin Tag: e.g., End Tag: e.g.,End Tag: e.g., Element without content: e.g.,Element without content: e.g.,is ais a
shorthand forshorthand for
Elements can be nestedElements can be nested Wilde Wutz Wilde Wutz
Subelements can implement multisetsSubelements can implement multisets ... ... ... ... Order is important !Order is important ! Documents must be well-formedDocuments must be well-formedis forbidden!is forbidden!
is forbidden!is forbidden!
-
7/25/2019 XMLDB M2 Stanford
9/49
9
AttributesAttributes Attribute are associated to ElementsAttribute are associated to Elements ... ... ... ...
Elements can have only attributesElements can have only attributes Attribute names must be unique! (No Multisets)Attribute names must be unique! (No Multisets)is illegal!is illegal!
What is the difference between a nested elementWhat is the difference between a nested elementand an attribute? Are attributes useful?and an attribute? Are attributes useful? Modeling decision: should name be an attributeModeling decision: should name be an attributeor a subelement of a person ? What about age ?or a subelement of a person ? What about age ?
-
7/25/2019 XMLDB M2 Stanford
10/49
10
Text and Mixed ContentText and Mixed Content Text appears in element contentText appears in element content
The politics of experienceThe politics of experience
Can be mixed with other subelementsCan be mixed with other subelements The politics of experienceThe politics of experience
Mixed ContentMixed Content For documents data -- very usefulFor documents data -- very useful The need does not arise in data processing, only entitiesThe need does not arise in data processing, only entitiesand relationshipsand relationships
People speak in sentences, not entities and relationships.People speak in sentences, not entities and relationships.XML allows to preserve the structure of natural language,XML allows to preserve the structure of natural language,while adding semantic markup that can be interpreted bywhile adding semantic markup that can be interpreted bymachines.machines.
-
7/25/2019 XMLDB M2 Stanford
11/49
11
Continuous spectrum betweenContinuous spectrum between
natural language, semi-structurednatural language, semi-structured
data, and structured datadata, and structured data1.1. Dana said that the book entitledDana said that the book entitledThe politics ofThe politics ofexperience is really excellent !experience is really excellent !
2.2. The book entitledThe book entitledThe politicsThe politicsexperience is really excellent !experience is really excellent !
3.3. The book entitledThe book entitledTheThepolitics of experiencepolitics of experienceis really excellent !is really excellent !
-
7/25/2019 XMLDB M2 Stanford
12/49
12
CDATA sectionsCDATA sections
Sometimes we would like to preserve theSometimes we would like to preserve theoriginal characters, and not interpret them asoriginal characters, and not interpret them asmarkupmarkup
CDATA sectionsCDATA sections Not parsed as XMLNot parsed as XML
Hello,world!Hello,world!
Hello,Hello,
world!]]>world!]]>
-
7/25/2019 XMLDB M2 Stanford
13/49
13
Comments, PIs, PrologComments, PIs, Prolog Comment: Syntax as in HTMLComment: Syntax as in HTML
Processing InstructionsProcessing Instructions Contain no data - interpretation by processorContain no data - interpretation by processor
Syntax:Syntax: Pause isPause isTarget;Target;10secs10secsis Contentis Content XMLXMLis a reserved target for prologis a reserved target for prolog
PrologProlog Standalone defines whether there is a DTDStandalone defines whether there is a DTD Encoding is usually Unicode.Encoding is usually Unicode.
-
7/25/2019 XMLDB M2 Stanford
14/49
14
Whitespaces declarationWhitespaces declaration
Whitespace = Continuous sequence ofWhitespace = Continuous sequence ofSpaceSpace,,TabTabandandReturnReturncharactercharacter Special AttributeSpecial Attributexml:spacexml:spaceto control useto control use Human-readible XML (with Whitespace)Human-readible XML (with Whitespace)
The politics of experienceThe politics of experienceRonald laingRonald laing
(Efficient) machine-readible XML (no WS)(Efficient) machine-readible XML (no WS)The politics ofexperienceRonaldexperienceRonaldLaingLaing
Performance improvement: ca. Factor 2.Performance improvement: ca. Factor 2.
-
7/25/2019 XMLDB M2 Stanford
15/49
15
Language declarationLanguage declaration
The quickThe quick
brown fox jumps over the lazybrown fox jumps over the lazy
dog.
dog.What colourWhat colour
is it?
is it?What colorWhat color
is it?
is it? -
7/25/2019 XMLDB M2 Stanford
16/49
16
Universal Resource IdentifiersUniversal Resource Identifiers
on the Webon the Web
URLs, URIs, IRIsURLs, URIs, IRIs URL (Universal Resource Locators):URL (Universal Resource Locators):deferenceable identifier on thedeferenceable identifier on theWebWeb The target of an URL pointer is an HTML file (virtual or materialized)The target of an URL pointer is an HTML file (virtual or materialized)
URIs (Unique Resource Identifier):URIs (Unique Resource Identifier):general purpose key to resourcesgeneral purpose key to resourceson the Webon the Web Uniquely identifies a resourceUniquely identifies a resource Target is not an HTML file, can be anything (schema, table, file, entity, object,Target is not an HTML file, can be anything (schema, table, file, entity, object,tuple, person, physical item, etc)tuple, person, physical item, etc)
Lifetime and scope of this key is user dependentLifetime and scope of this key is user dependent
IRI (Internationalized Resource Identifiers)IRI (Internationalized Resource Identifiers)
Allow non Latin characters (Chinese, Arabic, Japanese, etc)Allow non Latin characters (Chinese, Arabic, Japanese, etc) URL, URI, IRIsURL, URI, IRIs
All stringsAll strings Very LONG stringsVery LONG strings
-
7/25/2019 XMLDB M2 Stanford
17/49
17
NamespacesNamespaces Integration of Data from diverse data sourcesIntegration of Data from diverse data sources
Integration of different XML Vocabularies (aka Namespaces)Integration of different XML Vocabularies (aka Namespaces) Each vocabulary has a unique key, identified by a URI/IRIEach vocabulary has a unique key, identified by a URI/IRI Same local name, from different vocabularies can haveSame local name, from different vocabularies can have
Different meaningDifferent meaning Different structure associated with itDifferent structure associated with it
Qualified Names (Qname) to attach a name to its vocabularyQualified Names (Qname) to attach a name to its vocabulary for all nodes in an XML document that has names (Attributes, Elements,for all nodes in an XML document that has names (Attributes, Elements,PisPis
QNameQName::= triple ( URI::= triple ( URI[ prefix: ][ prefix: ]localname )localname )
Binding (prefix, URI) is introduced in elements start tagBinding (prefix, URI) is introduced in elements start tag Later only the prefix is used, not the long URIsLater only the prefix is used, not the long URIs Prefix is optional, default namespacesPrefix is optional, default namespaces Prefix and localname a separated by :Prefix and localname a separated by :
http://w3.org/TR/1999/REC-xml-nameshttp://w3.org/TR/1999/REC-xml-names
-
7/25/2019 XMLDB M2 Stanford
18/49
18
Namespaces (cont)Namespaces (cont)
Namespace definitions look like AttributesNamespace definitions look like Attributes Identified by xmlns:prefix or xmlns (default)Identified by xmlns:prefix or xmlns (default) Bind the Prefix to the URIBind the Prefix to the URI
Scope is the entire element where theScope is the entire element where thenamespace is declarednamespace is declared Includes the element itslef, its attributes and istIncludes the element itslef, its attributes and istsubtreessubtrees
ExampleExample
-
7/25/2019 XMLDB M2 Stanford
19/49
19
Default namespacesDefault namespaces
Default namespaces, no prefixDefault namespaces, no prefix
Only applies to subelements, not attributesOnly applies to subelements, not attributes
-
7/25/2019 XMLDB M2 Stanford
20/49
20
Example: NamespacesExample: Namespaces
DQ1 definesDQ1 definesdishdishforforchinachina Diameter, Volume, Decor, ...Diameter, Volume, Decor, ...
DQ2 definesDQ2 definesdishdishforforsatellitessatellites
Diameter, FrequencyDiameter, Frequency How many dishes are there?How many dishes are there?
Better ask for:Better ask for: How manyHow manydishesdishesare there?are there? or or
How manyHow manydishesdishesare thereare there??
-
7/25/2019 XMLDB M2 Stanford
21/49
21
Example: NamespacesExample: Namespaces
2020
55
MeissnerMeissner
200200
20-2000MHz20-2000MHz
-
7/25/2019 XMLDB M2 Stanford
22/49
22
Mixing Several NamespacesMixing Several Namespaces
2020
55
MeissnerMeissner
This is an unqualified element nameThis is an unqualified element name
-
7/25/2019 XMLDB M2 Stanford
23/49
23
Example XML dataExample XML data
XHTML (browser/presentation)XHTML (browser/presentation) RSS (blogs)RSS (blogs) UBL (Universal Business Language)UBL (Universal Business Language) HealthCare Level 7 (medical data)HealthCare Level 7 (medical data)
XBRL (financial data)XBRL (financial data) Digital photography metadata (XMP)Digital photography metadata (XMP) XMI (metadata)XMI (metadata) XQueryX (programs)XQueryX (programs)
XForms (forms)XForms (forms) SOAP (message envelopes)SOAP (message envelopes) Microsoft Office -- Powerpoint in XML (documents)Microsoft Office -- Powerpoint in XML (documents)
-
7/25/2019 XMLDB M2 Stanford
24/49
24
XHTMLXHTML
QuickTime and aTIFF (Uncompressed) decompressor
are needed to see this picture.
-
7/25/2019 XMLDB M2 Stanford
25/49
25
RSS, blogsRSS, blogs XML.com http://xml.com/pub XML.com features a rich mix ofinformation and services for the XML community.
XML.com http://www.xml.com
http://xml.com/universal/images/xml_tiny.gif
-
7/25/2019 XMLDB M2 Stanford
26/49
26
UBL (Universal BusinessUBL (Universal Business
Language)Language) Vocabularies definitions for:Vocabularies definitions for:
ApplicationResponseAttachedDocumentBillOfLadingApplicationResponseAttachedDocumentBillOfLading
CatalogueCatalogueDeletionCatalogueItemSpecificCatalogueCatalogueDeletionCatalogueItemSpecific
ationUpdateCataloguePricingUpdateCatalogueRequationUpdateCataloguePricingUpdateCatalogueRequ
estCertificateOfOriginCreditNoteDebitNoteDespatcestCertificateOfOriginCreditNoteDebitNoteDespatc
hAdviceForwardingInstructionsFreightInvoiceInvoichAdviceForwardingInstructionsFreightInvoiceInvoic
eOrderOrderCancellationOrderChangeOrderRespeOrderOrderCancellationOrderChangeOrderResp
onseOrderResponseSimplePackingListQuotationRonseOrderResponseSimplePackingListQuotationR
eceiptAdviceReminderRemittanceAdviceRequestFoeceiptAdviceReminderRemittanceAdviceRequestForQuotationSelfBilledCreditNoteSelfBilledInvoiceStatrQuotationSelfBilledCreditNoteSelfBilledInvoiceStat
ementTransportationStatusWaybillementTransportationStatusWaybill
-
7/25/2019 XMLDB M2 Stanford
27/49
27
HealthCareLevel 7HealthCareLevel 7
Medical information that is being exchangedMedical information that is being exchangedbetween hospitals, patients, doctors,between hospitals, patients, doctors,pharmacies and insurance companiespharmacies and insurance companies
http://en.wikipedia.org/wiki/HL7http://en.wikipedia.org/wiki/HL7
-
7/25/2019 XMLDB M2 Stanford
28/49
28
XBRL (Financial information)XBRL (Financial information)
Goal: facilitate the exchange of businessGoal: facilitate the exchange of business
and financial performance informationand financial performance information
between companies, governments,between companies, governments,
insurance companies, banks, etc.insurance companies, banks, etc. Mandate by law in many countriesMandate by law in many countries
http://en.wikipedia.org/wiki/XBRLhttp://en.wikipedia.org/wiki/XBRL
-
7/25/2019 XMLDB M2 Stanford
29/49
29
Extensible Metadata PlatformExtensible Metadata Platform
(XMP)(XMP)
Used inUsed inPDFPDF,,photographyphotographyandandphoto editingphoto editingapplications.applications. ParticularParticularschemasschemasfor basic properties useful forfor basic properties useful for
recording the history of a resource as it passes throughrecording the history of a resource as it passes throughmultiple processing steps, from being photographed,multiple processing steps, from being photographed,
scannedscanned, or authored as text, through photo editing steps, or authored as text, through photo editing steps(such as(such ascroppingcroppingor color adjustment), to assembly intoor color adjustment), to assembly into
a final image.a final image. XMP allows each software program or device along theXMP allows each software program or device along the
way to add its own information to a digital resource, whichway to add its own information to a digital resource, whichcan then be retained in the final digital file.can then be retained in the final digital file. http://en.wikipedia.org/wiki/Extensible_Metadata_Plathttp://en.wikipedia.org/wiki/Extensible_Metadata_Platformform
http://en.wikipedia.org/wiki/Photographyhttp://en.wikipedia.org/wiki/Portable_Document_Formathttp://en.wikipedia.org/wiki/Portable_Document_Formathttp://en.wikipedia.org/wiki/Photographyhttp://en.wikipedia.org/wiki/Photographyhttp://en.wikipedia.org/wiki/Graphics_softwarehttp://en.wikipedia.org/wiki/Graphics_softwarehttp://en.wikipedia.org/wiki/Schemahttp://en.wikipedia.org/wiki/Image_scannerhttp://en.wikipedia.org/wiki/Image_scannerhttp://en.wikipedia.org/wiki/Cropping_%28image%29http://en.wikipedia.org/wiki/Cropping_%28image%29http://en.wikipedia.org/wiki/Cropping_%28image%29http://en.wikipedia.org/wiki/Image_scannerhttp://en.wikipedia.org/wiki/Schemahttp://en.wikipedia.org/wiki/Graphics_softwarehttp://en.wikipedia.org/wiki/Photographyhttp://en.wikipedia.org/wiki/Portable_Document_Format -
7/25/2019 XMLDB M2 Stanford
30/49
30
Microsoft Office in XMLMicrosoft Office in XML
Office 2003 was able to import/export allOffice 2003 was able to import/export alldocuments into XMLdocuments into XML
Office 2007 models the documents NATIVELY inOffice 2007 models the documents NATIVELY in
XMLXML Examples of vocabularies and schemas:Examples of vocabularies and schemas:
WordprocessingML (the XML file format forWordprocessingML (the XML file format for
Word 2003! "preadsheetML (Excel 2003!Word 2003! "preadsheetML (Excel 2003!
#orm$emplate XML schemas (%nfo&ath 2003#orm$emplate XML schemas (%nfo&ath 2003and 'ata'iagramingML (isio 2003and 'ata'iagramingML (isio 2003
-
7/25/2019 XMLDB M2 Stanford
31/49
31
Forms on the Web in XMLForms on the Web in XML
XML Forms (Xforms)XML Forms (Xforms) http://www.w3.org/TR/xforms/http://www.w3.org/TR/xforms/
-
7/25/2019 XMLDB M2 Stanford
32/49
32
Programs and queries in XMLPrograms and queries in XML XQuery, the XML query language, has an XMLXQuery, the XML query language, has an XMLrepresentationrepresentation
Programs and queries are also DATAPrograms and queries are also DATA Blurring the distinction between data, metadata, codeBlurring the distinction between data, metadata, code distinctdistinct
documentdocument http://www.bn.comhttp://www.bn.com
descendant-or-selfdescendant-or-self
authorauthor
-
7/25/2019 XMLDB M2 Stanford
33/49
33
SOAP and Web ServicesSOAP and Web Services Web Services is the favorite way of exchanging informationWeb Services is the favorite way of exchanging information
between applicationsbetween applications XML exchange over HTTP, with a specific protocol (SOAP)XML exchange over HTTP, with a specific protocol (SOAP)
uuid:093a2da1-
q345-739r-ba5d-pqff98fe8j7d 200q345-739r-ba5d-pqff98fe8j7d 200
11-29T13:20:00.000-05:00 11-29T13:20:00.000-05:00
ke Jgvan yvind
-
7/25/2019 XMLDB M2 Stanford
34/49
34
The need for XML schemasThe need for XML schemas Unlike any other data format, XML is totally flexible,Unlike any other data format, XML is totally flexible,
elements can be nested in arbitrary wayselements can be nested in arbitrary ways We can start by writing the XML data -- no need for aWe can start by writing the XML data -- no need for apriori design of a schemapriori design of a schema Think relational databases, or Java classesThink relational databases, or Java classes
However, schemas are necessary:However, schemas are necessary: Facilitate the writing of applications that process dataFacilitate the writing of applications that process data Constraint the data that is correct for a certain applicationConstraint the data that is correct for a certain application Have a priori agreements between parties with respect to theHave a priori agreements between parties with respect to thedata being exchangeddata being exchanged
Schema: a model of the dataSchema: a model of the data Structural definitionsStructural definitions Type definitionsType definitions DefaultsDefaults
HistoryandroleofXMLSchemaHistoryandroleofXMLSchema
-
7/25/2019 XMLDB M2 Stanford
35/49
35
History and role of XML SchemaHistory and role of XML Schema
LanguagesLanguages Several standard Schema LanguagesSeveral standard Schema Languages
DTDs, XML Schema, RelaxNGDTDs, XML Schema, RelaxNG
Schema languages have been designed after, and inSchema languages have been designed after, and inan orthogonal fashion, to XML itselfan orthogonal fashion, to XML itself
Schemas and data are completely decoupled in XMLSchemas and data are completely decoupled in XML Data can exist with or without schemasData can exist with or without schemas Or with multiple schemasOr with multiple schemas
Schema evolutions rarely impose evolving the dataSchema evolutions rarely impose evolving the data
Schemas can be designed before the data, or extracted fromSchemas can be designed before the data, or extracted fromthe data (DataGuide -- Stanford)the data (DataGuide -- Stanford)
Makes XML the right choice for manipulating semi-Makes XML the right choice for manipulating semi-structured data, or rapidly evolving data, or highlystructured data, or rapidly evolving data, or highlycustomizable datacustomizable data
-
7/25/2019 XMLDB M2 Stanford
36/49
36
DTDsDTDs
Inherited from SGMLInherited from SGML Part of the original XML 1.0 specificationPart of the original XML 1.0 specification Describe the grammar of the XML fileDescribe the grammar of the XML file
Element declarations:Element declarations:how elements are allowed to nesthow elements are allowed to nest
within each other by rules and constraintswithin each other by rules and constraints Attributes lists:Attributes lists:describe what attributes are allowed ondescribe what attributes are allowed onwhich elementwhich element
Some constraints on the value of elements and attributesSome constraints on the value of elements and attributes Which is the root element of the XML fileWhich is the root element of the XML file
Checking the structural constraints:Checking the structural constraints:DTD validationDTD validation(valid vs. invalid documents)(valid vs. invalid documents)
DTD very useful for a while, not used anymore,DTD very useful for a while, not used anymore,several major limitationsseveral major limitations
-
7/25/2019 XMLDB M2 Stanford
37/49
37
Declaring the structure ofDeclaring the structure ofelementselements
Grammar that describes the structure of the elementGrammar that describes the structure of the element Subelements, identified by Name orSubelements, identified by Name or #PCDATA#PCDATA
Combinators :Combinators : + for at least 1+ for at least 1 * for 0 or more* for 0 or more ? for 0 or 1? for 0 or 1 , for concatenation, for concatenation | for choice| for choice
PCDATA: only textual content allowedPCDATA: only textual content allowed EMPTY : the element must be emptyEMPTY : the element must be empty
ANY: allows any contentANY: allows any content
-
7/25/2019 XMLDB M2 Stanford
38/49
38
Example DTD for recipesExample DTD for recipes
-
7/25/2019 XMLDB M2 Stanford
39/49
39
Defining the attribute listsDefining the attribute lists
Structure:Structure:>
name CDATA #REQUIRED name CDATA #REQUIRED amount CDATA #IMPLIED amount CDATA #IMPLIED unit CDATA #FIXED cupunit CDATA #FIXED cup>>
CDATA means normal contentCDATA means normal content #REQUIRED, or #IMPLIED refer to the fact#REQUIRED, or #IMPLIED refer to the factthat the attribute is optional or notthat the attribute is optional or not
Default value possibleDefault value possible
-
7/25/2019 XMLDB M2 Stanford
40/49
40
Attributes (cont.)Attributes (cont.)
#REQUIRED#REQUIRED Document must specify a value for attributeDocument must specify a value for attribute
#IMPLIED#IMPLIED Attribute is optional, there is no defaultAttribute is optional, there is no default
valuevalue Default value, if no other value specifiedDefault value, if no other value specified
#FIXED#FIXEDvaluevalue Default value, if no other value specifiedDefault value, if no other value specified If value specified, it must be the fixed valueIf value specified, it must be the fixed value
M j ttibttM j ttibtt
-
7/25/2019 XMLDB M2 Stanford
41/49
41
Major attribute typesMajor attribute types
PCDATA: normal Text contentPCDATA: normal Text content
IDID Value is unique within documentValue is unique within document Element has at most one attribute of this typeElement has at most one attribute of this type No default values allowedNo default values allowed
IDREF, IDREFSIDREF, IDREFS References to other elements within theReferences to other elements within thedocumentdocument
IDREFS: Enumeration, as separatorIDREFS: Enumeration, as separator
-
7/25/2019 XMLDB M2 Stanford
42/49
42
ID and IDREF attributesID and IDREF attributes
price CDATA #IMPLIED price CDATA #IMPLIED index IDREFS index IDREFS >>
-
7/25/2019 XMLDB M2 Stanford
43/49
43
Attributes list exampleAttributes list example
-
7/25/2019 XMLDB M2 Stanford
44/49
44
Mixed content in DTDsMixed content in DTDs
Mixing PCDATA declarations with otherMixing PCDATA declarations with othersubelements means that the content can besubelements means that the content can bemixedmixed
some text some emphasized
some text some emphasized
text blah some boldtext blah some bold
text
text -
7/25/2019 XMLDB M2 Stanford
45/49
45
Declarations of DTDsDeclarations of DTDs
No DTD (well-formed Documents)No DTD (well-formed Documents) DTD inside the Document:DTD inside the Document:
DTD external, specified by URI:DTD external, specified by URI:
DTD external, Name and optional URI:DTD external, Name and optional URI:
DTD inside the document + external:DTD inside the document + external:
-
7/25/2019 XMLDB M2 Stanford
46/49
46
Correctness of XML documentsCorrectness of XML documents
Well formedWell formeddocumentsdocuments Verify the basic XML constraints, e.g. Verify the basic XML constraints, e.g.
Valid documentsValid documents Verify the additional DTD structural constraintsVerify the additional DTD structural constraints
Non well formed XML documents cannot be processedNon well formed XML documents cannot be processed Non-valid documents can still be processed (queried,Non-valid documents can still be processed (queried,transformed, etc)transformed, etc)
-
7/25/2019 XMLDB M2 Stanford
47/49
47
Limitations of DTDsLimitations of DTDs
DTDs describe only the grammar of the XMLDTDs describe only the grammar of the XMLfile, not the detailed structure and/or typesfile, not the detailed structure and/or types
This grammatical description has some obviousThis grammatical description has some obvious
shortcomings:shortcomings: we cannot express that a length element mustwe cannot express that a length element mustcontain a non-negative numbercontain a non-negative number(constraints on the(constraints on thetype of the value of an element or attribute)type of the value of an element or attribute)
The unitThe unitelement should only be allowed whenelement should only be allowed when
amountamountis presentis present(co-occurrence constraints)(co-occurrence constraints) the the commentcommentelement should be allowed toelement should be allowed toappear anywhereappear anywhere(schema flexibility)(schema flexibility)
G d S h d i
-
7/25/2019 XMLDB M2 Stanford
48/49
48
Good Schema designprinciples
The XML schema language shall be1. more e2pressi3e than 4ML +T+s
5. e2pressed in 4ML
6. se07descriin&
4.usable by a wide variety of applications that employXML
5.straightforwardly usable on the Internet
6.optimized for interoperability
7.simpleenough to be implemented with modestdesign and runtime resources
8. coordinated 9ith ree3ant :6* specs
-
7/25/2019 XMLDB M2 Stanford
49/49
RecapitulationRecapitulation
XML as inheriting from the Web historyXML as inheriting from the Web history SGML, HTML, XHTML, XMLSGML, HTML, XHTML, XML
XML key conceptsXML key concepts Documents, elements, attributes, textDocuments, elements, attributes, text
Order, nested structure, textual informationOrder, nested structure, textual information NamespacesNamespaces XML usage scenariosXML usage scenarios
Financial, medical, metadata, blogs, etcFinancial, medical, metadata, blogs, etc
DTDs and the need for describing the structure ofDTDs and the need for describing the structure ofan XML filean XML file Next: XML SchemasNext: XML Schemas
top related