iso/iec jtc 1/sc 32 n 0444jtc1sc32.org/doc/n0401-0450/32n0444.pdf · iso/iec jtc 1/sc 32 n 0444...
TRANSCRIPT
ISO/IEC JTC 1/SC 32 N 0444Date: 2000-02-02
REPLACES: --
ISO/IEC JTC 1/SC 32
Data Management and Interchange
Secretariat: United States of America (ANSI)
Administered by Pacific Northwest National Laboratory on behalf of ANSI
DOCUMENT TYPE Other Document (Open)
TITLE Tutorial on Metadata Registries
SOURCE Bruce Bargmeyer
PROJECT NUMBER
STATUS Presented at the SC 32 Tutorial 2000-01-24
REFERENCES
ACTION ID. FYI
REQUESTEDACTION
DUE DATE
Number of Pages 26
LANGUAGE USED English
DISTRIBUTION P & L Members
SC Chair
WG Conveners and Secretaries
Douglas Mann, Secretariat, ISO/IEC JTC 1/SC 32
Pacific Northwest National Laboratory *, 901 D Street, SW., Suite 900, Washington, DC, 20024-2115,United States of America
Telephone: +1 703 575 2114; Facsimile; +1 703 681 9180; E-mail: [email protected]
available from the JTC 1/SC 32 WebSite http://bwonotes5.wdc.pnl.gov/SC32/JTC1SC32.nsf
*Pacific Northwest National Laboratory (PNL) administers the ISO/IEC JTC 1/SC 32 Secretariat on behalf ofANSI
1
SDC-0002-021-JE-2030
January 19, 2000
Bruce Bargmeyer
Chair, SC 32 – Data Management and Interchange
U.S. Environmental Protection Agency
Telephone: (202) 260-5306
WWW URL: http://www.nist.gov/itl/div897/staff/bargmeyer
SC 32Tutorial on Metadata Registries
2
SDC-0002-021-JE-2030
ISOISO 11179, Parts 1-611179, Parts 1-6
l Part 1: Framework for the Specification and Standardization of Data Elements
l Part 2: Classification for Data Elements
l Part 3: Basic Attributes of Data Elements
l Part 4: Rules and Guidelines for the Formulation of Data Definitions
l Part 5: Naming and Identification Principles for Data Elements
l Part 6: Registration of Data Elements
l DTR 15452 Specification of Data Value Domains
3
SDC-0002-021-JE-2030
Part 4: Data DefinitionsPart 4: Data Definitions
l Verification Method Code:The code that represents the process used toverify the latitude andlongitude coordinates.
l Mailing Address Post Office Box:The number of the post office box where mailfor the addressee is delivered.
l International Postal Code:The code that represents a postal zone specificto a country.
4
SDC-0002-021-JE-2030
Data Element ConceptData Element ConceptAfghanistan
Belgium
China
Denmark
Egypt
France
Germany
…………
Data ElementsData ElementsAFG
BELCHN
DNKEGY
FRADEU
…………
ISO 3166English Name
ISO 31663-Numeric Code
004
056156
208818
250276
…………
ISO 31663-Alpha Code
Afghanistan
BelgiumChina
DenmarkEgypt
FranceGermany
…………
Name:Context:Definition:Unique ID: 4572Value Domain:Maintenance Org.:Steward:Classification:Registration Authority:Others
Name:Context:Definition:Unique ID: 3820Value Domain:Maintenance Org.:Steward:Classification:Registration Authority:Others
Name:Context: Definition:Unique ID: 1047Value Domain:Maintenance Org.:Steward:Classification:Registration Authority:Others
Name: Country IdentifiersContext:Definition:Unique ID: 5769Conceptual Domain:Maintenance Org.:Steward:Classification:Registration Authority:Others
•In order to reduce costs associated withmanaging metadata, we want to enableinterchange of metadata includingterminology between metadata registries.
•Organizations that are responsible forparticular terminology and data elements canpropagate these changes to other 11179metadata registries.
5
SDC-0002-021-JE-2030
D a t a _ E l e m e n t _ C o n c e p t _ R e l a t i o n s h i p
< < R e q u i r e d > > t y p e _ d e s c r i p t i o n
N o n _ e n u m e r a t e d _ D o m a i n
< < R e q u i r e d > > d e s c r i p t i o n
V a l u e _ D o m a i n _ R e l a t i o n s h i p
< < R e q u i r e d > > t y p e _ d e s c r i p t i o n
E n u m e r a t e d _ D o m a i n
P e r m i s s i b l e _ V a l u e
< < R e q u i r e d > > i t e m
< < R e q u i r e d > > b e g i n _ d a t e
< < C o n d i t i o n a l > > e n d _ d a t e
2 . . n
1 . . *
+ m e m b e r _ o f
2 . . n
+ s p e c i f i n g
1 . . *
a l l o w e d _ v a l u e
V a l u e _ M e a n i n g
< < R e q u i r e d > > i d e n t i f i e r
< < O p t i o n a l > > d e s c r i p t i o n
< < R e q u i r e d > > b e g i n _ d a t e
< < C o n d i t i o n a l > > e n d _ d a t e
2 . . n
0 . . *
+ c o n t a i n e d _ i n
2 . . n
+ c o n t a i n i n g
0 . . *
p e r m i s s i b l e _ v a l u e
1 . . *
0 . . *
+ r e p r e s e n t e d _ b y
1 . . *
+ r e p r e s e n t i n g
0 . . *
p e r m i s s i b l e _ v a l u e _ m e a n i n g
C o n c e p t u a l _ D o m a i n
< < O p t i o n a l > > a d m i n i s t e r e d _ c o m p o n e n t _ i n f o r m a t i o n : A d m i n i s t e r e d _ C o m p o n e n t
< < O p t i o n a l > > d i m e n s i o n a l i t y
0 . . *
0 . . *
+ c o n t a i n i n g
0 . . *
c o m c e p t u a l _ d o m a i n _ r e l a t i o n s h i p
+ c o n t a i n e d _ i n
0 . . *
1 . . *
0 . . *
+ c o n t a i n i n g
1 . . *
+ c o n t a i n e d _ i n
0 . . *
v a l u e _ m e a n i n g _ s e t
V a l u e _ D o m a i n
< < O p t i o n a l > > a d m i n i s t e r e d _ c o m p o n e n t : A d m i n i s t e r e d _ C o m p o n e n t
< < O p t i o n a l > > n a m e
< < R e q u i r e d > > d a t a t y p e : D a t a t y p e
< < O p t i o n a l > > m a x i m u m _ c h a r a c t e r _ q u a n t i t y
< < O p t i o n a l > > m i n i m u m _ c h a r a c t e r _ q u a n t i t y
< < O p t i o n a l > > f o r m a t
< < O p t i o n a l > > u n i t _ o f _ q u a n t i t y : U n i t _ o f _ Q u a n t i t y
0 . . *
0 . . 1
+ c o n t a i n e d _ i n
0 . . *
v a l u e _ d o m a i n _ r e l a t i o n s h i p
+ c o n t a i n i n g
0 . . 1 0 . . *
1 . . 1
+ r e p r e s e n t i n g
0 . . *
+ s p e c i f i e d _ b y
1 . . 1
s p e c i f i c a t i o n
E x a m p l e
< < R e q u i r e d > > i t e m
D a t a _ E l e m e n t _ C o n c e p t
< < R e q u i r e d > > a d m i n i s t e r e d _ c o m p o n e n t : A d m i n i s t e r e d _ C o m p o n e n t
< < O p t i o n a l > > o b j e c t _ c l a s s : O b j e c t _ C l a s s
< < O p t i o n a l > > o b j e c t _ c l a s s _ q u a l i f i e r
< < O p t i o n a l > > p r o p e r t y : P r o p e r t y
< < O p t i o n a l > > p r o p e r t y _ q u a l i f i e r
0 . . 1
0 . . * + c o n t a i n i n g
0 . . 1
d a t a _ e l e m e n t _ c o n c e p t _ r e l a t i o n s h i p
+ c o n t a i n e d _ i n
0 . . *
1 . . 10 . . *
+ s p e c i f i n g
1 . . 1
+ h a v i n g
0 . . *
d a t a _ e l e m e n t _ c o n c e p t _ c o n c e p t u a l _ d o m a i n _ r e l a t i o n s h i p
D a t a _ E l e m e n t
< < R e q u i r e d > > a d m i n i s t e r e d _ c o m p o n e n t : A d m i n i s t e r e d _ C o m p o n e n t
< < R e q u i r e d > > r e p r e s e n t a t i o n _ c l a s s : R e p r e s e n t a t i o n _ C l a s s
< < O p t i o n a l > > r e p r e s e n t a t i o n _ c l a s s _ q u a l i f i e r
0 . . * 1 . . 1
+ r e p r e s e n t e d _ w i t h
0 . . *
+ p r o v i d i n g _ r e p r e s e n t a t i o n _ f o r
1 . . 1
r e p r e s e n t a t i o n
1 . . *
1 . . *
+ r e p r e s e n t e d _ b y
1 . . *
+ r e p r e s e n t i n g
1 . . *
e x e m p l i c a t i o n
0 . . *
1 . . 1
+ p r o v i d i n g _ r e p r e s e n t a t i o n _ t o
0 . . *
+ r e p r e s e n t e d _ b y
1 . . 1
e x p r e s s i o n
R u l e
< < O p t i o n a l > > a d m i n i s t e r e d _ c o m p o n e n t : A d m i n i s t e r e d _ C o m p o n e n t
< < R e q u i r e d > > d e s c r i p t i o n
S o u r c e _ D a t a _ E l e m e n t
0 . . *
1 . . *
+ c o n t a i n i n g
0 . . *
+ c o n t a i n e d _ i n
1 . . *
d e r i v a t i o n _ i n p u t
0..1
1 . . 1
+ i s _ i n p u t _ t o
0..1
+ r e s u l t i n g _ f r o m
1 . . 1
d e r i v a t i o n _ o u t p u t
1..1
0 . . *
+ i s _ f o r m u l a _ f o r
1..1
+ u s e d _ b y
0 . . *
d e r i v a t i o n
AFG
BEL
CHN
DNK
EGY
FRA
DEU
…………
004
056
156
208
818
250
276
…………
Afghanistan
Belgium
ChinaDenmark
Egypt
France
Germany
…………
6
SDC-0002-021-JE-2030
ObjectObject
ConceptConcept
TermTerm
DefinitionDefinition
troutSalmo truttabrown trouttruite
Elements of TerminologyElements of Terminology
any of several gamefishes of the genusSalmo , related to thesalmon...
7
SDC-0002-021-JE-2030
TerminologyTerminology
Terms ContextConcept
troutSalmo truttatruite
common namescientific nameFrench name
any of several gamefishes of the genusSalmo, related to thesalmon...
UIN=6349
•We capture a concept with a definition.
•We assign unique identifiers to uniquelyidentify concepts.
•There are potentially many terms associatedwith any concept, and many conceptsassociated with any one term.
•Each term has a context from which it isextracted.
8
SDC-0002-021-JE-2030
Dictionary/Glossary/Keywords
Printed On-line
Terms ContextConcept
Brown troutSalmo truttatruite
common namescientific nameFrench name
UIN=6349
•It is common to alphabetize terms anddemonstrate many senses of a term (e.g.,dictionary, glossary, key word list).
•Terms within a specific context may also bestored for a particular subject area.
•We can alphabetize terms so people can lookup and see all the concepts associated with aparticular term.
•Specialized information, or subject matterspecialties gather and store terms specific to aparticular context (e.g., legal dictionary,medical glossaries, etc.).
•Deployment technologies: printed, on-line
9
SDC-0002-021-JE-2030
Search Example:Fish Trout
Documents Data1 2 3 4 5 6 7 8
SearchEngine
Terms ContextConcept
Brown troutSalmo trutta
truite
common namescientific name
French name
UIN=6349Thesaurus(and Topic Trees)Thesaurus(and Topic Trees)
Salmo truttaBrown Trout
trout
fish
**1.00 fish <any>***0.25 trout <or>****0.10 Salmo trutta
****0.10 Brown trout
•Another way to structure terminology is intoa thesaurus.
•A thesaurus has a hierarchical realtionship(NT, BT, UF). We add value by buillding astructure which is more informative than adictionary.
•Thesauri can be deployed in the form ofpublished books, software, or used to feedsearch engine applications.
10
SDC-0002-021-JE-2030
Terms ContextConcept
Brown trout
Salmo truttatruite
common name
scientific nameFrench name
UIN=6349
DataElements
DataElements Data element
concept: trout species name
Definition: The names ofspecies of trout
Data element 1,
Common name:
brook trout
brown trout
cutthroat trout
Data element 2,
Scientific name:
Salvelinus fontinalisSalmo truttaOncorhynchus clarkii
•We organize our terms into data element.
•Terms can be used in names, definitions, andvalues.
•The way to access all this data is by term.
•In our metadata registry we want to be ableto view this information by the term name.
11
SDC-0002-021-JE-2030
Systems:STORETEnvirofacts . .
DatabaseDesignQuery
Terms ContextConcept
Brown trout
Salmo truttatruite
common name
scientific nameFrench name
UIN=6349
Web-enabled FormsEDI Messages
“Message”Content
Federal RegisterRegulationsReports
Documents
DataElements
DataElements
•Data elements are organized into the designof data bases, and can be used to performqueries.
•Data elements can be assembled intomessage content (I.e., message sets,transaction sets, information bundles, flatfiles, etc.) and can be deployed using EDImessages and other forms of communication.
•Terms and data elements can be organizedinto documents and published as reports,regulations, included in legislation, etc.
12
SDC-0002-021-JE-2030
Terms ContextConcept
Brown troutSalmo trutta
truite
common namescientific name
French name
UIN=6349
LocalMapping
CentralMapping
Query AgentBrokerMediatorResource Agent
IntelligentInformation
Services (IIS):Ontology
Example:
fish brown troutcommonname
commonname
Table Data element Data element value
•A fourth way to structure terminology is intoontologies. Ontologies have rich structuresof relationships (generalization,specialization, inheritance).
•Terms can be used to drive IIS’s that utilizequery agents, wrapping agents, mediators,and similar technology to access data ongeographically dispersed, disparate sources.Example: an ontology might be structured toinclude table, data elements, and data elementvalues, so that queries could be constructedusing these ontological terms and resources.Queries are mapped to this ontology, thusenabling data to be integrated withoutrequiring all resources to use the exact sameterms in each system.
13
SDC-0002-021-JE-2030
UsersUsers
Metadata Registry
CompaniesCompanies
UniversitiesUniversities
AgenciesAgencies
Terminology Thesaurus Themes
DataStandards
Ontology
StructuredMetadata
IntelligentSystems
MiddleWare
InfoSleuth
OthersOthers
14
SDC-0002-021-JE-2030
Terminology SourcesTerminology Sources
Terms ContextConcept
Brown troutSalmo truttatruite
common namescientific nameFrench name
UIN=6349
Thesauri:General EuropeanMultilingualEnvironmentalThesaurus(GEMET) andUniversal MedicalLanguage System(UMLS)
Practitionersin Field
DocumentsData Elements
There are many sources to obtain semanticcontent for our terminology (thesauri likeGEMET and UMLS, or other sources likedata elements, documents, and practioners ofthe world.
15
SDC-0002-021-JE-2030
Semantics ManagementSemantics Management
OntologyOntologyThesaurusThesaurus
DataDataElementsElements
SearchEngine
DBMS/EDI/DBMS/EDI/ Documents Documents
IIS
Dictionary/Glossary/Keywords
a category of vertebrate, cold-blooded craniate animals with permanent gills...
•11179 New Work Items define overlap of11179 and relevant standards for managing,organizing, and structuring terminology.
•We want to include modifications needed in11179, or any additions to it.
•Users may want to access terminologydirectly in a metadata registry.
16
SDC-0002-021-JE-2030
Profiles of ISO/IEC 11179Profiles of ISO/IEC 11179
xx
xx
17
SDC-0002-021-JE-2030
D a t a _ E l e m e n t _ C o n c e p t _ R e l a t i o n s h i p
<<Requ i red>> t ype_desc r ip t i on
N o n _ e n u m e r a t e d _ D o m a i n
<<Requ i red>> desc r i p t i on
Va lue_Domain_Re la t ionsh ip
< < R e q u i r e d > > t y p e _ d e s c r i p t i o n
Enumera ted_Doma in
P e r m i s s i b l e _ V a l u e
<<Requ i red>> i tem
<<Requ i red>> beg in_da te
<<Cond i t i ona l>> end_da te
2 . . n
1 . . *
+ m e m b e r _ o f
2 . . n
+spec i f ing
1 . . *
a l lowed_va lue
Va lue_Mean ing
<<Requi red>> ident i f ie r
<<Opt iona l>> descr ip t ion
<<Requ i red>> beg in_da te
<<Cond i t i ona l>> end_da te
2 . . n
0 . . *
+con ta ined_ in
2 . . n
+ c o n t a i n i n g
0 . . *
pe rm iss i b l e_va lue
1 . . *
0 . . *
+ r e p r e s e n t e d _ b y
1 . . *
+represent ing
0 . . *
p e r m i s s i b l e _ v a l u e _ m e a n i n g
C o n c e p t u a l _ D o m a i n
< < O p t i o n a l > > a d m i n i s t e r e d _ c o m p o n e n t _ i n f o r m a t i o n : A d m i n i s t e r e d _ C o m p o n e n t
<<Op t i ona l>> d imens iona l i t y
0 . . *
0 . . *
+conta in ing
0 . . *
comcep tua l_doma in_ re la t i onsh ip
+con ta ined_ in
0 . . *
1 . . *
0 . . *
+ c o n t a i n i n g
1 . . *
+con ta ined_ in
0 . . *
v a l u e _ m e a n i n g _ s e t
V a l u e _ D o m a i n
< < O p t i o n a l > > a d m i n i s t e r e d _ c o m p o n e n t : A d m i n i s t e r e d _ C o m p o n e n t
<<Opt iona l>> name
<<Requ i red>> da ta type : Data type
<<Op t i ona l>> max imum_cha rac te r_quan t i t y
<<Op t i ona l>> m in imum_cha rac te r_quan t i t y
<<Opt iona l>> format
<<Opt iona l>> un i t_of_quant i ty : Un i t_of_Quant i ty
0 . . *
0 . . 1
+con ta ined_ in
0 . . *
va lue_doma in_ re l a t i onsh ip
+ c o n t a i n i n g
0 . . 1 0 . . *
1 . . 1
+represent ing
0 . . *
+ s p e c i f i e d _ b y
1 . . 1
spec i f ica t ion
E x a m p l e
<<Requ i red>> i tem
D a t a _ E l e m e n t _ C o n c e p t
< < R e q u i r e d > > a d m i n i s t e r e d _ c o m p o n e n t : A d m i n i s t e r e d _ C o m p o n e n t
<<Opt iona l>> ob jec t_c lass : Ob jec t_C lass
<<Op t i ona l>> ob jec t_c l ass_qua l i f i e r
<<Opt ional>> proper ty : Proper ty
<<Opt iona l>> p roper t y_qua l i f i e r
0 . . 1
0 . . * + c o n t a i n i n g
0 . . 1
da ta_e lement_concep t_ re la t i onsh ip
+con ta ined_ in
0 . . *
1 . . 10 . . *
+spec i f ing1 . . 1
+hav ing
0 . . *
da ta_e lemen t_concep t_concep tua l_doma in_ re l a t i onsh ip
Da ta_E lemen t
< < R e q u i r e d > > a d m i n i s t e r e d _ c o m p o n e n t : A d m i n i s t e r e d _ C o m p o n e n t
<<Requ i red>> rep resen ta t i on_c lass : Rep resen ta t i on_C lass
<<Opt iona l>> rep resen ta t ion_c lass_qua l i f i e r
0 . . * 1 . . 1
+represented_with
0 . . *
+prov id ing_representa t ion_for
1 . . 1
representa t ion
1. . *
1 . . *
+ r e p r e s e n t e d _ b y
1 . . *
+represent ing
1 . . *
exempl ica t ion
0 . . *
1 . . 1
+prov id ing_representat ion_to
0 . . *
+ r e p r e s e n t e d _ b y
1 . . 1
e x p r e s s i o n
R u l e
< < O p t i o n a l > > a d m i n i s t e r e d _ c o m p o n e n t : A d m i n i s t e r e d _ C o m p o n e n t
<<Requ i red>> desc r i p t i on
Sou rce_Da ta_E lemen t
0..*
1..*
+ c o n t a i n i n g
0..*
+con ta ined_ in
1..*
der iva t ion_ input
0..1
1..1
+ i s_ inpu t_ to
0..1
+resu l t ing_f rom
1..1
der iva t ion_output
1..1
0..*
+ is_ formula_for
1..1
+ u s e d _ b y
0..*
derivat ion
18
SDC-0002-021-JE-2030
Part 4
Definitions
Part 5
Names
IDs
Part 2
Classifications
19
SDC-0002-021-JE-2030
Part 5Names
IDs
Part 4Definitions
Part 2 Classifications
20
SDC-0002-021-JE-2030
Biological terms
New speciesNew classification
Systematic NameVernacular Name
Taxonomic Serial NumberISO
ANSIIndustryGov’t
Impacts of Data Change
21
SDC-0002-021-JE-2030
Chemicals
New chemical
Systematic NameSynonym
CAS NumberISO
ANSIIndustryGov’t
Impacts of Data Change
22
SDC-0002-021-JE-2030
Impacts of Data Change
ISO
ANSIIndustryGov’t
Country
Country changesCZ Czechoslovakia*
CZ Czech Republic**LO Slovakia**
* FIPS PUB 10-3** FIPS PUB 10-4
Change
23
SDC-0002-021-JE-2030
As changes in the world occur, metadatachanges will be recorded and effectively
communicated between registries.
Standard Data
Change
ISO
ANSIIndustry
Gov’t
DefenseData Dictionary
System
Catalog ofData Sources
HealthCare
FinancingAdministration
IntelligentTransportation
System
St
an
da
rd
Da
ta
EDREDR
Health AffairsHISB
24
SDC-0002-021-JE-2030
Profiles of ISO/IEC 11179Profiles of ISO/IEC 11179
xx
xx
25
SDC-0002-021-JE-2030
l Individual Parts of 11179
l Profiles
l Extensions
ImplementationsImplementations