share
DESCRIPTION
SHARE. ESUG Teleconference on 22-Mar-2011. What am I going to cover?. GSK’s current approach to standards and the need for change Our plans / ongoing work and the similarity to CDISC SHARE Information Model Technicalities (will probably skip) Making the information model real … - PowerPoint PPT PresentationTRANSCRIPT
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
SHARE
ESUG Teleconference on 22-Mar-2011
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
What am I going to cover?• GSK’s current approach to standards and the need for change• Our plans / ongoing work and the similarity to CDISC SHARE• Information Model Technicalities (will probably skip)• Making the information model real …• SHARE content versus GSK content• What do you have to do in order to gain maximal benefits from
share?• Flexibility in practice• Creating an eCRF• Slide pack on BRIDG and ISO21090 [included in the slide pack
but will not be covered]
GSK’S CURRENT APPROACH TO STANDARDS AND THE NEED FOR CHANGE
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
What is the current GSK approach?• Current approach to standards is based on standard dataset definitions which
combine terminology, rules and structure• The standards processes are managed through a Lotus Notes database solution and
are made available to teams through multiple electronic solutions (an in-house Dataset Manager tool, a an in-house study specification tool, InForm libraries etc)
• Standards are available at both the global (core – all therapy areas) and therapy area levels. Some standards have been defined at indication level within the therapy area standards
• We align standard objects (CRF’s, data extraction programs, statistical displays, algorithms etc) to standard dataset definitions – a general rule is one eCRF module/page per dataset
• Lots of documentation, but not integrated with the standards• Study teams are required to apply for changes or exemptions when they need to do
something different for captured data
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Some of our current issues• GSK standards, based on SDS 2.1 (the predecessor to SDTM) have limitations
– duplicate variables and datasets– ambiguity (what is this?; how am I meant to use this?)– different datasets employ different structures … hard to become familiar– Data Management and Stats want different data structures in order to do their work– little opportunity for automation– hard to aggregate and reuse data other than the core standards (AEs, labs, vitals etc)
• Lots of problems mapping our standards to SDTM– extra variables which don’t fit the domains– multiple different uses for an individual variable (some subtle differences but others not so
subtle)• SDTM seen as an add-on deliverable … we don’t want to build our standards
and tools around it– not an operational standard– doesn’t fit with our current complex toolset– doesn’t seem to fit with ADaM or our reporting process/macros– doesn’t do much to help with data aggregation
• Standards too tied to our toolset– hard to automate across the study process– painful whenever a tool is replaced
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Drivers for change• Regulatory requirements for clinical data are changing
– new FDA requirements (i.e. CDISC) on their way– uncertainty about the future (e.g. HL7 v3)
• Need to be able to share data more easily with development partners • We need more flexibility in using standards (from the study & project
team perspective) whilst maintaining/increasing the benefits of standardisation
• Want to minimise the effort associated with transforming data to standards, or using more than one standard
• Need a less complex clinical computing environment/toolset• Need to be able to do more work with fewer resources• Currently replacing most of our clinical trial toolset … if we are going to
change our standards, we have to change them now
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
GSK Long term visionRegulatory/legal/public mandate:• GSK is well prepared to provide regulators and others with the data they require, in the
format required • always able to respond to regulatory queries quickly
Operational efficiency:• increase operational efficiency through the implementation of a metadata driven
approach• provide study teams with the flexibility to capture and process the data in an optimal
way (study teams to have the ability to decide on structure and grouping of their data)• variables much more clearly defined: less ambiguity, less confusionData Reuse:• ability to combine and analyse data across studies, indications and broader with little effortTraceability:• ability to trace all the way back from a result in a clinical report (e.g. a mean value or a p-
value) to the value that was first entered in the CRF/eCRF … with an understanding, at each step, of what data/variables were used and what algorithms were applied
OUR PLANS / ONGOING WORK AND THE SIMILARITY TO CDISC SHARE
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
So what are we doing?
• Long term, we want to use SHARE content• Cannot wait for SHARE before changing our standards as
we’re replacing systems now• Developed an Information Model which all our standards
will follow, together with an implementation plan for this– standards being developed independently of our systems– new systems built to work with / take advantage of the new
standards• Critically, our information model is based on the same
industry standards as the SHARE information model
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
So what are we doing?
• Metadata driven approach to developing, executing and reporting clinical trials– eProtocol tool– metadata repository– many systems consuming the metadata: eCRF tool, reporting
tools …• Metadata Repository– structured based on our information model– houses all the clinical data definitions– houses operational metadata (information needed to create
eCRFs, datasets, SDTM datasets etc)
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Information Model Technicalities
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Information Model Details
• The information model is a combination of three industry standards:– the BRIDG model (a collaborative piece of work
between CDISC, HL7, FDA and the US National Cancer Institute (NCI)
– the ISO21090 datatype standard (applicable across Healthcare, not just regulated clinical research) … very similar to the HL7 abstract datatypes
– the ISO11179 metadata registry standard
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Simple explanation of these 3 standardsBRIDG is a standard way of representing the world of clinical research– it doesn’t take us right down to variables, but it does take us down to
meaningful objects such as “anatomic location”, “result”, “date” etc
ISO21090 datatypes are a standard way of representing particular types of data– these take us from the BRIDG meaningful objects such as “result” to
individual variables like “value”, “unit”, “code”
The link between BRIDG and ISO21090 is that all the BRIDG meaningful objects have an ISO21090 datatype
ISO11179 is a standard way of recording metadata in a metadata registry– we want to be compliant with it, but it isn’t something that operational folk
need to understand or worry about
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Sources of Information
• BRIDG site: http://www.bridgmodel.org/ (we are using 3.0.3)
• ISO21090 standard: http://gforge.hl7.org/svn/hl7v3/trunk/dt/iso/index.htm (logon with username= anonymous and blank password) … the 2011 published version is on the ISO website
• Enterprise Architect is the modelling software used by BRIDG. Here is a link to a free viewer: http://www.sparxsystems.com/bin/EALite.exe
• I have included a simple to understand slide set on BRIDG and ISO21090 (15 easy slides) at the bottom of this slide pack for those who want to understand more
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Making the information model real …
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
What does this Information Model approach give us?
• A well developed modelling of clinical research … there shouldn’t be anything missing– so we model clinical data in a consistent and
formalised manner• A templated approach to the development of
our standards– we end up selecting variables from a short list
rather than manually creating them
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
And the usual inevitable downsides?
• BRIDG model is complicated– but this is because clinical research and clinical data are complicated– use of a templated approach to implementation removes much of the
complexity– you do need to train people (as always)– you need to take advantage of the capabilities to reap the biggest benefits
• ISO21090 datatype standard has been accused of being too complicated– without tools to help you, I’m sure that is true– but it is the complexity that allows the development of a templated
approach to standards creation– you need to train people … but mainly with regards to choices they have to
make
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
So what does content look like?
Blood Specimen Collection
Blood Specimen
Haemoglobin Test
Haemoglobin Result
is a result of
BRIDG based associations between concepts (wording in blue describes things from the bottom up)
is a result of
is a test performed on
Fasting status indicator value = trueDate Range low value = 23-Apr-2010
Accession Number Text value = 01876288485Condition Code item code = CC51 display name value = haemolysed
Category Code code = HAEM display name value = Haematology
Result value = 151 unit = g/L
Concepts: BRIDG based modelling of the clinical data
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
So what does content look like?
Blood Specimen Collection
Blood Specimen
Haemoglobin Test
Haemoglobin Result
Is a result of
Is a result of
Is a test performed on
Fasting status indicator value = trueDate Range low value = 23-Apr-2010
Accession Number Text value = 01876288485Condition Code item code = CC51 display name value = haemolysed
Category Code code = HAEM display name value = Haematology
Result value = 151 unit = g/L
Concept attributes from BRIDG
ISO21090 decomposition: “pre-variable attributes”
ISO21090 decomposition: variables (shown with example values)
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
What we get from the metadata• Concepts – clear definitions of clinical information (e.g.
height, systolic blood pressure, weight result)• Associations – how the concepts connect together, rules for
the use of concepts• BRIDG attributes – meaningful attributes for a piece of
clinical data (e.g. method, date, anatomic site, result) … some may have codelists
• ISO21090 decomposition: “pre-variable attributes” – various levels of clumping of variables; some may have codelists
• Variables – clear, model based, unambiguous variables
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Steps needed to create that information?
• Choose which clinical scenario template we need (in this case, one containing specimen, lab test & lab result)
• Enter information about each concept (a name, a description, a definition …)
• Choose which of the BRIDG attributes we will need• Choose which associations are needed• Choose which bits of the ISO21090 decomposition we need• Enter the name of codelists when prompted (and select the
set of codes in that codelist that you want to make available for this concept)
SHARE CONTENT VERSUS GSK CONTENT
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
What we do expect from SHARE
• We expect SHARE to provide us with these model based definitions (the concepts, concept attributes and decomposition together with the associations between concepts and the terminology)
• We expect SHARE to provide us with the information needed to represent these definitions in the form of SDTM domains
• There will be a SHARE metadata repository• GSK expect to import all the SHARE metadata into the
GSK metadata repository
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
What we don’t expect from SHARE
• We do not expect SHARE to provide us with all the rules that GSK will want to apply
• We do not expect SHARE to provide us with all the operational metadata we need to create study objects (GSK datasets, GSK eCRFs)
• GSK expect to add additional metadata to the GSK repository … we want to augment the SHARE content, not change it
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Choices• Just use the SHARE variables and forget about the rest of
the metadata– you get consistent industry standard variables– you can keep your own processes– but you may not use the variables in such a way that you can
aggregate your data with that of others– you miss out on the additional benefits
• Use the SHARE metadata to the full and augment with additional company metadata [the GSK approach]– you get all the benefits of using the SHARE metadata– you get additional capability to automate downstream processes
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Creating a GSK standard using SHARE content
• Rules …– define which variables are mandatory, optional,
conditional in a study specification– define the conditionality rules e.g. either have to include
variables for total daily dose/dose units or dose/dose unit and frequency
– define which variables have to be populated if used in a study
– (in fact, we may apply rules to associations, BRIDG based attributes, “pre-variable attributes” and codes as well as to variables)
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Rules example: Subject Disposition
Tick this …
… and you MAY tick none, one or many of these
Tick this …
… and you MUST enter text here
If the study includes pre-specified subreasons, an “other specify” subreason MUST be included and, if ticked, MUST be populated
If the study does not include subreasons, the “specify” MUST be included and populated
We should not expect SHARE to deliver these company specific rules
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
What extra metadata would we add?
• Mappings from other standards to concepts & concept variables– legacy data– development partners
• Mappings from SHARE terminology to GSK terminology and vice versa (mapping codes)– we want to use SHARE terminology as much as we
can but there are always going to be cases where, for some reason, we need to deviate
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Central role of concept metadata
eCRFRender as an eCRF
mapping to concepts
SDTM
Concept Definitions
Render as SDTM
mapping to concepts
Non-GSK metadata
GSK legacymetadata
RegistryRender in registry form
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Central role of concept aligned data
Represent as SDTM dataset
map data using metadata
Concept aligned data
Represent as registry format dataset
map data using metadata
Non-GSK data
GSK legacy data
SDTM
Registry
Aggregations“Aggregate anything”
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
So what operational metadata would we add?
• Metadata needed to render the definitions in a particular form e.g. an eCRF, a GSK dataset– length and precision for variables– whether a coded field should be represented as a
drop down box or a radio button– and more
• A study specification
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Setting Up Studies
• For each study, we will produce a fully detailed study specification
• We will be doing this using the BRIDG modelling– key to taking full advantage of the concept metadata
• This will be done at a fully detailed level– including which variables will be collected at which
visits/timepoints– including which set of codes are available for use at that
visit/timepoint (when codelisted)– all the inherent structure of the metadata will be utilised to
the full
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Setting Up Studies
• Benefits of utilising the BRIDG trial design modelling– the study time and events are modelled using study
design concepts1 and data collection concepts which makes for a fully integrated approach
– BRIDG modelling provides metadata/data driven navigation capability, guiding study investigators through sometimes very complex study procedures
• We can use the richness of the metadata included in the study specification to help with the creation of operational objects1 Study design concepts include visits, timepoints, cycles, arms, epochs, treatment strategies & elements
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
SDTM
• We expect to get totally consistent SDTM “for free”– concepts are associated with SDTM domains– concept variables are generated from BRIDG and
ISO21090– we expect there to be a mapping from BRIDG
attributes/ISO decomposition to SDTM variables– We expect to standardise/eliminate the inherent
SDTM wiggle-room through this process
WHAT DO YOU HAVE TO DO IN ORDER TO GAIN MAXIMAL BENEFITS FROM SHARE?
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Important Actions
• Always maintain a link back from operational objects to the SHARE definitions
• Use the SHARE objects right from the design stage of a study
• Augment the SHARE metadata with company specific metadata, for example– rules (e.g. use this object or that object but not both)– additional metadata to permit automation of eCRF
screens (somewhat tool dependent)
FLEXIBILITY IN PRACTICE
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
System independent standards which are not tied to specific objects (e.g.
dataset)
This is the GSK standard for a dataset …
Any variation requires an exemption or a new standard
In the new standards each coloured block is a “standard” or “building block” and they can be combined in different ways to make objects (e.g. datasets).
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Flexibility In Practice – Dataset Content
An AE eCRF screen may look like this …
With the new standards it can also look like this …
There will still be standard objects (e.g. datasets) to provide the benefits of standardisation but also more flexibility (fewer exemptions required)
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Flexibility In Practice – DatasetsAn existing GSK dataset may look like this …
With the new standards the same data can also look like this …
Or this …
Or this …
Or this …
It all comes from the same building blocks (no exemptions required)
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Flexibility In Practice – Transforming Non-Standard Data
CDISC SDTM datasets
CDISC ADaM datasets
GSK Operational datasets
GSK standard
Partner standard
Vendor X
In-licensed Compound
New regulatory
requirement
CDISC SDTM datasets
CDISC ADaM datasets
GSK Operational
datasets
GSK standard
Partner standard
Vendor X
Without building blocks …
… 9 mappings required
With building blocks …
… 6 mappings required
… 1 new mapping … 1 new mapping
CREATING AN E-CRF
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Creating a smart eCRF• SHARE will provide metadata about clinical information• SHARE will provide multiple levels of clumping of objects e.g.
– value and unit– test and test result– albumin test is done using serum specimen
• Your company will add additional metadata to create company-specific standard combinations of the SHARE content e.g.– either total daily dose object will be used or single dose object + dose frequency object will
be used (but not both)• Your company will add additional metadata to indicate whether repeat values are
allowed– only one primary reason for discontinuation is allowed (and must be provided) but multiple
sub-reasons are permitted (and it is OK not to choose any)• Your company will add additional metadata and/or define rules to facilitate the
automation of eCRF creation e.g.– represent this codelist as a radio button if it has less than 6 possible values and as a drop-
down if it has 6 or more possible values• Some metadata will need to be created at a study level e.g.
– is this a collected field or a hard coded field
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Creating a smart eCRF• Two component parts– creating individual pages … need metadata to:
• differentiate between hard coded information and collected information [study level metadata]
• drive pop-ups (e.g. pregnancy test details if subject is female) [company and/or study level metadata]
• allow repeat fields (e.g. medical history) [study may deviate from company level rule]
• rules (get investigator to confirm values that are outside certain limits) [company and/or study level metadata]
– navigation though the complete eCRF• general flow• exceptional flows e.g. if a particular event occurs, additional tests/visits
necessitated [BRIDG contains functionality to record this as computable metadata]
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Will not cover the following slides during the training
They are for people to view after the meeting
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Two industry standards: BRIDG and ISO21090
A simple explanation of what these are and what they provide
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Information Model?
• An information model is a combination of structure and nomenclature– modelling the structure of data– employing a set of terms to describe the objects
• A good information model will ensure that nothing is glossed over and that similar things will be described in a similar manner
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
GSK’s rationale for using BRIDG and ISO21090• We developed GSK standards with no underlying information model
– these have the right content (the info we need in GSK’s clinical trials)– but consistency of approach, avoidance of duplication and ambiguity is not as good
as we would like• In 2009 we started to develop an information model based approach to
representing GSK’s clinical trial standards, in order to gain bigger benefits from standardisation– our original intention was not to implement BRIDG, but rather to use it as a tool … to
guide us– we ran into various issues requiring solutions … some of these we addressed using
our own solutions– at year end, we came to recognise that within BRIDG lies all the functionality we
need to provide solutions to all our issues– in January 2010, we took the decision to implement BRIDG and an ISO datatype
standard as we felt this is the optimal approach• using these we can address all our issues• and, we can develop a solution that will be at least similar to that of SHARE• and we will be using standards employed in the healthcare world
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
CDISC SHARE Project
• In the early days of the SHARE project, it was agreed that SHARE would use the BRIDG model, the ISO21090 datatype standard and the ISO11179 metadata registry standard as its information model
• Although SHARE could decide to implement these differently from GSK, currently the GSK and SHARE information models are very similar
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
BRIDG
• An information model• Targeted at protocol driven research• Reasonably mature• Key collaborators: CDISC, HL7, NCI, FDA
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Key Features
• BRIDG is a model of protocol driven research– entities (animal, person, organisation, material)– activities (any action that can, in the context of a
study, be planned, scheduled or performed e.g. a surgical procedure, a laboratory test, or the administration of a drug)
– participation or functional role of an entity in an activity
– relationships between activities (both simple and complex relationships
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Example: Tissue SamplesTissue
Specimen Collection
Freeze Specimen
Preserve Specimen
Embed Specimen in
ParaffinFrozen Tissue
SpecimenFresh Tissue Specimen Preserved
Tissue Specimen
Paraffin Block
Specimen
Cut Slide from Block Slide
Test
Result
Stain Slide
Stained Slide
Here we have a diagrammatic representation of a clinical procedure, in which a specimen is collected from a subject, some processing of the specimen may occur, and then the specimen is tested and a result obtainedWe may need data about some or all of the steps in this process, as well as about the test and its result
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
TissueSpecimen Collection
Freeze Specimen
Preserve Specimen
Embed Specimen in
ParaffinFrozen Tissue
SpecimenFresh Tissue Specimen Preserved
Tissue Specimen
Paraffin Block
Specimen
Cut Slide from Block Slide
Test
Result
Stain Slide
Stained Slide
Performed Specimen Collection
Performed Specimen Procedure
Biologic Specimen
Performed Observation
Performed Observation Result
Key to BRIDG classes
Same example, but indicating BRIDG classes
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
TissueSpecimen Collection
Freeze Specimen
Preserve Specimen
Embed Specimen in
ParaffinFrozen Tissue
SpecimenFresh Tissue Specimen Preserved
Tissue Specimen
Paraffin Block
Specimen
Cut Slide from Block Slide
Test
Result
Stain Slide
Stained Slide
• BRIDG has templates (classes) for all the different objects and activities that are needed to describe protocol driven research. We do not need to create these from scratch each time.
• In this diagram, we have shown which template is appropriate for each object or activity
• In effect, we have taken copies of BRIDG templates and made these specific to our clinical process
How does BRIDG help us?
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
TissueSpecimen Collection
Freeze Specimen
Preserve Specimen
Embed Specimen in
ParaffinFrozen Tissue
SpecimenFresh Tissue Specimen Preserved
Tissue Specimen
Paraffin Block
Specimen
Cut Slide from Block Slide
Test
Result
Stain Slide
Stained Slide
• One of the things that BRIDG gives us is a framework (the classes and the relationships between these classes) by which we can document the information generated through a clinical process. For example:• a specimen collection results in specimen(s)• a test can have more than one result, but a result can
only have one test
How does BRIDG help us?
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
That isn’t enough … what else?
Tissue Specimen Collection: Collection Method Site from which the specimen was taken Date on which the specimen was taken …..
• Each BRIDG class has its own defined set of attributes (placeholders) … these give us is a way of documenting the detail
• For example, the Specimen Collection class includes attributes for the collection method, the site from which the specimen was taken, the date on which the specimen was taken … plus another 14 attributes
• We choose the attributes we want to use for a given piece of information• Sometimes we associate these attributes with a specific codelist (or even
give the attribute a value)• Intention is that there are attributes for ALL the information we might want
to record
Here is one of those templates/classes:
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Tissue Specimen Collection: Collection Method Site from which the specimen was taken Date on which the specimen was taken …..
• Key thing here is that every time you copy a particular template, and make it specific to your situation, you choose from the same set of attributes
• In some cases, you may wish to associate a specific codelist with all your uses of a specific attribute e.g. the anatomic location attribute
• This “copy, choose and make specific” process makes it easier for both computers and humans, as this enforces a consistent approach
Here is one of those templates/classes:
What does that give us?
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Tissue Specimen Collection: Collection Method CD Site from which the specimen was taken CD Date on which the specimen was taken IVL<TS.DATETIME> …..
• We need to know what sort of information we have: Is Collection Method text? Or is it coded? Is it in English?
• We need datatypes to answer these questions• BRIDG uses an ISO standard (ISO21090) for the datatypes• These datatypes are complex – not like SAS datatypes (character, numeric,
date etc)• Examples are shown in black in the diagram: CD and IVL<TS.DATETIME>• Each datatype has a number of attributes … it is through these attributes
that we get down to the variable level
Here is one of those templates/classes:
But this is still not enough!
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Selected attributes of the CD datatype:nullFlavor : NullFlavor, <used if original text cannot be coded>code : characterstring, CODEcodeSystem : characterstring, codeSystemName : characterstring, codeSystemVersion : characterstring, valueSet : characterstring, CODELISTvalueSetVersion : characterstring, displayName : ST, DECODEoriginalText : ED <original text>
Here is one of these datatypes:
So what do these datatypes do for us?
The CD datatype is for coded information (though CD stands for “Concept Descriptor”)You can see that it has all the attributes you need for coded dataYou can also see that some of the datatype attributes are themselves datatypes (e.g. originalText is of datatype ED). So attributes of a datatype can have attributes too … we have to “decompose” all these levels to get down to what we know as variables.
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Selected attributes of the PQ datatype:originalText : ED.TEXT, uncertainty : QTY, uncertaintyType : UncertaintyType, uncertainRange : IVL(QTY) value : Decimal, VALUEcodingRationale : CodingRationale, unit : characterstring, UNIT
And here is another of these datatypes:
The PQ datatype is for physical quantities (things like sodium concentration, systolic blood pressure, number of lesions)You can see that it has the attributes you need
So what do these datatypes do for us?
Simon Bishop‘s slides for ESUG TC on SHARE on 22-March-2011
Why complex datatypes rather than nice simple ones?
• Because we get extra benefit!• The datatype we use for a physical quantity (e.g.
the sodium concentration in the blood) has several component parts including Value and Unit
• When we use a sodium concentration result e.g. in a SAS dataset, we always keep Value and Unit together as the value is meaningless without a unit
• Use of these ISO datatypes gives us the facility to keep these sort of things together