study data

Upload: chandu-reddy

Post on 23-Feb-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/24/2019 Study Data

    1/40

    STUDY DATA

    TECHNICAL CONFORMANCE GUIDE

    Technical Specifications Document

    This Document is incorporated by reference into the followingGuidance Document(s):

    Guidance for IndustryProviding Regulatory Submissions in Electronic

    Format Standardized Study Data

    For questions regarding this technical specifications document, contact CDER [email protected] CBER [email protected]

    U.S. Department of Health and Human Services

    Food and Drug Administration

    Center for Drug Evaluation and Research (CDER)

    Center for Biologics Evaluation and Research (CBER)

    March 2015

    mailto:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]
  • 7/24/2019 Study Data

    2/40

    STUDY DATATECHNICAL CONFORMANCE GUIDE

    March 2015

  • 7/24/2019 Study Data

    3/40

    Revision History

    Date Version Summary of RevisionsJanuary 2014 1.0 Initial version

    December 2014 2.0 Revisions based on the public comment period(February 2014 May 2014); and CDER/CBERinternal review May 2014 December 2014

    March 2015 2.1 Revisions based on a comments received toversion 2.0. Updates to Sections 2.2 Study DataReviewers Guide (SDRG), 2.3 Analysis DataReviewers Guide (ADRG), 3.3.1 SAS TransportFormat, 3.3.2 Dataset Size and a revision of

    Section 4.1.4.5 Data Definition File

  • 7/24/2019 Study Data

    4/40

    Contains Nonbinding Recommendations

    i

    Table of Contents

    1. Introduction ....................................................................................................... 1

    1.1 Background ............................................................................................................ 1

    1.2 Purpose ................................................................................................................... 1

    1.3 Document Revision and Control .......................................................................... 2

    1.4 Organization and Summary of the Guide ........................................................... 2

    1.5 Relationship to Other Documents ........................................................................ 3

    2. Planning and Providing Standardized Study Data ........................................ 4

    2.1 Study Data Standardization Plan ........................................................................ 4

    2.2 Study Data Reviewers Guide .............................................................................. 4

    2.3 Analysis Data Reviewers Guide .......................................................................... 5

    3. Exchange Format Electronic Submissions ................................................... 5

    3.1 Extensible Mark-up Language ............................................................................. 5

    3.2 Portable Document Format .................................................................................. 6

    3.3 File Transport Format .......................................................................................... 6

    3.3.1 SAS Transport Format ......................................................................................6

    3.3.2 Dataset Size .......................................................................................................6

    3.3.3 Dataset Column Length ....................................................................................7

    3.3.4 Variable and Dataset Descriptor Length ...........................................................7

    3.3.5 Special Characters: Variables and Datasets .....................................................7

    3.3.6 Variable and Dataset Names .............................................................................7

    3.3.7 Variable and Dataset Labels .............................................................................7

    4. Study Data Submission Format Clinical and Nonclinical .......................... 8

    4.1 Clinical Data Interchange Standards Consortium ............................................. 8

    4.1.1 Study Data Tabulation Model ...........................................................................8

    4.1.1.1 Definition ............................................................................................................. 84.1.1.2 SDTM General Considerations ........................................................................... 8

    4.1.1.3 SDTM Domain Specifications .......................................................................... 10

    4.1.2 Analysis Data Model .......................................................................................11

    4.1.2.1 Definition ........................................................................................................... 11

    4.1.2.2 General Considerations ...................................................................................... 11

  • 7/24/2019 Study Data

    5/40

    Contains Nonbinding Recommendations

    ii

    4.1.2.3 Key Efficacy and Safety Variables .................................................................... 12

    4.1.2.4 Timing Variables ............................................................................................... 12

    4.1.2.5 Core Variables ................................................................................................... 12

    4.1.2.6 Dates .................................................................................................................. 134.1.2.7 Labels................................................................................................................. 13

    4.1.2.8 Software Programs ............................................................................................ 13

    4.1.2.9 ADaM Domain Specifications........................................................................... 13

    4.1.2.9.1 Analysis Data Subject Level .......................................................................... 13

    4.1.2.9.2 Imputed Data .................................................................................................. 14

    4.1.3 Standard for Exchange of Nonclinical Data ...................................................14

    4.1.3.1 Definition ........................................................................................................... 14

    4.1.3.2 General Considerations ...................................................................................... 14

    4.1.3.3 SEND Domain Specification ............................................................................. 14

    4.1.4 General Considerations: SDTM, SEND, ADaM ............................................15

    4.1.4.1 Variables: Required, Expected, and Permissible .............................................. 15

    4.1.4.2 Dates .................................................................................................................. 16

    4.1.4.3 Naming Conventions ......................................................................................... 16

    4.1.4.4 SDTM / SEND / ADaM Versions ..................................................................... 16

    4.1.4.5 Data Definition File ........................................................................................... 16

    4.1.4.6 Annotated Case Report Form (aCRF) ............................................................... 17

    5. Therapeutic Area Standards .......................................................................... 17

    6. Terminology ..................................................................................................... 18

    6.1 General ................................................................................................................. 18

    6.1.1 Controlled Terminologies ...............................................................................18

    6.1.2 Use of Controlled Terminologies....................................................................19

    6.1.3 Maintenance of Controlled Terminologies .....................................................196.2 CDISC Controlled Terminology ........................................................................ 20

    6.3 Adverse Events .................................................................................................... 20

    6.3.1 MedDRA .........................................................................................................20

    6.3.1.1 General Considerations ...................................................................................... 20

    6.4 Medications .......................................................................................................... 21

  • 7/24/2019 Study Data

    6/40

    Contains Nonbinding Recommendations

    iii

    6.4.1 FDA Unique Ingredient Identifier ..................................................................21

    6.4.1.1 General Considerations ...................................................................................... 21

    6.4.2 WHO Drug Dictionary ....................................................................................21

    6.4.2.1 General Considerations ...................................................................................... 216.5 Pharmacologic Class ........................................................................................... 22

    6.5.1 National Drug File -- Reference Terminology................................................22

    6.5.1.1 General Considerations ...................................................................................... 22

    6.6 Indication ............................................................................................................. 22

    6.6.1 SNOMED CT..................................................................................................22

    6.6.1.1 General Considerations ...................................................................................... 22

    7. Electronic Submission Format ....................................................................... 23

    8. Data Validation and Traceability ................................................................... 26

    8.1 Definition of Data Validation ............................................................................. 26

    8.2 Study Data Validation Rules .............................................................................. 26

    8.2.1 Types of Data Validation Rules ......................................................................26

    8.2.2 Support on Data Validation Rules ..................................................................27

    8.3 Study Data Traceability ...................................................................................... 27

    8.3.1 Overview .........................................................................................................27

    8.3.2 Legacy Study Data Conversion to Standardized Study Data..........................28

    8.3.2.1 Traceability Issues with Legacy Data Conversion ............................................ 28

    8.3.2.2 Legacy Data Conversion Plan and Report ......................................................... 29

    Appendix: Data Standards and Interoperable Data Exchange ................................. 31

    Glossary .............................................................................................................................. 34

  • 7/24/2019 Study Data

    7/40

    Contains Nonbinding Recommendations

    1

    STUDY DATA

    TECHNICAL CONFORMANCE GUIDE

    This technical specifications document represents the Food and DrugAdministration's (FDA's) current thinking on this topic. It does not create or conferany rights for or on any person and does not operate to bind FDA or the public. Youcan use an alternative approach if the approach satisfies the requirements of theapplicable statutes and regulations. If you want to discuss an alternative approach,contact the FDA staff responsible for implementing this guidance. If you cannotidentify the appropriate FDA staff, send an email to [email protected] [email protected].

    1. Introduction

    1.1Background

    This Study Data Technical Conformance Guide (Guide) provides specifications,recommendations, and general considerations on how to submit standardized studydata using FDA-supported1data standards located in the Data Standards Catalog(Standards Catalog).2The Guide supplements the guidance for industry ProvidingRegulatory Submissions in Electronic Format Standardized Study Data(eStudyData). The eStudy Data guidance will implement the electronic submissionrequirements of section 745A(a) of the FD&C Act with respect to standardized studydata contained in certain investigational new drug applications (INDs), new drugapplications (NDAs); abbreviated new drug applications (ANDAs); and certainbiologics license applications (BLAs) that are submitted to the Center for DrugEvaluation and Research (CDER) or the Center for Biologics Evaluation and

    Research (CBER).3

    1.2 Purpose

    This Guide provides technical recommendations to sponsors4for the submission ofanimal and human study data and related information in a standardized electronicformat in INDs, NDAs, ANDAs, and BLAs. The Guide is intended to complementand promote interactions between sponsors and FDA review divisions. However, it isnot intended to replace the need for sponsors to communicate directly with reviewdivisions regarding implementation approaches or issues relating to data standards.To better understand why the FDA is now emphasizing the submission ofstandardized data for all studies, please refer to the Appendix.

    1For the purposes of this document, supported means the receiving Center has established processes andtechnology to support receiving, processing, reviewing, and archiving files in the specified file format.2Available athttp://www.fda.gov/forindustry/datastandards/studydatastandards/default.htm.3See Providing Regulatory Submissions in Electronic Format Standardized Study Data(section II.A)available athttp://www.fda.gov/forindustry/datastandards/studydatastandards/default.htm.4For the purposes of this document, the term sponsor refers to both sponsors and applicants who aresubmitting study data to the Agency.

    http://www.fda.gov/forindustry/datastandards/studydatastandards/default.htmhttp://www.fda.gov/forindustry/datastandards/studydatastandards/default.htmhttp://www.fda.gov/forindustry/datastandards/studydatastandards/default.htmhttp://www.fda.gov/forindustry/datastandards/studydatastandards/default.htmhttp://www.fda.gov/forindustry/datastandards/studydatastandards/default.htmhttp://www.fda.gov/forindustry/datastandards/studydatastandards/default.htmhttp://www.fda.gov/forindustry/datastandards/studydatastandards/default.htmhttp://www.fda.gov/forindustry/datastandards/studydatastandards/default.htm
  • 7/24/2019 Study Data

    8/40

    Contains Nonbinding Recommendations

    2

    Because of the inherent variability across studies and applications, it is difficult toidentify all data needed by a review division prior to a scientific regulatory review.We recommend that as early as the pre-IND meeting, sponsors should use theestablished regulatory process to discuss with the review division the key data

    necessary to support a submission, the data elements that should be included in eachdataset, and the organization of the data within the datasets.

    Some data standards may not require the use of all defined data elements to becollected in any given study. For example, the Study Data Tabulation ModelImplementation Guide (SDTMIG)5classifies variables as required, expected, orpermissible. Whatdata are collected and submitted is a decision that should be madebased on scientific reasons, regulation requirements, and discussions with the reviewdivision. However, all study-specific data necessary to evaluate the safety andefficacy of the medical product should be submitted in conformance with thestandards currently supported by FDA and listed in the Standards Catalog.

    If there is a question regarding a specific submission or a particular data standardimplementation, the sponsor should contact the review division for specificsubmission questions or the appropriate contact for data standards issues ([email protected] or [email protected]).

    This Guide supersedes all previous Study Data Specifications documents (Versions1.0 - 2.0) and CDER Study Data Common Issues Documents (Versions 1.0 -1.1).

    1.3 Document Revision and Control

    FDA intends to post updated versions of the Guide to the Study Data Standards

    Resources Web page (Standards Web page)6

    followed by Federal Registernoticesannouncing updated versions. The revision history page of the Guide will containsufficient information on the changes made by section.

    1.4Organization and Summary of the Guide

    This document is organized as follows:

    Section 1: Introduction provides information on regulatory policy and guidancebackground, purpose, and document control.

    Section 2: Planning and Providing Standardized Study Data recommends andprovides details on preparing an overall study data standardization plan, a

    study data reviewers guide and an analysis data reviewers guide.

    5Seehttp://www.cdisc.org.6The Standards Web page can be accessed athttp://www.fda.gov/forindustry/datastandards/studydatastandards/default.htm.

    http://www.cdisc.org/http://www.cdisc.org/http://www.cdisc.org/http://www.fda.gov/forindustry/datastandards/studydatastandards/default.htmhttp://www.fda.gov/forindustry/datastandards/studydatastandards/default.htmhttp://www.fda.gov/forindustry/datastandards/studydatastandards/default.htmhttp://www.cdisc.org/
  • 7/24/2019 Study Data

    9/40

    Contains Nonbinding Recommendations

    3

    Section 3: Exchange Format - Electronic Submissions presents thespecifications, considerations, and recommendations for the file formatscurrently supported by FDA.

    Section 4: Study Data Submission Format: Clinical and Nonclinical presents

    general considerations and specifications for sponsors using, for example,the following standards for the submission of study data: Study DataTabulation Model (SDTM), Analysis Data Model (ADaM), and Standardfor Exchange of Nonclinical Data (SEND).

    Section 5: Therapeutic Area Standards presents supplemental considerations andspecific recommendations when sponsors submit study data using FDA-supported therapeutic area standards (TA).

    Section 6: Terminology presents general considerations and specificrecommendations when using controlled terminologies/vocabularies forclinical trial data.

    Section 7: Electronic Submission Format provides specifications andrecommendations on submitting study data using the electronic CommonTechnical Document (eCTD) format.

    Section 8: Data Validation and Traceability provides general recommendationson conformance to standards, data validation rules, data traceabilityexpectations, and legacy data conversion.

    1.5Relationship to Other Documents

    This Guide integrates and updates information discussed previously in theStudy DataSpecifications and the CDER Common Data Standards Issues documents.7As noted

    above, this Guide supersedes all previous Study Data Specifications documents(Versions 1.0 - 2.0) and CDER Study Data Common Issues Documents (Versions 1.0-1.1). The examples of issues and concerns discussed in the Guide are intended asexamples only of common issues, and not an inclusive list of all possible issues.

    This Guide is incorporated by reference into the Guidance to Industry ProvidingRegulatory Submissions in Electronic Format: Standardized Study Data. In addition,sponsors should reference the following:

    FDA Study Data Standards Resources Web page (See section 1.1) FDA Data Standards Catalog (See section 1.1)

    FDA Portable Document Format Specifications (See section 3.2)

    7Seehttp://www.fda.gov/Drugs/DevelopmentApprovalProcess/FormsSubmissionRequirements/ElectronicSubmissions/ucm248635.htm.

    http://www.fda.gov/Drugs/DevelopmentApprovalProcess/FormsSubmissionRequirements/ElectronicSubmissions/ucm248635.htmhttp://www.fda.gov/Drugs/DevelopmentApprovalProcess/FormsSubmissionRequirements/ElectronicSubmissions/ucm248635.htmhttp://www.fda.gov/Drugs/DevelopmentApprovalProcess/FormsSubmissionRequirements/ElectronicSubmissions/ucm248635.htmhttp://www.fda.gov/Drugs/DevelopmentApprovalProcess/FormsSubmissionRequirements/ElectronicSubmissions/ucm248635.htmhttp://www.fda.gov/Drugs/DevelopmentApprovalProcess/FormsSubmissionRequirements/ElectronicSubmissions/ucm248635.htm
  • 7/24/2019 Study Data

    10/40

    Contains Nonbinding Recommendations

    4

    Guidance to Industry Providing Regulatory Submissions in Electronic Format:

    Submissions Under Section 745A(a) of the Federal Food, Drug, and Cosmetic

    Act8

    Guidance to Industry Providing Regulatory Submissions in ElectronicFormat: Certain Human Pharmaceutical Product Applications and Related

    Submissions Using the Electronic Common Technical DocumentSpecifications

    9

    2.

    Planning and Providing Standardized Study Data

    2.1 Study Data Standardization Plan

    For clinical and nonclinical studies, sponsors should include a plan (e.g., in the IND)describing the submission of standardized study data to FDA. The Study DataStandardization Plan (StandardizationPlan) assists FDA in identifying potential datastandardization issues early in the development program. Sponsors may also initiatediscussions at the pre-IND stage. For INDs, the Standardization Planshould be

    located in the general investigational plan. The Standardization Planshould include,but is not limited to the following:

    1. List of the planned studies2. Type of studies (e.g., phase I, II or III)3. Study designs (e.g., parallel, cross-over, open-label extension)4. Planned data standards, formats, and terminologies and their versions or a

    justification of studies that may not conform to the currently supportedstandards

    The Standardization Planshould be updated in subsequent communications with

    FDA as the development program expands and additional studies are planned.Updates to the Standardization Planshould not be communicated each time a study isstarted. The cover letter accompanying a study data submission should describe theextent to which the latest version of the Standardization Planwas executed.

    2.2 Study Data Reviewers Guide

    The preparation of a Study Data Reviewers Guide (SDRG)10,is recommended as anintegral part of a standards-compliant study data submission. The SDRGshoulddescribe any special considerations or directions that may facilitate an FDAreviewer's use of the submitted data and may help thereviewer understand therelationships between the study report and the data.11 The SDRG should include, butis not limited to the following:

    8http://www.fda.gov/downloads/Drugs/Guidances/UCM384686.pdf9 http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM333969.pdf 10A specific template for a Study Data Reviewers Guide is not specified. However, an example of a StudyData Reviewers Guide (template, completion guidelines and examples) can be found athttp://www.phusewiki.org/wiki/index.php?title=Study_Data_Reviewer's_Guide.11For submissions to CBER, sponsors and applicants should continue to provide the Data Interpretationand Validation Report (DIVR). The DIVR can be incorporated into the Study Data Reviewers Guide. The

    http://www.fda.gov/downloads/Drugs/Guidances/UCM384686.pdfhttp://www.fda.gov/downloads/Drugs/Guidances/UCM384686.pdfhttp://www.fda.gov/downloads/Drugs/Guidances/UCM384686.pdfhttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM333969.pdfhttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM333969.pdfhttp://www.phusewiki.org/wiki/index.php?title=Study_Data_Reviewer%27s_Guidehttp://www.phusewiki.org/wiki/index.php?title=Study_Data_Reviewer%27s_Guidehttp://www.phusewiki.org/wiki/index.php?title=Study_Data_Reviewer%27s_Guidehttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM333969.pdfhttp://www.fda.gov/downloads/Drugs/Guidances/UCM384686.pdf
  • 7/24/2019 Study Data

    11/40

    Contains Nonbinding Recommendations

    5

    1. Study protocol title, number, and version2. Study design3. Standards, formats, and terminologies and their versions4. Description of study datasets

    5.

    Data standards validation rules, versions, and issues6. Description of all sponsor decisions related to data standard implementations

    The SDRG for nonclinical and clinical studies should be placed with the study data inModule 4 and 5, respectively, of the Electronic Common Technical Document(eCTD).12

    2.3Analysis Data Reviewers Guide

    The preparation of an Analysis Data Reviewers Guide (ADRG)13is recommended asan important part of a standards-compliant analysis data submission. The ADRGprovides FDA reviewers with context for analysis datasets and terminology, receivedas part of a regulatory product submission, additional to what is presented within thedata definition file (i.e., define.xml). The ADRG also provides a summary of ADaMconformance findings. The ADRG purposefully duplicates limited information foundin other submission documentation (e.g., the protocol, statistical analysis plan,clinical study report, define.xml) in order to provide FDA reviewers with a singlepoint of orientation to the analysis datasets. It should be noted that the submission ofan ADRG does not eliminate the requirement to submit a complete and informativedefine.xml file corresponding to the analysis datasets.

    TheADRG for a clinical study should be placed with the analysis data in Module 5 ofthe Electronic Common Technical Document (eCTD).

    3. Exchange Format Electronic Submissions

    3.1 Extensible Mark-up Language

    Extensible Mark-up Language (XML), as defined by the World Wide WebConsortium (W3C), specifies a set of rules forencoding documents in a format that isboth human-readable and machine-readable.14,15 XMLs primary purpose is tofacilitate the sharing of structured data across different information systems. AnXML use case is CDISCs define.xml file. All XML files should use .xml as the file

    DIVR can be found athttp://www.fda.gov/BiologicsBloodVaccines/DevelopmentApprovalProcess/ucm209137.htm12The Study Data Reviewers Guides are separate from an overall reviewers guide that may be placed inModule 1 of the eCTD.13A specific template for an Analysis Data Reviewers Guide is not specified. However, an example of anAnalysis Data Reviewers Guide (template, completion guidelines and examples) can be found athttp://www.phusewiki.org/wiki/index.php?title=Analysis_Data_Reviewer's_Guide.14Seehttp://en.wikipedia.org/wiki/XML.15Seehttp://www.w3.org/XML/.

    http://www.fda.gov/BiologicsBloodVaccines/DevelopmentApprovalProcess/ucm209137.htmhttp://www.fda.gov/BiologicsBloodVaccines/DevelopmentApprovalProcess/ucm209137.htmhttp://www.phusewiki.org/wiki/index.php?title=Analysis_Data_Reviewer%27s_Guidehttp://www.phusewiki.org/wiki/index.php?title=Analysis_Data_Reviewer%27s_Guidehttp://en.wikipedia.org/wiki/XMLhttp://en.wikipedia.org/wiki/XMLhttp://en.wikipedia.org/wiki/XMLhttp://www.w3.org/XML/http://www.w3.org/XML/http://www.w3.org/XML/http://www.w3.org/XML/http://en.wikipedia.org/wiki/XMLhttp://www.phusewiki.org/wiki/index.php?title=Analysis_Data_Reviewer%27s_Guidehttp://www.fda.gov/BiologicsBloodVaccines/DevelopmentApprovalProcess/ucm209137.htm
  • 7/24/2019 Study Data

    12/40

  • 7/24/2019 Study Data

    13/40

    Contains Nonbinding Recommendations

    7

    datasets, to better support regulatory reviewers. The split datasets should be placed ina separate sub-directory labeled split (See section 7). Clear explanation regardinghow these datasets were split needs to be presented within the relevant datareviewers guide (i.e., SDRG or ADRG).

    3.3.3 Dataset Column Length

    The allotted length for each column containing character (text) data should be set tothe maximum length of the variable used across all datasets in the study. This willsignificantly reduce file sizes. For example, if USUBJID has a maximum length of18, the USUBJIDs column size should be set to 18, not 200.

    3.3.4 Variable and Dataset Descriptor Length

    The length of variable names, descriptive labels, and dataset labels should not exceedthe maximum permissible number of characters described below.

    Table 1: Maximum Length of Variables and Dataset Elements

    Element Maximum Length in Characters

    Variable Name 8

    Variable Descriptive Label 40

    Dataset Label 40

    3.3.5 Special Characters: Variables and Datasets

    Variable names, as well as variable and dataset labels should include American

    Standard Code for Information Interchange (ASCII) text codes only.

    3.3.6 Variable and Dataset Names

    Variable and dataset names should not contain punctuation, dashes, spaces, or othernon-alphanumeric symbols. In addition, the variable and dataset names should notcontain special characters, including:

    \ / * ,? < > | : % # + ( ) { } [ ]

    3.3.7

    Variable and Dataset LabelsVariable and dataset labels can include punctuation characters. However, specialcharacters should not be provided, such as,

    1. Unbalanced apostrophe, e.g., Parkinson's.2. Unbalanced single and double quotation marks.3. Unbalanced parentheses, braces or brackets, e.g., (, {and [.4. greater-than sign.

  • 7/24/2019 Study Data

    14/40

    Contains Nonbinding Recommendations

    8

    4. Study Data Submission Format Clinical and Nonclinical

    4.1Clinical Data Interchange Standards Consortium

    Clinical Data Interchange Standards Consortium(CDISC) is an open,multidisciplinary, neutral, nonprofit standards development organization (SDO) that

    has been working through consensus-based collaborative teams to develop global datastandards for clinical and nonclinical research.21

    Data format specifications for the tabulation datasets of clinical and nonclinicaltoxicology studies are provided by SDTM and SEND, respectively. ADaM providesthe clinical data format specifications for analysis datasets. Currently, ADaMspecifications for SEND have not been developed. As noted in Section 1.1, theStandards Catalogprovides a listing of the currently supported data standards withlinks to reference materials.

    Although the SDTM and SEND formats facilitate review of the data, they do not

    always provide the data structured in a way that supports all analyses needed forreview. Analysis files are critical for FDA to understand, on a per subject basis, howthe specific analyses contained in the study report have been created. Therefore,sponsors should supplement the SDTM with ADaM analysis datasets as describedbelow.

    There may be instances in which current implementation guides (e.g., SDTMIG,SENDIG) do not provide specific instruction as to how certain study data should berepresented. In these instances, sponsors should discuss their proposed solution withthe review division and submit supporting documentation that describes thesedecisions or solutions in the SDRGat the time of submission.

    4.1.1 Study Data Tabulation Model

    4.1.1.1 Definition

    The Study Data Tabulation Model (SDTM) defines a standard structure for humanclinical trials tabulation datasets.

    4.1.1.2 SDTM General Considerations

    It is recommended that sponsors implement the SDTM standard for representation ofclinical trial tabulation data prior to the conduct of the study. The use of case reportforms that incorporate SDTM standard data elements (e.g., Clinical Data Acquisition

    Standards Harmonization (CDASH)) allows for a simplified process for the creationof SDTM domains.

    The SDTMIG should be followed unless otherwise indicated in this Guide or in theStandards Catalog. The conformance criteria listed in the SDTMIG should not be

    21Seehttp://www.cdisc.org.

    http://www.cdisc.org/http://www.cdisc.org/http://www.cdisc.org/http://www.cdisc.org/
  • 7/24/2019 Study Data

    15/40

    Contains Nonbinding Recommendations

    9

    interpreted as the sole determinant of the adequacy of submitted data. If there isuncertainty regarding implementation, the sponsor should discuss application-specificquestions with the review division and general standards implementation questionswith the specific center resources identified elsewhere in this Guide (See section 1.2).When data imputation is utilized, sponsors should submit imputed data in an analysis

    dataset, and the relevant supporting documentation (e.g., ADRG, define.xml)explaining the imputation methods.

    Except for variables that are defined in the SDTMIG as being coded, no numericallycoded variables should typically be submitted as part of the SDTM datasets. Numericvalues generated from validated scoring instruments or questionnaires do notrepresent codes, and therefore have no relevance for this issue. There may be specialinstances when codes are preferred, hence sponsors should refer to the reviewdivision for direction, if there are any questions.

    Subject Identifier (SUBJID)

    The SUBJID is an ID of the entity (i.e., person) that participates in a trial. If the samesubject is screened more than once in a trial, then the subjects SUBJID should bedifferent.

    Unique Subject Identifier (USUBJID)Each individual subject should be assigned a single unique identifier across the entireapplication. This is in addition to the subject ID (SUBJID) used to identify subjectsin each study and its corresponding study report. An individual subject should havethe exact same unique identifier across all datasets, including between SDTM andADaM datasets. Subjects that participate in more than one study should maintain thesame USUBJID across all studies. It is important to follow this convention to enable

    pooling of a single subjects data across studies (e.g., a randomized control trial andan extension study).

    Sponsors should not add leading or trailing spaces to the USUBJID variable in anydataset. For example, applications have been previously submitted in which theUSUBJID variable for each individual subject appeared to be the same acrossdatasets; however, in certain datasets, the actual entry had leading zeros added, orzeros added elsewhere in the entry. This does not allow for machine-readablematching of individual subject data across all datasets. Improper implementation ofthe USUBJID variable is a common error with applications and often requiressponsors to re-submit their data.

    Adjudication DataThere are no existing standards or best practices for the representation of adjudicationdata as part of a standard data submission. Until standards for adjudication data aredeveloped, it is advised that sponsors discuss their proposed approach with the reviewdivision and also include details about the presence, implementation approach, andlocation of adjudication data in the SDRG.

  • 7/24/2019 Study Data

    16/40

    Contains Nonbinding Recommendations

    10

    4.1.1.3 SDTM Domain Specifications

    SUPPQUAL (Supplemental Qualifier)A SUPPQUAL dataset is a special SDTM dataset that contains non-standardvariables which cannot be represented in the existing SDTM domains. SUPPQUALshould be used only when key data cannot be represented in SDTM domains. Ingeneral, variables used to support key analyses should not be represented inSUPPQUAL. Discussion with the review division should occur if the sponsor intendsto include important variables (e.g., that support key analyses) in SUPPQUALdatasets, and reflected in the SDRG.

    DM Domain (Demographics)In the DM domain, each subject should have only one single record per study.

    Screen failures, when provided, should be included as a record in DM with the ARMfield left blank. For subjects who are randomized in treatment group but not treated,the planned arm variables (ARM and ARMCD) should be populated, but actual

    treatment arm variables (ACTARM and ACTARMCD) should be left blank.22

    DS Domain (Disposition)When there is more than one disposition event, the EPOCH variable should be usedto aid in distinguishing between them. This will allow identification of the EPOCHin which each event occurred. If a death of any type occurs, it should be the lastrecord and should include its associated EPOCH. It is expected that EPOCH variablevalues will be determined based on the trial design and thus should be defined clearlyand documented in the define.xml.

    SE Domain (Subject Elements)

    The Subject Elements domain should be included to aid in the association of subjectdata (e.g., findings, events, and interventions) with the study element in which theyoccurred.

    AE Domain (Adverse Events (AE))Currently, there is no variable in the AE domain that indicates if an AE wastreatment-emergent. The AE domain should include all adverse events that wererecorded in the subjects case report forms, regardless of whether the sponsordetermined that particular events were or were not treatment-emergent.

    The entry of a Y for the serious adverse event variable, AESER, should have the

    assessment indicated, (e.g., as a death, hospitalization, or disability/permanentdamage). Frequently, sponsors omit the assessment information, even when it hasbeen collected on the CRF. The criteria that led to the determination should be

    22Although this convention is inconsistent with the SDTMIG, FDA recommends its use so that ScreenFailure is not specified as a treatment arm.

  • 7/24/2019 Study Data

    17/40

    Contains Nonbinding Recommendations

    11

    provided. This information is critical during FDA review to support thecharacterization of serious AEs.

    Custom DomainsThe SDTMIG permits the creation of custom domains if the data do not fit into an

    existing domain. Prior to creating a custom domain, sponsors should confirm that thedata do not fit into an existing domain. If it is necessary to create custom domains,sponsors should follow the recommendations in the STDMIG. In addition, sponsorsshould present their implementation approach in the SDRG.

    LB Domain (Laboratory)The size of the LB domain dataset submitted by sponsors is often too large to process(See section 3.3.2). This issue can be addressed by splitting a large LB dataset intosmaller datasets according to LBCAT and LBSCAT, using LBCAT for initialsplitting. If the size is still too large, then use LBSCAT for further splitting. Forexample, use the dataset name lb1.xpt for chemistry, lb2.xpt for hematology, and

    lb3.xpt for urinalysis. Splitting the dataset in other ways (e.g., by subject or file size)makes the data less useable. Sponsors should submit these smaller files in addition tothe larger non-split standard LB domain file. Sponsors should submit the split files ina separate sub-directory/split that is clearly documented in addition to the non-splitstandard LB domain file in the SDTM datasets directory (See section 7).

    Trial DesignTrial Design datasets provide a way to describe the planned conduct of a clinical trialand should be included in SDTM submissions.

    4.1.2 Analysis Data Model

    4.1.2.1

    Definition

    Specifications for analysis datasets for human drug product clinical studies areprovided by the Analysis Data Model (ADaM) and its implementation by theADaMIG. Analysis datasets should be used to create and to support the results inclinical study reports, Integrated Summaries of Safety (ISS), and IntegratedSummaries of Efficacy (ISE), as well as other analyses required for a thoroughregulatory review. Analysis datasets can contain imputed data or data derived fromSDTM datasets.

    4.1.2.2 General Considerations

    Generally, ADaM analysis datasets facilitate FDA review. However, it does notalways provide data structured in a way that supports all of the analyses that shouldbe submitted for review. For example, ADaM does not support simultaneous analysisof multiple dependent variables or correlation analysis across several responsevariables. Therefore, sponsors should, as needed, supplement their ADaMdatasets after discussions with the specific review division.

  • 7/24/2019 Study Data

    18/40

    Contains Nonbinding Recommendations

    12

    One of the expected benefits of analysis datasets that conform to ADaM is that theysimplify the programming steps necessary for performing an analysis. As notedabove, ADaM datasets should be derived from the data contained in the SDTMdatasets. There are features built into the ADaM standard that promote traceabilityfrom analysis results to ADaM datasets and from ADaM datasets to SDTM. To

    ensure traceability, all SDTM variables utilized for variable derivations in ADaMshould be included in the ADaM datasets. Each analysis dataset that is submittedshould be described accordingly with adequate metadata in the define.xml file.

    4.1.2.3 Key Efficacy and Safety Variables

    Sponsors should submit ADaM datasets to support key efficacy and safety analyses.At least one dataset should be referenced in the define file as containing the primaryefficacy variables. Further, primary and secondary variables and their derivations (asapplicable) should be provided, as well as documented in the define file.

    4.1.2.4 Timing Variables

    A variable for relative day of measurement or event, along with timing variables forvisit should be included when an analysis dataset contains multiple records persubject (i.e., repeated measures data). In addition to a protocol-scheduled visitvariable, sponsors should include at least two additional timing variables: a charactervariable describing the visit (e.g., WEEK 8) and a corresponding numeric variable(e.g., 8). These two variables may represent measures of real time fromrandomization. The reason for this request is related to a common analysis (i.e., theportrayal of data over the duration of a study). These data are often presented asmeans or medians by treatment group and by time on study treatment. The timeon study treatment variable is defined by windowed visits that might be representedin ADaM datasets as AVISIT (a character variable) and its numeric analog AVISITN.AVISIT / AVISITN are adequate for sub-setting the data. However, in certaincircumstances, such as extension studies, AVISIT / AVISITN are not an adequatemeasure of real time on study treatment, and so it is not sufficient for plotting datawhere the x-axis measures real time.

    4.1.2.5 Core Variables

    Core variables, including all covariates presented in the study protocol, should belisted after the key variables (USUBJID and visit) and included in each ADaMdataset. These core variables are typically included in the Analysis Data SubjectLevel (ADSL) dataset (See section 4.1.2.9.1). Examples of key core variables

    include study/protocol number, center/site number, geographic region, country,treatment assignment information, sex, age, race, analysis population flags (e.g.,Intent-to-Treat (ITT), Full Analysis Set (FAS), Safety, Per-Protocol (PP)), and otherimportant baseline demographic variables. Note that all variables that contain codeddata should be accompanied by a variable that provides the decoded information.

    In addition, it is important to note that SDTM datasets do not have core variables(such as demographic and population variables) repeated across the different

  • 7/24/2019 Study Data

    19/40

    Contains Nonbinding Recommendations

    13

    domains. The duplication of core variables across various domains can be fulfilledthrough their inclusion in the corresponding analysis datasets. For example, theSDTM adverse event dataset does not allow for the inclusion of variables such astreatment arm, sex, age, or race. These and other variables should be included in anadverse event analysis dataset.

    4.1.2.6 Dates

    Dates should be formatted as numeric in the analysis datasets. In addition toincluding dates formatted as International Standards Organization (ISO) 8601,corresponding numeric dates should be included in the analysis datasets. The specificdate of reference used to calculate numeric dates varies by software; hence this dateof reference should be specified within define.xml. In the event of partial dates,imputation should be performed and appropriate corresponding ADaM imputationflags utilized.

    4.1.2.7 Labels

    Each dataset should be described by an internal label that is shown in the define.xmlfile. The label names of analysis datasets should be different from those of theSDTM datasets. For example, the SDTM adverse event dataset (AE) and the analysisadverse event dataset (e.g., ADAE) should not share the exact same dataset label,such as Adverse Events.

    4.1.2.8 Software Programs

    Sponsors should provide the software programs used to create ADaM datasets andcorresponding tables and figures to help reviewers to better understand how thedatasets, tables and figures were created. Any submitted programs (scripts) generatedby an analysis tool should be provided as ASCII text files or PDF files and shouldinclude sufficient documentation to allow a reviewer to understand the submittedprograms. If the submitted programs created by the analysis tool uses a file extensionother than .txt or .pdf, then the file name should include the native file extension.

    4.1.2.9 ADaM Domain Specifications

    4.1.2.9.1 Analysis Data Subject Level

    Analysis Data Subject Level (ADSL) is the subject-level analysis dataset for ADaM.All submissions containing standard data should contain an ADSL file for each study.In addition to the variables specified for ADSL in the ADaMIG such as thosepreviously listed in the core variables section (See section 4.1.2.5) above, the sponsorshould include multiple additional variables representing various important baselinesubject characteristics / covariates presented in the study protocol. Some examples ofbaseline characteristics / covariates include, but are not limited to, disease severityscores such as Acute Physiology and Chronic Health Evaluation (APACHE) scores,23baseline organ function measurements such as calculated creatinine clearance or

    23Knaus WA, Draper EA, Wagner DP, Zimmerman JE (1985). APACHE II: a severity of diseaseclassification system. Critical Care Medicine, 13 (10): 818829.29.

  • 7/24/2019 Study Data

    20/40

    Contains Nonbinding Recommendations

    14

    Forced Expiratory Volume in 1 second (FEV1); range categories for continuousvariables; and numeric date variables in non-ISO formats.

    4.1.2.9.2 Imputed Data

    When data imputation is utilized, sponsors should submit imputed data in an analysisdataset, and the relevant supporting documentation (e.g., define.xml and ADRG)explaining the imputation methods.

    4.1.3 Standard for Exchange of Nonclinical Data

    4.1.3.1 Definition

    The Standard for Exchange of Nonclinical Data (SEND) provides the organization,structure, and format of standard nonclinical (animal toxicology studies) tabulationdatasets for regulatory submission. Currently, the SEND Implementation Guide(SENDIG) supports single-dose general toxicology, repeat-dose general toxicology,and carcinogenicity studies.

    4.1.3.2

    General Considerations

    The SENDIG provides specific domain models, assumptions, conformance andbusiness rules, and examples for preparing standard tabulation datasets that are basedon the SDTM. If there is uncertainty regarding SEND implementation, the sponsorshould discuss the issue with the review division.

    The ideal time to implement SEND is prior to the conduct of the study as it is veryimportant that the results presented in the accompanying study report be traceableback to the original data collected.

    4.1.3.3

    SEND Domain SpecificationSUPPQUAL (Supplemental Qualifier)A SUPPQUAL dataset is a special SEND dataset that contains non-standard variableswhich cannot be represented in the existing SEND domains. Discussion with thereview division should occur if the sponsor intends to include important variables(i.e., that support key analyses) in SUPPQUAL datasets and this should be reflectedin the SDRG.

    Currently, SUPPQUAL should be used to capture some collected information (e.g.,pathology modifiers) until the SEND is further refined to adequately represent suchinformation.

    Microscopic Findings (MI) DomainSponsors should ensure that the transformation of findings from MIORRES toMISTRESC closely adheres to the instructions in the SENDIG. Modifiers for whichthere are variables available (e.g. MISEV, MILAT, etc.) should be placedappropriately. There should be no severities (e.g., minimal, mild, etc.) included inMISTRESC. Sponsors should use the VISITDY variable if postmortem findings

  • 7/24/2019 Study Data

    21/40

    Contains Nonbinding Recommendations

    15

    which were intended to be analyzed together were collected across multiple studydays.

    Macroscopic Findings (MA) DomainSponsors should use the VISITDY variable if postmortem findings which were

    intended to be analyzed together were collected across multiple study days.

    Custom DomainsThe SENDIG allows for the creation of custom domains if the data do not fit into anexisting domain.

    Tumor DatasetCarcinogenicity studies should include an electronic dataset of tumor findings toallow for a complete review. At this time sponsors should include a tumor.xpt filewhile following the specification in the SENDIG for its creation regardless ofwhether or not the study is in SEND format (Seewww.cdisc.org/send).

    4.1.4 General Considerations: SDTM, SEND, ADaM

    4.1.4.1 Variables: Required, Expected, and Permissible

    CDISC data standards categorize SDTM and SEND variables as being Required,Expected, and Permissible. In some instances, sponsors have interpreted Permissiblevariables as being optional and, in other cases, sponsors have excluded Expectedvariables. For the purposes of SDTM and SEND submissions, all Required,Expected, and Permissible variables that were collected, plus any variables that areneeded to compute derivations, should be submitted.24

    SDTM datasets should not contain imputed data. FDA recognizes that SDTMcontains certain operationally derived variables that have standard derivations acrossall studies (e.g., --STDY, EPOCH). If the data needed to derive these variables aremissing, then these variables cannot be derived and the values should be null. Thefollowing are examples of some of the Permissible and Expected variables in SDTMand SEND that should be included, if available:

    1. Baseline flags (e.g., last non-missing value prior to first dose) for Laboratoryresults, Vital Signs, ECG, Pharmacokinetic Concentrations, and Microbiologyresults. Currently, for SDTM, baseline flags should be submitted if the data werecollected or can be derived.

    2.

    EPOCH designators. Please follow CDISC guidance for terminology.25 Thevariable EPOCH should be included for clinical subject-level observation (e.g.,adverse events, laboratory, concomitant medications, exposure, vital signs). This

    24See CDISC SDTM Implementation Guide atwww.cdisc.orgfor additional information on variablesreferenced throughout this Guide25

    Seehttp://www.cancer.gov/cancertopics/terminologyresources/page6.

    http://www.cdisc.org/sendhttp://www.cdisc.org/sendhttp://www.cdisc.org/sendhttp://www.cdisc.org/http://www.cdisc.org/http://www.cdisc.org/http://www.cancer.gov/cancertopics/terminologyresources/page6http://www.cancer.gov/cancertopics/terminologyresources/page6http://www.cancer.gov/cancertopics/terminologyresources/page6http://www.cancer.gov/cancertopics/terminologyresources/page6http://www.cdisc.org/http://www.cdisc.org/send
  • 7/24/2019 Study Data

    22/40

    Contains Nonbinding Recommendations

    16

    will allow the reviewer to easily determine during which phase of the trial theobservation occurred (e.g., screening, on-therapy, follow-up), as well as the actualintervention the subject experienced during that phase.

    3. Whenever --DTC, --STDTC or --ENDTC, which have the role of timing

    variables, are included, the matching Study Day variables (--DY, --STDY, or --ENDY, respectively) should be included. For example, in most Findingsdomains, --DTC is Expected, which means that --DY should also be included. Inthe Subject Visits domain, SVSTDTC is Required and SVENDTC is Expected;therefore, both SVSTDY and SVENDY should be included.

    4.1.4.2 Dates

    Dates in SDTM and SEND domains should conform to the ISO 8601 format.Examples of how to implement dates are included in the SDTMIG and SENDIG.26

    4.1.4.3 Naming Conventions

    Naming conventions (variable name and label) and variable formats should befollowed as specified in the SDTMIG and SENDIG.

    4.1.4.4 SDTM / SEND / ADaM Versions

    When submitting clinical and nonclinical data, sponsors should not mix versionswithin a study. As noted above, the Standards Cataloglists the versions that aresupported by FDA.

    Conversions to one standardized version should be described in the SDRG, includingthe rationale for the conversion.

    4.1.4.5 Data Definition File

    The data definition file, i.e., define file, describes the metadata of the submittedelectronic datasets, and is considered arguably the most important part of theelectronic dataset submission for regulatory review. This data definition specificationfor submitted datasets defines the metadata structures that should be used to describethe datasets and variables. An insufficiently documented define file is a commondeficiency that reviewers have noted. Consequently, the sponsor needs to provideadequate detail in this file, especially for the derivations pertaining to derivedvariables. In addition, sponsors should also make certain that the code list and originfor each variable are clearly and easily accessible from the define file. The version of

    any external dictionary should be clearly stated both in the define file and, wherepossible, in the updated Trial Summary (TS) domain (SDTMIG 3.1.2 or greater;SENDIG 3.0 or greater). The internal dataset label should also clearly describe thecontents of the dataset. For example, the dataset label for an efficacy dataset might beTime to Relapse (Efficacy).

    26Seehttp://www.cdisc.org

    http://www.cdisc.org/http://www.cdisc.org/http://www.cdisc.org/http://www.cdisc.org/
  • 7/24/2019 Study Data

    23/40

    Contains Nonbinding Recommendations

    17

    Separate define files should be included for each type of electronic datasetsubmission, i.e., a separate define file for the SDTM datasets, a separate define filefor the SEND datasets, and a separate define file for the ADaM datasets. The definefile should be submitted in XML format, i.e., a properly functioning define.xml27. Inaddition to the define.xml, a printable define.pdf should be provided if the define.xml

    cannot be printed28. To confirm that a define.xml is printable within the CDER ITenvironment, it is recommended that the sponsor submit a test version [email protected] to application submission. If a define.xml version 2.0 orlater version is submitted, then a define.pdf does not need to be included in thesubmission. TheStandards Cataloglists the currently supported version(s) ofdefine.xml. Sponsors should include a reference to the style sheet as defined in thespecification and place the corresponding style sheet in the same submission folder asthe define.xml file.

    4.1.4.6 Annotated Case Report Form (aCRF)

    An AnnotatedCase Report Form (aCRF) is a PDF document that maps the data

    collection fields used to capture subject data (electronic or paper) to thecorresponding variables or discrete variable values contained within the datasets.Regardless of whether the clinical database is legacy or SDTM compliant, an aCRFshould be submitted. The aCRF should be provided as a PDF with the file nameacrf.pdf.29 The SDTM Metadata Submission Guidelines should be used foradditional information on annotated CRFs.30

    The aCRF should include treatment assignment forms, when applicable, and shouldmap each variable on the CRF to the corresponding variables in the datasets (ordatabase). The aCRF should include the variable names and coding for each CRFitem.

    When data are recorded on the CRF but are not submitted, the CRF should beannotated with the text "NOT SUBMITTED. There should be an explanation in theSDRGstating why data have not been submitted.

    5. Therapeutic Area StandardsThis section is reserved for future comments, recommendations, and preferences on

    therapeutic area data standards.

    27Seehttp://www.cdisc.org/define-xml28Detailed FDA PDF specifications are located on FDAs Electronic Common Technical Document(eCTD) Web site,http://www.fda.gov/drugs/developmentapprovalprocess/formssubmissionrequirements/electronicsubmissions/ucm153574.htm29Previously acrf.pdf was called blankcrf.pdf.30See Study Data Tabulation Model Metadata Submission Guidelines (SDTM-MSG)(http://www.cdisc.org/sdtm).

    mailto:[email protected]:[email protected]:[email protected]:[email protected]://www.cdisc.org/define-xmlhttp://www.cdisc.org/define-xmlhttp://www.cdisc.org/define-xmlhttp://www.fda.gov/drugs/developmentapprovalprocess/formssubmissionrequirements/electronicsubmissions/ucm153574.htmhttp://www.fda.gov/drugs/developmentapprovalprocess/formssubmissionrequirements/electronicsubmissions/ucm153574.htmhttp://www.fda.gov/drugs/developmentapprovalprocess/formssubmissionrequirements/electronicsubmissions/ucm153574.htmhttp://www.fda.gov/drugs/developmentapprovalprocess/formssubmissionrequirements/electronicsubmissions/ucm153574.htmhttp://www.fda.gov/drugs/developmentapprovalprocess/formssubmissionrequirements/electronicsubmissions/ucm153574.htmhttp://www.cdisc.org/define-xmlmailto:[email protected]:[email protected]
  • 7/24/2019 Study Data

    24/40

    Contains Nonbinding Recommendations

    18

    6. Terminology

    6.1General

    Common dictionaries should be used across all clinical studies and throughout thesubmission for each of the following: adverse events, concomitant medications,

    procedures, indications, study drug names, and medical history. FDA recommendsthat sponsors use, where appropriate, the terminologies supported and listed in theStandards Catalog. It is important that coding standards, if they exist, be followed(e.g., ICH MedDRA Term Selection: Points-to-Consider document). Frequently,sponsors submit data that do not conform to terminology standards, for example,misspelling of MedDRA or WHO Drug terms, lack of conformance to upper / lowercase, or the use of hyphens. All controlled terms submitted in datasets shouldconform to the exact case and spelling used by the terminology maintenanceorganization (e.g., MedDRA, CDISC controlled terminology). These conformanceissues make it difficult to use or develop automated review and analysis tools. Theuse of a dictionary that is sponsor-defined or an extension of a standard dictionary

    should be avoided if possible, but, if essential, its use should be documented in thedefine.xml file and the SDRG.

    6.1.1 Controlled Terminologies

    Controlled terminology standards are an important component of study datastandardization and are a critical component of achieving semantically interoperabledata exchange (See Appendix). Generally, controlled terminology standards specifythe key concepts that are represented as definitions, preferred terms, synonyms,codes, and code system.

    The analysis of study data is greatly facilitated by the use of controlled terms for

    clinical or scientific concepts that have standard, predefined meanings andrepresentations. Standard terminology for adverse events perhaps represents theearliest example of using standards for study data. For example, myocardialinfarctionand heart attackare synonyms, and as such should be mapped to the sameterm in a standard dictionary. This level of standardization facilitates an efficientanalysis of events that are coded to the standard term. In electronic study datasubmissions, sponsors should provide the actual verbatim terms that were collected(e.g., on the case report form), as well as the coded term.

    Controlled terminology is also useful when consistently applied across studies tofacilitate integrated analyses (that are stratified by study) and cross-study comparative

    analyses (e.g., when greater statistical power is needed to detect important safetysignals). Cross-study comparisons and pooled integrated analyses occasionallyprovide critical information for regulatory decisions, such as statistical results that

  • 7/24/2019 Study Data

    25/40

    Contains Nonbinding Recommendations

    19

    support effectiveness,31as well as important information on exposure-responserelationships32and population pharmacokinetics33.

    6.1.2 Use of Controlled Terminologies

    FDA recognizes that studies are conducted over many years, during which timeversions of a terminology may change. Generally, FDA expects sponsors to use themost current version of an FDA-supported terminology available at the time ofcoding. It is acceptable to have different studies use different versions of the samedictionary within the same application. There are some situations where it may beacceptable to use a single older version of a dictionary across multiple studies, eventhough that version may not be the most current for the later studies. The study datasubmission should describe the impact, if any, of the older version on the studyresults in the SDRG. For example, if the sponsor anticipates pooling coded dataacross multiple studies, then it may be desirable to use a single version across thosestudies to facilitate pooling. If a sponsor selects this approach, then the approach andthe justification should be documented in the Standardization Plan, or in an update to

    the plan.

    Regardless of the specific versions used for individual studies, pooled analyses ofcoded terms across multiple studies (e.g., for an integrated summary of safety) shouldbe conducted using a single version of a terminology. This will ensure a consistentand coherent comparison of clinical and scientific concepts across multiple studies.Sponsors should specify the terminologies and versions used in the study in theSDRG.

    6.1.3 Maintenance of Controlled Terminologies

    The use of supported controlled terminologies is recommended wherever available.If a sponsor identifies a concept for which no standard term exists, FDA recommendsthat the sponsor submit the concept to the appropriate terminology maintenanceorganization as early as possible to have a new term added to the standard dictionary.FDA considers this good terminology management practice. The creation of customterms (i.e., so-called extensiblecode lists) for a submission is discouraged, becausethis does not support semantically interoperable study data exchange. Furthermore,

    31See the guidance for industry Providing Clinical Evidence of Effectiveness for Human Drugs andBiological Products, available athttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm072008.

    pdf. We update guidances periodically. To make sure you have the most recent version of a guidance,

    check the FDA Drugs guidance Web page athttp://www.fda.gov/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/default.htm.32See the guidance for industryExposure-Response Relationships Study Design, Data Analysis, andRegulatory Applications,

    http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm072109.pdf.33See the guidance for industry Population Pharmacokinetics,available athttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm072137.

    pdf.

    http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm072008.pdfhttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm072008.pdfhttp://www.fda.gov/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/default.htmhttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm072109.pdfhttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm072109.pdfhttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm072109.pdfhttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm072137.pdfhttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm072137.pdfhttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm072137.pdfhttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm072137.pdfhttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm072137.pdfhttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm072109.pdfhttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm072109.pdfhttp://www.fda.gov/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/default.htmhttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm072008.pdfhttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm072008.pdf
  • 7/24/2019 Study Data

    26/40

    Contains Nonbinding Recommendations

    20

    the use of custom or extensible code lists should not be interpreted to mean thatsponsors may substitute their own nonstandard terms in place of existing equivalentstandardized terms. Terminology maintenance organizations generally have well-defined change control processes. Sponsors should allow sufficient time for aproposed term to be reviewed and included in the terminology, as it is desirable to

    have the term incorporated into the standard terminology before the data aresubmitted. If custom terms cannot be avoided, the submitter should clearly identifyand define them within the submission, reference them in the SDRG, and use themconsistently throughout the application.

    If a sponsor identifies an entire information domain34for which FDA has notaccepted a specific standard terminology, they may select a standard terminology touse, if one exists. FDA recommends that sponsors include this selection in theStandardization Plan(See section 2.1) or in an update to the existing plan, andreference it in the SDRG. If no controlled terminology exists, the sponsor may definecustom terms. The non-FDA supported terms (whether from a non-supported

    standard terminology or sponsor-defined custom terms) should then be usedconsistently throughout all relevant studies within the application.

    6.2CDISC Controlled Terminology

    Sponsors should use the terminologies and code lists in the CDISC ControlledTerminology, which can be found at the NCI (National Cancer Institute) EnterpriseVocabulary Services.35 For variables for which no standard terminology exists, or ifthe available terminology is insufficient, the sponsor should propose its ownterminology. The sponsor should provide this information in the define.xml file andin the SDRG.

    6.3

    Adverse Events

    6.3.1 MedDRA

    6.3.1.1 General Considerations

    MedDRA should be used for coding adverse events. The spelling and capitalizationof MedDRA terms should match the way the terms are presented in the MedDRAdictionary (e.g., spelling and case). Common errors that have been observed includethe incorrect spelling of a System Organ Class (SOC) and other MedDRA terms.

    Generally, the studies included in an application are conducted over many years andmay have used different MedDRA versions. To avoid potential confusion orincorrect results, the preparation of the adverse event dataset for the ISS shouldinclude MedDRA Preferred Terms from a single version of MedDRA. The reason foran ISS based on a single version of MedDRA is that reviewers often analyze adverse

    34By information domain, we mean a logical grouping of clinical or scientific concepts that are amenableto standardization (e.g., adverse event data, laboratory data, histopathology data, imaging data).35Seehttp://www.cancer.gov/cancertopics/terminologyresources/page6.

    http://www.cancer.gov/cancertopics/terminologyresources/page6http://www.cancer.gov/cancertopics/terminologyresources/page6http://www.cancer.gov/cancertopics/terminologyresources/page6http://www.cancer.gov/cancertopics/terminologyresources/page6
  • 7/24/2019 Study Data

    27/40

    Contains Nonbinding Recommendations

    21

    events across studies, including the use of Standardized MedDRA Queries.36 Inaddition, sponsors should use the MedDRA-specified hierarchy of terms. The SDTMvariables for the different hierarchy levels should represent MedDRA-specifiedprimary SOC-coded terms.

    6.4

    Medications

    6.4.1 FDA Unique Ingredient Identifier

    6.4.1.1 General Considerations

    The FDA Unique Ingredient Identifier (UNII)37should be used to identify activeingredients (specifically, active moieties) that are administered to investigationalsubjects in a study (either clinical or nonclinical). This information should beprovided in the SDTM Trial Summary (TS) domain. UNIIs should be included for allactive moieties of investigational products (TSPARM=TRT or TRTUNII), activecomparators (TSPARM=COMPTRT), and any protocol-specified backgroundtreatments (TSPARM=CURTRT).

    If a medicinal product has more than one active moiety, then multiple records in TSshould be provided, one for each active moiety. For example, if the investigationalproduct is Bactrim (a combination of sulfamethoxazole and trimethoprim), then TSwill contain two records for TSPARM=TRT: one for sulfamethoxazole and one fortrimethoprim.

    The preferred substance names and UNII codes can be found by searching FDAsSubstance Registration System, hosted by the National Library of Medicine.38 Werecognize that unapproved substances may not yet have registered UNII codes. Werecommend that sponsors obtainUNII codes for unapproved substances as early indrug development as possible, so that relevant information, such as study data, can beunambiguously linked to those substances.

    6.4.2 WHO Drug Dictionary

    6.4.2.1 General Considerations

    World Health Organization (WHO) Drug Dictionary39is a dictionary maintained andupdated by Uppsala Monitoring Centre. WHO Drug Dictionary contains uniqueproduct codes for identifying drug names and evaluating medicinal productinformation, including active ingredients and therapeutic uses.

    Typically, WHO Drug is used to code concomitant medications. --DECOD should be

    populated with the generic name from the WHO dictionary, and --CLAS populatedwith the drug class, if the utilized dictionary codes drugs to a single class. When

    36Seehttp://www.meddra.org/standardised-meddra-queries.37Seehttp://www.fda.gov/ForIndustry/DataStandards/SubstanceRegistrationSystem-UniqueIngredientIdentifierUNII/38The Substance Registration System can be accessed athttp://fdasis.nlm.nih.gov/srs39See http://www.who-umc.org/

    http://www.meddra.org/standardised-meddra-querieshttp://www.meddra.org/standardised-meddra-querieshttp://www.meddra.org/standardised-meddra-querieshttp://www.fda.gov/ForIndustry/DataStandards/SubstanceRegistrationSystem-UniqueIngredientIdentifierUNII/http://www.fda.gov/ForIndustry/DataStandards/SubstanceRegistrationSystem-UniqueIngredientIdentifierUNII/http://www.fda.gov/ForIndustry/DataStandards/SubstanceRegistrationSystem-UniqueIngredientIdentifierUNII/http://www.fda.gov/ForIndustry/DataStandards/SubstanceRegistrationSystem-UniqueIngredientIdentifierUNII/http://fdasis.nlm.nih.gov/srshttp://fdasis.nlm.nih.gov/srshttp://fdasis.nlm.nih.gov/srshttp://fdasis.nlm.nih.gov/srshttp://www.fda.gov/ForIndustry/DataStandards/SubstanceRegistrationSystem-UniqueIngredientIdentifierUNII/http://www.fda.gov/ForIndustry/DataStandards/SubstanceRegistrationSystem-UniqueIngredientIdentifierUNII/http://www.meddra.org/standardised-meddra-queries
  • 7/24/2019 Study Data

    28/40

    Contains Nonbinding Recommendations

    22

    using WHODRUG, generally, --CLAS would not be filled because a drug may havemultiple classes. However, one Anatomic Therapeutic Code (ATC) level 4 codecould be mapped to --CLAS and the remainder of the ATC codes could be placed inSUPPCM.

    6.5

    Pharmacologic Class

    6.5.1 National Drug File -- Reference Terminology

    6.5.1.1 General Considerations

    TheVeterans Administrations National Drug File Reference Terminology (NDF-RT)40should be used to identify the pharmacologic class(es) of all activeinvestigationalsubstances that are used in a study (either clinical or nonclinical).This information should be provided in the SDTM Trial Summary (TS) domain. Theinformation should be provided as one or more records in TS, whereTSPARM=PCLAS.

    Pharmacologic class is a complex concept that is made up of one or more componentconcepts: mechanism of action (MOA), physiologic effect (PE), and chemicalstructure (CS).41 The established pharmacologic class is generally the MOA, PE, orCS term that is considered the most scientifically valid and clinically meaningful.Sponsors should include in TS the established pharmacologic class of all activemoieties of investigational products used in a study. FDAmaintains a list ofestablished pharmacologic classes of approved moieties.42 If the establishedpharmacologic class is not available for an active moiety, then the sponsor shoulddiscuss the appropriate MOA, PE, and CS terms with the review division. Forunapproved investigational active moieties where the pharmacologic class isunknown, the PCLAS record may not be available.

    6.6

    Indication

    6.6.1 SNOMED CT

    6.6.1.1 General Considerations

    The International Health Terminology Standards Organizations (IHTSDO)Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT)43shouldbe used to identify the medical condition or problem that the investigational productin a study is intended to affect (treat, diagnose or prevent, i.e., the indication). Thisinformation should be provided in the SDTM Trial Summary (TS) domain as a recordwhere TSPARM=INDIC and TSPARM=TDIGRP. SNOMED CT was chosen to

    40

    Seehttp://mor.nlm.nih.gov/download/rxnav/NdfrtAPIs.html#41See the guidance for industry and review staff Labelingfor Human Prescription Drug and BiologicProducts Determining Established Pharmacologic Class for Use in the Highlights of Prescribing

    Information, available athttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm186607.

    pdf.42Available athttp://www.fda.gov/downloads/ForIndustry/DataStandards/StructuredProductLabeling/UCM346147.zip43

    http://www.ihtsdo.org/snomed-ct/.

    http://mor.nlm.nih.gov/download/rxnav/NdfrtAPIs.htmlhttp://mor.nlm.nih.gov/download/rxnav/NdfrtAPIs.htmlhttp://mor.nlm.nih.gov/download/rxnav/NdfrtAPIs.htmlhttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm186607.pdfhttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm186607.pdfhttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm186607.pdfhttp://www.fda.gov/downloads/ForIndustry/DataStandards/StructuredProductLabeling/UCM346147.ziphttp://www.fda.gov/downloads/ForIndustry/DataStandards/StructuredProductLabeling/UCM346147.ziphttp://www.ihtsdo.org/snomed-ct/http://www.ihtsdo.org/snomed-ct/http://www.ihtsdo.org/snomed-ct/http://www.ihtsdo.org/snomed-ct/http://www.fda.gov/downloads/ForIndustry/DataStandards/StructuredProductLabeling/UCM346147.ziphttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm186607.pdfhttp://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm186607.pdfhttp://mor.nlm.nih.gov/download/rxnav/NdfrtAPIs.html
  • 7/24/2019 Study Data

    29/40

    Contains Nonbinding Recommendations

    23

    harmonize with Indication information in Structured Product Labeling (SPL)44. Areviewer should be able to take the indication term from product labeling and readilysearch for clinical or nonclinical studies of that indication without having to translate.

    7.

    Electronic Submission Format

    Study datasets and their supportive files shouldbeorganized into a specific filedirectory structure when submitted in the eCTD45format (see Figure 1 and Table 2below). Note that this structure is distinct from the eCTD headings and hierarchyfolder structure, and does not affect it. Submission of files within the appropriatefolders allows automated systems to detect and prepare datasets for review, andminimizes the need for manual processing.

    The define.xml and supportive style sheet should reside in the same folder as thedatasets they pertain to (e.g., for SDTM, place in tabulations\sdtm\). Do not submit

    empty file folders. Do not submit additional subfolders. If you feel that additionalfolders are needed, please consult with the appropriate center in advance forguidance.

    If you need to split a file that exceeds file size limits (See section 3.3.2), you shouldsubmit the smaller split files in the split sub-folder in addition to the larger non-splitfile in the original data folder.

    All datasets should be referenced in the eCTD XML backbone. Datasets includedwithin the eCTD should be accurately tagged within a study tagging file to ensureproper identification and organization.46 The file folder structure for study datasets

    is summarized in Figure 1. Table 2 provides the study dataset and file folderstructure and associated description.

    44Seehttp://www.fda.gov/ForIndustry/DataStandards/StructuredProductLabeling/ucm163377.htm.45Seehttp://www.ich.org/products/ctd.html.46See The eCTD Backbone File Specification for Study Tagging Files(http://www.fda.gov/downloads/Drugs/DevelopmentApprovalProcess/FormsSubmissionRequirements/ElectronicSubmissions/UCM163560.pdf)for further details.

    http://www.fda.gov/ForIndustry/DataStandards/StructuredProductLabeling/ucm163377.htmhttp://www.fda.gov/ForIndustry/DataStandards/StructuredProductLabeling/ucm163377.htmhttp://www.fda.gov/ForIndustry/DataStandards/StructuredProductLabeling/ucm163377.htmhttp://www.ich.org/products/ctd.htmlhttp://www.ich.org/products/ctd.htmlhttp://www.ich.org/products/ctd.htmlhttp://www.fda.gov/downloads/Drugs/DevelopmentApprovalProcess/FormsSubmissionRequirements/ElectronicSubmissions/UCM163560.pdfhttp://www.fda.gov/downloads/Drugs/DevelopmentApprovalProcess/FormsSubmissionRequirements/ElectronicSubmissions/UCM163560.pdfhttp://www.fda.gov/downloads/Drugs/DevelopmentApprovalProcess/FormsSubmissionRequirements/ElectronicSubmissions/UCM163560.pdfhttp://www.fda.gov/downloads/Drugs/DevelopmentApprovalProcess/FormsSubmissionRequirements/ElectronicSubmissions/UCM163560.pdfhttp://www.fda.gov/downloads/Drugs/DevelopmentApprovalProcess/FormsSubmissionRequirements/ElectronicSubmissions/UCM163560.pdfhttp://www.fda.gov/downloads/Drugs/DevelopmentApprovalProcess/FormsSubmissionRequirements/ElectronicSubmissions/UCM163560.pdfhttp://www.ich.org/products/ctd.htmlhttp://www.fda.gov/ForIndustry/DataStandards/StructuredProductLabeling/ucm163377.htm
  • 7/24/2019 Study Data

    30/40

    Contains Nonbinding Recommendations

    24

    Figure 1: Folder Structure for Study Datasets

  • 7/24/2019 Study Data

    31/40

    Contains Nonbinding Recommendations

    25

    Table 2: Study Dataset and File Folder Structure and Description

    Folder Name Folder Level Description/Contents

    [module] 1

    Refers to the eCTD module in which study data is being

    submitted. Name this folder m4 for nonclinical data andm5 for clinical data. Do not place files at this level.

    datasets 2

    Resides within the module folder as the top-level folder

    for study data (nonclinical or clinical) being submitted for

    the specified module (m4 or m5). Do not place files at

    this level.

    [study]3

    Name this folder with the study identifier or analysis

    type performed (e.g., study123, iss, ise). Do not place

    files at this level.

    analysis4

    Contains folders for analysis datasets and software

    programs; arrange in designated level 6 subfolders. Do

    not place files at this level.

    adam

    5

    Contains subfolders for ADaM datasets and

    corresponding software programs. Do not place files at

    this level.

    datasets6 Place ADaM datasets in this subfolder.

    split7 Place any split ADaM datasets in this subfolder.

    programs6

    Place software programs for ADaM datasets, tables and

    figures in this subfolder.

    legacy5

    Contains legacy formatted analysis datasets and

    corresponding software programs. Do not place files at

    this level.

    datasets6 Place legacy analysis datasets in this subfolder.

    split7 Place split legacy analysis datasets in this subfolder.

    programs6

    Place software programs for legacy analysis datasets,

    tables and figures in this subfolder.

    misc4

    Place miscellaneous datasets that dont qualify as

    analysis, profile, or tabulation datasets in this subfolder.

    This subfolder was formerly named listings.

    profiles4 Place patient profiles in this subfolder.

    tabulations4

    Contains subfolders for tabulation datasets. Do not

    place files at this level.

    legacy5

    Place legacy (non-standardized) tabulation datasets in

    this folder.

    split

    6Place any split legacy tabulations datasets in this

    subfolder.

    sdtm 5Place SDTM tabulation datasets in this subfolder. Should

    only be used in m5 for clinical data.

    split6 Place any split SDTM files in this subfolder.

    send5

    Place SEND tabulation datasets in this subfolder. Should

    only be used in m4 for animal data.

  • 7/24/2019 Study Data

    32/40

    Contains Nonbinding Recommendations

    26

    8.

    Data Validation and Traceability

    8.1Definition of Data Validation

    For purposes of this Guide, data validation is a process that attempts to ensure that

    submitted data are both compliant and useful. Compliantmeans the data conform tothe applicable and required data standards. Usefulmeans that the data support theintended use (i.e., regulatory review and analysis).

    Data validation is one method used to assess submission data quality. Standardizeddata do not ensure quality data, but they do make it easier to assess some aspects ofdata quality by facilitating the automation of various data checks.

    Data validation relies on a set of validation rules that are used to verify that the dataconform to a minimum set of quality standards. The data validation process canidentify data issues early in the review that may adversely affect the use of the data.

    FDA recognizes that it is impossible or impractical to define a prioriall the relevantvalidation rules for any given submission. Sometimes serious issues in the submitteddata are only evident through manual inspection of the data and may only becomeevident once the review is well under way. Often these issues are due to problems indata content (i.e., whatwas or was not submitted, or issues with the collection oforiginal source data), and not necessarily howthe data were standardized.

    8.2Study Data Validation Rules

    8.2.1 Types of Data Validation Rules

    Generally, FDA recognizes two types of validation rules:

    Conformance validation: These rules help ensure that the data conform to the datastandards. For example, a conformance validation rule for CDISC SDTM data wouldcheck that the value in the Domain column of all datasets matches the name of thedomain.

    Quality checks: These checks help to ensure the data will support meaningfulanalysis. For example, a quality check for a particular human study may require thateach value for AGE fall within a pre-specified human physiologic range.

    Once a data standard is defined, the conformance validation rules are generally static.They are not expected to change substantially unless the standard itself changes.

    However, new analysis requirements or specific studies may suggest additionalquality checks and these will be incorporated into data validation processes.

  • 7/24/2019 Study Data

    33/40

    Contains Nonbinding Recommendations

    27

    8.2.2 Support on DataValidation Rules

    The Standards Web page47provides links to the currently available validation rules,i.e. both conformance rules and quality checks.

    Sponsors should validate their study data before submission using the published

    validation rules and either correct any validation errors or explain in the SDRGwhycertain validation errors could not be corrected. The recommended pre-submissionvalidation step is intended to minimize the presence of validation errors at the time ofsubmission.

    Data validation is used by FDA to inform review staff of potential problems in usingthe data, and to assess the usefulness of the rules. If applicable, FDA may reportimportant data validation errors to the sponsor for correction.

    8.3Study Data Traceability

    8.3.1 Overview

    An important component of a regulatory review is an understanding of theprovenance of the data (i.e., traceability of the sponsors results back to the CRFdata). Traceability permits an understanding of the relationships between the analysisresults, analysis datasets, tabulation datasets, and source data. Traceability enablesthe reviewer to accomplish the following:

    Understand the construction of analysis datasets Determine the observations and algorithm(s) used to derive variables Understand how the confidence interval or the p-value was calculated in a

    particular analysis

    Based upon reviewer experience, establishing traceability is one of the mostproblematic issues associated with legacy study data converted to standardized data.If the reviewer is unable to trace study data from the data collection of subjectsparticipating in a study to the analysis of the overall study data, then the regulatoryreview of a submission may be compromised. Traceability can be assured whenstudies are prospectively designed to collect data using a standardized CRF, e.g.,CDASH.

    As noted in Section 1.1, the submission of standardized study data will be requiredaccording to the timetable specified in the eStudy Data guidance. During thetransition period to required study data standards, FDA recognizes that some study

    data (i.e., legacy data) submissions may not conform to FDA-supported study datastandards and may need to be converted.

    47The Standards Web page can be accessed athttp://www.fda.gov/forindustry/datastandards/studydatastandards/default.htm.

    http://www.fda.gov/forindustry/datastandards/studydatastandards/default.htmhttp://www.fda.gov/forindustry/datastandards/studydatastandards/default.htmhttp://www.fda.gov/forindustry/datastandards/studydatastandards/default.htm
  • 7/24/2019 Study Data

    34/40

    Contains Nonbinding Recommendations

    28

    8.3.2 Legacy Study Data Conversion to Standardized Study Data

    Sponsors should use processes for legacy data conversion that account fortraceability. Generally, a conversion to a standard format will map every dataelement as originally collected to a corresponding data element described in astandard. Some study data conversions will be straightforward and will result in all

    data converted to a standardized format. In some instances, it may not be possible torepresent a collected data element as a standardized data element. In these cases,there should be an explanation in theSDRGas to why certain data elements could notbe fully standardized or were otherwise not included in the standardized datasubmission. The legacy data (i.e., aCRF, legacy tabulation data, and legacy analysisdata) may be needed in addition to the submissi