2.2 CMDR and Other Tools 4

Total Page:16

File Type:pdf, Size:1020Kb

2.2 CMDR and Other Tools 4

CDISC MDR Storyboards & Stakeholder Analysis

CDISC Meta data Repository Enrichment & integration of CDISC Data Standards toward semantic interoperability

Storyboards & Stakeholder Analysis

Prepared by the CDISC MDR Team

Notice to Readers

This provides the stakeholder analysis done by the different CMDR business requirement sub streams. The requirements identified here have been consolidated in the CMDR BRS document.

Revision History

Date Version Summary of Changes

Dec 09 1.0 First version for distribution. Contains ONLY the validated storyboards !

Page: 1 / 43 CDISC MDR Storyboards & Stakeholder Analysis

Table of Contents

1. FOREWORD...... 3 2. INTRODUCTION...... 4

2.1 USER COMMUNITY...... 4 2.2 CMDR AND OTHER TOOLS...... 4 2.3 CONTENT OF CMDR...... 5 3. STAKEHOLDER ANALYSIS...... 6

3.1 PHARMACEUTICAL COMPANIES/ CROS...... 6 3.1.1 Protocol/scientist...... 6 3.1.2 Data manager/data collector – with Study Database Design-Build-Test Process...... 15 3.1.3 Analysis Dataset Creation (single trial)...... 23 3.1.4 Analysis/Reporting (single and multiple trial)...... 24 3.1.5 Data “curator”/ database integrator/ data miner...... 25 3.1.6 Application developer (clinical application developer,…)...... 32 3.1.7 Document manager / medical writer...... 33 3.2 REGULATORS AND HEALTH CARE AUTHORITIES...... 34 3.2.1 FDA clinical data reviewers (“view extractors”)...... 34 3.2.2 Other – non clinical trial – submission of structured data...... 34 3.3 CDISC MDR SUPPORTING STAFF...... 35 3.3.1 Data standards definition staff (across SDOs such as CDISC, HL7,..)...... 35 3.3.2 CDISC MDR “steward“...... 36 3.3.3 CDISC CMDR governance committee...... 43 3.4 HEALTHCARE ORGANIZATION STAFF...... 44

Page: 2 / 43 CDISC MDR Storyboards & Stakeholder Analysis

1. Foreword This document contains the stakeholder analysis around the CDISC Meta Data Repository (CMDR). It was set up by different sub streams as part of the CMDR requirement gathering exercise, and is used as key input for putting these requirements together.

In this current version, the document only look at some stakeholders, focusing on pharma industry/CROs and data standard stewards. It needs to be further completed for some actors within pharma industry and also needs to be extended to regulators and health care actors.

Page: 3 / 43 CDISC MDR Storyboards & Stakeholder Analysis

2. Introduction

2.1 User community The CMDR needs to support mainly 4 user communities  Pharma companies & CROs who download the content of the CMDR and use it as their internal dictionary to collect data in a way which support submission and data exchange across multiple organizations, while supporting internal data integration. Some smaller companies may want to use the CMDR as their internal data dictionary in an ASP mode.  FDA, who download the content of the CMDR and use it as the dictionary allowing to extract several views (SDTM and other) across products and therapeutic areas from the data they receive from the pharma companies  CDISC MDR supporting staff and volunteers staff defining data standards who access CMDR directly and use it to define/maintain new standards in a consistent way across CDISC and health care  Health care organizations (not a priority) who download the CMDR to map their local dictionary with data required in clinical research.

IT vendors are also expected to be potential stakeholders, in the same way than any of the aforementioned organizations, based on the software being delivered.

Through the stakeholder analysis enclosed  The storyboard is provided in black character  Related requirements are provided in black – highlighted in green

2.2 CMDR and other tools CMDR is mainly a “standard content management tool” supporting other applications It will be used directly by people maintaining data standards to edit the content It will be mainly used by other tools that will use the content for the CMDR for their own purpose. During the stakeholder analysis we envisioned the following tools  eProtocol generator tool. A tool to be used by scientific people/protocol author to generate a structured protocol based on “standard concept”. Through the CMDR the tool could provide concepts to scientist while linking these concepts to implementation variables, needed for downstream activities  Study specific meta-data registry with contextual info on studies; this repository is specific for each company/ sponsor and contain information on how precisely standards defined in the CMDR have been applied for a specific study.  Data base design tool – this tool allows to build/ generate eCRF and data base design for further storage of information during the data collection  Data base query generator tool – this tool would be able to translate a query based on one concept into a query that can be run on a database, with the different variables used in the database, related to the same concept  …..

These additional tools are not specified in details, though there are several requirements on the tools across the storyboard. The only requirements gathered in the CMDR BRS are related to their linkage with the CMDR are provided.

Page: 4 / 43 CDISC MDR Storyboards & Stakeholder Analysis

2.3 Content of CMDR The stakeholder analysis is based on the assumption that the stakeholders will have direct access to the content of the CDISC MDR, without any specification on how this happen technically. It is important however is to make the difference between what is the content of the cross-industry CMDR and what is additional content needed within each organization.

Sponsor MDR Sponsor MDR FDA MDR SponsorCopy of MDR CMDRCopy content of Copy of CMDRCopy content of CMDR content CMDR content

CDISC MDR

The CMDR will contain  the GOLD STANDARD concept and variables, valid across the industry  mapping between this golden standard and the different representations in different CDISC standards (e.g. SDTM, CDASH…) with the aim to suppress these mappings with the time when these standards have been fully harmonized. What will NOT be part of the CMDR is the definition of company specific variables and their mapping with the golden standard. This should be included in organization specific MDR.

Page: 5 / 43 CDISC MDR Storyboards & Stakeholder Analysis

3. Stakeholder Analysis

3.1 Pharmaceutical companies/ CROs

3.1.1 Protocol/scientist

High level Scientific concepts used in the protocol have clear definitions. Terms can be requirement checked for accuracy/consistency across the protocol1. Implementation variables for data collection and analysis are associated with scientific concepts, allowing the author to specify data collection for the protocol precisely. Problem to Process People solve (+ root cause No formal process Limited ressources to build the link in analysis) an unmabiguous Different knowledge & Differentway translation of understanding between concept into variables scientist & developper by different data Additional burden on No structured/unambigous managers protocol authors with link between scientific benefits to other people concepts within protocol and variables used within No tools to support downstream activities (data linking (emerging New/Changing & collection & analysis) electronic protocol not available generator cannot work effectively with this) Inconsistency ?

Technology External data standards

High level Dr Joe is the principal author of the protocol for a new study. He has discussed the storyboard design of the trial with key people on the project team and is ready to write the protocol.

In the past he would have written the protocol on paper and had it reviewed and approved before passing it to the data manager responsible for data collection and (e)CRF development.

Today Dr Joe is working with a new tool – called eProtocol manager – which helps him to enter the protocol in a more formal way. The tool has a "template" for writing the protocol, and each part of the template has a degree of structure appropriate to that part. Wherever the content of the protocol refers to scientific concepts, protocol text is linked (either automatically or manually) with a well-defined concept with CMDR. If the CMDR has no definition for a concept needed for the protocol, a "wizard" guides Dr. Joe in creating a draft definition.

Detailed storyboards:

1 The same should hold for all development projects and data sources outside of clinical trials (e.g. epidemiology, outcomes research, etc.). This is a similar storyboard – not developed here as the requirements should be similar from a CMDR point of view.

Page: 6 / 43 CDISC MDR Storyboards & Stakeholder Analysis

 3.1.1.1 . Dr. Joe selects concepts for a highly structured part of the protocol  3.1.1.2 . For a relatively unstructured part of the protocol, Dr Joe manually links words embedded in free text with concepts.  3.1.1.3 . Dr Joe doesn’t find the concept he wants and must draft a new concept or need to update an existing concept.

In those parts of the protocol that describe activities to be performed for individual subjects, there is the capability to drill down into the concept of the activity and access information about the variables associated with that activity. Dr. Joe may choose the standard set of variables for the concept, or he may adjust the data items to be collected to meet the needs of the particular study. Detailed storyboards:  Dr. Joe selects lab tests from a well-organized set of possibilities.  3.1.1.4 . Dr. Joe selects a concept, but finds that the set of data items associated with the concept is not complete enough for the needs of this protocol.  3.1.1.5 . Dr. Joe decides how to collect medical history data using one of several different approaches, then specifies the level of granularity for collection of different kinds of medical history data based on their relevance to the disease and population under study.  3.1.1.7. Dr Joe decides which group of variable should be collected at which visit

When all the checks have been made, Dr Joe approves the protocol.

Detailed storyboards:  3.1.1.6 . Dr. Joe checks that concepts that appear in the objectives also appear in the assessments and analysis sections.

The protocol tool produces a report on draft concepts, which will be submitted to the CMDR for validation and for linking to relevant data items. (See Storyboards in sections 3.3.)

The protocol tool passes information on concepts and variables to the data manager and/or eCRF developer. Where Dr. Joe has not completely specified data collection, they complete the task. (See Storyboards in section 3.1.2.)

Page: 7 / 43 CDISC MDR Storyboards & Stakeholder Analysis

Detailled storyboard 3.1.1.1 Dr. Joe selects concepts for a highly structured part of the protocol. The protocol “template” includes a section for lab tests, which allow Dr. Joe to choose from among certain “default” groups of tests. Dr. Joe chooses the default group of tests for Phase 3 trials, then adds a complete differential to the WBC and a particular cancer marker. Dr Joe also requests serum glucose. The eProtocol manager checks CMDR and finds that there are many different possibilities (e.g., serum glucose alone, when subject is fast, random, or at specific times after particular challenges). Dr Joe decides to keep serum glucose alone (i.e., without a specified relationship to fasting or a challenge). eProtocol manager tool  identification of concept either in structured part or in free text CMDR  Concept definition. CMDR can handle “complex concepts” which are composed by a group of concepts.  Concept search. When a concept is provided, CMDR o checks if there are related concepts (i.e. more specialized concept) and return these concept as alternative candidates- e.g., if serum glucose is provided as input, CMDR returns serum glucose alone, when subject is fast, random, or at specific times after particular challenges.

3.1.1.2 For a relatively unstructured part of the protocol, Dr. Joe manually links words embedded in free text with concepts. The “background” section of the protocol describes prior research in the disease setting and with the treatments for this study. Since the kind of information that appears in this part of the protocol is highly variable, the template is set up for entry of free text. When Dr. Joe saves this section of the protocol, the eProtocol manager scans text looking for concepts in the CMDR. It proposes matches to Dr. Joe, who either accepts or rejects the matches. Dr. Joe notes that there is no match for a concept for which he would expect to find a match. He initiates a search, finds two possible matches, and decides, on the basis of their definitions, that one of them is the concept he intended. He investigates another concept for which no match was found, and finds no match, so he marks this concept as one that may need to be added to the CMDR. CMDR  Concept search (need to be very rich to ensure that people do find easily a concept – this decrease the riks of requeting/defining new ones) It is possible to retrieve a concept in different ways o From a a higher level complex concept, selecting a concept o Providing the exact concept name (perfect match) o Providing incomplete concept name – potentially with wildcard (e.g. ser. glu or ser*glu* for serum glucose) o Providing concept name with some spelling error (e.g. serim glucose for serum glucose) o Providing related word/synonyms (e.g. low fat meal when searching for low cal meal) o …..  Concept search. When a concept is provided, CMDR o provides a precise definition of concept in a language aimed at scientist… o provides the list of related concepts (with their relations ) o provides a list of related “gold standard” variables (e.g. for serum glucose which is a PQ2, CMDR would return the 3 variables -SerGlucVal,

Page: 8 / 43 CDISC MDR Storyboards & Stakeholder Analysis

SerGlucUnit, SerGlucRange - that allow to capture the needed information. The variables are provided with definition, format, coded or non coded, list of codes when there is a related code list ….These variables can then be used by the programmer for inclusion in the data collection program.

3.1.1.3 Dr. Joe doesn’t find the concept he wants and must draft a new concept or needs to update an existing concept. Dr Joe wants to specify that the patient should take a “low fat meal” at a certain point in the study. The eProtocol manager checks CMDR and finds no “low fat meal” but does find “low cal meal” – and the related definition is provided . Dr Joe decides that this is existing concept is not good enough and decides to create a new concept: he is then asked a set of questions – based on the structure of the underlying information model - which allow CMDR to uniquely define this concept. Note:  We need to put in place mechanism that would avoid proliferation of new concepts which are inconsistent/redundant. This out of scope of this use case but should be enforced by o Interdependencies of concepts through an information model/ontology (BRIDG) o A strong governance process with a 2 steps approach (definition of “ad interim “ concepts, and agreement through CDISC MDR governance body)  Updates/new concepts related to changed into the related controlled terminology are not handeld here – they are handled in other use case (related to variable management)

CMDR  Concept definition. When a user does not find the correct concept, CMDR allows to enter the definition of a new concept, as “draft” concept for further approval as part of the governance process o the concept definition needs to follow a life cycle in the approval status . Draft . Under review by the CMDR Governance Committee . Approved, . Reject – in which case a rationale needs to be provided . Retired – in which case a rationale need to be provided, and whenever relevant with a link to the replacing concept o while a new concept is being specified, CMDR guides the user by requesting all relevant information through the on-line change request form (e.g. related concepts or groups of concepts - following underlying information model or hierarchy of concept in data layer, description, definition of related “gold standard” variable to collect the information related to the concept)… o a new concept can be created  from scratch  or CMDR may ask the user to specify a related concept from which the new concept can be described. When a new concept is

2 PQ stands for Physical Quantity – it is an ISO/HL7 abstract data type and is composed by a value, a unit code and a range

Page: 9 / 43 CDISC MDR Storyboards & Stakeholder Analysis

created from modifying an existing one, the link to the “father concept” is stored

 Concept definition update. CMDR allows to update an existing concept definition i.e. o change/update the description o add or delete related variable (s) o update a link with other concepts o … If the change is o A correction (e.g. wrong description of the semantic), this should be marked as a MAJOR change and explained clearly so that user understand the difference/changes and can take appropriate action o A clarification (e.g. wrong choice of words for clinical experts but correct semantic), this should marked as MINOR change

3.1.1.4 Dr. Joe selects a concept, but finds that the set of data items associated with the concept is not complete enough for the needs of this protocol. Dr Joe decides to collect chest x-ray results. There is a large group of possible results to choose from, and he is able to find most of the assessments he wants. However, he wants to use a relatively new scoring system that is not in the CMDR. He writes in this score, which is marked by the eProtocol manager as a concept that may need to be added to the CMDR. CMDR  (same as before) Concept definition update. Add a variable to a concept Concept definition: add a new concept within a related group within links to other concepts

3.1.1.5 Dr. Joe decides how to collect medical history data using one of several different approaches, then specifies the level of granularity for collection of different kinds of medical history data based on their relevance to the disease and population under study. The eProtocol manager presents several approaches for medical history – a generic approach, an approach targeted at specific conditions, and an approach which collects timing details of when specific conditions (e.g., date of onset of symptoms, date of diagnosis, date of confirmatory diagnostic test, date of most recent episode, etc.) Dr. Joe chooses the generic approach for most body systems, but a more specific approach for cardiovascular disease, since the investigational treatment may have cardiovascular side effects. For the disease under study, he selects the most detailed approach, and chooses the most relevant dates (date of diagnosis, date of confirmatory diagnostic test, date of relapse). eProtocol generator Possibility to shape part of the protocol by  Selecting a complex concept – while doing so the user should be provided by different possibilities (specialization of generalization of the original concept provided as input (e.g. when the use is selecting “medical history” , he should be provided for the different approaches which are “specialization” concept of the overall medical history concept.  Selecting a simple complex and its related variable  Specifying a specific value for a concept/term by linking it to a specific

Page: 10 / 43 CDISC MDR Storyboards & Stakeholder Analysis

value/term for instance from MedDRA; CMDR  Concept definition. Concepts with similar semantic but with some differences (such as the different approaches for medical history) should be linked to each other under a same generic concept (e.g. medical history). Concept search. When a concept is retrieved, its parents (generalization of the concept) as well as its children (specialization of the concept) should be provided at the same time  (same as before) Concept search. Possibility to retrieve a concept based on name (e.g. medical history) Selection of a complex concept with underlying variables

3.1.1.6 Dr. Joe checks that concepts that appear in the objectives also appear in the assessments and analysis sections. The eProtocol manager expects that if a concept appears in a study objective, then data related to that concept will be collected, and an analysis of that data will be described in the analysis section. In this study, the primary objective of the study involves “disease progression,” and none of the assessments described in the protocol is called “disease progression.” The eProtocol manager asks Dr. Joe to indicate which assessment(s) are used to assess disease progression. Dr. Joe chooses lesion assessments. However, he also realizes that results for a biomarker are part of the criteria for disease progression, and that he has not included that assessment in the protocol, so he adds it. The analysis section includes mention of lesion measurements and the biomarker, so the eProtocol manager considers the check complete.

CMDR  Concept definition. CMDR can handle “complex concepts”(e.g. “disease progression” is composed by different concepts like “lesion assessment, biomarker x value, lab test y value”) which are composed by a group of concepts. This definition is recursive (i.e. a complex concept can be composed by complex concepts). Each concept belonging to a group has a optionality qualifier with 2 values: mandatory, optional (and potentially a 3rd value - to be confirmed - conditional) o For instance “visit” is a concept composed by Lab, AE, Conmed, …. In turn Conmed is a complex concept composed by different concepts. o For instance in the concept (“standing systolic blood pressure” position is mandatory, while measuring device or challenge (“at rest” or “after exercise”) are optional. For optional variables, whenever relevant we should have a default value (e.g. challenge for Systolic blood pressure by default = “at rest”).  Concept definition. A “simple concept” is linked with one or more variables allowing to collect and store the concept information (e.g. “lesion assessment” is linked to variables such as LESION_ASSESMENT_VAL, LESION_ASSESSMENT_SEV,…. Provide information on variables, including definition, format, coded or non coded, list of codes when there is a related code list , whether the code list may be modified or not…. NOTE: Should we have as simple concept/variable something like LESION_ASSESSMENT_VAL, ASSESSMENT_VAL, or VAL ? o If we have more specialized concepts/variables (e.g. LESION_ASSESSMENT_VAL), there is  A risk of proliferation of concepts which are close to each others, but still different

Page: 11 / 43 CDISC MDR Storyboards & Stakeholder Analysis

 less re-usability of the same simple concept/variable around different concepts,  but the more precise/unambiguous semantic o If we have less specialized simple concept/variables (e.g. VAL), there is more re-usability but less clear semantic. o Guiding principle for the ontology/organizational framework of concepts  There needs to be the right trade-off between unambiguous semantic and minimal amount of simple concept/variables.  Whenever there are simple concepts/variables related to each other – toward specialization – they need to be linked. For instance LESION_ASSESSMENT_VAL is a specialization of ASSESSMENT_VAL which is a specialization of VAL,  Concept definition/code list. o Complex terminologies maintained by external organization (e.g. SNOMED, MedDRA) should be managed/maintained outside of the CMDR – but link to it o Simple code lists (e.g. sverity, race, …) shoud be managed within the CMDR  Concept definition/code list. A simple concept – linked with a single variable – can be linked to a code list. o Some code lists cannot be modified without changing the meaning of the concept/variable (e.g. AE severity grade). In this case the user needs to define if she wants to define a new concepts- derived from the original one. o Some code lists can be modified without changing the meaning of the concept/variable (e,g list of countries) It should be part of the concept/variable specifications to define which code lists can be modified without changing the meaning of the data and which should not.

3.1.1.7 Dr Joe decides which group of variable should be collected at which visit The eProtocol generator enables to drive generation of downstream components from upstream point and click. A basic implementation allow to drive from a time and events schedule.

Concept: Visit 1 Visit 2 Visit 3 Visit 4 Time 1 Time 2 Time 3 Lab  AE       Conmed  Lung   Function …    

Double clicking on Visit 1 Conmed pulls up a list of variables associated with the Conmed concept and with mandatory, optional and conditional variables. The mandatory ones would automatically selected, Dr Joe has the ability to choose whether or not he wants the optional ones. By choosing an optional conditional variable, Dr

Page: 12 / 43 CDISC MDR Storyboards & Stakeholder Analysis

Joe get (or is able to choose) from a set of associated variables. And for each of the variables, there would be an indication of whether the variable was coded or not. And if it were coded with a modifiable code list, Dr Joe can choose which codes he wants to have available.

Whenever there is a choice of variables or a choice of codes, and Dr Joe makes a choice, that set of choices becomes a stored choice set. Next time (for that study) Dr Joe has a choice of variables or a choice of codes, he can elect to use one of the stored choice sets, or to select a subset from a stored choice set or select from the full set of variables/codes. Each stored choice set would have contextual info identifying it (which visit/time, which concept). ( many requirements related to eProtocol generator)  access to a list of concepts in CMDR that can be collected for a visit  possibility to select a concept (complex or simplet) for a specific visit  when some concepts related to a concept are conditional, possibility to select /confirm concept related to a visit  possibility to see all variables related to a concept, to select the relevant one and to this in a study specific meta-data registry with contextual info o for a concept composed by concepts – show the list of variable in a hierarchy related to concepts o for a simple concept – provide the variable definition  when variable is related to a code list – possibility to select the relevant code for the specific study – and store this in a study specific meta-data registry with contextual info (which product, which study, visit, which concept, …) o This practice should however limited to ensure data integration and re- usability (but this may require an increase in the richness/complexity of what is in the CMDR) o This should be possible only related to simple concepts/variables with a code list that can be modified without changing the meaning of the variable; not for the ones where changes in the code list change the semantic  Out of scope: medical knowledge for medical research (e.g. group of biomarkers for a specific disease that could be proposed to the protocol authors) CMDR Requirement.  See 3.1.16

At some stage, Dr Joe want to be able to switch on audit trailing and version controlling. eProtocol tool Requirement  audit trail and version control on study specific meta-data registry

So, now Dr Joe has selected concepts, variables and codes slotted to the visits. For autogeneration of an eCRF, the eProtocol generator tool would need additional information e.g. order of the variables on the physical pages etc. as well as a way of using existing data checks/specifying new ones. CMDR Requirement.  Concept definition. A complex concept should contain the information that should be collected together (in a form). Specification of layout would be nice to have (under the form of the recommendations)

Downstream, Dr Joe expects to be able to autoextract/generate the datasets.

Page: 13 / 43 CDISC MDR Storyboards & Stakeholder Analysis

 Concept retirement (not specified in used case) o It is possible to specify that a concept becomes obsolete. CMDR will nevertheless keep the concept definition and all its related variables but will mark it as obsolete. Traceability to concepts who were derived from that concept is kept

Page: 14 / 43 CDISC MDR Storyboards & Stakeholder Analysis

3.1.2 Data manager/data collector – with Study Database Design- Build-Test Process High level  Access variables – related to the scientific/clinical concept – with attributes like requirement format, type, …to include in CRF/data collection module in a medically/scientifically consistent way, from data collection onward 3  Access to information explaining how the variables are intended to be used (alone or with different qualifiers, way it is stored ...) Problem to Process People solve (+ No formal process to share “inventivity” of people root cause variable definition across industry in different contexts analysis) No shared definition of what a variable is Not yet enough pressure in need for increased efficiency in data collection (changing!) Growing set of No structured metadata No variables (20.000+ for unambigous No standardiwed Mindset that EHR integration is far clinical trial) with link between away or „the problme to be solved“ inconsistency, proprietary extensions by HC actors concepts redundancies within protocol and Minimal re-use of content No agreement on terminology/code No possibility to share variables list (e.g. MEDDRA, SNOMED, data across (see section LOINC, CDISC Vocab...) No tools to store definition companies (CROs, 3.1.1) of variables accepted in/out-licensing) and across the industry (use No „umbrella“ organisation to caDSR but UI not friendly) decide on variable definition with medical record across pharma and health care

No tool to exchange variable definition (possibility to use ODM ?) CDASH ?????

Technology External data standards

3 Data collector needs to have a list of permissible values, data manager needs to use them to ensure consistency check

Page: 15 / 43 CDISC MDR Storyboards & Stakeholder Analysis

Storyboard 1 Detailed Storyboard 3.1.2.1 Create study specific database metadata file (empty study template; study “stub”) with unique identifier and description. CMDR  No requirements for CMDR here

3.1.2.2 Read the protocol and from the Study Procedures and Timings table define the Time and Events Schedule.  Create Visits (“Study Events” in Database Design Tool  Examine Protocol Study Procedures to determine all Forms  Create Forms in Database Design Tool o Import existing forms from “common to all” and “therapeutic specific” section of metadata repository o Create new ones where content is not present (shell only)  Generate Time and Events Schedule with Protocol Annotations  Get Clinician to approve CMDR  Variable grouping. Predefined groups of variables may be linked together in CMDR to be used easily in data collection instruments. Lower-level groupings may be mandatory (e.g. SYSBP, SYSBP_UNIT). Higher-level groupings may be convenient (e.g. CRF Form or Domain). Although CMDR will define standard groups, users will often reorganize groups either to match company or study- specific needs.  Variable definition. CMDR variables can be incorporated into either horizontal or vertical structures, and should facilitate changes between these two forms.  Concept definition. Each CMDR Form corresponds to a complex concept with precisely defined semantics for itself and all underlying variables.  Concept definition. Each CMDR Form concept specifies a precise definition of the concept, with all the underlying variables and their related attributes.  Concept definition. It is possible to define new complex concepts to be used as Forms through an on-line change request form.  Concept search. It is possible to retrieve a Form concept in different ways, including by related therapeutic area/disease

3.1.2.3 Define new Forms  Perform an Initial Submission (SDTM) Analysis to determine that all the required SDTM variables have something in the database metadata that collects data that can be used to generate them. o Domain designation may require consultation with FDA (e.g. repeated pregnancy – in a CDISC standard domain (SDTMIG) or a Therapeutic Specific (SDTM) domain o For Therapeutic Specific domains consult a “visualization” of existing Therapeutic domains & CDISC domains held in the metadata repository  Form Content Analysis o Determine the Sections (“ItemGroups”) and structure of the form by reference to protocol description of the data to be collected

Page: 16 / 43 CDISC MDR Storyboards & Stakeholder Analysis

o Create Form Sections (“ItemGroups”) o Determine the Items and CodeLists o Create Items and CodeLists CMDR  Variable search. CMDR can be searched to identify all variables linked to a specific Data Domain (e.g. SDTM Standard or Therapeutic Area)  Variable definition. Possibility to add new variables linked to specific therapeutic domain. This includes definition/update of related code lists  Code list definition. CMDR needs to specify how tightly code lists are linked to specific variables. The tighter the linkage, the more unique code lists will be needed. Code lists should be flagged as to whether or not they can be reused/expanded.  Concept definition. Possibility to define the structure of a new complex concept (e.g. a Form) as composed of lower-level concepts (e.g. Item Groups) which are linked to variables. Lower level concept/variables must exist before being included in a higher level concept.

3.1.2.4 Decide on Study Execution Medium  Paper, Hybrid, EDC System 1 or 2 or 3 etc  Where appropriate add proprietary target metadata to describe presentation, complex edit checks etc. CMDR  Metadata model. The CMDR metadata model should be able to be conveniently extended, e.g. to capture layout parameters used, for example, to specify how Forms should appear in different eDC systems.

3.1.2.5 Review and Approve Database Design  Create visualizations appropriate to reviewers’ roles – case report forms for clinicians, annotated case report form, tabular database specification for data managers, test specifications etc.  Review and approve (cyclic).  Finalize (“as specified”) metadata and begin system build. CMDR  No requirements for CMDR here

3.1.2.6 Import Machine Readable Database Design into EDC/CDMS  Automate build activity – auto generate CRFs and EDC/CDMS system.  Manually build parts of EDC/CDMS system where appropriate.  Export machine readable (“as built”) metadata from EDC/CDMS and begin testing. CMDR  Use in eDC system. Form complex concepts exported from CMDR would allow an automatically build of the data collection instrument. Out-of-scope for CMDR would be completely specifying database design.

1. Testing  Machine compare “as specified” vs. “as built” metadata.

Page: 17 / 43 CDISC MDR Storyboards & Stakeholder Analysis

 Perform manual user acceptance testing where appropriate.  Repeat until approval.  Go “live”. CMDR  No requirements for CMDR here.

Summary of  Needs to reduce study specific database design-review and approval process by: needs (“as o Reusing content from metadata repository and past studies where is process”) appropriate. o Utilizing a universal design tool that uses standardized metadata but supports extensions to standardized metadata to describe proprietary features of a target system. o Creating human readable visualizations automatically from machine readable metadata for multiple roles.  Needs to reduce study specific CRF/EDC/CDMS build process by: o Using a design tool that can create proprietary EDC metadata and/or auto generate print ready CRFs or o Choosing an EDC vendor with direct ODM import or extended ODM import  Needs to automate testing process by: o Autogenerating test specifications from machine readable metadata o Machine checking as specified vs as built metadata CMDR  Use in eDC system. CMDR needs to download content in an open, machine readable test specification (e.g. ODM) to DB tools etc.

Page: 18 / 43 CDISC MDR Storyboards & Stakeholder Analysis

Storyboard 2 High level Akhil is the responsible data manager for PROVEIT, a new study. He needs to specify storyboard a data capture system including eCRFs and data quality checks for the study under an (“to be accelerated timeline. process”) In the past, Akhil would start with the protocol, and try to determine which concepts equated to CRFs similar to what had been done before and which concepts seemed to be new. He would then research new concepts to see how they could be collected in CRF form (including checking copyright and fees), and would then assemble new and existing CRFs into a package to be reviewed and revised by the study team.

Since the launch of the CMDR, however, protocols written using eProtocol Manager have become more common. In these protocols, much of this work of specifying data to be collected on the CRF has already been done – even to the extent of requesting new CMDR standards for certain items.

Akhil’s primary responsibilities, therefore, are:

Detailed Storyboards:  3.1.2.7 . Check Data Collection for Consistency. To check whether the collected data is consistent and will integrate easily with other studies done by this study team. For example, if the team always collects height in cm, has that been consistently specified?  3.1.2.8 . Specify Edit Checks. To specify study-specific range checks and cross-checks in the eDC system, including how and when these will be overridden (i.e. most of which detail will not exist in the CMDR.)  3.1.2.9 . Finalize CRF Flow, Layout, and Instructions. To organize the order, flow, and layout of the CRF questions. Also, to add any study-specify or additional instructions to the investigator on how to fill out the CRFs  3.1.2.10 . Ensure status of Collected/Non-Collected variables. To ensure that collected, prepopulated, and calculated variables have been correctly distinguished –so that prepopulated variables receive the correct values and neither prepopulated nor calculated items appear on the CRF form.

Detailled storyboard 3.1.2.7 Check Data Collection for Consistency. Akhil loads the output from eProtocol Manager into his eDC-build system. As he begins working on the study, Akhil realizes that Dr. Joe has mistakenly used variables commonly used in pediatric observational growth studies in PROVEIT (an oncology study). Akhil pulls up the CMDR and replaces the variables with the correct ones. CMDR  Variable search. It is possible to retrieve variables in different ways (name, ID, Question text, associated Code List, etc.).  Variable definition. When variables are retrieved, CMDR provides the full description of the variables, including links to related concepts providing unambiguous definition of the variables. When a variable is linked to several concepts, all concepts are provided. (Note: it is the study specific meta-data registry with contextual info on studies which contains the details on which concept has been used for a specific variable in a specific study)

Page: 19 / 43 CDISC MDR Storyboards & Stakeholder Analysis

3.1.2.8 Specify Edit Checks As most edit checks in a study are specific to that study, Akhil must add these to the DC system. Several parameters exist in the CMDR template to hold edit checks – but these have been left blank in the library view. Akhil must now populate these according to standard rules. Fortunately, since the edits have a standard specification, he can pull from a separate eDC Library of Edit Checks to populate the necessary fields. study specific meta-data registry with contextual info on studies  need to define study specific edit check per variable CMDR  Variable definition. Access information about how the variables are intended to be used (e.g. alone or with different qualifiers; storage requirements). This is specified through the related concepts where the variable is used  Variable definition. Specify a subset of allowable responses (i.e. code list), or desired measurement units for a given variable. Variables contain a specification of allowable responses applicable across their context of use in each specific concept. These allowable responses may still need to be refined for each study in a separate database. Example of allowable responses – based on concept/context o the code list for “body position” would be different if used in the context of systolic blood pressure than for other cardiovascular test). o The normal range of a lab test is different based on specific patient population, e.g. diabetic versus normal population

3.1.2.9 Finalize CRF Flow, Layout, and Instructions. Akhil notices that the CRF for Visit 2 goes for over 250 pages. He realizes this is because the entire concept of ECG measurements has been included. Akhil again opens the CMDR and replaces the ECG content with a single question: of ‘Normal/Abnormal’.

Akhil also reviews an ‘Instructional’ view of the CRF’s, in which company or team- specific instructions for data collection are tagged to collected fields. He also views several types of annotated CRFs to ensure that all fields are tagged as going to the appropriate domains/variables.

At this point, up against a deadline to complete the study build, the study team decides to add two new variables to the study, on the recommendation of the joint venture partner. Akhil searches the CMDR and realizes that both variables are non-standard. So he opens the form on the CMDR web-page and submits requests for their consideration. A week later, Akhil receives the reply that one question: “Rate the patient’s satisfaction with care (high, medium, low)” has been accepted as part of the QOL domain. The other, “How much would the subject pay to live disease-free?” was not accepted. Akhil includes both items on the CRF at the insistence of the study team. Database design tool  Possibility to change optional items describing a variable (e.g. instructions,…). Note: it should not be possible to change required attributes from the CMDR (e.g. ID, label, data type, …) study specific meta-data registry with contextual info on studies?  The possibility to collect non-standard variables and process attribute

Page: 20 / 43 CDISC MDR Storyboards & Stakeholder Analysis

information from those variables is critical since non-standard variables occur in many studies and are often considered highly important (i.e. non-negotiable) by the study team,

CMDR  Variable search. Possibility to search for variables with different criteria, including free text (that would search in the concept definition and in the variable label)  Variable definition. Possibility to define a new variable by filling the on-line “change request form”. This should be further processed by the “CDISC data standard steward”. When a new variable has been added it gets the status “Draft” and can be further accessed from the CMDR.  Variable definition. Even if a variable has not been accepted in the CMDR, it should be kept with the “rejected” status

3.1.2.10 Ensure status of Collected/Non-Collected variables. Akhil runs queries to see that any variable with an Origin=‘Collected’ DOES appear on the CRF; and also checks that variables with Origin other than ‘Collected’ DO NOT appear on the CRF. He also checks that the default values specified for any ‘Assigned’ variable match the study protocol specification. Finally, for any calculated variables that he is aware of, he checks that machine-readable instructions for doing the calculation are being properly specified for use by downstream systems. CMDR  Variable search. Possibility to provide a list of variables and check specific attributes, for instance “origin”, “default values”, algorithm ,,, Summary of  Choose the most appropriate set of standard variables whose attributes (format, needs (“to type, definition) will adequately measure relevant aspects of scientific/clinical be concepts desired for a given clinical study. process”) o Access information about how the variables are intended to be used (e.g. alone or with different qualifiers; storage requirements) o Specify a subset of allowable responses or desired measurement units for a given variable o When a variable does not exist in the standard, be able to construct a draft standard that can be used for the present study immediately. . At the same time minimizing new variable specification  Construction of CRF/data collection modules which are medically and scientifically consistent across studies [I am not sure why this is needed – we will collect different data across studies and we may choose to have eCRF modules that combine different data from one study to another], from data collection onward 4  Readily swap variables in and out of the planned study data collection system and visualize immediately the new proposed collected variables.

 Needs to be able easily check the quality of the proposed data collection system at the finest level of detail.  Need to be able to quickly correct errors in the eDC system  Need to get rapid turnaround on requests for new standards  Need to be able to include non-standard items in final product

4 Data collector needs to have a list of permissible values, data manager needs to use them to ensure consistency check

Page: 21 / 43 CDISC MDR Storyboards & Stakeholder Analysis

3.1.3 Analysis Dataset Creation (single trial)

Page: 22 / 43 CDISC MDR Storyboards & Stakeholder Analysis

3.1.4 Analysis/Reporting (single and multiple trial)

Page: 23 / 43 CDISC MDR Storyboards & Stakeholder Analysis

3.1.5 Data “curator”/ database integrator/ data miner High level  be able to view at a detailed level detailed information about what variables were requirement collected in a study without having to reference the protocol and be able to integrate studies which have collected the same information easily Problem to Process People solve (+ No process to Additional burden on data collection root cause enforce collection team with benefits to other people of meta-data when analysis) collecting data Growing mindest of the need of secondary use of data No process to ensure consistency in data collection Protocol Team focus on ONE No consistency in across studies protocol and overlook need for data integration for submission (ISSE and the way variables ISE) and further data mining are used across studies

No tools to store relationship CDISC SDTM has limitations No information on between variables at a and scope is only clinical safety conceptual level (e.g. SYSBP how variables were may be collected with site and Different standards – CDISC, linked together in position) SEND, HL7 – require mapping data collection No agreement on terminology/code Data collection tool list in R&D do not support collection of meta- BRIDG is the conceptual model data ??? linking variables across standards Technology External data standards High level Following a merger of two companies, Jie-won has been asked to build an integrated storyboard database of all studies which have collected data on either of 2 similar ZZZ-inhibitor 1 compounds. This database will be used to reply to regulatory questions about adverse reactions in this class of compounds, to investigate the possibility of a new indication for reducing color-blindness, and for exploratory analyses. In the past, Jie-won would have collected the protocols, datasets, and data definition documents for all studies being considered for integration and spend several weeks or months determining how the studies should be combined. Now, however, Jie-won is leveraging the fact that many of these studies conform to the CMDR to make the task easier. Jie-won needs to accomplish the following tasks: Detailed Storyboards  3.1.5.1 . Deciding which studies are final candidates for integration  3.1.5.2 . Determining the mappings between similar variables  3.1.5.3 . Prepare and integrate the analysis datasets (e.g. ADaM)  3.1.5.4 . Programming any data selections/subgroups

Detailled storyboard 3.1.5.1 Deciding which studies are final candidates for integration 1 In order to determine which studies are suitable candidates for integration, a list of attributes would have to be created along with criteria such that, if a study meets the criteria, it would be integrated in the combined database. In the past, just coming to agreement on criteria would have taken several weeks of meetings. However, with access to the CMDR and the study specific meta-data registry, Jie-won is able to access the Study Metadata model and propose a list of 15-20 ‘mandatory’ attributes to the study team – with a further suggestion that certain criteria be required to match exactly between studies, while other criteria need not be an exact match. As a result, the

Page: 24 / 43 CDISC MDR Storyboards & Stakeholder Analysis criteria for integrating studies are agreed upon after only a single meeting. Having agreed on criteria, Jie-won would now normally spend several weeks reading each protocol and extracting necessary attributes into a spreadsheet, including study design, patient population, treatment groups, length of the study, primary objective, any unusual inclusion or exclusion criteria, and finally whether the study collected any of a list of desired safety or efficacy parameters. However, since 19 of the 24 studies under consideration have followed the CMDR format, for all these studies the metadata information is immediately downloadable from the study databases. Jie-won only has to read in detail the protocol from the 5 studies not following the CMDR format. Jie-won decides not to enter the information for the 5 studies into a spreadsheet but instead back into the study meta-databases, ensuring that any future efforts to integrate these studies will not run into the same problems. The study team agrees that 20 of the 24 studies should be selected for inclusion in the integrated database. Sponsor’s study specific meta-data registry with contextual info  Contains a description of all the variables & related concepts used in a study – with study specific information whenever applicable; these variables and concept are compliant with CMDR  Possibility to access CMDR => possibility to combine concepts (and variables) into a unambiguous selection/search,  Search for studies which meet the specific selection criteria (complex concept) and produce a list of relevant studies/protocol CMDR Requirement (contains the definition of the variables and concepts., valid ACROSS ALL STUDIES)  Concept search. Possibility to search complex concepts (that can serve a study selection criteria) and get their definition; note a complex concept can be composed of lower level concept (and the variables) which are optional  Concept definition. Possibility to specify and request new complex concept (that can serve as study selection criteria) which can be immediately used in “draft” status. The request must be sent to /managed by the “data standard steward” (see related storyboard)  Variable definition. Possibility to specify and request new variable and to link to an existing concept (including mapping/ transformation)

3.1.5.2 Determining the mappings between similar variables In the past, Jie-won would have collected the SAS or ORACLE datasets for each study along with their accompanying data definition documents whenever they were available (if not available, Jie-won would have had to create a data “dump” of the variable names and labels for each dataset). Jie-won would have then try to determine which datasets contained similar information based on the names of the dataset and the names of the variables. For example, she might decide to investigate whether DEMO, DM, and SUBJINFO all contain similar demographic information; and whether PATID, PATNO, and SUBJID might all be identifiers of the patient ID. Jie-won would then have constructed another series of spreadsheets, indicating which datasets and variables for which studies seemed to contain similar information. For the studies following the CMDR format, though, this process is now much simpler.

Page: 25 / 43 CDISC MDR Storyboards & Stakeholder Analysis

All variables that need to be integrated – regardless of on which dataset they have been stored, and whether horizontally or vertically – have identical variable names. So Jie- won has no trouble determining whether a study collected the desired information. In several cases, where it did not appear that a study collected a particular variable, Jie- won has been able to map the desired variable “up” to it’s higher-level concept – and then determine whether that concept had been collected in the study. This has lead her to discover several alternate versions of variables that might have been difficult to discover (one reason being that variables in datasets are often sorted alphabetically; and related variables do not always have similar names). For the 5 studies which do not conform to CMDR, Jie-won again has a simpler task – with the studies which match the CMDR forming the target. Jie-won need therefore only determine the map between the non-conforming elements and that target. Sponsor study specific meta-data registry with contextual info  Variable traceability. traceability of definition: access to the definition version that was used for data collection CMDR Requirement.  Variable search. When a user is not sure of the meaning of a variable he can access the CMDR that will provide o The related concept with a clear description of what the variable mean o Potential synonym variables also used to store the same concept o The “gold standard variable” that should be used when collecting information about the concept  Variable definition. A variable may be linked to a subset of allowable responses (i.e. code list). The allowable responses applicable across their context of use in each specific concept. Example of allowable responses – based on concept/context: the code list for “body position” would be different if used in the context of systolic blood pressure than for other cardiovascular test).  Variable definition. a concept or a variable is always linked to the same value set)  Variable definition. A variable definition (and concept definition) is version controlled

3.1.5.3 Prepare and integrate the analysis datasets (e.g. ADaM) Normally, the next phase of the project would have involved writing an extensive program that would read each of the existing datasets, input each of the existing variables, transform the variables where necessary into the same format, transform the values where necessary into the same value set or range, and then output new datasets with new variable names. With the CMDR, though, Jie-won has several alternative means to accomplishing the task. Jie-won decides that, for this integration, the simplest approach is to access the CMDR-specified data collection files (i.e. before vertical stacking and assigning to datasets for analysis). In this format, she need not worry about what dataset a variable was eventually mapped to. In fact, it is only critical that she line up with the time points between the various studies appropriately. Jie-won then uses a standard data transform on the CMDR-conforming collected data into a CMDR-based analysis dataset standard for use as the final reporting data base for the integration. The only programming needed is to transform related variables referring to the same concept into a single format. The 5 non-conforming studies must be mapped using more traditional, labor-intensive,

Page: 26 / 43 CDISC MDR Storyboards & Stakeholder Analysis

ad-hoc maps. CMDR Requirement.  Concept definition. Standards data sets (such as SDTM, ADaM, …) are complex concepts with allow to relate/group the different underlying variables  Variable definition. Whenever relevant, the variables definition should contain a definition of mapping/transformation with the other variables, sharing the same concept/meaning (e.g. transformation from one unit to another one, transformation of time format). In a first step this could be free text, in a later phase it could be an executable code that can directly be used. E.g. transformation from one unit to another one, transformation of data/time into a standard one… Note: o in case CMDR standards have NOT been applied correctly the problems/workload for data integration remains the same as today o complex transformation across several variables may be more difficult.

3.1.5.4 Programming any data selections/subgroups A specific requirement of the integration is that data from patients taking antihypertensives 2 weeks before an observation should be excluded. In the past, Jie- won would have had to print out lists of all concomitant medications, and consult either a physician or ATC codes to program the necessary data exclusion.

With the CMDR, however, the concept of ‘antihypertensive intervention’ is well- defined, and so excluding those observations from the final integration is more straightforward. CMDR Requirement.  Same as what is described under 3.1.5.1, i.e possibility to specify a complex concept (related to specific variables) ; this would be then transformed into a specific query (within SAS or Oracle) to select (or exclude in this case) some records

High level Fred has been asked to do a business critical analysis, the context of which may be storyboard broader than a single indication or compound and involve either or both safety and 2 efficacy data. Fred was not involved in the development or reporting of any of the studies likely to be included in the analysis. There may, or may not, be a requirement for the analysis to be consistent with previous analyses of these studies (endpoint definitions, time windows etc) e.g. repeating the analysis for a sub-population of the full set of subjects.

The analysis requires the following detailed storyboard  3.1.5.5. Identification of the studies (and, on occasions, subjects within studies) for inclusion in the analysis  3.1.5.6. Identification of the specific variables and assessments  3.1.5.7. Development of a full understanding of what was collected in those studies. Detailed storyboard 3.1.5.5 Identification of the studies (and, on occasions, subjects within studies) for 2 inclusion in the analysis  Step 1 is to decide on the subject population in which Fred is interested. Maybe certain indications, or certain compounds. Maybe treated subjects or untreated (placebo) subjects. Fred does this primarily on the basis of his business question,

Page: 27 / 43 CDISC MDR Storyboards & Stakeholder Analysis

but the availability of data may influence how Fred plans to address his business question. Data availability will be covered under step 2.  Step 2 is to choose the studies and, on occasions, they may be a need to identify subsets of subjects within those studies. o Choosing studies requires information about the studies. Some of this comes from TDM datasets e.g. study population, study design, compounds under investigation, doses, duration, indication etc. The remainder comes from an evaluation of the data collected in the studies. Fred identifies the concepts used in each of the studies by tracking back from the variables contained in the datasets. This will either require variables to map to a single concept, or for there to be metadata at the study level that points to the concept. The concepts and TDM information may be sufficient for Fred to choose the studies. But in some cases, Fred will need to review the variables themselves in order to check that the necessary information is collected. o Choosing subjects within studies requires information about inclusion/exclusion criteria, maybe baseline values (safety or efficacy), maybe medical history etc. A review of the concepts included in each study helps to identify which variables to review. The data held in these variables will be reviewed to identify which subjects to include in the analysis.

CMDR Requirement.  Concept search. It is possible to retrieve a (complex) concept (that can be used a study and/or patient population selection criteria) o From a a higher level complex concept, and then selecting a lower level concept o Providing some free text (“business question”) that would provide a match to an existing concept Note: the actual criteria will be a combination of the concept/variable with a specific value (e.g. age below 20 years) …..CMDR provides the definition of the concept/variable (e.g. age), the data integrator provides the value (e.g. below 20 years)  Variable definition. It is possible to retrieve the meaning of a variable by retrieving back the related simple concept, and potentially related complex concepts  Concept definition. It is possible to identify all the variables related to a specific (simple or complex) concept ; this can be used by a dedicated search engine to define one or more database “synonyms” queries

3.1.5.6 Identification of the specific variables and assessments This step is the most time consuming. From step 2, Fred knows which studies he wants to pull data from in order to answer his business question. Now he has to choose his datasets … and the variables within those datasets. A review of the concepts used in the individual datasets will facilitate the identification of the variables that will be used. Different studies may have used different datasets … but this is not a problem for Fred, as the variables are what he needs, not the datasets. Fred pulls all the variables into his own dataset(s). CMDR Requirement.  Concept definition. It is possible to identify all the variables related to a

Page: 28 / 43 CDISC MDR Storyboards & Stakeholder Analysis

specific (simple or complex) concept which could be the concept of a standard analysis dataset– and their potential synonyms across different studies) ; this can be used by a dedicated search engine to define one or more database “synonyms” queries

3.1.5.7 Development of a full understanding of what was collected in those studies In some cases, Fred may want to be consistent in approach with the original study/submission/aggregation analyses. Typically, this will mean that Fred needs access to the previously generated analysis datasets containing CMDR defined variables. An alternative, not requiring CMDR derived variables, would be ADaM datasets but these would be less suitable as the variables would not be as standardized. Whichever is the case, these datasets, the variables and the accompanying metadata provide all the information Fred needs to check that algorithms and methodology are consistent across studies. CMDR Requirement.  (no requirement identified)

Key points Note; Usage of CMDR in this storyboard will not allow to save a lot of time if the underlying data sets are not compliant/consistent with the CMDR (i.e. all variables used in the studies must be registered in the CMDR)

Best practices/ meta-data management for Sponsor’s specific meta-data registry with context info  Standard, programmatically accessible meta-data structures across all studies – this way, we don’t have to work out the intricacies of each dataset, one by one  Data/information to be unambiguous with all information required to interpret the collected and derived data and to determine data quality o for example, when looking at a systolic blood pressure measurement, it is important want to know whether it was collected supine or standing, whether it was collected at rest or after exercise, whether the subject is a healthy volunteer or a hypertensive patient, whether the subject is young or old, whether the subject was on or off treatment etc). o The exact circumstances of data collection (phrasing of question, instructions, methods, tools, …) o The involved Measuring device parameters that influence measurement   Best practices/ meta-data management for CMDR  a generic “collected data” format makes integration easier – before variables are stacked and assigned to SDTM or analysis-style datasets  datasets and variables to be well documented so there is no question which datasets and variables needed  easy access from variables “up” to concepts and back down again  Full set of terminology for all variables  Concepts and associated grouped variables robustly defined so that they are used identically in different implementations  Groups of variables that can be extracted from one set of datasets and used in a new configuration without changing the meaning of those variables  A documented default for a concept that can be modified by use of optional variables. For example, the documented default for the concept “Systolic blood

Page: 29 / 43 CDISC MDR Storyboards & Stakeholder Analysis pressure” should be “Systolic blood pressure at rest”. If the measurement was taken “after exercise”, there would be optional variable(s) indicating this. Benefit of this is that the standard scenario doesn’t require the collection of what the study person would see as unnecessary variables e.g. “At rest (Y/N)?”. Note that different subject populations (e.g. elderly, hypertensive) would be handled via subject concept(s).

Page: 30 / 43 CDISC MDR Storyboards & Stakeholder Analysis

3.1.6 Application developer (clinical application developer,…)

Page: 31 / 43 CDISC MDR Storyboards & Stakeholder Analysis

3.1.7 Document manager / medical writer (not in scope of pilot – to be developed in a later phase)

Page: 32 / 43 CDISC MDR Storyboards & Stakeholder Analysis

3.2 Regulators and Health Care authorities

3.2.1 FDA clinical data reviewers (“view extractors”)5 CDISC MDR is a critical component to instantiate the HL7 CDISC content message in a consistent way across the industry. It should be used at the sponsor site to build the HL7 message, and at the FDA site to build the different views. HL7 CDISC Content Message definition

SPONSOR FDA SDTM SDTM

ADAM ADAM HL7 CDISC Clinical ContentInstancesInstances View 1 Data messageofof message message JANUS View 1 Repository withwithwith data data data … …

View n Sponsor MDR View n FDA MDR Copy of CMDR content Copy of CMDR content

CDISC MDR

3.2.2 Other – non clinical trial – submission of structured data This would includes the following o Structured protocol submission to FDA, WHO, EMEA… and potential other authorities o Product identification submission to EMEA, FDA, … o ….

(not in scope of pilot – to be developed later)

5 This section is building upon the storyboard defined for the “Study data Hl7 CDISC content message” - see http://wiki.hl7.org/index.php?title=Subject_Data_Story_Boards

Page: 33 / 43 CDISC MDR Storyboards & Stakeholder Analysis

3.3 CDISC MDR supporting staff

3.3.1 Data standards definition staff (across SDOs such as CDISC, HL7,..)

Page: 34 / 43 CDISC MDR Storyboards & Stakeholder Analysis

3.3.2 CDISC MDR “steward“ High level This is for the people actually maintaining the CMDR with consistent content. requirement  ensure compliant/consistent information is provided across the chain to support the aforementioned needs  (note: ensuring conformance to the standards is not a requirement for this role) Problem to Process People solve (+ root cause No certification authority of cross industry standard analysis) No FORMAL process content for sharing data standards content across Change in mindset: data are industry critical asset, data standards are not a competitive advantage and Ensure quality of should be shared across in the CDISC MDR in an industry environment where there are inconsistent and sometime No tools to support sharing ODM could be used to support conflicting definitions of standard content across import/expert within CMDR organisation of concepts and variables No tools to support storgae and sharing of standard content across organisations

Technology External data standards High level Dan and Carl are data standard curators within CMDR. As data curators, they storyboard know quite well the content of CMDR. They have been trained to the underlying information model/ontology of CMDR but they are not domain expert and cannot take a final decision on medical/scientific content; the can issue recommendations and the CDMR Governance Committee takes the final decisions. They received today each a different request to update the CMDR  Dan received requests in the predefined form  Carl received an ODM file from Company ConceptCo, which seems quite complete with variable and concept descriptions and another one from  Carl also received an ODM file from Company VariableCo which seems to contain only variable descriptions

Dan received forms with change request, one to add a set of new concepts from Dr Joe (the protocol author), and another one to add a set of new variables from Akhil (the eCRF developer). Dr Joe and Akhil correctly filled the pre-defined form, available on the CMDr web-site, with interactive consistency checking. Dan can therefore process the content directly  1.1.1.1 . Request new concept.  1.1.1.2 . Request new variable and new variable/concept link  1.1.1.3 . Send feedback to the requestor.

To process the information they received in the files, Carl works through the following steps  1.1.1.4 . Upload ODM file in CMDR. This first step provides a report  The information contained in the file must be processed differently based on the quality of the content complete/correct incomplete/incorrect

Page: 35 / 43 CDISC MDR Storyboards & Stakeholder Analysis

syntactical description description concept with a match 1.1.1.5 . Check existing concept in CMDR concept with no 1.1.1.6 . Check new concept match in CMDR complete/correct incomplete/incorrect syntactical description description Variables with a link to a concept 1.1.1.7 . Check gold standard variable Variables with no link to a concept

 1.1.1.8 . Send gold standard variables in ODM file to Company ConceptCo and VariableCo

Detailled storyboard 1.1.1.1 Request new concept Dan checks the request from a new concept from Dr Joe, the protocol author. There are several requests  Three simple concepts: patient ethnicity, patient skin type and IGFBP3 (a new lab test)  A complex concept, which is a new domain for the CV therapeutic area, called CV domain anti hypertension with renal failure. This complex concept is composed by a group existing concepts in the CMDR.

(patient ethnicity. request can be linked to existing concept). For patient ethnicity, Dr Joe mentioned in the “rationale” for the request that he could not use the concept race – very close in semantic to ethnicity but still different. As ethnicity is a concept with a limited set of permissible values, Dr Joe proposes to link it with an existing code list of permissible values. Dan checks the underlying information model for a concept closed to ethnicity. He finds the concept “ethnicGroup” which is a property related to the object person. As a person can be an investigator or a patient, this concept should be valid for “patient ethnicity”. Dan does not change anything into the CMDR but introduce his comments in the form, mentioning that the concept Dr Joe wants, already exists under “person.ethnicGroup” CMDR requirement  Possibility to navigate into CMDR to search for concept based on concept name, concept description or by navigating through the ontology …..  Need to update a change request form  Audit trail on management of change requests

(patient skin type. request for a new concept as property in information model). For patient skin type, Dr Joe provided the rationale and the set of permissible values, as he could not find an adequate code list in CMDR. Dan checks the underlying information model for skin type and cannot find any relevant concept. He thinks that it should be a new concept, as an property of the object Person in the information model. Consequently he  introduces the new concept into CMDR – as property of Person – with all needed information, including the related code list  marks the new concept and the related code list as “draft” so that the CMDR

Page: 36 / 43 CDISC MDR Storyboards & Stakeholder Analysis

Governance Committee will be notified o Need to confirm existence of this committee o Need storyboard to outline functioning of this committee and related CMDR requirements  CMDR Governance Committee reviews the proposal from Dan and accepts or rejects it. In case of rejection an alternative proposal need to be provided  introduces the status draft in the change request form CMDR requirement  Possibility to navigate through existing code lists  Possibility to enter a new concept into the ontology – as new property of an existing object - with a status related to life cycle management o Draft o Under review by the CMDR Governance Committee o Approved, o Reject – in which case a rationale needs to be provided o Retired – in which case a rationale need to be provided, and whenever relevant with a link to the replacing concept  Version control on concept  Possibility to enter a new code list with a status related to life cycle management  Audit trail on updates to the CMDR

(IGFBP3. request for a new concept as instance of an object). For IGFBP3, Dr Joe explained what this new test is about – and in which pathology it is used. He also explained that this new test is related to IGF1, a test already existing within CMDR, but is different. He also specified that this is a numeric test; he therefore provided the measurement unit – which is one of the unit already defined in CMDR, the possible range as well the precision. Dan knows that measurements are instance of a object called “clinical measurement”. He also finds the instance describing IGF1. He agrees that IGFBP3 should be a new concept, as an instance of the object “clinical measurement” which is also a specialization the IGF1 concept. Consequently he  introduces the new concept into CMDR – as a new instance of “clinical measurement”– with all related properties (i.e. unit, range, precision), as well as specialization link to IGF1  marks the new concept and the related code list as “draft” to CMDR Governance Committee  introduces the status of draft in the change request form CMDR requirement Same as before – with different way of registering a new concept ( as a new instance of an existing object)

(new domain for CV anti hypertension with renal failure . request for a new complex concept composed of existing concepts) Dr Joe requested to introduce a new domain for the CV therapeutic area. This is a complex concept. Dr Joe defined as well the different concepts belonging to this new complex concept. While navigating through the CMDR, Dan realizes that this new domain is very similar to the existing complex concept called “CV anti hypertension”, there is just one additional concept (creatinine measurement) in this new concept. Dan is not sure what is the best approach and so forwards the request to the CMDR Governance Committee

Page: 37 / 43 CDISC MDR Storyboards & Stakeholder Analysis for a decision. The committee may decide to:  Introduce a new complex concept linked to the CV anti hypertension as specialization;  Make the existing concept “CV anti hypertension” obsolete and create a new related concept called “CV anti hypertension with renal failure”  Reject/defer the request to see if there are more request for such a concept – in order to avoid proliferation of too many concepts The committee finally opts to reject/defer, but instructs Dan to register the request so that if other requests for the same complex domain are made, the committee can reconsider it’s decision at a later stage. Dan also introduces his comments into the change request form CMDR requirement  Possibility to navigate through complex concepts in the ontology  Possibility to forward request for complex concept to CMDR Governance Committee.  Possibility for steward to judge whether or not a request is complete – and reject request and ask for additional information if the request is not complete or the steward does not understand it.  Possibility to add complex concepts (with life cycle management and version control)  Audit trail on change request with possibility to have statistics on rejected concepts  Need to fill relevant part of change request form

1.1.1.2 Request new variable and new variable/concept link Whenever Dan has checked/entered a concept he checks as well for a gold standard variable name  For patient ethnicity, an existing concept with an existing gold standard variable, there is nothing to change.  For patient skin type, a new concept related to a object property, there is only one variable, Dan proposes a new gold standard variable following the agreed naming conventions for variables. The variable is registered with status draft for further review by the CMDR Governance committee) and with a link to the newly created concept  For IGFBP3, a new concept related to an instance of an object with 4 properties (value, unit, range, precision), Dan proposes 4 new gold standard variables following the agreed naming conventions. The variables are registered with status draft for further review by the CMDR Governance committee) and are 4 linked to the newly created concept  For CV domain – this is a complex concept, there is no need to link it to a variable; the underlying concepts are already linked to variables  Whenever a gold standard variable is define, Dan updates as well the relevant part of the change request form CMDR requirement  Possibility to add new variables with link to related concept (with audit trail and version control)  Possibility to perform validation of the new variables by a designated SME on the Governance Committee  Need to update variable part of change request form

Page: 38 / 43 CDISC MDR Storyboards & Stakeholder Analysis

1.1.1.3 Send feed back to the requestor When all the needed changes have been made into CMDR, Dan checks the content of the change request form to ensure he entered correctly all his comments and changes. He then send back the form to the requestor. CMDR requirement  Possibility to automatically send back change request form to the requestor via email  Possibility for the requestor to view status of his request on-line

1.1.1.4 Upload ODM file in CMDR. Needs more discussion:  Will ODM files and define (e.g. SDTM) files both be uploaded?  What extensions are needed in ODM/define to specify Variable/Simple Concept/Complex Concept attributes? Carl uploads the ODM file into the CMDR; the first think he receives is a report with the following information for their respective file  Number of concepts in ODM file complete/correct syntactical incomplete/incorrect description description concept with a match in CMDR concept with no match in CMDR  Number of variables in ODM file complete/correct syntactical incomplete/incorrect description description Variables with a link to a concept Variables with no link to a concept

CMDR requirement  Possibility to upload ODM file – within CMDR o If the ODM format is not correct, generation of an error message o Parsing of the file and introduction of the different components in a staging area for further processing o Generation of a report with the result of the parsing

STOPPED HERE – 2008-12_09

1.1.1.5 Check existing concept. The file from Company ConceptCo is complete with variables and concepts description, and has been registered in a staged area. Carl first check the concepts identified by the CMDR parser as existing: for each concept in the ODM file, the CMDR is providing with the definition provided in the file and the one already existing in CMDR. Carl can then simply tick a button, confirming that the concept is correct. For one concept, he does agree with CMDR and marks the concept as “not matching” CMDR requirements  Generation of a concept comparison report o With a list of the concept – with their definition - provided in the ODM file and the ones in CMDR

Page: 39 / 43 CDISC MDR Storyboards & Stakeholder Analysis

o Confirmation on the report of match or no match of concepts, potentially with comment.

1.1.1.6 Check new concept Carl then checks the concepts identified by the CMDR parser as not matching – this includes as well the concept he identified previously as non matching. For each of the non matching concept, CMDR already pre-filled as much as possible a change request form. Carl check the forms and potentially correct a few elements.  If there is not enough information, he registers the change request form as incomplete (and with a request to resend a complete change request form manually or resend a complete ODM file)  If there is enough information, he process the request as described in 1.1.1.1 . Request new concept. When he is finished, Carl send a complete report (complete and incomplete change request forms) to the originating company CMDR requirements  Possibility to generate the content of a change request form (concept definition part) from ODM file  Possibility to manually update the change request form  Audit trail on change request form

1.1.1.7 Check gold standard variable The ODM file from Company ConceptCo contains their legacy variables already related to concepts.  For all variables related to concepts existing already or that Carl managed to register as draft into CMDR, Carl can further work. For each variable in the ODM file, CMDR provides Carl with the list of gold standard variable – and potentially with the list of existing synonym variables. Carl then manually confirm that the variable in the ODM file is a synonym of the gold standard variable.  For all other variables – related to a concept that Carl cannot identify, nor register - Carl indicates that the variable cannot be linked to a CMDR concept, and stores it in CMDR linked to a “DUMMY concept”; there is no gold standard variable. This is a VERY bad practice that Carl does not like but it allows people on operations to further work.

On a weekly basis, Carl and Dan as well as all the data stewards of all the variables that have been sent and for which there is no concept/gold standard. They need to scan these files to see if there is no possibility to defined new concepts across all the different variables proposed.

CMDR requirements  Possibility to generate the content of a change request form (variable definition part) from ODM file, including mapping to gold standard variable (through the concept)  Possibility to manually update the change request form  Audit trail on change request form  Possibility to automatically link variables without an existing concept, to a DUMMY concept  Possibility to generate an ODM file will all the variables that were in the source

Page: 40 / 43 CDISC MDR Storyboards & Stakeholder Analysis

ODM file, this new file must include the legacy variable, the related concept and the related gold standard variable (or nothing if there is a DUMMY concept)  Possibility to generate reports with all variables linked to DUMMY concept

1.1.1.8 Send gold standard variables in ODM file When he is finished, Carl requests CMDR to generate a new ODM file with  All the variables and related concept proposed that were in the file from the company, linked with the gold standard variable.  All the variables that could not be linked to concept and for which there is no gold standard variable to inform Company ConceptCo and VariableCo CMDR requirements  Possibility to generate an ODM file will all the variables that were in the source ODM file, this new file must include the legacy variable, the related concept and the related gold standard variable (or nothing if there is a DUMMY concept) Summary of needs

Page: 41 / 43 CDISC MDR Storyboards & Stakeholder Analysis

3.3.3 CDISC CMDR governance committee

Page: 42 / 43 CDISC MDR Storyboards & Stakeholder Analysis

3.4 Healthcare organization staff (to be defined)

Page: 43 / 43

Recommended publications