Ensuring quality for decision- making

C/Can Responsible Data Framework Table of contents

01 02 Background Purpose

03 04 Dimensions of Data Quality Control Quality Methodology

03.1 Core Indicators of Data Quality

05 Governance

05.1 Human Resources 05.2 Training 05.3 Routine Data Quality Checks 05.4 Annual Review of Data Quality RESPONSIBLE DATA FRAMEWORK · ENSURING QUALITY DATA FOR DECISION-MAKING Background 01 the C/CanCityEngagementProcessincluding: for cancercareincitiestobeaddressedthrough effective decision making on the prioritysolutions Complete andaccuratedataareessentialtosupport integrated intooureverydaywork. is quality toassuredata inplace of validation means having processes, definitions and systems the highestqualityonwhichtobasedecisions. This (C/Can) recognises the importance of having data of As a data-driven organisation, City Cancer Challenge 06. 05. 04. 03. 02. 01. (needs assessment) Identification ofprioritysolutions Sustainability Monitoring andEvaluation to deliverprioritysolutions Design andimplementationofprojects Activity planning stakeholders andexperts Engagement ofthe“right”partners,

3 RESPONSIBLE DATA FRAMEWORK · ENSURING QUALITY DATA FOR DECISION-MAKING Purpose 02 specific efforts are needed to ensure high quality high ensure to needed are efforts specific errors in data entry and analysis. For this reason, missing values,bias,measurementerror,andhuman All dataaresubjecttoqualitylimitationssuchas of allC/Canrelateddata. carried out toensure good data quality management out theproceduresandprocessesthatneedtobe of dataqualityforC/Canasanorganisation.Itsets This guidelaysthebasisforacommonunderstanding over time. made incontinuouslyimprovingthequalityofdata review dataquality,andsothatinvestmentscanbe is putting in place measures toassess, monitor and data and the reliability and validity of findings. C/Can 4 RESPONSIBLE DATA FRAMEWORK · ENSURING QUALITY DATA FOR DECISION-MAKING 1

Quality of Data Dimensions 03 Master , 2015 Allen, DaltonCervo, inMulti-Domain Data QualityManagement, Mark dimensions: C/Can assessesdataquality 02. Accuracy 01. Completeness 04. Integrity 03. Uniqueness moving fromdatacollectiontoanalysis. accuracy is to have the data checked/approved before source ofinformation.Anadditionalmeasuretoensure the with an identified and trusted second possible, dataaccuracycanbecheckedbycomparing reflects the real world situation / object or event. Where Accuracy isthedegreetowhichdataaccurately look atthelevelsofdatathataremissingorunusable. if optional data is missing. Measures of completeness data required are available. Data can still be complete the all which to extent the as defined is Completeness cleaning processes. cleaning processes. include lossofarecorddue tohumanerrorsindata human errorisafailure of dataintegrity.Examples / software malfunctions), storage,retrieval of data and data as a result of processing operations (e.g. hardware of datacorruption.Anyunintendedchangestothe inputted, andcanthereforebemeasuredbythedegree Integrity refers to maintaining information as it was single datapoint. then thisshouldbecleanedandconsolidatedinto a same question from the same institution or individual, copies ofrepeatingdataormultipleresponsesforthe wherever possible,meaningthatifthereareduplicate appearing inadatabase.Duplicationshouldbeavoided information means that there is only one instance of it data withinagivensetoracrosssets.Unique Uniqueness isameasureofthelevelduplication 1 acrosssixcore 5 RESPONSIBLE DATA FRAMEWORK · ENSURING QUALITY DATA FOR DECISION-MAKING 06. Timeliness 05. Validity expected andwhenitisavailableforuse. be measured as thetimebetweenwheninformation is over time,therefore timeliness of dataiscritical.Itcan duration. The value and accuracy of data may decrease available within an acceptable time frame, timeline and The degree to whichthe data is up-to-date and the formataswellrational/acceptable. intermsof isacceptable inputted checking thatdata validation isakeyprocessinensuringdatavalidityby such asDD.MM.YYratherthanMM/DD/YYYY.Data example, dataondatesshouldfollowthesameformat whether information confirms to a specific format. For correctness and reasonableness of the data, including Validity is a data quality dimension that refers to the 6 RESPONSIBLE DATA FRAMEWORK · ENSURING QUALITY DATA FOR DECISION-MAKING 03.1 CoreIndicatorsofDataQuality the C/CanCityNeedsAssessmentQuestionnaire. dimensions. Theexamplesofsampleindicatorsprovidedbelowrelateto determined on a case-by-case basis, but should cover the six core Benchmarks and sample sizes for data quality indicators should be – – – – – –

03 01. 05 Uniqueness Completeness Validity values inthe series). is extremein relationtotheother (if adatavalueinseries ofvalues Percentage ofoutliers responses/records Percentage ofduplicate society andpatients) down byhealthcareinstitutions,civil Assessment, thisshouldbebroken Number ofresponses(fortheNeeds required questions individuals thathaverespondedtoall Percentage oftargetedinstitutions/ Percentage ofincompletedatafields know’ responses Percentage ofdatafieldswith‘Don’t

– – – 02 06 04 Accuracy Timeliness Integrity institutional lead data reviewedandsubmittedby Percentage ofinstitutionswith the designatedtimeframe Percentage ofdatareceived within responses/records Percentage ofcorrupted

7 RESPONSIBLE DATA FRAMEWORK · ENSURING QUALITY DATA FOR DECISION-MAKING Methodology Control Quality Assessment Needs 04

support highlevelsofdataquality: C/Can has a number of interrelated processes to 02. 01. 06. 05. 04. 03. validation Undertaking data standards Setting data protection and personal data Ethical considerations dashboards Reports and quality control arrangements for User accessandrights or inconsistentdata acting onmissing Checking forand

8 RESPONSIBLE DATA FRAMEWORK · ENSURING QUALITY DATA FOR DECISION-MAKING – – 01. – 04. collection processesthroughouttheC/Cancitylifecycle. developed questionnaire platform, but can be adapted and applied to other data The examplesbelowapplytotheCityNeedsAssessmentProcessandnewly

possible. response optionswherever checkboxes areusedas multiple choiceor with dropdownoptions, Predefined answers as aresponse. within pre-specifiedranges allowing onlyintegers answers aspercentages, requiring numericalor data entriesforquestions Parameters aresetfor for agiveninstitution. no duplicationofresponses which ensuresthatthere is one recordperinstitution questionnaire. Thiscreates allocated sectionofthe use tocompletetheir multiple respondentscan unique accesscodethat Each institutionhasa standards Setting data quality control arrangements for User accessandrights 05. – 02. – – –

of completeness ismet. ensure theminimum level complete missing dataand up withstakeholdersto completeness andfollow dashboard tocheckondata City Managerscanusethis submitted byinstitutions. which sectionshavebeen provides anoverviewof relevant teammembers to CityManagersandall A dashboardaccessible correct aspossible. as comprehensiveand submissions toC/Canare required fieldstoensure validity onreportableand quality, completenessand leads, toimprovedata via theinstitutional up bytheCityManager MEL Manager,andfollowed run monthlybytheSenior Data qualitychecksare previous datasets. comparing tosimilaror information producedand ‘Sense-checking’ any the system. obtained andenteredin missing informationcanbe data setandreviewingif completeness ofany Checking forthe Dashboards Reports and validation Undertaking data – – – 03. – 06.

by theinstitutionallead. investigated andaddressed identified shouldbe Errors andinconsistencies completed. once allmandatorydatais to C/Cancanonlyoccur mandatory, sosubmission questionnaire are Key questionsinthe submitting toC/Can. is completebefore that datafortheinstitution responsibility toensure institution andassigned identified foreach Institutional leadsare collected. telephone numbers, are names, emailaddresses, personal datasuchas patient data,including Consequently noindividual institutional representative. them byacivilsocietyor follow alinksentto anonymous; patients the questionnaireare Patient responsesto inconsistent data acting onmissingor Checking forand considerations Ethical

9

* Please see C/Can’s data privacy policy and guidelines on the protection of personal and non-personal data for further detail and examples. RESPONSIBLE DATA FRAMEWORK · ENSURING QUALITY DATA FOR DECISION-MAKING 05.1 Rolesandresponsibilities Governance 05 across theteamforkeyactivities/processeswithinC/Can.Forexample: assigned however are quality data for responsibilities and roles specific approach, responsibility ofeveryC/Can team member. To ensure aconsistentandrigorous Ensuring that high quality data is collected and used at global and city levels is the – – the responsibilityoftheirlinemanagers. is CRM distributed amongeachdataset owner andvalidationis C/Can’s into input and collect to Responsibility members ofthePolicyandImpactteam. responsibility oftheCityManager.Supportisprovidedby of the city needs assessment process is primarily the Oversight of the quality of all data collected as part 10 RESPONSIBLE DATA FRAMEWORK · ENSURING QUALITY DATA FOR DECISION-MAKING 05.3 RoutineDataQualityReview 05.2 Training – – – highest qualitydatapossible. mechanism tocollect,analyseandbasedecisionsonthe Routine andregulardataqualityreviews(DQR)areanimportant this document. and standardsforqualitydatacollection,beprovidedwithaccessto Assessment processinparticularshouldbeawareofC/Can’sguidelines and localsustainabilitypartners,thoseinvolvedintheCityNeeds Key localstakeholdersincludingCityExecutiveCommitteemembers documents. the contentandprocessessetoutinthisguiderelated going programmeoftrainingandcommunicationtostaffon The PolicyandImpactTeamareresponsiblefortheon- to reviewdatasets regularlyforcompleteness, accuracy andintegrity. For Salesforce data,linemanagers ofdatasetowners areresponsible the citymanagerinconsultation withtheSeniorMELManager. collected. Thesample ofthedataassessedwillbeatdiscretion of confirm datasourcesandvalidityonaset sample ofthedatabeing assessment toconductadataqualityassessment. Specifically,to least onesitevisittoeachinstitutioninputting dataduringtheneeds For theneedsassessment,CityManagersshould plantoconductat as comprehensiveandcorrectpossible. on reportableandrequiredfieldstoensure submissionstoC/Canare relevant dataholder,toimprovequality completenessandvalidity run monthlybytheSeniorMELManager,andfollowedupwith For specificprocessessuchastheneedsassessment,datachecksare 11 RESPONSIBLE DATA FRAMEWORK · ENSURING QUALITY DATA FOR DECISION-MAKING 05.4 AnnualDataQualityReview actions outlinedintheplanshouldbespecific, timeboundandcosted. measures needed to strengthen the system and resolve the problem. The root causes of data quality problems revealed by the DQR, including specific Data Quality Improvement Plan should seek to identify and address the recommend thataDataQualityImprovementPlanshouldbeprepared.The Depending on the findings of the annual DQR, the Policy & Impact team may The objectivesoftheDQRare: The deskreviewhastwolevelsofdataqualityassessment: data qualityacrossthesixdimensions. during that calendaryear. The objectiveofthereviewwillbetoexamine of theneedsassessmentindicatorsfromallcitiesengagedinprocess C/Can willinstituteannualDQRsofkeydatasetsincludingatleast2.5% 3. 2. 1. 2. 1. capacity toproducegoodqualitydata. to monitortheperformanceofdataqualityovertimeand action planstoimplementsuchmeasures,and to identifymeasuresimprovequality,anddevelop including routinemonitoringofdata, to institutionaliseasystemforassessingqualityofdata, an assessmentofeachindicatoraggregatedtothegloballevel; an assessmentofeachindicatoraggregatedtothecitylevel; 12 https://citycancerchallenge.org/