------, 4 , Chi-Ren Shyu , Chi-Ren 6 There is a large need to need large a is There 3

, Danny Myers PhD , Danny Myers 3 , S. Hasan Naqvi MD Hasan Naqvi , S. 5 , Daniel Shyu MD 1 able to researchers. cough and/or dyspnea, reported pneumonia, respiratory fail respiratory pneumonia, cough and/or dyspnea, reported in (ARDS), cardiac distress syndrome respiratory acute ure, encephalitis. and failure, renal jury, of COVID-19 and spectrum clinical the further understand datasets administrative Several large outcomes. its associated In Statewide and Sample Inpatient Nationwide the as such avail become and compile to months require datasets patient

2 - , Wei Huang MA Huang , Wei 2 , M. Fareed K. Suri MD , M. Fareed 1 The encounters include pharmacy, clinical and microbiology laboratory, admission, and admission, laboratory, and microbiology clinical pharmacy, The encounters include , William I. Baskett BS , William I. Baskett , Iryna Lobanova MD , Iryna Lobanova HealthCare Research Journal HealthCare Research 2 ,1

The manifestations and associated outcomes of patients with Coronavirus disease 2019 (COVID-19) outcomes of patients with Coronavirus disease 2019 (COVID-19) and associated The manifestations The COVID-19 deidentified dataset provides data from a large cohort of COVID-19 pa cohort from a large provides data dataset deidentified COVID-19 Cerner The Patients with a minimum of one emergency department (ED) visit, who were admitted for observa (ED) visit, who were admitted department of one emergency with a minimum Patients Standard care as per institution. Adnan I Qureshi, MD, Department of Neurology, University of Missouri, Columbia, MO 65201. Email: [email protected]. University of Missouri, Columbia, MO Adnan I Qureshi, MD, Department of Neurology, As of July 22, 2020, a total of 14,765,256 As of July 22, 2020, a total Electronic health records; Coronavirus; COVID-19; diagnosis codes; procedural codes. Electronic health records; Coronavirus; COVID-19; diagnosis codes; procedural 1 A Cerner COVID-19 deidentified dataset based on electronic medical records in the Cerner Real-World Cerner Real-World records in the medical based on electronic dataset Cerner COVID-19 deidentified A A total of 54 healthcare facilities across the United States for the April 2020 release. of 54 healthcare facilities across the United States for the total A OFFICIAL JOURNAL OF ZEENAT QURESHI INSTITUTES QURESHI OF ZEENAT JOURNAL OFFICIAL

CRJ

2 affected by cardiovascular conditions. affected Key words— Main data elements— Tenth of Diseases, Classification using the International The patient population can be identified billing information. be identi or secondary diagnosis codes and procedures can (ICD-10-CM) primary Modification Revision, Clinical (CPT) codes. Terminology and Current Procedural fied using Procedure Coding System(ICD-10-PCS) Conclusions— and is to address gaps in current knowledge regarding how the disease affects tients, which may increase the ability Settings— Participants— with a diagnosis code that could be associated with COVID-19 encounter tion, or had an inpatient exposure or infec result for COVID-19. tion, or those with a positive laboratory Intervention— conditions, larger and more diverse patient samples are necessary. A better understanding of how the disease interacts understanding better A samples are necessary. and more diverse patient conditions, larger provide to clinicians of ability the improve may complications COVID-19 induced and factors risk pre-existing with informed care. Methods— identify risk fac pipelines are used to select study populations and to researchers. Data analytics Dataset is available with COVID-19 with various manifestations. tors, co-morbidities, and outcomes in patients Background— cardiovascular by, and is affected understand how the COVID-19 affects, To studies. are not well studied in large Tiger Institute for Health Innovation, Cerner Corporation, MO, USA Institute for Health Tiger Cloud, MN, USA St. Cloud Hospital, St. MO, USA Medicine, University of Missouri, Columbia, Department of Internal Department of Neurology, University of Missouri, MO, USA University Department of Neurology, MO, USA and Informatics, University of Missouri, Institute for Data Science of Missouri, MO, USA School of Medicine, University 5 6 Abstract Murugesan Raju PhD Raju Murugesan PhD 1 2 3 4 Vol.1, No. 1, pp. 17-28, Published August, 2020. No. 1, pp. 17-28, Published Vol.1, All Rights Reserved by HCRJ. Unauthorized reproduction of this article is prohibited. Author: *Corresponding disease 2019 (COVID-19), caused by severe acute respira disease 2019 (COVID-19), caused by severe acute pandemic a as (SARS-CoV-2), virus-2 corona syndrome tory 2020. on March 11, with 612,054 deaths. had been diagnosed globally, patients Most patients with confirmed COVID-19 developed fever, INTRODUCTION coronavirus (WHO) declared Organization Health World The Adnan I Qureshi MD* Adnan I Qureshi Facilitating the Study of Relationships between COVID-19 and Cardiovas COVID-19 between of Relationships the Study Facilitating Deidentified COVID-19 Cerner Real-World Using Outcomes cular Health Dataset H

HealthCare Research Journal, Vol. 1 - - - -

5 , powered by Amazon Web Web Amazon , powered by The workflow for data extraction using ICD codes from data extraction using ICD codes from The workflow for Cerner HealtheDataLab™ FROM HYPOTHESIS TO PATIENT POPULATION SELECTION AND ANALYSIS AND SELECTION POPULATION PATIENT TO HYPOTHESIS FROM comes that could help drive important medical decisions. Internation the using identified be can population patient The al TenthClassification Revision, of Clinical Diseases, Modi codes. diagnosis secondary or primary (ICD-10-CM) fication ICD-10-PCS describes 69,823 producer codes (7 characters The A-H, J-N, P-Z). letters numeric 0-9 or either in length, (3-7 characters codes 71924 diagnosis ICD-10-CM describes two numeric, 3-7 can one alphabetical, in length, character there numeric, or both). For example, be either alphabetical, mellitus, diabetes type-1 for ICD-10-CM codes 113 about are FIGURE 2: type, data analytics the HealtheDataLab – COVID-19 dataset, study methods and extracting clinically useful results. and clinical encounters include pharmacy, over 500 million information admission, and billing laboratory, microbiology from affiliated patient care locations. Cerner Corporation has Accountability and Health Insurance Portability established Act-compliant operating policies for the de-identification of more than comprises data The Data. Real-World Cerner the with hospital associated variables and nonclinical 100 clinical diagnoses, primary secondary and primary stays, including admissions and discharge and secondary procedures, patients’ (e.g., sex, age, information demographic statuses, and patient and charges, source, total payment expected race/ethnicity, stored and secured data patient deidentified A stay). of length on COVID-19-related included Services Inc., (see Figure 1) that spread and track help to available was made demographics conditions, treat underlying illnesses and chronic surge, and out complications results and clinical ments, laboratory 18 ------Cerner 4,5

HealtheDataLab – Advanced cloud-based AWS analytic AWS Advanced cloud-based HealtheDataLab – Cerner Corporation has initially offered access to ap to access offered initially has Corporation Cerner 4,5 laboratory test. (ED) visit, is admitted for observation, or has an inpa (ED) visit, is admitted asso be could that code diagnosis a with encounter tient ciated with COVID-19 exposure or infection; OR department emergency one of minimum a has Patient for observation, or has an inpa (ED) visit, is admitted with a positive result for a COVID-19 encounter tient Patient has a minimum of one emergency department department emergency one of minimum a has Patient fier and each encounter has a unique identifier. The Cerner Data-COVID-2020Q2apr version of the data World Real Data Real-World Cerner from 54 contributing data included summary of A patients. systems that had qualifying health 3. Table is presented in supplemental data variables patients Data contains data of over 65 million Real-World The States. United the across facilities healthcare 750 from The data elements and diagnostic codes used for inclusion and diagnostic The data elements The data 1, 2.A, and 2.B. Tables are provided in supplemental set includes both patients with confirmed COVID-19 infec was suspected but excluded tion and those in whom infection identi has a unique patient Each policies. using institutional 2. METHODOLOGY patients for data includes dataset deidentified COVID-19 The following criteria: who qualified for inclusion based on the 1. systems to this cloud-computing environment, including data including environment, cloud-computing this to systems who have planned COVID-19 hy scientists and clinicians is approved, After a project projects. research potheses and lead signed by the be (DUA) must Agreement Use Data a is to which the investigator and the organization investigator affiliated with prior to accessing the data. released by Cerner Corporation through the Cerner® Healthe through the Cerner® by Cerner Corporation released time-sensitive facilitate to for researchers platform DataLab is a subset of the The data analysis related to COVID-19. data medi electronic the from extracted Data Real-World Cerner use agree data has a Cerner where hospitals records of cal ment. health from investigators individual 45 named proximately FIGURE 1: and research (EHR) record health electronic simplified platform for of data. dealing with a vast amount in of health supplier American is an Corporation Cerner The solutions, services, devices, and hard technology formation ware. The COVID-19 deidentified dataset was prepared and

HealthCare Research Journal, Vol. 1 ------Discharge destina Discharge 6,8 be missing even if the individual is still affected by that con by that affected is still individual be missing even if the dition. on the severity details which provides minimal The dataset, of clinical deficits and diagnostic study results (imaging and some types of in-depth tests), may also preclude laboratory mea be cannot outcome discharge functional The analyses. used the index closest the and data, available the sured with using studies previous in done as discharge the of destination Sample data. Inpatient the Nationwide drug, dose quantity, dose unit, route, frequency, and status. and frequency, route, unit, dose quantity, dose drug, with the demographic can be linked table The medication contains nine ele table The procedure tables. and encounter and demographic the with link two keys to including ments (procedure other elements tables as well as seven encounter date, start service procedure, code, procedure type, code id, There are about 30 element and billing rank) service end date id, person id, encounter including types in the results table result date, service result id, code type, result code, result, type, text value, numeric value, numeric value unit modifier, of measure, date value, codified value codevalue code, type,codified value, status, codified specimen types, measure the in type data The and many others. interpretation, ments, the “as except a string or an integer, is either COVID database some table, Boolean, and in the medication needed” element data the of Details long. are table medication the in types data types and their descriptions can be found in Supplement. AND CHALLENGES LIMITATIONS records linked to This dataset contains a subset of medical up with diagnosis codes available individuals the included re lab and COVID-19 outbreak to the years prior three to analyses possible limit can This 2019. December sults from as diagnoses that occurred outside of these time ranges and may encounters during subsequent which are not re-recorded The database contains seven tables, such as demographics, as demographics, such tables, seven contains database The procedure, medication, condition, labs, COVID encounters, id contain the patient’s demographic tables The and results. status. deceased and ethnicity race, gender, their with along including elements, data twelve contains table The encounter date, hospitaliza service of the encounter, the time the age at status. For de-identi and discharge admission type, tion date, fication purposes, ages are constrained to the range of 17-90. to information prior clinical also contains table encounter The help with the identification of comorbid conditions and sev medica results, including tables, other with link to keys eral The COVID lab tion, procedure, COVID labs, and condition. lab code type, date, types, service have nine element tables and encounter type, which were not de code, lab test, results, keys This table also contains table. scribed on the encounter tables. and results encounter, that link with the demographic, including types elements eleven contains table condition The code type, condi date, asserted date, condition id, effective tion code, condition, billing details, classification of the clin ical-stage or process at which the condition was identified, This table also contains several and source encounter type. and encounter tables. keys that link with other demographic including elements fourteen contains table medication The stop date, code type, drug code, id, start date, medication 19 ------and vali 6 . The analysis may require complex Structured Query Structured complex require may analysis The . 7 history (2016 to 2019) for the qualified patients’ records. DATASET DESCRIPTION DATASET database is a relational dataset COVID real-world, Cerner’s health records that contain data related to CO of electronic De from has records dataset current The VID-19 encounters. years of prior April 2020, along with three 2019 to cember be displayed in a list or table format to limit repeating in stances of the same data, and the system was enhanced to sup together captured Values values. port the grouping of related together, records can be populated health in the electronic panels), of laboratory (components such as laboratory results medications, adverse events, and medical history. DATA QUALITY DATA data from electronic retrieving data capture The integrated error that risk of a transcription records reduces the medical The data analysis methodology entry. can occur with manual need to that information of participant pieces allows certain skilled nursing home, or death. The discharge destination can destination The discharge or death. nursing home, skilled mod and mild to none define to marker surrogate a as used be described as previously disability to severe erate dated within coding queries and Python or R-like (SQL) Language Spark infrastructure. Jupyter notebooks to interface with the such as intubation and mechanical ventilation. The procedure The ventilation. mechanical and such as intubation are 0BJ17EZ ventilation and mechanical codes for intubation are for intubation codes CPT The respectively. Z9911, and ventilation mechanical the 31500, 94656, and 94657, while status can be catego codes are 94002 to 94005. Discharge short-term hospital rized into a routine, home health care, and care intermediate including and other facilities ization ratory failure (J96), pneumonia (J12-J18), urinary tract infec urinary tract (J12-J18), ratory failure (J96), pneumonia shock N39.0), septic N34.1, N34.2 and (N30.0, N30.9, tion (I82), pulmo thrombosis venous deep R65.21), (A41 and (I21). infarction myocardial nary embolism (I26), and acute Procedural Common and codes ICD-10-PCS procedure The procedures identify used to be can (CPT) codes Technology and E13), nicotine dependence (F17), hyperlipidemia (E78), (F17), hyperlipidemia dependence and E13), nicotine atrial fibrillation (I48) and congestiveheart failure (I09.81, codes diagnosis ICD-10-CM secondary The I50). and I11.0 kidney events such as acute hospital were used to identify arrest (I46), sys (K72), cardiac failure injury (N17), hepatic respi (R65.1), syndrome(SIRS) response inflammatory temic edema, left eye) and so on. Many of these codes are very codes are Many of these and so on. eye) left edema, body. the of side left and side right the describing granular, group we can dimensionality, data the to reduce Furthermore, based code 3-5-digit a categories, higher-order into codes the codes can ICD-10-CM For example, on the type of study. patients with hypertension (I10, O10.0, be used to identify E10, E11 E09, (E08, mellitus diabetes O10.9, I16 and I67.4), proliferative diabetic retinopathy), E10.321(Type 1 diabetes 1 diabetes E10.321(Type retinopathy), diabetic proliferative with retinopathy diabetic with mild non-proliferative mellitus with mellitus 1 diabetes E10.3291 (Type edema), macular without macular retinopathy diabetic mild non-proliferative with mellitus 1 diabetes E10.3292 (Type eye), right edema, without macular retinopathy diabetic mild non-proliferative namely E10.2 ( Type 1 diabetes mellitus with kidney com with kidney mellitus 1 diabetes Type ( E10.2 namely mild non- with mellitus 1 diabetes (Type E10.32 plications),

HealthCare Research Journal, Vol. 1 ------tion in the United States. Arch Neurol. 2011;68(12):1536-1542. tion in the United States. Dis AI, Chaudhry SA, Sapkota BL, Rodriguez GJ, Suri MF. Qureshi charge destination as a surrogate for Modified Rankin Scale defined Arch survivors. stroke among poststroke 12-months and 3- at outcomes Phys Med Rehabil. 2012;93(8):1408-1413 e1401. AI. Na WG, Qureshi Tekle AE, Chaudhry SA, Grigoryan M, Hassan of and outcomes of endovascular treatment tional trends in utilization era. thrombectomy in the mechanical stroke patients ischemic acute Stroke; a journal of cerebral circulation. 2012;43(11):3012-3017. Diagnosis of CO Improved Molecular al. KK, et To CC, Yip Chan JF, COVID-19-RdRp/ Specific and Sensitive Highly Novel, the by VID-19 In Vitro Assay Validated Hel Reverse Transcription-PCR Real-Time and with Clinical Specimens. J Clin Microbiol. 2020;58(5). A, of laboratory et al. Rapid establishment Konrad R, Eberle U, Dangel Ger in Bavaria, SARS-CoV-2 coronavirus novel for the diagnostics February 2020. Euro Surveill. 2020;25(9). many, in Different of SARS-CoV-2 Detection al. Gao R, et Y, Xu W, Wang AMA. 2020:323(18). of Clinical Specimens. Types Describe the data elements of the database for ordinary for of the database elements the data Describe knowledge. any database not have who do people ICD work, and carpentry the data cleansing Describe codes into the various ICD how to aggregate codes. and cri exclusion and inclusion on the based variable study a type of study. teria for the Analyze the final dataset usingmachine learning meth analysis - mul forest and statistical ods such as Random or Odds Ratio analysis. tivariate regression rel clinical with results analyzed interpret and Describe evance. 7. 8. 9. 10. 11. 1. 2. 3. 4. for HealtheDatalab the of value the demonstrated we Overall, clinical decision support. clinical research and ACKNOWLEDGEMENT of Institutes National by the supported are MR and WIB The content is solely the responsi Health 5T32LM012410. the represent necessarily not does authors and the of bility of Health. official views of the National Institutes 20 ------Accessed July 6, 2020. data research M. Integrating SB, Hoffman M, Mitchell Laird-Maddox capture into the workflow: real-world experi ence to advance innovation. Perspect Health Inf Manag. 2014;11:1e. of treatment Thrombolytic AE, et al. AI, Chaudhry SA, Hassan Qureshi dissec arterial to underlying stroke related ischemic acute with patients Guan WJ, Ni ZY, Hu Y, et al. Clinical Characteristics of Coronavi Characteristics et al. Clinical Y, Hu WJ, Ni ZY, Guan of medicine. The New England journal rus Disease 2019 in China. 2020;382(18):1708-1720. Cerner Corporation. Cerner Provides Access to De-Identified Patient https://www. Development. Vaccine Data for COVID-19 Research and cerner.com/newsroom/cerner-provides-access-to-de-identified-patient- Published 2020. data-for-covid-19-research-and-vaccine-development. situation report - 51. https://www.who.int/emergencies/diseases/novel- Accessed July 6, Published 2020. coronavirus-2019/situation-reports/. 2020. Coronavirus disease 2019 (COVID-19) Health Organization. World report - 184. https://www.who.int/emergencies/diseases/ situation Accessed Published 2020. novel-coronavirus-2019/situation-reports/. July 6, 2020. World Health Organization. Coronavirus disease 2019 (COVID-19) Health Organization. World 5. 6. 3. 4. 2. tion to the analytic method with special emphasis on cardio emphasis with special method to the analytic tion we addressed four goals vascular diseases. More specifically, in this publication as follows. REFERENCES 1. We demonstrated the use of nationwide real-world-data, such real-world-data, use of nationwide the demonstrated We in Cerner client through as the COVID-19 collected dataset conditions cardiovascular of assessment risk the for stitutions we Further, environment. AWS in an using HealtheDatalab demonstrated a workflow pipeline starting from data extrac dertaken.9-11 Therefore, these patients may not be entirely may not these patients Therefore, dertaken.9-11 reflective of non-COVID-19 patients in general. The com COVID-19 without with and those parison between patients understand with the abovementioned should be undertaken ing. CONCLUSIONS There is a group of patients without COVID-19 without the dataset in group of patients There is a These negative. tested and for COVID-19 screened who were of respira suggestive clinical have may patients have may they that could mean which infections tract tory small minority of or even a tract infections other respiratory tests un on the screening depending COVID-19 undetected tion may allow differentiation of patients with different func with different patients of differentiation allow tion may of accuracy. level a reasonable groups with outcome tional

HealthCare Research Journal, Vol. 1 long long long long long long long string string string string string string string string string string string string string string string string string string string string string string string string integer integer integer integer integer Data Type - - Viewed Viewed 21 that occurred within two weeks prior to a qualifying encounter. The ID of the person associated with the condition An ID that uniquely identifies this encounter for a person The number of rows in the result table linked to this encounter The ID of the person associated with the result An ID that uniquely identifies this encounter for a person An ID that uniquely identifies this result for a person The clinically significant date and time associated with the lab result The type of coding system used for recording the lab The code value that identifies the lab The display name of the lab test performed The value of the lab. Possible values include Positive, Negative, Indeterminate, Not Done, or Unknown an NOTE: If for example, Inpatient or Emergency. The specific type of encounter, Admitted for Observation encounter has a type other than Inpatient, Emergency, or Inpatient hospice care, the lab result was obtained from an outpatient encounter The number of rows in the covid_labs table linked to this encounter The number of rows in the covid_labs table linked to to the personid associated The number of rows in the covid_labs table linked with this encounter this encounter The number of rows in the condition table linked to the number of encounter, When linked by the personid associated with this diagnoses (e.g., source_ rows in the condition table with details on historical The condition records counted here encounter_type=’Historical') NOTE: conditions. are provided to help with the identification of comorbid will display encounterid in the condition table, historical diagnosis records values that do not appear elsewhere in the database. The number of rows in the procedure table linked to this encounter The number of rows in the medication table linked to this encounter Age at the specific encounter for example, health maintenance The insurance information for the encounter, (PPO), Medicaid, or self-pay (HMO), preferred provider organization organization Routine) Emergent, The priority of the admission to a medical facility (Elective, The disposition of the patient at the time of discharge a diagnosis code that may be flag indicating that the encounter included A (0=No; 1=Yes) associated with COVID-19 exposure or infection. flag indicating that the encounter included a positive result for a lab procedure A 1=Yes) that may be associated with COVID-19 testing. (0=No; have a positive result for a CO flag indicating that the encounter did not A within 2 weeks prior that VID-19 lab, but that a positive result was identified This flag NOTE: 1=Yes) may be associated with COVID-19 testing. (0=No; is only valued if pos_cvd_lab_ind=0. The gender of the person The gender of person The race of the The ethnicity of the person the person An indicator of the death of with the encounter The ID of the person associated this encounter for a person An ID that uniquely identifies date repre In an inpatient setting, this of the encounter. The service date and time registered sents when the patient was patient was admitted to the hospital The date and time when the patient was discharged The date and time when the or Inpatient for example, Emergency The specific type of the encounter, Element Detailed Description Element Detailed person The ID of the Data elements in Cerner COVID-19 deidentified dataset deidentified COVID-19 in Cerner elements Data personid encounterid enc_result_recs personid encounterid resultid servicedate codetype labcode labtest result encountertype enc_cvd_lab_recs pat_cvd_lab_recs enc_dx_recs hist_dx_recs enc_px_recs enc_med_recs age_at_encounter payer admissiontype dischargedisposition cvd_dx_ind pos_cvd_lab_ind pos_lab_2wk_prior_ind gender race ethnicity deceased personid encounterid servicedate hospitalizationstartdate dischargedate encountertype Element Name personid covid_labs condition condition covid_labs covid_labs covid_labs covid_labs covid_labs covid_labs encounter encounter encounter covid_labs covid_labs encounter encounter encounter encounter encounter encounter encounter encounter encounter encounter encounter encounter encounter encounter demographics demographics encounter encounter encounter Table demographics demographics demographics SUPPLEMENTAL TABLE 1: TABLE SUPPLEMENTAL

HealthCare Research Journal, Vol. 1 string string string string string string string string string string string string string string string string string string string string string string string string string string string string string string string string string boolean medication medication medication medication medication medication - - - - code 22 The type of the result value, for example, NUMERIC, CODIFIED, TEXT, or TEXT, The type of the result value, for example, NUMERIC, CODIFIED, DATE The end date of the procedure value that identifies the significance or priority of a billing procedure, for ex A or _NOT_RANKED. SECONDARY, ample, PRIMARY, The ID of the person associated with the result An ID that uniquely identifies this encounter for a person An ID that uniquely identifies this result for a person The type of coding system used for recording the result code that identifies the test or measurement A The display name of the test or measurement The clinically significant date and time associated with the result when an observa tion is made or measurement is taken, at a point in time. For laboratory results, this is the specimen collection time. For vitals, this is the time the measurement was taken The frequency that a dose of the medication is to be administered, for example, The frequency that a dose of the medication is to be As Needed PRN, or BID, q6hr, Daily, needed in a specific dosing Indicates whether the medication is taken only when schedule Discontin Active, Complete, The current status of the medication, for example, ued, or On Hold The ID of the person associated with the procedure person An ID that uniquely identifies this encounter for a person An ID that uniquely identifies this procedure for a The type of coding system used for recording the procedure The code value that identifies the procedure, for example, an ICD-9 or CPT The display name of the procedure The start date of the procedure An ID that uniquely identifies this medication for a person An ID that uniquely identifies The start date and time of the medication order The stop date and time of the medication order The type of coding system used for recording the medication using a single code. Identifies the medication if the medication can be represented code, then drugCode is not If the medication cannot be represented using a single identify the medication populated and the list of ingredients must be used to The display name of the drug or prescribed The dose quantity for the medication that was ordered such as mL, mg, quantity, The codified unit of measure associated with the dose tab(s), etc. for example, orally The route through which the medication is to be administered, or intravenously The code value that identifies the condition, for example, an ICD-10-CM or for example, an ICD-10-CM that identifies the condition, The code value code SNOMED CT The display name of the condition for example, the significance or priority of a billing diagnosis, value to identify A or _NOT_RANKED. SECONDARY, PRIMARY, are: at which the condition was identified. Examples The clinical stage or process Billing, etc. Final, Admitting, Discharge, a qualifying whether the condition record is associated with value to indicate A or a supplemental encounter (value=Supplemental), encounter (value=COVID), to help with the identification of comorbid condi from a prior encounter included tions (value=Historical). with the medication The ID of the person associated this encounter for a person An ID that uniquely identifies An ID that uniquely identifies this condition for a person condition identifies this that uniquely An ID reported was first or condition the diagnosis date when significant The clinically as present or recorded condition was first acknowledged The date the the condition system used for recording The type of coding resulttype serviceenddate billingrank personid encounterid resultid codetype resultcode result servicedate frequency asneeded status personid encounterid procedureid codetype procedurecode procedure servicestartdate medicationid startdate stopdate codetype drugcode drug dosequantity doseunit route conditioncode condition billingrank classification source_encounter_type personid encounterid conditionid effectivedate asserteddate codetype result result result result result result result result procedure procedure procedure procedure procedure medication procedure procedure procedure procedure medication medication medication medication medication medication medication medication medication medication medication medication medication condition condition condition condition condition condition condition condition condition

HealthCare Research Journal, Vol. 1 result result result result string string string string string string string - 23 The symbol or characters that modify the numeric value. For example, <, >, or >= numeric value. For example, characters that modify the The symbol or as mg/dL, lbs, with the result value, such unit of measure associated The codified seconds, etc. only be valued if the result type is DATE. This will The date value of the result. date and time for the result table **See service date for the used for recording the numeric value The type of coding system the codified value The code value recorded for valued if the result type is This will only be result. The codified value of the CODIFIED Modified, or Preliminary example, In Error, The status of the result, for critical, etc. is high, low, Indicates whether the result critical, etc. is high, low, Indicates whether the result The text value of the result. This will only be valued if the result type is TEXT TEXT is the result type be valued if will only This result. value of the The text be a high PII, there will of text and the possibility of free Given the nature NOTE: values associ observed commonly However, in this field. of missingness degree through. whitelisted and will pass lab results have been ated with qualitative type is be valued if the result This will only value of the result. The numeric NUMERIC numericvaluemodifier unitofmeasure datevalue codifiedvaluecodetype codifiedvaluecode codifiedvalue status interpretation interpretation textvalue numericvalue result result result result result result result result result result result

HealthCare Research Journal, Vol. 1 24 Thrombocytopenia due to Severe acute respiratory syndrome coronavirus 2 (disorder) At increased risk of exposure to Severe acute respiratory syndrome coronavirus 2 (finding) Sepsis due to disease caused by Severe acute respiratory syndrome coronavirus 2 (disorder) Acute kidney injury due to disease caused by Severe acute respiratory syndrome coronavirus 2 (disorder) Acute hypoxemic respiratory failure due to disease caused by Severe acute respiratory syndrome coronavirus 2 (disorder) Rhabdomyolysis due to disease caused by Severe acute respiratory syndrome coronavirus 2 (disorder) Coronavirus as cause of disease classified elsewhere (diagnosis) Coronavirus as cause of disease classified elsewhere SARS-associated coronavirus (diagnosis) classified elsewhere (diagnosis) SARS-associated coronavirus as cause of disease Middle east respiratory syndrome (mers) (diagnosis) Human coronavirus pneumonia (diagnosis) Middle East Respiratory Syndrome Acute Respiratory Syndrome Severe Coronavirus infection (disorder) Severe acute respiratory syndrome (disorder) (disorder) Healthcare associated severe acute respiratory syndrome coronavirus (disorder) Pneumonia caused by Severe acute respiratory syndrome (event) Exposure to severe acute respiratory syndrome coronavirus Exposure to coronavirus infection (event) Pneumonia caused by Human coronavirus (disorder) Severe acute respiratory syndrome of upper respiratory tract (disorder) Severe acute respiratory syndrome coronavirus 2 (organism) Severe acute respiratory syndrome coronavirus 2 vaccination (procedure) Disease caused by 2019-nCoV Suspected disease caused by severe acute respiratory coronavirus 2 (situation) Exposure to 2019 novel coronavirus (event) due to Severe acute respiratory syndrome coronavirus 2 (disorder) Lymphocytopenia Code Description not identified COVID-19, virus SARS-associated coronavirus Sepsis due to infection, unspecified Coronavirus elsewhere as the cause of diseases classified Coronavirus as the cause of diseases classified elsewhere SARS-associated coronavirus of diseases classified elsewhere Other coronavirus as the cause coronavirus Pneumonia due to SARS-associated Other viral pneumonia acute respiratory disease 2019-nCoV exposure to other viral communicable diseases Contact with and (suspected) SARS-associated coronavirus coronavirus Pneumonia due to SARS-associated Exposure to SARS Exposure to SARS (history) Severe acute respiratory syndrome (SARS) (diagnosis) (diagnosis) Severe acute respiratory syndrome (SARS) pneumonia Coronavirus infection (diagnosis) List of qualifying diagnosis codes in Cerner COVID-19 deidentified dataset deidentified COVID-19 in Cerner codes diagnosis qualifying List of 870590002 870591003 840534001 840539006 840544004 840546002 866151004 866152006 870577009 870588003 870589006 C128424 C85064 186747009 398447004 408688009 441590008 444482005 702547000 713084008 715882005 840533007 V01.82 123661 272816 272883 318393 330006 330007 330009 350309 366307 B34.2 B97.2 B97.21 B97.29 J12.81 J12.89 U07.1 Z20.828 079.82 480.3 Code U07.2 A41.89 SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT MEDCIN MEDCIN NCI NCI SNOMED CT MEDCIN MEDCIN MEDCIN MEDCIN MEDCIN ICD-9-CM ICD-9-CM ICD-9-CM MEDCIN MEDCIN ICD-10-CM ICD-10-CM ICD-10-CM ICD-10-CM ICD-10-CM ICD-10-CM Code System ICD-10 ICD-10-CM ICD-10-CM ICD-10-CM SUPPLEMENTAL TABLE A: 2 TABLE SUPPLEMENTAL

HealthCare Research Journal, Vol. 1 - - - - - 25 Pneumonia caused by Severe acute respiratory syndrome coronavirus 2 (disorder) Severe acute respiratory syndrome coronavirus 2 detected (finding) (novel coronavirus) detected 2019-nCoV Wuhan (novel coronavirus) 2019-nCoV Wuhan Disease caused by Coronavirus contact Acute bronchitis caused by Severe acute respiratory syndrome coronavirus 2 (disorder) Asymptomatic Severe acute respiratory syndrome coronavirus 2 infection (finding) History of disease caused by Severe acute respiratory syndrome coronavirus 2 (situation) Acute respiratory distress syndrome due to disease caused by Severe acute respiratory syndrome coronavirus 2 (disorder) Disease caused by Severe acute respiratory syndrome coronavirus 2 absent (situation) Lower respiratory infection caused by Severe acute respiratory syndrome coronavirus 2 (disorder) Myocarditis caused by Wuhan 2019-nCoV (novel coronavirus) 2019-nCoV Wuhan Myocarditis caused by acute respiratory syndrome coronavirus 2 Infection of upper respiratory tract caused by Severe (disorder) (novel coronavirus) 2019-nCoV Wuhan Upper respiratory tract infection caused by (novel coronavirus) 2019-nCoV Wuhan Pneumonia caused by (novel coronavirus) 2019-nCoV Wuhan Pneumonia caused by acute respiratory syndrome coronavirus 2 Encephalopathy due to disease caused by Severe (disorder) (novel coronavirus) 2019-nCoV Wuhan Encephalopathy caused by (novel coronavirus) 2019-nCoV Wuhan Gastroenteritis caused by (novel coronavirus) 2019-nCoV Wuhan Gastroenteritis caused by Cardiomyopathy due to disease caused by Severe acute respiratory syndrome coronavirus 2 Cardiomyopathy due to disease caused by Severe respiratory syndrome coronavirus 2 (disor Conjunctivitis due to disease caused by Severe acute der) coronavirus 2 (disorder) Fever caused by Severe acute respiratory syndrome coronavirus 2 (disorder) Dyspnea caused by Severe acute respiratory syndrome Coronavirus infection, unspecified (disorder) coronavirus 2 (substance) Ribonucleic acid of Severe acute respiratory syndrome coronavirus 2 antibody (observable entity) Measurement of Severe acute respiratory syndrome coronavirus 2 antigen (observable entity) Measurement of Severe acute respiratory syndrome respiratory syndrome coronavirus 2 (disorder) Otitis media due to disease caused by Severe acute respiratory syndrome coronavirus 2 (disorder) Myocarditis due to disease caused by Severe acute Detection of ribonucleic acid of Severe acute respiratory syndrome coronavirus 2 in nasopharyn acute respiratory syndrome ribonucleic acid of Severe Detection of entity) geal swab (observable coronavirus 2 in oropharyn acute respiratory syndrome ribonucleic acid of Severe Detection of geal swab (observable entity) 2 in sputum acid of Severe acute respiratory syndrome coronavirus Detection of ribonucleic (observable entity) 2 in bronchoal acid of Severe acute respiratory syndrome coronavirus Detection of ribonucleic entity) veolar lavage fluid (observable 2 using poly acid of Severe acute respiratory syndrome coronavirus Detection of ribonucleic entity) merase chain reaction (observable entity) respiratory syndrome coronavirus 2 (observable Detection of Severe acute syndrome Severe acute respiratory syndrome Severe acute respiratory Detection of Severe acute respiratory syndrome coronavirus 2 antibody (observable entity) (observable 2 antibody coronavirus syndrome acute respiratory of Severe Detection entity) (observable 2 antigen coronavirus syndrome acute respiratory of Severe Detection coronavirus 2 (observable acute respiratory syndrome ribonucleic acid of Severe Detection of entity) 882784691000119000 1240581000000100 1240581000000104 1240751000000100 1861021000006100 138389411000119000 189486241000119000 292508471000119000 674814021000119000 688232241000119000 880529761000119000 1240531000000103 1240541000000100 1240541000000107 1240551000000100 1240551000000105 1240561000000100 1240561000000108 1240571000000100 1240571000000101 119731000146105 119741000146102 119751000146104 119981000146107 431711000000107 1240411000000100 1240461000000100 1240471000000100 1240521000000100 1240531000000100 871556004 871557008 871558003 871559006 871560001 871562009 78601000000102 83381000000100 871552002 871553007 871555000 SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED SNOMED CT SNOMED CT SNOMED CT SNOMED CT SNOMED

HealthCare Research Journal, Vol. 1 -probe set N1 -probe set N2 with probe detection with probe detection with probe detection with probe detection 26 List of qualifying COVID-19 laboratory codes in Cerner COVID-19 deidentified dataset deidentified COVID-19 in Cerner codes laboratory COVID-19 qualifying List of SARS coronavirus 2 RNA [Presence] in Serum or Plasma by NAA with probe detection by NAA [Presence] in Serum or Plasma SARS coronavirus 2 RNA with probe detection specimen by NAA [Presence] in Unspecified SARS coronavirus 2 RNA with probe detection by NAA panel - Respiratory specimen SARS coronavirus 2 RNA SARS coronavirus 2 ORF1ab region [Cycle Threshold #] in Respiratory specimen by NAA with probe detection Threshold #] in Respiratory specimen by NAA SARS coronavirus 2 ORF1ab region [Cycle detection with probe Threshold #] in Unspecified specimen by NAA SARS coronavirus 2 ORF1ab region [Cycle with probe detection SARS coronavirus 2 ORF1ab region [Presence] in Respiratory specimen by NAA SARS coronavirus 2 ORF1ab region [Presence] in Unspecified specimen by NAA probe detection with Threshold #] in Respiratory specimen by NAA SARS coronavirus 2 RdRp gene [Cycle with probe detection Threshold #] in Unspecified specimen by NAA SARS coronavirus 2 RdRp gene [Cycle with probe detection SARS coronavirus 2 RdRp gene [Presence] in Respiratory specimen by NAA SARS coronavirus 2 RdRp gene [Presence] in Unspecified specimen by NAA with non-probe detection by NAA [Presence] in Nasopharynx SARS coronavirus 2 RNA with probe detection by NAA [Presence] in Respiratory specimen SARS coronavirus 2 RNA SARS coronavirus 2 IgM Ab [Presence] in Serum or Plasma by Rapid immunoassay Ab [Presence] in Serum or SARS coronavirus 2 IgM or Plasma by Immunoassay Ab [Units/volume] in Serum SARS coronavirus 2 IgM with probe detection Threshold #] in Unspecified specimen by NAA SARS coronavirus 2 N gene [Cycle amplification using primer- Threshold #] in Unspecified specimen by Nucleic acid SARS coronavirus 2 N gene [Cycle probe set N1 amplification using primer- Threshold #] in Unspecified specimen by Nucleic acid SARS coronavirus 2 N gene [Cycle probe set N2 with probe detection specimen by NAA SARS coronavirus 2 N gene [Presence] in Respiratory SARS coronavirus 2 N gene [Presence] in Unspecified specimen by NAA using primer SARS coronavirus 2 N gene [Presence] in Unspecified specimen by Nucleic acid amplification using primer SARS coronavirus 2 N gene [Presence] in Unspecified specimen by Nucleic acid amplification SARS coronavirus 2 Ag [Presence] in Respiratory specimen by Rapid immunoassay Ag [Presence] in Respiratory specimen by Rapid SARS coronavirus 2 with probe detection Threshold #] in Unspecified specimen by NAA SARS coronavirus 2 E gene [Cycle specimen by NAA SARS coronavirus 2 E gene [Presence] in Unspecified Serum or Plasma by Immunoassay Ab [Presence] in SARS coronavirus 2 IgA Plasma by Immunoassay Ab [Presence] in Serum or SARS coronavirus 2 IgG Plasma by Rapid immunoassay Ab [Presence] in Serum or SARS coronavirus 2 IgG or Plasma by Immunoassay Ab [Units/volume] in Serum SARS coronavirus 2 IgG or Plasma by Immunoassay SARS coronavirus 2 IgG and IgM panel - Serum or Plasma by Rapid immunoassay SARS coronavirus 2 IgG and IgM panel - Serum Ab [Presence] in Serum or Plasma by Immunoassay SARS coronavirus 2 IgG+IgM Plasma by Immunoassay Ab [Presence] in Serum or SARS coronavirus 2 IgM Human coronavirus 229E RNA [Presence] in Upper respiratory specimen by NAA with probe detection by NAA respiratory specimen [Presence] in Upper 229E RNA Human coronavirus detection with non-probe in Nasopharynx by NAA [Presence] HKU1 RNA Human coronavirus detection with probe [Presence] in Unspecified specimen by NAA RNA Human coronavirus HKU1 detection with non-probe [Presence] in Nasopharynx by NAA RNA Human coronavirus NL63 probe detection with NAA [Presence] in Unspecified specimen by RNA Human coronavirus NL63 detection with probe by NAA [Presence] in Upper respiratory specimen RNA Human coronavirus NL63 with non-probe detection in Nasopharynx by NAA [Presence] RNA Human coronavirus OC43 with probe detection in Unspecified specimen by NAA [Presence] RNA Human coronavirus OC43 with probe detection in Upper respiratory specimen by NAA [Presence] RNA Human coronavirus OC43 Ab [Interpretation] in Serum or Plasma SARS coronavirus 2 Code Description with non-probe detection by NAA [Presence] in Nasopharynx 229E RNA Human coronavirus detection with probe specimen by NAA [Presence] in Unspecified 229E RNA Human coronavirus LOINC 88626-7 94661-6 94558-4 94509-7 94315-9 82162-9 41005-0 88618-4 82164-5 41009-2 82163-7 41003-5 88610-1 82161-1 62423-9 94511-3 94311-8 94565-9 94500-6 94660-8 94309-2 94531-1 94559-2 94639-2 94646-7 94645-9 94534-5 94314-2 94316-7 94307-6 94308-4 94644-2 94510-5 94312-6 94533-7 94503-0 94547-7 94564-2 94508-9 94506-3 94562-6 94563-4 94507-1 94505-5 94504-8 SUPPLEMENTAL TABLE 2 B: TABLE SUPPLEMENTAL

HealthCare Research Journal, Vol. 1 with probe detection with probe detection 27 SARS coronavirus 2 S gene [Presence] in Respiratory specimen by NAA with probe detection by NAA in Respiratory specimen 2 S gene [Presence] SARS coronavirus by NAA in Unspecified specimen 2 S gene [Presence] SARS coronavirus with probe detection in Unspecified specimen by NAA [Presence] SARS coronavirus RNA with probe detection in Unspecified specimen by NAA Threshold #] gene [Cycle SARS-like coronavirus N NAA gene [Presence] in Unspecified specimen by SARS-like coronavirus N with probe detection specimen by NAA [Presence] in Respiratory RNA SARS-related coronavirus with probe detection specimen by NAA [Presence] in Unspecified RNA SARS-related coronavirus probe detection with [Presence] in Respiratory specimen by NAA coronavirus RNA SARS-related coronavirus+MERS SARS coronavirus 2 IgM Ab [Presence] in Serum or Plasma by Immunoassay or Plasma in Serum Ab [Presence] 2 IgM coronavirus SARS detection with probe NAA specimen by panel - Unspecified 2 RNA coronavirus SARS with probe detection specimen by NAA Threshold #] in Respiratory 2 S gene [Cycle SARS coronavirus detection with probe NAA specimen by Threshold #] in Unspecified 2 S gene [Cycle SARS coronavirus 41458-1 94313-4 94310-0 94502-2 94647-5 94532-9 94306-8 94642-6 94643-4 94640-0 94641-8 94564-2

HealthCare Research Journal, Vol. 1 28 Categories of variables Categories Laboratory test results Clinical events/results Major procedures Demographics Demographics Diagnoses/comorbidities Medications Immunizations type) location encounter metadata (hospital Encounter SUPPLEMENTAL TABLE 3: TABLE SUPPLEMENTAL

HealthCare Research Journal, Vol. 1