Gender, Sex, and Sexual Orientation in Medicine: A Linguistic Analysis

A dissertation submitted to the

Graduate School

of the University of Cincinnati

in partial fulfillment of the

requirements for the degree of

Doctor of Philosophy

in the Department of Biomedical Informatics

of the College of Medicine

by

Clair Artemis Kronk

B.Sc. University of Pittsburgh

April 2017

Committee Chair: Judith Dexheimer, Ph.D.

ABSTRACT

Nine million Americans identify as LGBTQIA+ (, gay, bisexual, , queer/questioning, intersex, agender/asexual, and other umbrella gender and sexual identity minorities), with an additional 187,000 expressing intersex anatomical variations.

LGBTQIA+ populations experience disproportionate amounts of discrimination and stigmatization. A 2017 survey revealed that 51% of LGBTQIA+ people said that they or an

LGBTQIA+ friend had experienced violence due to their identity. LGBTQIA+ discrimination is also prevalent in healthcare, 33% of transgender individuals disclosed negative experiences related to a health care provider and 23% described avoiding seeing a doctor when they needed due to fear of mistreatment.

Such negative experiences are often connected to language use. Linguistic stigmatization has been tied to poor patient outcomes in healthcare settings. However, current provider education on LGBTQIA+ topics is lacking with a median of 5 hours dedicated to such subjects.

A first step to addressing health disparities is to adequately model domain-specific linguistic knowledge. In medicine, language is modelled using controlled vocabularies or ontologies. Ontologies are common, shared networks which explain information in a domain. Such systems allow for greater reuse of domain-specific knowledge, and analysis of that knowledge. Although there are hundreds of biomedical ontologies, none cover

LGBTQIA+ subject areas, or the areas of gender, sex, and sexual orientation.

ii

We created the Gender, Sex, and Sexual Orientation (GSSO) ontology and evaluated its usage in research, education, and clinical domains for accuracy, completeness, conciseness, adaptability, clarity, computational efficiency, and consistency. The GSSO includes over

10,000 entries, 14,000 mappings to other databases, more than 200 slang terms with definitions, 200 nonbinary and culturally-specific gender identities, and 190 pronouns with linked example usages. The GSSO is freely available via GitHub

(https://github.com/Superraptor/GSSO) and its website

(https://gsso.research.cchmc.org/) as well as via the NCBO BioPortal, EMBL-EBI OLS, and

Ontobee (as part of the OBO Foundry ontologies).

In research domains, the GSSO was able to perform on par with manually curated literature reviews and outperformed other ontologies in the space. In education, it was able to be easily understood by clinical and non-clinical subgroups. In clinical systems, it outperformed current identification methodologies.

We also tested the systems efficacy in LGBTQIA+ language identification in free-text, including research-related abstracts and clinical notes in electronic health records (EHRs).

In an LGBTQIA+-specific set of MEDLINE abstracts, the GSSO was able to tag 99.85% versus

MeSH tagging 82.62%. In a manually curated transgender bibliography, MeSH would only return 86.9% of results versus the GSSO returning 97.7%. In the EHR, the GSSO outperformed ICD-based identification of transgender persons in both the MIMIC-III

(100% versus 46%) and CCHMC (recall and precision of 0.74 and 0.79 versus 0.50 and

0.53) datasets.

iii

The GSSO is an effective ontological tool which can be applied across a wide variety of domains and questions.

iv

v

ACKNOWLEDGEMENTS

I would like to extend my sincere thanks to all persons who helped me over the course of this project and its many individual endeavors.

Firstly, many thanks to my primary research supervisor and thesis advisor, Dr. Judith W.

Dexheimer, for providing me with the opportunity to pursue research which would likely not be accepted in many places given the current cultural and political climates. She taught me so much over the years I spent here and managed to make me smile even in the worst of circumstances. She is an absolutely incredible mentor.

I would also like to thank the other members of my committee, Dr. Giao Q. Tran and Dr.

Mark H. Eckman for their time, patience, and expertise.

Additionally, I would like to thank administrative and departmental personnel who helped me along this journey, including Dr. Jaroslaw (Jarek) Meller, Dr. Eric Hall, Jill Loch, Mary Jo

Petersman, Sonya Harbin, Melissa Hogan, and especially Dr. Batsheva Guy who facilitated my process of coming out at the University of Cincinnati (UC) and Cincinnati Children’s

Hospital Medical Center (CCHMC) and helped me navigate numerous issues within the department and the university at large.

From the Homosaurus board of directors, I would like to thank K.J. Rawson, Amber Billey,

Marika Cifor, Chloe Noland, Jack van der Wel, Bri M. Watson, Jay L. Colbert, and especially the late Walter “Cat” Walker, who passed away tragically in 2020, after 16 years as Head

Cataloging Librarian at Loyola Marymount University and 25 years as a volunteer at the

vi

ONE National Gay & Lesbian Archives at the University of Southern California (USC)

Libraries.

From the GLBT Museum & Archives, I would like to thank Isaac Fellman, Kelsi Evans, Nalini

Elias, and Patricia Delara.

From the AIDS History Project and Memory Lives On: Documenting the HIV/AIDS Epidemic project group, I would like to thank Charlie Macquarie, Joanna Kang, Rebecca Tang, and

Polina Ilieva.

From the board of directors at OutHistory, I would like to thank Jonathan Ned Katz and Dr.

Randall Sell.

From the American Medical Informatics Association (AMIA) Mental Health Working Group

(MH-WG) and the Systematized Nomenclature of Medicine (SNOMED) Mental and

Behavioural Health Clinical Reference Group (MBHCRG), I would like to thank Dr. Piper

Ranallo and Dr. Jessie Tenenbaum.

From the AMIA Diversity, Equity, and Inclusion Task Force (DEI-TF), I would like to thank

Dr. Tiffani J. Bright, Dr. Suzanne Bakken, Oliver J. Bear Don’t Walk IV, David K. Butler, Dr.

Carl E. Johnson, Dr. Kevin B. Johnson, Dr. Casey Overby Taylor, Dr. Jyotishman Pathak,

Carolyn Petersen, Dr. Rubina Fatima Rizvi, Rosemary Ventura, Dr. Karen Wang, Dr. Patricia

C. Dykes, and AMIA staff Karen Greenwood, Krista Martin, Lisa Gibson, and Nina Richards.

From the Health Sciences Graduate Association (HSGA) and from the Graduate Student

Government (GSG), I would like to thank Kenyatta Viel, Jelena Vicic, Caroline Sackleh, Molly

Broscoe, Smruti Deoghare, Hannah Russell, and Kara Finley Wolfe.

vii

To the following individuals (organizations in parenthetical), I would like to give my utmost thanks for their kindness, friendship, compassion, and support over the last few years: Jake Tracy, Albert Carter, Wesley Parker, Avery Everhart, Laur Bereznai, Kiri

Stewart, Madg Weighner, Riley Galvin, Louis Markowitz, Florence Paré

, Katie Kardum, Sam Jackson, Jennifer Lynn, Mathias Vitullo, Lili LC, Troy Henson, Danielle

Parker, Harris Wheeler, Madison Mumma, Jace Rubino, Alex Loss, Jessica Howey, Ariel

Mary Ann, Jeremy Brenner-Levoy, Fait Poms, Hugh Ryan, Blair Perry, Madeline Barber,

Corey Forman, Gabrielle Cuadra, Susan Stryker, Patti Brennan, Pieter-Jan Van Camp, Surbhi

Bhatnagar, Madrid Vinarski, Rachel Golden, Kayleigh Rozwat, Sarah Burns, Christina

Lancaster, Andie Vester, Sylvia Guard, Aurora Starr, Angela Larsen, Nora Anderson, and especially the late Ariel Galant, who passed away in 2020 after a lifetime of service to disenfranchised communities. She was an amazing ally and an incredible friend. She will be missed.

To my parents, my siblings, and my extended chosen family: thank you so much for all of your support over the course of this project. This document would not be possible without you. Thank you for every time you helped me move, thank you for letting me stay with you during various conferences and work-related trips, thank you for listening to all of my jargon-laden ranting, thank you for asking questions about my work, thank you for visiting me, for keeping me sane during one of the worst pandemics of the last century, during the many times political figures blamed people like me for war, for famine, for mass shootings, during the many death threats I received, through the discrimination I faced in around the workplace and in Cincinnati generally. I would not be here today if it were not for you.

viii

Additionally, I would like to acknowledge that the University of Cincinnati was built on land forcibly taken from a number of Indigenous Algonquian speaking tribes, including the

Delaware, Miami, and Shawnee tribes. This action and the actions of many other colonizers led directly to the deaths of over 100 million Native American and First Nations peoples. I would also like to acknowledge and condemn my specific institution’s promotion of white nationalism, direct and indirect participation in racially-motivated violence, especially against members of the Black community, lack of appropriate mental health resources for students and information regarding suicide prevention, and continued disregard for sexual assault survivors and lack of appropriate response to sexual harassment and violence, as showcased by the dismemberment of Reclaim and Title IX resources. Finally, I would like to acknowledge my transgender siblings, especially the 1,657 who were brutally murdered simply for being trans during my time writing this document.

ix

TABLE OF CONTENTS

ABSTRACT ...... ii

ACKNOWLEDGEMENTS ...... vi

TABLE OF CONTENTS ...... x

LIST OF FIGURES ...... xiii

LIST OF TABLES...... xiv

LIST OF ABBREVIATIONS ...... xvi

PART I: INTRODUCTION ...... 2

Chapter 1: What is an Ontology? ...... 5

Chapter 2: Biomedical Ontologies ...... 7

Chapter 3: Ontology Development ...... 10

Chapter 4: The Importance of Language ...... 13

Chapter 5: Language of Sex and Gender ...... 15

Chapter 6: Sex and Gender in Medicine ...... 19

PART II: STUDY DESIGN ...... 25

Chapter 7: Building Ontologies ...... 25

Chapter 8: Data Sources and Dataset Construction ...... 27

Chapter 9: Criteria for Ontology Completion ...... 28

x

PART III: ONTOLOGY CONSTRUCTION ...... 30

Chapter 10: Literature Review ...... 32

Review of Existing Ontologies ...... 38

Review of Classification Systems ...... 39

Chapter 11: Entity and Attribute Construction ...... 41

PART IV: ONTOLOGY EVALUATION ...... 42

Chapter 12: Research Documentation ...... 53

MEDLINE ...... 53

AIDS History Project ...... 59

Chapter 13: Medical Documentation ...... 60

MIMIC-III ...... 63

Cincinnati Children’s Hospital ...... 66

PART V: DISCUSSION ...... 75

Chapter 14: Limitations ...... 76

Chapter 15: Future Directions ...... 79

Chapter 16: Conclusions ...... 80

REFERENCES ...... 82

APPENDICES ...... 89

Appendix A: Rule Structures ...... 89

Appendix B: Online Resources ...... 92 xi

Appendix C: Survey Instruments ...... 94

Appendix D: Necessary Mappings ...... 97

Appendix E: Additional Statistical Calculations ...... 101

xii

LIST OF FIGURES

Figure 1. Sample entries from version 2 (left) and version 1 (right) shown in Protégé...... 44

Figure 2. Example connectivity of the GSSO in version 1, showcasing the high- connectedness in the GSSO graph and sub-graphs...... 45

Figure 3. Screenshots from initial markups for the stand-alone GSSO website with version

1 (left) and version 2 (right)...... 50

Figure 4. Transgender-related diagnostic codes in the International Classification of

Diseases, specifically in ICD-9 (Panel A) and ICD-10 (Panel B). Bolded codes are typically considered unambiguous in their reference to transgender persons...... 63

Figure 5. Comparison of identification methodologies in the CCHMC dataset (left) and the

Foer et al (2019) dataset (right)...... 73

Figure 6. Distribution of matching scores using GSSO algorithm for transgender identification...... 74

xiii

LIST OF TABLES

Table 1. Outline of ontology components...... 11

Table 2. Evaluation criteria derived from Raad and Cruz (68)...... 29

Table 3. Numerical comparisons between 1.0.0 and 2.0.0...... 31

Table 4. Classification systems, subject headings, and thesauri which cover LGBTQIA+ topics specifically...... 33

Table 5. Encyclopedias which cover LGBTQIA+ topics explicitly or implicitly. Number of entries are listed only if applicable to the encyclopedia’s structure and if the encyclopedia itself was available for analysis...... 35

Table 6. Other resources covering LGBTQIA+ terminologies...... 37

Table 7. External database mappings from version 1 to version 2. Only databases with more than ten mappings are shown...... 39

Table 8. Graph metrics for classes (with individuals removed) calculated for the GSSO

(Gender, Sex, and Sexual Orientation Ontology), MeSH (Medical Subject Headings),

SNOMED-CT (Systematized Nomenclature of Medicine, Clinical Terms), and ICD-10-CM

(International Classification of Diseases, Tenth Revision) using the Ontology Metrics

(OntoMetrics) application. N/A = the ontology lacked the conciseness necessary for comprehensive analysis...... 46

Table 9. Comparison of average values of other OBO Foundry ontologies to the GSSO...... 47

Table 10. Group membership and estimated response rates pre-survey distribution. Based on Sivo et al, we label a lower range of 60% of active membership and an upper range of

90% of active membership...... 51

xiv

Table 11. Term frequencies for most common 32 terms across main terminologies for the

HRC subset...... 55

Table 12. Percentages of subject coverage in 1799 to 1999 dataset versus Wanta and

Unger dataset, up to June 2016 (Table 1 in that publication)...... 57

Table 13. Precision, recall, and F-scores for the top ten represented subjects in MEDLINE transgender literature review...... 58

Table 14. Percentage of transgender patients shown out of total adult patients and total diagnoses in MIMIC-III...... 64

Table 15. Differences in mental health diagnoses, based on ICD-9 and GSSO search mechanisms. Score cutoff for GSSO identification was 1.0...... 66

Table 16. Transgender-related ICD-9 diagnoses in the CCHMC dataset...... 67

Table 17. Transgender-related ICD-10 diagnoses in the CCHMC dataset...... 68

Table 18. Comparison of Williams Institute and CCHMC transgender racial/ethnic statistics. Latinx/Hispanic population statistics not recorded (hence the N/A)...... 71

Table 19. Overlap between various identification methods in CCHMC data...... 72

Table 20. Precision (P), recall (R), and F-scores (F) for ICD-10 identification, free-text identification, and GSSO identification based on a score cutoff of 1.0. Foer et al (2019) comparisons shown under averages (Table 1 in that paper), wherein methodologies are the same, except “GSSO” is “Keyword”...... 75

xv

LIST OF ABBREVIATIONS

AIDS Acquired Immunodeficiency Syndrome ASAB Assigned Sex At Birth BIDMC Beth Israel Deaconess Medical Center CCHMC Cincinnati Children’s Hospital Medical Center ChEBI Chemical Entities of Biological Interest DSD Disorders of Sexual Development DTA Digital Transgender Archive EMBL European Molecular Biology Laboratory EMBL-EBI EMBL European Bioinformatics Institute EMBL-EBI OLS EMBL-EBI Ontology Lookup Service FFS Forward Feature Selection GLBT Gay, Lesbian, Bisexual, Transgender GSSO Gender, Sex, and Sexual Orientation ontology HL7 Health Level Seven HRC Human Rights Campaign ICD-9 International Classification of Diseases, Ninth Revision ICD-10 International Classification of Diseases, Tenth Revision ICU Intensive Care Unit IDE Integrated Development Environment JSON JavaScript Object Notation JSON-LD JavaScript Object Notation for Linked Data LGBTQIA+ Lesbian, Gay, Bisexual, Transgender, Queer and questioning, Intersex, Agender and asexual, and other umbrella gender and sexual minority identities

xvi

MeSH Medical Subject Headings MIMIC Medical Information Mart for Intensive Care NCBO National Center for Biomedical Ontology NCI National Cancer Institute NLP Natural Language Processing OBO Open Biological and Biomedical Ontologies OBO Foundry Open Biological and Biomedical Ontologies Foundry OCR Optical Character Recognition OMRSE Ontology for Medically Related Social Entities OUS Ontology Usability Scale OWL Web Ontology Language PFLAG Formerly stood for “Parents, Families, and Friends of And Gays”; however as of 2014, PFLAG is no longer an acronym RDF Resource Description Framework SNOMED Systematized NOMenclature of MEDicine SNOMED-CT SNOMED, Clinical Terms SUS System Usability Scale UC University of Cincinnati UC CoM University of Cincinnati, College of Medicine UID Unique IDentifier UMLS Unified Medical Language System

xvii

PART I: INTRODUCTION

Nine million Americans identify as LGBTQIA+ (lesbian, gay, bisexual, transgender, queer/questioning, intersex, agender/asexual, and other umbrella gender and sexual identity minorities), with an additional 187,000 expressing intersex anatomical variations

(1,2). LGBTQIA+ populations experience various forms of discrimination and stigmatization. A 2017 survey revealed that 57% of LGBTQIA+ people experienced slurs,

51% said they or an LGBTQIA+ friend had been sexually harassed, and 51% said they or an

LGBTQIA+ friend had experienced violence due to their identity. 20% experienced discrimination when applying for jobs or purchasing housing (3).

This discrimination is also prevalent in healthcare, with 17% of LGBTQIA+ people reporting that they avoided medical care (3). Thirty-three percent of transgender individuals disclosed negative experiences related to a health care provider and 23% described not seeing a doctor when they needed due to fear of mistreatment (4).

Language often plays a role in these negative experiences. Linguistic stigmatization of substance use disorders, unaddressed language barriers, various forms of verbal abuse, and perceived and actual language-based discrimination have been tied to poor patient outcomes in healthcare settings, showcasing the wide range of potential effects language can have (5–13). As writer Kyle Taylor Shaughnessy noted in The Remedy: Queer and Trans

Voices on Health and Health Care: “While the constantly evolving language and concepts of gender and sexual identity… can be overwhelming at times, if we don’t keep up we lose the ability to connect and therefore to do effective work” (14). However, current provider

2

education on LGBTQIA+ topics is ineffective and inaccessible with a median of 5 hours of

U.S. medical education dedicated to such subjects (15), compared to 77 hours of microbiology and 19.6 hours of nutrition (16,17). As one clinician explained about their experience with transgender medicine and medical education in 2012:

[D]espite trying to find ways to improve my expertise, I just didn’t know where to go

or who to talk to, or where to get the information, and I felt really bad because some

of my initial attempts to help these people—I sent them to people I wish I hadn’t

sent them to. (18)

In 2018, medical research director of The Fenway Institute, Kenneth Mayer echoed this sentiment, noting that one of the largest difficulties that medical professionals providing for LGBTQIA+ patients is that they lack cultural competency related to this lack of education:

The biggest challenge is that the health care system is woefully unprepared to take

appropriate care of LGBTQ people… It’s a dawning idea that needs to gain traction,

that there’s also a whole field of sexual-gender minority health that providers need

to have an understanding of. (19)

A first step to addressing LGBTQIA+ health disparities is to adequately model domain- specific knowledge. This model can then be used to promote education and understanding of LGBTQIA+ topics and foster better communication between LGBTQIA+ patients and clinicians (20). Such a system could help clarify existing, disparate data which uses disconnected language entities.

3

In medicine, language is modelled using controlled vocabularies or ontologies. Ontologies are common, shared networks which explain information in a given domain (21). These systems allow for greater reuse of domain-specific knowledge, analysis of that knowledge, and greater explicitness within that domain (22). There are hundreds of biomedical ontologies, but none of them cover LGBTQIA+ subject areas, or the areas of gender, sex, and sexual orientation (23).

There are no standards or base vocabularies for gender identity or sexual orientation identity collection in medical or nonmedical literature. PFLAG (https://pflag.org/) and the

HRC (https://www.hrc.org/) have produced recommendations, but these have focused on media representation of LGBTQIA+ individuals rather than structuring of medical- or research-based documentation. Producing a novel ontology would allow for a new subfield of medical linguistics with focus on LGBTQIA+ individuals, something required for a more complete understanding of the contemporary patient–clinician relationship.

We constructed and evaluated a controlled ontology for gender-, sex-, and sexual orientation-based terminology. This ontology was built using an interdisciplinary approach, incorporating terms from the biomedical sciences, psychology, sociology, and gender studies. The evaluation process was split into modeling of research-based documentation, and medical documentation. The ontology served as a basis for electronic medical record data collection and systems, as well as general reference for patients, clinicians, archivists, and researchers interested in providing more comprehensive

LGBTQIA+ medical care.

The specific aims of this project were to:

4

Aim 1: Create an ontology which adequate covers gender, sex, and sexual

orientation in health care using an interdisciplinary approach.

Aim 2: Evaluate this accuracy, completeness, conciseness, adaptability, clarity,

computational efficiency, and consistency of the ontology.

Aim 2.1: Evaluate the GSSO in the research domain.

Aim 2.2: Evaluate the GSSO in the medical domain.

Chapter 1: What is an Ontology?

Ontology is a subject which is often thought of as a philosophical discipline, rather than in the natural and applied sciences. In philosophical discourse, ontology is described as the study of being. It is considered a subdiscipline of metaphysics which details entities, their existence or inexistence, and, if they may be grouped, in what sense they should.

The gap between philosophical ontology and information science ontology is made more explicit in Tom Gruber’s 1995 paper “Toward Principles for the Design of Ontologies Used for Knowledge Sharing”. In this work, Gruber defines ontologies in information and computer sciences as follows:

An ontology is a description (like a formal specification of a program) of the

concepts and relationships that can formally exist for an agent or a community of

agents. This definition is consistent with the usage of ontology as set of concept

5

definitions, but more general. And it is a different sense of the word than its use in

philosophy.

The application of ontology in the information sciences is known as a domain ontology.

Domain ontology is one of four basic modes1 by which a philosopher could utilize concepts to solve relevant problems. Domain ontologies ground the conceptualization as a representation of concepts and complexity into a specific model. These can be built utilizing some information object (such as abstract data types like trees and lists). For example, Linnaean taxonomy is a simple domain ontology with hierarchical relationships. A species belongs to a single genus, a genus to a single family, a family to a single order, and so on, while an order can contain many families, a family can contain many genera, etc. A linked list data type could be used to build such a system. Ontologies can also be used to create graph database systems, sometimes called knowledge graphs. Because ontologies can be represented using graph data objects, the terms ontology and knowledge graph may overlap in the information sciences.

The word “ontology” is often connected to terms like vocabulary, controlled vocabulary, and taxonomy in the information sciences. A vocabulary is simply any set of terms or concepts; it can be conceptualized as a free-text box. Vocabularies are constrained by the limitations of natural language. A controlled vocabulary is then more akin to a drop-down list or even a spelling test, where some entity has artificially limited or scoped the vocabulary.

1 The other modes are upper ontology, interface ontology, and process ontology. 6

Controlled vocabularies (and vocabularies generally) do not have any inherent structure.

They may be organized or unorganized. Organizing a vocabulary by its constituent properties like this may sometimes be called a dictionary, but this term is multifaceted in the information sciences, and may also refer to a dictionary data structure, which is unorganized.

Adding structure to a controlled vocabulary in which terms are organized hierarchically produces a taxonomy. There are only hierarchical relationships in taxonomy. No non- hierarchical relationships are allowed in taxonomies. One way to add non-hierarchical connections without forcing a hierarchical structure is to construct a thesaurus. Thesauri have a limited number of connections which are often defined outside of the thesaurus itself, such as synonyms or antonyms.

An ontology combines all of these elements, while also being more flexible, by having a hierarchical structure, with relationships that are defined within the ontology itself. Non- hierarchical relationships are possible as well and are flexible enough to include any relationships that an ontologist would want or need. However, in ontology science, these entities and relationships have specific names as components of the ontology. These are discussed further in Chapter 3.

Chapter 2: Biomedical Ontologies

Ontologies are useful in domains with complex, heterogeneous, and constantly evolving data representations. Bodenreider and Stevens noted that biomedical scientists collect natural language-based facts rather than more homogenous mathematical formulae and

7

that natural language is difficult to reduce to computational form. Biology and applied biology, in the form of medicine, form natural candidates for ontology building and subsequent natural-language data annotation.

“Bio-ontologies” developed unplanned concurrently with the rise of bioinformatic data types. Bio-ontologies special interest groups have been meeting since 1998. In 1999, the

Gene Ontology was created, one of the longest-running active ontologies within the Open

Biomedical Ontologies (OBO) (24).

The advent of ‘big data’ biological technologies required data to be stored in enormous biomedical databases, hence the need for structured data types. For example, In the domains we modeled, “gender identity” was given contradictory use cases and information types. MeSH suggests the term is a ‘personal attribute’ while OMRSE (Ontology for

Medically Related Social Entities) considers it an ‘information content entity’. Psychologists and mathematicians consider ‘identity’ to mean something different as well. Ontologies allow us to compare these contexts and codify them. Usage notes, multiple hierarchies, definitions, and references assist users and computer platforms in using the appropriate material. This can be considered akin with the five rights of medication administration

(25), but instead offering the right material to the right individual at the right time.

As the amount of data being analyzed grew, the number of distinct ontologies to describe data grew as well. This growth facilitated the creation of several organizations which host a number of ontologies alongside one another. For instance, the Open Biological and

Biomedical Ontologies (OBO) Foundry was formed2, inspired by the efforts of the Gene

2 Originally known simply as Open Biomedical Ontologies in 2001, the OBO Foundry was started in 2007. 8

Ontology (GO) project. The National Center for Biomedical Ontology (NCBO), one of the

National Centers for Biomedical Computing (NCBC) in the United States, was also created around the same time. Both systems allowed for a unification of various evolving biomedical ontology standards, leading to an increase in interoperability between ontologies and ontology platforms.

Bioinformatics- and medical informatics- controlled vocabularies have existed in non- computational forms for centuries, leading to disparate ontology systems in research and medical care. Even contemporarily, bioinformatics- and medical informatics-based ontologies follow significantly different paradigms due to massive differences in their traditional applicability. However, as translational research has become more prioritized, connections between the two have become more frequent (26). This has led to the extensive creation of mappings between various ontologies and controlled vocabularies and complex mechanisms for ontological matching (27,28).

Such mappings may be present in the ontologies themselves, but are more commonly autogenerated in various platforms, such as in the NCBO BioPortal, Ontobee, and the UMLS

(Unified Medical Language System) Terminology Services. These automated processes increase system efficiency computationally. However, they also lead to problems which are difficult to detect in larger ontologies. Matching and mapping mechanisms suffer from the quadratic complexity of the matching problem and so probabilistic models usually take the place of more robust mapping mechanisms (29).

The idea of robust cross-disciplinary, interdisciplinary, or transdisciplinary ontologies remains unaddressed in information science. Because ontologies typically focus on a single

9

‘domain’ by definition (30), many such systems become isolated and stagnant. Many other issues still surround ontological science, including multilinguality, language neutrality, terminological ambiguity, and linguistic context. In domains which integrate multiple complex subjects, such as in the realms of gender, sex, and sexual orientation, these problems compound. However, a few interdisciplinary-based ontologies exist and have showcased their effectiveness in leading to novel insights in multiple domains (31). Single- discipline ontologies build on philosophical, cognitive science, linguistic, and logical disciplines and subdisciplines in their construction (32,33).

Chapter 3: Ontology Development

Ontologies are composed of several different, fundamental features: individuals, classes, attributes, relations, function terms, restrictions, rules, axioms, and events (Table 1). Of these features, individuals, classes, and attributes are the most crucial parts to cover a diverse set of use cases. Complex rules and restrictions can lead to problems in natural language processing applications, wherein a significant amount of documentation may be required for usage or the ontology may not translate well between ontology languages/formats.

10

Table 1. Outline of ontology components.

Name Meaning Notes Attribute A property, feature, parameter, In the statement “canaries are characteristic or quality which an the color yellow”, “color” could individual, class, or attribute has. be considered the attribute, with “canaries” are the subject and “yellow” as the predicate. Axiom Logical form assertions or rules which form the overall theory the ontology describes. Class A set, collection, category, or kind of “Wine” can be considered a set thing. with subclasses like “white wine” and “red wine”. “Chardonnay” could then be considered an instance of the class “white wine”, depending on the framing of the ontology. Event Shifting or changing of attributes or relations. Function Term More complicated structures created from relations which can serve as an individual term in an assertion. Individual A “ground level” object or thing. Also called “instances”, as in an instance of a class. What is or is not an instance depends on the framing and specificity of the ontology. Relation The ways in which classes, A dog is a mammal. “is a” is individuals, and other relations considered the attribute in this interact or relate to one another. case. A relation is a connection between these objects, usually in the form of a triple. The full statement “a dog is a mammal” has ‘dog’ as the subject and ‘mammal’ as the predicate, making ‘mammal’. Restriction Formal logical descriptions of what For instance, if an assertion is must be true for some assertion to made with a ‘datetime’ format, be accepted. but does not follow that format, it will be rejected. Rule Statements written as antecedent- consequent forms which can be drawn from assertions.

11

These features are then codified in formal languages called ontology languages. Markup ontology languages tend to be the most common include web ontology language (OWL), resource description framework (RDF), and RDF schema (RDFS).

Many software packages have been developed to make ontology design and development easier, with the most prevalent being Protégé. Protégé is a free, open-source ontology editor developed at the Stanford Center for Biomedical Informatics Research (34). It was originally released in November 1999 and has been continuous updated. It has been called

“the leading ontological engineering tool” (35), having more than 300,000 registered users.

It allows for an easy-to-understand visual interface for editing and viewing. It also includes simplified processes to avoid complex operational mathematics encountered when adding a new annotation property, class, or individual. Additionally, Protégé works with multiple popular ontology formats like OWL (Web Ontology Language), OBO (Open Biomedical

Ontologies), RDF (Resource Description Framework), and JSON-LD (JavaScript Object

Notation for Linked Data).

Ontology components in Protégé can be defined as “entities”. These are objects within an ontology system (instances and classes). Instances are called individuals (classes remain classes) and attributes are referred to as properties. Properties are split into three main types: annotation properties, object properties, and data properties.

However, properties do not necessarily carry with them explicit logical reasoning. For instance, “date published” may be a datetime-specific property in a small domain where that knowledge is always known. However, there may be instances where information is not known. For instance, if only the year of publication is known, it is not possible to code

12

just a year as a datetime object. For properties which are extensible (providing extra- logical information), for those which provide human-readable documentation (such as usage notes), and for data provenance information, it is preferable to use annotation properties.

Chapter 4: The Importance of Language

Despite computational and discrete scientific descriptors, knowledge representation is an abstract art of language description. This introduces bias into the construction of any ontology.

While natural language processing applications may have difficulties utilizing complex rules and restrictions, codifying human-readable entries and language, especially for its usage in subject headings, requires differential treatment in ontological systems. For instance, the Medical Subject Headings (MeSH) includes a preferred label (prefLabel), a definition (definition), an identifier (ID), various alternate labels (altLabel), etc. These are then “matched” to various attributes when uploaded to the NCBO BioPortal for interoperable usage, becoming Preferred Name, Definitions, ID, and Synonyms. There are two main issues with this system: (1) technical issues, such as only the first altLabel becoming a Synonym; and (2) semantic issues, as “alternate label” in subject headings has a slightly different meaning than “synonym.”

These problems are common in ontological structures and systems and underlie discussions of interoperability. It is a significant issue when considering semantic

13

interoperability, which allows computer systems to send data to one another with the idea that the data is unambiguous and has the same meaning (36).

These difficulties make interoperability between medical devices difficult, which led to the creation of international vocabularies and standards such as SNOMED-CT and HL7. All ontologies require updating as medical and scientific understanding evolves. The

International Classification of Diseases (ICD) is working on the eleventh iteration and the

Diagnostic and Statistical Manual of Mental Disorders (DSM) is on the fifth. However, these updates may continue at different speeds, necessitating a mapping back to their predecessors as an anchor. Certain statistics are no longer directly comparable, such as the evolution of diagnostic criteria for AIDS from 1980s to present. New conditions and new terminologies have no ‘backward’ mapping.

These issues contribute to making medical language subjective. Discussions of patienthood

(37), the relationship to medical coding (38), and the determination of subjectivity (39) elucidate these complications. The subjectivity of language and modeling in ontologies is crucial in the biomedical sciences, wherein models are often characterized in natural language (40). This makes such data heterogeneous which makes conclusions based on traditional statistical testing difficult and many times arbitrary. These decisions lead to grayer divisions in knowledge that humans and computer systems both have difficulty defining.

Subjectivity of language in the GSSO will be impossible to avoid, but it can be minimized.

Updating of the ontology will be required for the system to remain useful. Mapping to both

14

historical and contemporary contexts will be necessary to facilitate further research. It is for this reason that inclusion of references within the GSSO was prioritized.

Chapter 5: Language of Sex and Gender

Aspects of sex and gender deeply pervade language. As psychologist Susan Speer noted, there are “different, and often competing, theoretical and political assumptions about the way discourse, ideology and gender identity should be conceived and understood” (41).

This pervasion involves not only grammatical gender within language itself but broader sociolinguistic usage. This secondary level was first explored intently by linguist Robin

Lakoff in her 1975 book Language and Woman’s Place, which examined the role of women in society as second-class. Lakoff’s work acknowledged usage of tag questions and question intonation.

Multiple approaches to construction of sex and gender roles in language have been proposed since Lakoff’s work, with her own approach being labelled deficit, as it defines adult male language as standard, and women’s language as deficient to some extent (42).

This approach is often compared to dominance, whereby feminine language is seen as subordinate, rather than nonstandard.

The longer history of grammatical gender is somewhat less clear. The Indo-European ancestor language included distinctions for animate and inanimate objects, which later were assigned feminine or masculine characteristics and grammatical gender based on animistic conceptions of the world (43). Other language families, such as Austronesian,

Turkic, and Uralic languages are genderless, while a select few have more than three

15

grammatical genders. Higher rates of gender inequality are observed in countries wherein citizens speak gendered languages (44).

In English, there is not a grammatical gender, but certain terms have been imbued with gender-specific meanings (woman, man, queen, king, he, she, etc.). Many of these terms have pairings, such as the third-person personal pronouns he and she. As early as 1542, scholars applied sexist value to these terms, with William Lily writing that “[t]he Masculine

Gender is more worthy than the Feminine, and the Feminine more worthy than the

Neuter”. By 1770, literary critic Robert Baker went one step further, proclaiming that he should be the “first” gender-neutral pronoun in English, channeling religious language deriving from the Christian mythology of Eve being created from the rib of Adam (and thus proclaiming that she had been derived from he).

However, Baker ignored the fact that singular, gender-neutral they had existed in English since the 14th century. It would not be until the mid-18th century that the term would be criticized formally by grammarians. Creation of neologistic pronouns (or neopronouns) which were gender-neutral followed this wave of criticism, with hundreds of suggestions being formulated in English before singular they began to be more widely accepted once more (45).

This debate occurred somewhat simultaneously in LGBTQIA+ communities and in feminist circles in the United States, Canada, and the United Kingdom. It also enveloped the tying of terms of address for women to marital status (Ms., Mrs., etc.), leading to the creation of the gender-neutral Mx.

16

In parallel, LGBTQIA+ communities in various parts of the world began to develop secret languages, known in linguistics as gay argots, to communicate with each while avoiding detection in societies which oftentimes had strict laws outlawing such behaviors (sodomy and buggery laws, for instance). Many gay argots started as evolutions of slang terminology in the sex trade, combined with concepts and words derived from other minoritized communities, including racial and ethnic minorities. One such argot, Polari, led to many

‘mainstream’ LGBTQIA+ terms widely utilized in the community today, such as acdc, butch, camp, cottaging, and trade (46).

As vice committees were formed in major cities such as New York, such “secret” languages became more widely published so that the “scourge” of homosexuality (and LGBTQIA+ communities generally) could be rooted out. Documents such as the 1964 legislative report

Homosexuality and Citizenship in Florida contained an extensive glossary of LGBTQIA+ slang, contributing to rapid shifts in LGBTQIA+ slang to avoid “detection” during the

Lavender Scare.

Wide publication of LGBTQIA+ slang which was often considered nonderogatory in its original form often led to terms being co-opted by majority communities. These terms were then redressed as LGBTQIA+ slurs (terms like fag and tranny, for instance). This forces minoritized communities to create newer terms to fill those spaces, to eliminate of the term without direct replacement, or to attempt to reclaim the co-opted term or another, similar older term.

Transgender studies scholar Julia Serano calls this phenomenon the “activist language merry-go-round” (47). She goes on to label the first two aforementioned strategies “word-

17

elimination” and “word-sabotage”. She states, “While such strategies [of word-elimination and word-sabotage] are often embarked upon with the best of intentions, they can have unforeseen negative consequences for the minority/marginalized group”.

One such consequence exists in the realm of LGBTQIA+-affirming health care. As terms expand and change rapidly, with newer terms introduced more and more quickly, it can be difficult for medical professionals to understand their patients’ needs and concerns. In

HIV/AIDS- and transgender-related health care specifically, such misunderstandings can and have led to numerous patient deaths and other medical issues.

In addition, discriminatory ideologies surrounding LGBTQIA+ people often utilize slurs, specific language usages, and lack of education about LGBTQIA+-specific health care exacerbate these issues. In 2011, forty years after homosexuality had been depathologized by the DSM, an attending physician in New York called homosexuality a “primary illness” and said “[t]here is something wrong with those [gay] people” (48). Discrimination against

LGBTQIA+ persons in health fields continues to lead to negative patient outcomes.

In 2007, Janice Langbehn and her children were not allowed to see their mother, Lisa Pond, for eight hours as she lay dying from an aneurysm. Medical providers would not allow the visitation because Langbehn and Pond were lesbians. A social worker told her: “I need you to know this is an anti-gay city and a [sic] anti-gay state, and you are not going to get to see her or know her condition” (49).

In 2019, a 32-year-old trans man came to a hospital reporting severe abdominal pains; it was considered that it was a non-emergency by medical staff who related the pains to his obesity and the fact that he had stopped taking his blood pressure medication. The pains 18

were actually due to pregnancy, and the misdiagnosis ended with the man’s child dying before it could be delivered. Dr. Stroumsa, later reflected on the case, writing that: “The point is not what’s happened to this particular individual but this is an example of what happens to transgender people interacting with the health care system” (50).

Another case, in 2008, involved another trans man Jay Kallio. A transphobic surgeon hid biopsy results because they were uncomfortable with Kallio’s gender presentation and pronouns. The results were only shared with him by accident by a lab technician four years later. The first medical oncologist Kallio saw refused treatment, leading to a delay past the therapeutic window for chemotherapy. Kallio died after the cancer metastasized in 2016

(51). Before his death, he noted: “We [transgender people] are so vulnerable when we are sick… I was at the point where I was going to forgo treatment. I had greater trust in the natural course of my cancer than with my providers. No one should be treated like that when they face a potentially terminal diagnosis” (52).

Considerations of LGBTQIA+ language and education of medical providers must account for unforeseen circumstances related to that knowledge. An ontological system such as the

GSSO must then leverage aspects of discrimination and ethical usage notes in its construction.

Chapter 6: Sex and Gender in Medicine

It is likely that gender and sex were among the first medically discussed paradigms, with information involving the delivery process recorded in Neolithic settlements as early as

19

7500 BCE (53) and archaeological evidence pointing to culturally-specific gender roles as early as 2700 BCE (54).

Evidence of gender- and sexually- diverse persons stretches back almost as far. In 2011,

Czech archaeologists noted a find during a press conference wherein an apparently male skeleton was buried in a traditionally female position. Lead archaeologist Kamila Remisova

Vesinova noted that they “found one very specific grave of a man lying in the position of a woman, without gender specific grave goods, neither jewelry or weapons” and archaeologist Kateřina Semrádová clarified that they believed “this is one of the earliest cases of what could be described as a ‘transsexual’ or ‘third gender grave’ in the Czech

Republic”. The skeleton was dated between 2900 BCE and 2500 BCE (55). In the Americas, archaeologist Sandra E. Hollimon described a potential “two-spirit”3 skeleton from Santa

Cruz Island dating between 3500 BCE and 1200 BCE (56).

These roles and categorizations, as well as the social attitudes related to them, sometimes lead to drastic differences in health outcomes. Despite worldwide life expectancy at birth for males being several years shorter than females (68 years and 4 months versus 72 years and 8 months) (57), cultural norms in many countries shorten or reverse this trend (58).

For example, gender inequality is significantly associated with gender disparities in depressive disorders (59).

3 Two-Spirit is a contemporary umbrella term for Native American and First Nations gender and sexual roles which do not fit into Eurocentric conceptualizations of ‘male’ and ‘female’. The term was coined in 1990 Historically, these persons were referred to by colonialists and anthropologists as berdache, which is now considered offensive. “Two-Spirit” is not a term accepted by all tribes however and has been criticized as an oversimplification of the hundreds of gender and sexual expressions and roles which exist in Native American and First Nations communities. 20

In knowing that gender inequality affects health outcomes, the question becomes how to accurately reflect that information in a health document. Gender- and sex-related information content entities are recorded by many institutions, including on driver’s licenses, passports, Social Security cards, school report cards, college enrollment forms, health insurance documentation, vehicular insurance documentation, birth certificates, death certificates, birth registries, death registries, etc., etc. For the majority of people, these values are all the same: an ‘F’ or an ‘M’.

The first issue arises when considering whether an ‘F’ or ‘M’ has different disparities based on cultural context. Mortality and morbidity trends are connected to the societal norms attributed to a person who is an ‘F’ or an ‘M’. An ‘F’ in Argentina, Mexico, or Japan cannot be said to be equivalent to an ‘F’ in Indonesia, Nigeria, or Portugal, despite having similar risks based on assumed phenotypes (an ‘F’ might have a higher risk of breast cancer, regarding of the jurisdiction, whereas an ‘M’ might have a higher risk of testicular cancer).

One of the central questions in construction of the GSSO involved these data types and their classification. We had to consider whether it would be possible to model personal gender and sex attributes using a single field (such as gender identity or assigned sex at birth).

Studies of health disparities in transgender, intersex, and other gender-diverse populations indicate that this does not accurately capture patient data. These persons too, have different disparities based on social factors like race, caste, or socioeconomic status. These data types were added to the GSSO for this reason to increase applicability in multiple scenarios.

21

Historically, sex- and gender- diverse statuses were categorized via a pathologization model. As far back as in the records of Herodotus and Hippocrates, there are stories of the

Enaree, shamans belonging to the nomadic Scythians, who lived in various areas of the

Pontic steppe from the 7th century BCE until the 3rd century BCE. Herodotus mentions that the Enaree who were described as “effeminate” or “androgynous” were afflicted with a

“female” sickness, following their participation in pillaging the temple of Aphrodite at

Ascelon. Hippocrates takes a less divine route in explaining their “effeminacy”, hypothesizing that their continuous horseback riding (as a result of being part of a nomadic people) caused impotency, leading to adoption of feminine gender roles. The “female” disease hypothesis was rediscovered and reapplied during the European Renaissance to persons who we may consider feminine gay men or transgender women today.

In Johannes Baptista Friedrich’s Versuch Einer Literargeschichte der Pathologie und

Therapie der psychischen Krankheiten (1830), he noted that “[t]he fixed delusion of being a woman is not an uncommon psychical disease, and is observed everywhere”, citing dozens of cases stretching back as far as 1790. Friedrich’s work would then be cited by Esquirol

(1838, 1845) and Westphal (1870), solidifying the condition’s existence in the realm of psychiatry, rather than in the realm of sin (as suggested by many clerics at the time).

The confabulation of Friedrich’s “disease” with sexual inversion4 strengthened the idea that these conditions were congenital, but could be rectified using eugenic methodologies, so that such persons could not reproduce and pass on their “perversion”. This was the reality

4 “Sexual inversion” was a psychiatric/psychological term used to describe lesbian, gay, bisexual, and transgender persons in the late 19th and early 20th centuries. 22

for persons such as Ralph Werther5 (1874-1921?), who wrote two autobiographical sketches of his life as a call for sympathy:

I beg all adults, particularly school officials, to be extraordinarily charitable and

sympathetic with girl-boys and others sexually abnormal by birth who may seem to

have lost their senses. Guard against doing anything that would lead the disgraced

to commit suicide, which event is fairly common among these ‘stepchildren of

nature’.

However, there simultaneously was a push for a segment of the “fairy” community which asked for voluntary surgical (instead of involuntary physical or chemical sterilization) and hormonal treatment to change their primary and secondary sexual characteristics. One of the earliest of these cases involved the individual Karl M. Baer, who underwent gender affirming surgery in December 1906. Following this, Alan L. Hart underwent a voluntary hysterectomy in 1918. In the period from 1922 to 1931, Dora Richter underwent orchiectomy, penectomy, and vaginoplasty (in the form of grafting an artificial vagina). By the time Lili Elbe, hailed as “the first sex change” by Niels Hoyer, underwent vaginoplasty in

1931, physicians had about 30 years-experience in the realm of gender affirming surgery6.

Beginning with the very first medical vocabulary systems, LGBTQIA+ persons were classified as being diseased, following from the common discourses in the fields of psychology (particularly psychoanalysis), psychiatry, neurology, criminology, anthropology, and sociology. These discussions themselves derived from thousands of

5 Also known as Jennie June or Earl Lind. He/him pronouns used as these were the pronouns he used for himself. 6 Many forms of gender affirming surgery have existed since Antiquity, such as the ceremonial nirvaan practiced in some hijra communities in India, Pakistan, Nepal, and Bangladesh. 23

years of oppression. As early as 486 BCE, King Darius I enacted the first state sanctioned death penalty for homosexuality (60).

By the late 19th century, Eurocentric conceptualizations of homosexuality and transgender behavior as immoral shifted to ideas that they were either acquired from other such persons or a congenital condition, being forms of ‘psychical hermaphroditism’ or sexual inversion. The idea that homosexuality was a contagion continued to be published in mainstream academic literature until the 1990s (over 20 years since homosexuality was removed from the DSM), while narratives about being transgender being a social contagion continue to be published today (61).

This history contributes to LGBTQIA+ persons avoiding medical institutions and providers today. Misgendering (using the wrong gendered references in reference to a transgender person) and deadnaming (using a transgender person’s deadname, typically their birth name) have been tied directly to depressive symptoms, suicidal ideation, and suicidal behavior (62). Both misgendering and deadnaming continue to be common in health care settings (63), alongside other types of discrimination (64,65). These experiences are not exclusive to the transgender community, but are felt throughout the LGBTQIA+ community, despite depathologization movements (66,67).

Ontologies occasionally codify these cisnormative and heteronormative assumptions

(including those of pathologization) into care and research models. For instance, the NCIT defines sexual intercourse as “[t]he act of sexual procreation between a man and a woman; the man's penis is inserted into the woman's vagina and excited until orgasm and ejaculation occur” and SNOMED includes terms outdated for half a century such as

24

“Hermaphrodite”, “Latent homosexual state”, “Sodomy”, “Surgically transgendered transsexual”, and “Tomboyishness”. Both phenomena can lead to discrimination, inaccurate literature review and research outcomes, and negative patient outcomes if certain terms are misapplied.

Under the NCIT definition of “sexual intercourse”, it is possible that a gay man with HIV could be asked if he has had sexual intercourse and he could technically respond “no”. A physician may not order an HIV test in time, the patient could die or be irreparably hurt and he may even continue to infect others with HIV unknowingly. Under this definition, even penile-vaginal intercourse when orgasm does not occur or when one or both individuals are using some form of contraception, does not count as sexual intercourse.

Clinical and research definitions leverage a diverse set of direct and indirect patient outcomes in their construction, rather than restrictive terminology, which is not only discriminatory, but actively hurtful to patients. An ontology must leverage past and present conditions within the medical field and outside of it to adequately model the reality

LGBTQIA+ patients face.

PART II: STUDY DESIGN

Chapter 7: Building Ontologies

We followed an iterative approach to ontology construction, using a four-step process laid out by Corcho et al: (1) marking, (2) exploring, (3) mapping, and (4) abstracting. The ontology was built manually in an RDF/XML format using the Protégé ontology designer

25

platform (34). The iterative approach meant first identifying a set of “seed” terms to search for relevant literature with.

We identified and utilized the 32 terms from the most recent Human Rights Campaign

(HRC) glossary, as HRC is the largest LGBTQIA+ advocacy group in the United States, claiming more than 3 million members (about one-third of the estimated American

LGBTQIA+ population). These results were then used to identify common 1-, 2-, 3-, 4-, and

5-grams; the top 200 terms were considered for addition, and then used to search again within various databases. Once saturation of the 200 most common terms was met, we considered the marking and exploring parts of the Corcho et al process complete.

Parallel to this was a secondary search of tertiary literature (glossaries, dictionaries, encyclopedias, etc.) including search of phrases like “LGBTQ terminology” and “LGBTQ slang” on Google in Autumn 2019. For this, we took the first 2 pages of relevant results and added all terms; for the print tertiary literature, all entries were considered for addition.

We created full cross-reference mappings to existing LGBTQIA+-specific vocabularies, with particular focus given to the second version of the Homosaurus vocabulary originally created by IHLIA7 LGBT Heritage, and the Library of Congress’ LGBTQIA+ subject headings.

The tertiary literature search and vocabulary search also served as a literature review of

LGBTQIA+ terminology in those spaces.

Next, we began the process of mapping into the main hierarchy of the ontology (“mapping up”), using the process described fully in Chapter 11. The final step, abstraction, involved a

7 No longer used as an acronym; formerly stood for the International Homo/Lesbian Information Center and Archive. 26

combination of relations, usage notes, and definitions being created to allow for complex data concerning entries to be elucidated in a computationally efficient manner. This began with the “mapping up” process and continued with adding specific references in literature and contextual notes.

Once the ontology was built, we aimed to make it available in multiple platforms and to extend the usability of entry URIs by making them permalinks via the GSSO’s addition to the OBO Foundry. Making the system available on NCBO BioPortal and GitHub, as well as its own dedicated website, followed this same line of reasoning. PURLs of the GSSO thus followed the OBO Foundry pattern, being accessible through URIs like

“http://purl.obolibrary.org/obo/GSSO_000096”.

Chapter 8: Data Sources and Dataset Construction

Multiple data sources were consulted throughout the project, including those listed in

Appendix B. We also considered controlled vocabularies, thesauri, ontologies, encyclopedias, dictionaries, subject headings, and classification systems which included content related to gender, sex, and sexual orientation as part of the process of marking the subject area of the GSSO. Most of these resources were ascertained during literature review, while others were found indirectly via specific database searches between Autumn

2019 and Spring 2020.

These approaches were iterative, with bibliographies constructed from each of these resources used to find further resources and new seed terms being used to identify

27

additional references or terms. Terms from version 1.0 of the GSSO were included in version 2.0 of the GSSO.

Sources were linked to individual entries where available to provide context, with PURLs like DOIs made available so the sources could be consistently referenced. Internet resources without PURLs were archived via the Internet Archive’s Wayback Machine

(https://archive.org/web/) to ensure that they would be available moving forward.

Chapter 9: Criteria for Ontology Completion

Raad and Cruz (68) derived the following evaluation criteria for ontology completeness: (1) accuracy, (2) completeness, (3) conciseness, (4) adaptability, (5) clarity, (6) computational efficiency, and (7) consistency. These criteria are laid out with example measures in Table

2.

28

Table 2. Evaluation criteria derived from Raad and Cruz (68).

Criteria Central Question Measure(s) Accuracy Are definitions and Percentage of correct descriptions considered predictions; user feedback on correct by various kinds of correctness audiences? Completeness Is the domain appropriately Percentage of covered covered? literature based on other ontology coverage; user feedback on completeness Conciseness Are there irrelevant Percentage of elements used in elements? accuracy and completeness; feedback on conciseness Adaptability Can the ontology be used for Usage in research, archival, and a wide variety of purposes? medical settings; usage by independent groups and projects across multiple disciplines Clarity Does the ontology effectively User feedback on clarity communicate its intentions? Computational Can tools utilize the ontology Benchmarked search times, efficiency quickly in an automated or load times, etc.; feedback from semi-automated manner? users on speed Consistency Does the ontology include Logical tests; feedback from few, if any, logical users on consistency inconsistencies?

Measures could be split into computational applicability (for usage in areas like natural language processing or text-based classification or tagging) and human-centered usability and design (for usage in educational contexts, knowledge discovery via search engines, etc.).

For human-centered measures, we constructed a three-part survey which analyzed demographic information related to survey users, the Software Usability Scale (SUS) in terms of analyzing website design (for where the ontology itself was hosted), and the

Ontology Usability Scale (OUS) for analyzing the content of the ontology more specifically

29

(69,70). Both the SUS and the OUS use Likert scale (or Likert-like scales) for question formatting and have been independently validated for usage in their respective domains.

Specific questions regarding accuracy, completeness, conciseness, clarity, computational efficiency (load times on the website, for instance), and consistency were included in the

SUS and OUS. Survey questions are displayed in Appendix C.

The demographic portion of the survey was reviewed by two independent sociologists and included gender minority related questions derived from the Gender Identity in U.S.

Surveillance (GenIUSS) Group recommendations and the most recent Gender Census

(2019), which included responses from over 11,000 persons identifying outside of the gender binary. The survey was run from August to December 2020.

PART III: ONTOLOGY CONSTRUCTION

Version 1 of the ontology was officially published on 24 June 2019, this version contained numerous flaws, including a lack of definitions for materials and usage of English-name- based PURLs instead of more computationally efficient numerical UIDs. This led to several incremental changes until 13 September 2019 with a soft release of version 1.0.1e, uploaded for compatibility to the NCBO BioPortal. However, major overhauls occurred shortly thereafter, originally becoming Version 1.1.0, but later being renamed Version 2.0.0 in accordance with a complete shift in the ontology’s structure and content.

Version 2 was officially released on 18 June 2020, with incremental updates made to align with OBO Foundry standards. Such standards include openness (the GSSO is available under the Apache 2.0 license), a common format (the GSSO is available in RDF/XML, OBO,

30

and JSON-LD formats, among others), a URI/identifier space (available at Ontobee), a versioning platform (via GitHub), scope (identifying via the construction process), textual definitions (all classes contain human-readable definitions), relations (defined via hundreds of attributes), documentation (available on the GSSO website, on the GSSO

GitHub, and within the GSSO itself), documented plurality of users (via publications citing the GSSO and general interdisciplinary interest), commitment to collaboration (showcased via the case), locus of authority (via the primary author of the GSSO), naming conventions

(specified via individual communities, number of citations, and recency of referencing), and maintenance (fundamentally displayed by the versioning over the past two years) (71).

These principles were defined utilized interpretations of keywords derived from RFC 2119:

Key words for use in RFCs to Indicate Requirement Levels.

Comparisons between the two major official releases (1.0.0 and 2.0.0 are shown in Table 3 below, although it is of note that version 2.0.5a is currently utilized on the NCBO BioPortal and within the GSSO GitHub.

Table 3. Numerical comparisons between 1.0.0 and 2.0.0.

Property Version 1.0.0 Version 2.0.0 # Entries 6,250 10,060 # Classes 6,250 7,121 # Instances 0 2,939 # Definitions (for Classes) 1,063 (17.0%) 7,121 (100.0%) # Database Cross- 1,416 14,193 References

In addition to those sources which were included as part of the broader review of ontology- like systems and catalogued resources in the literature review below, we also prioritized mappings from the Homosaurus, Wikipedia, and Wikidata to increase interoperability.

31

Chapter 10: Literature Review

Scoping literature via the covered subject areas of gender, sex, and sexual orientation was the first step in ontology construction.

LGBTQIA+-related terminology usage in a standardized format is not new, although incorporation in a formalized ontology is novel. Multiple LGBTQIA+ archives and institutions (as well as those in health and in archival work) incorporate such terminology.

Broadly speaking, these can be classed as ontologies and classification systems. While ontologies can be utilized as classification systems, classification systems are not necessarily ontologies and have a clear purpose in indexing content (in this context, these include subject headings and thesauri). A summary of these systems and their various classifications can be found in Table 4.

32

Table 4. Classification systems, subject headings, and thesauri which cover LGBTQIA+ topics specifically.

Name Edition Type Date Created Homosexual Subject Heading Subject Headings 1974 Schemes Homosexuality and Gay Liberation: Subject Headings 1977 An Expansion of the Library of Congress Classification Schedule Philadelphia Gay Library Classification System 1979 Classification Scheme National Gay Archives Library Classification System 1984 Classification System GDC: Gay Decimal Classification Classification System 1984 Thesaurus of Subject Headings Thesaurus 1984 Gay Studies Thesaurus Thesaurus 1985 International Gay & Lesbian Revised Classification System 1985 Archives Classification System International Thesaurus of Gay and Thesaurus 1988 Lesbian Index Terms Michel/Moore Classification Revised Classification System 1990 Scheme for Books in Lesbian/Gay Collections A Queer Thesaurus: An Thesaurus 1997 International Thesaurus of Gay and Lesbian Index Terms Homosaurus Version 0 Linked Data Vocabulary 1997 Homosaurus Version 1 Linked Data Vocabulary 2013 Homosaurus Version 2 Linked Data Vocabulary 2019 Homosaurus Version 2.1 Linked Data Vocabulary 2020 Lesbian Herstory Archives Subject Subject Headings 1997 Files Library of Congress Queer Subject Subject Headings 2000 Headings LGBT Life Thesaurus Thesaurus 2011 QueerLCSH Subject Headings 2020 LLACE Classification Scheme Classification System 2001 GLSO Taxonomy Classification System 2013 Out on the Shelves Classification Old System Classification System 2018 Out on the Shelves Classification New Classification System 2018 System

33

In addition to these sources explicitly created for usage in controlled vocabulary systems, we also investigated all available sources which implicitly created controlled vocabularies, those being encyclopedias related to sex, gender, and sexual orientation. These are included in Table 5. There were also a series of semi-encyclopedic works consulted, such as the 1995 new expanded edition of The Complete Dictionary of Sexology8 and The A–Z of

Gender and Sexuality: From Ace to Ze (2019).

8 Includes over 6,000 entries. 34

Table 5. Encyclopedias which cover LGBTQIA+ topics explicitly or implicitly. Number of entries are listed only if applicable to the encyclopedia’s structure and if the encyclopedia itself was available for analysis.

Title Edition Date Published Number of Entries Encyclopedia of Homosexuality 1990 784 Cassell’s Encyclopedia of Queer Myth, 1997 1,480 Symbol, and Spirit: Gay, Lesbian, Bisexual, and Transgender Lore Encyclopedia of Lesbian, Gay, Bisexual 2004 544 and in America Routledge International Encyclopedia of 2006 1,131 Queer Culture LGBTQ America Today: An Encyclopedia 2008 686 Encyclopedia of Contemporary LGBTQ 2009 N/A Literature of the United States The Greenwood Encyclopedia of LGBT 2009 N/A Issues Worldwide The SAGE Encyclopedia of LGBTQ 2016 431 Studies Global Encyclopedia of Lesbian, Gay, 2019 445 Bisexual, Transgender, and Queer (LGBTQ) History Encyclopedia of Lesbian Histories and 2015 N/A Cultures Encyclopedia of Sex and Gender 2003 N/A Encyclopedia of Sex & Gender 2007 N/A Encyclopedia of Gender and Society 2008 466 The Wiley Blackwell Encyclopedia of 2016 686 Gender and Sexuality Studies The Encyclopœdia of Sexual Knowledge 1934? N/A American Encyclopedia of Sex 1935 N/A The Encyclopaedia of Sexual Behaviour 1961 112 Encyclopedia of Unusual Sex Practices 1992 655 Encyclopedia of Sex 1st 1994 N/A Encyclopedia of Sex 2nd 2000 N/A The International Encyclopedia of 2001 N/A Sexuality The Continuum Complete International 2004 N/A Encyclopedia of Sexuality Encyclopedia of Prostitution and Sex 2006 N/A Work The International Encyclopedia of 2015 547 Human Sexuality The Wiley Blackwell Encyclopedia of 2016 686 Gender and Sexuality Studies

35

In consideration of LGBTQ terminologies, vocabularies, and bibliographies, we consulted multiple databases via manual search paradigms in early 2019. This helped us consider both more recent slang and broader non-medical language in current usage. Some of the sources found are listed in Table 6.

36

Table 6. Other resources covering LGBTQIA+ terminologies.

Title Date Modified Number of Entries 17 Lesbian Slang Terms 2018 17 Every Baby Gay Needs To Learn 20 Lesbian Slang Terms 2012 20 You’re Never Heard Before Comprehensive* List of 2019 89 LGBTQ+ Vocabulary Definitions Definitions (Donna Lynn 1999 9 Matthews) Glossary of Gender and 2010 72 Transgender Terms Glossary of LGBT Terms for 2016 56 Health Care Teams Glossary of LGBTQ Terms Unknown 147 Glossary of Terms (Human Unknown 32 Rights Campaign) GLSEN Concepts and Terms 2014 36 LGBT Terminology 2018 177 (University of Southern California) LGBTQ Slang Everyone 2016 64 Should Know LGBTQI Terminology 2004 104 LGBTTIQQ2SAA+ Unknown 52 Definitions Other Words for the Other- 1995 23 gendered PFLAG National Glossary of Unknown 49 Terms Queer Dictionary Unknown 17 Studies of LGBTQ Language: 2006 N/A A Partial Bibliography The Angel’s Dictionary 1996 25 The Ultimate LGBT 2017 181 Glossary: all your questions answered The Words that Failed: A Unknown N/A chronology of early nonbinary pronouns Transgender glossary 2018 73 (RationalWiki) 37

All source entries were cross-tabulated and added based on priority, designated by the number of references available in all consulted resources, i.e., if “transgender” had 36 references and “bisexual” had 18, “transgender” was added first.

Review of Existing Ontologies

No current ontologies exist which specifically cover LGBTQIA+-specific topics. The

Homosaurus comes close, being a linked data vocabulary, but it has no hierarchical structure and is decentralized and disconnected from other ontological systems.

Many ontologies attempt to cover some LGBTQIA+ spaces, but many fall significantly short or are vastly out of date, such as the Medical Subject Headings (which did not add a term for ‘intersex persons’ until 2020 or ‘transgender persons’ until 2016, before which time it was ‘transgender persons’ from 2013 until 2015). Popular medical ontologies such as

SNOMED-CT, ICD-10-CM, and the NCI Thesaurus are not much better, including aforementioned examples of out-of-date terminology like “sodomy” and “transvestism”.

Major ontologies that were mapped to included ATC, BFO, ChEBI, DO, DSM, EFO, FMA, GO,

GOLD, HPO, ICD-9-CM, ICD-10-CM, MedDRA, MeSH, NCBI Taxon, NCIT, SCTID, SIO, STY, TA,

TE, and Uberon. These were chosen based on their accessibility and usage within biomedical spaces. The differences in database mappings from version 1.0 and version 2.0 are shown in Table 7.

38

Table 7. External database mappings from version 1 to version 2. Only databases with more than ten mappings are shown.

Source # Mappings in Version 1 # Mappings in Version 2 (n = 1,416) (n = 14,193) ATC NM 20 BFO 19 0 ChEBI 4 213 DO 62 193 DSM NM 21 EFO 35 0 FMA 43 327 GO 32 152 GOLD 13 0 Homosaurus NM 430 HPO 30 137 ICD-9-CM 30 101 ICD-10-CM 28 261 LCC NM 527 LCSH NM 749 MedDRA 129 595 MeSH 261 904 NCBI Taxon 11 87 NCIT 261 1,034 SCTID 241 1,084 SIO 116 3 STY 16 1 TA 3 91 TE 30 38 UBERON 22 169 Wikidata NM 2,270 Wikipedia NM 4,755

Review of Classification Systems

The Rainbow Round Table (RRT), part of the American Library Association (ALA), compiled a document of GLBT Controlled Vocabularies and Classification Schemes in 2007, which included controlled vocabularies, classification schemes, periodical indexes, encyclopedias, and secondary literature. However, these terms were used in-line more with 39

their definitions in the library sciences, rather than in the information sciences (discussed in Chapter 1). This bibliography was built based on work done for the round table by Dee

Michel in 1990 (at that point the RRT was known as the Gay and Lesbian Task Force

[GLTF]9). Matt Johnson was the primary author of the 2007 report, based on his earlier work Gay, Lesbian, Bisexual, and Transgender Subject Access: History and Current Practice

(also published in 2007).

These documents are notable, as the RRT was the first official LGBTQIA+ professional organization in the United States, formed by members of those communities, rather than psychiatrists, psychologists, sociologists, sexologists, and/or anthropologists. This fundamentally shifted the lens away from LGBTQIA+ pathologization and medicalization in the professional sphere, and the impact this had on conceptual vocabulary was immediately obvious. The Task Force established the Stonewall Book Award in 1971, a full three years before homosexuality was removed from the sixth printing of the DSM-II in

1974.

Fortunately, Johnson made clear the availability of several of the associated vocabularies, as well as potential interest in their expanded usages by institutions like libraries and archives. In many cases, the vocabularies were developed for the explicit purpose of usage in LGBTQIA+ libraries and archives, rather than subsets of knowledge within larger libraries and archives. Many vocabularies overlapped somewhat with LGBTQIA+-related systems, including sexuality-related vocabularies and gender-related vocabularies

9 The organization was originally founded as the Task Force of Gay Liberation, itself part of the ALA’s Social Responsibilities Round Table (SRRT). It has also been known as the Gay, Lesbian, and Bisexual Task Force (GLBTF, 1995) and the Gay, Lesbian, Bisexual, and Transgender Round Table (GLBTRT, 1999). It became known as the RRT in 2019. 40

(including feminist-related vocabularies). These are of note as well but tend to be excluded in some analyses as they are not LGBTQIA+ specific. Because our ontology aimed to be comprehensive of LGBTQIA+ topics, but explicit included gender, sex, and sexual orientation (GSSO) data in its compilation, many of these vocabularies were consulted where possible.

Other classification systems which were utilized in addition to those mentioned previously included the DDC, LCC, and LCSH.

Chapter 11: Entity and Attribute Construction

Entities and individuals were primarily derived from the aforementioned ontologies, controlled vocabularies, thesauri, classification systems, encyclopedias, and subject headings, both specifically in the LGBTQIA+, sex, gender, and sexual orientation spaces, as well as those in the broader space of biomedical ontologies. They were also derived from current lists of terminology and vocabulary using multiple databases and search engines using seed terms derived from a short list of terms, namely the HRC glossary terms, as well as broader searches like “LGBTQ terminology” and “LGBTQ slang”10.

Once a potential entry for the GSSO was identified, it was classified as an entity or as an individual, based on whether it represented a set of possible objects or an instance of an object. For instance, the Stonewall Riots were considered an instance of an LGBTQ-related riot because they fundamentally represent an individual riot which could not be split into

10 “LGBTQ” was utilized because it is the preferred initialism in current style guides, and because most search engines map it to other abbreviations (such as “LGBT” or “LGBTQIA+”). 41

smaller fundamental parts. Conversely, “LGBTQ-related riot” was considered a class because there could be multiple instances (or individuals) of riots.

After an entry was classified as an individual or class, it was further classified into another class until such a superclass aligned with both owl:Thing and BFO:0000001 (entity). This classification was done with priority given to OBO ontologies, then other ontologies available via the NCBO BioPortal and the EMBL-EBI OLS, and using Wikipedia category systems or Wikidata class relationships. If none of these could be identified clearly, a proxy was used based on source definitions (i.e., depending on where the entry originated). For instance, if a definition or statement regarding the entry fell into the form “X is a Y”, then entry X was considered to be a subclass of Y and Y would then follow the same order of consideration as X when attempted to connect back to BFO:0000001.

Attributes were constructed as annotation properties in parallel with priority being given to pre-existing attributes which had been built in other ontologies. Priority was given to

OBO-related ontology attributes first, followed by more general OWL and DC, and then by

Wikidata and Schema.org. These were primarily utilized to increase interoperability between systems.

PART IV: ONTOLOGY EVALUATION

Version 2.0 of the ontology had 7,121 classes, 2,939 individuals, and over 300 attributes

(including 289 annotation properties) covering a diverse series of topics from abstinence to zygosity along with thousands of synonymous terms and mappings to other ontologies

42

and databases. 76,492 annotations were included, forming 103,979 axioms (of which

14,351 were logical axioms and 13,037 declaration axioms). The GSSO was added to the

NCBO BioPortal, the central registry of Identifiers.org, to Wikidata, to Ontobee, to the

EMBL-EBI OLS, and to the OBO Foundry to increase interoperability. Additionally, suggestions for improvement were obtained from multiple subject-specific libraries, archives, and vocabularies, such as from the Homosaurus and the GLBT Museum and

Archives. Formal feedback was ascertained via a multi-arm survey located the GSSO website hosted by CCHMC (http://gsso.research.cchmc.org/).

Each class in version 2.0 included a human-readable definition and was placed in a computer-readable hierarchy congruent with existing high-use biomedical ontologies (like

MeSH and SNOMED-CT). Over 5,000 classes (74.0%) had no mappings to any of the over

800 ontologies represented at the NCBO BioPortal. From its inception, the GSSO filled a significant gap in ontology coverage, being the top 5% of all ontologies visited on the NCBO

BioPortal website as of March 2020, almost one year after its initial publication.

43

Figure 1. Sample entries from version 2 (left) and version 1 (right) shown in Protégé.

The average number of annotations per class increased from 2.6 to 7.4, showcasing a massive increase in the depth of data represented in addition to breadth. An example of this difference is showcased in Figure 1. In particular, attributes were given more effective labeling, sources were included in a more consistent manner, and sub-annotations were included to “clean-up” entries with unclear annotations, by making it more clear which annotations related to one another. An example visualization of the connectivity displayed in version 1 of the GSSO is provided in Figure 2.

44

Figure 2. Example connectivity of the GSSO in version 1, showcasing the high- connectedness in the GSSO graph and sub-graphs.

Because ontologies can be said to act as graphs, we can analyze the GSSO in terms of depth, breadth, and tangledness, among other things (Table 8, “N/A” indicates that the ontology was too large to be analyzed by OntoMetrics). The graphical nature of the ontology is closely related to computational integrity and efficiency, as well as cognitive ergonomics

(72).

45

Table 8. Graph metrics for classes (with individuals removed) calculated for the GSSO (Gender, Sex, and Sexual Orientation Ontology), MeSH (Medical Subject Headings), SNOMED-CT (Systematized Nomenclature of Medicine, Clinical Terms), and ICD-10-CM (International Classification of Diseases, Tenth Revision) using the Ontology Metrics (OntoMetrics) application. N/A = the ontology lacked the conciseness necessary for comprehensive analysis.

Metric Value GSSO ICD-10-CM MeSH SNOMED-CT Absolute root 1 4 N/A N/A cardinality Absolute leaf 5,386 72,279 N/A N/A cardinality Absolute sibling 9,965 95,210 N/A N/A cardinality Absolute depth 355,530 661,267 N/A N/A Average depth 11.541683 6.944988 N/A N/A Maximal depth 41 8 N/A N/A Absolute breadth 30,804 95,215 N/A N/A Average breadth 2.582495 4.152058 N/A N/A Maximal breadth 145 34 N/A N/A Ratio of leaf fan- 0.540492 0.759114 N/A N/A outness Ratio of sibling 1.0 0.999947 N/A N/A fan-outness Tangledness 0.136879 0.0 N/A N/A Total number of 30,804 95,215 N/A N/A paths Average number 751.317073 11,901.875 N/A N/A of paths

In terms of absolute breadth and depth, the GSSO is smaller than ICD-10-CM, leading to faster computation times when exploring the ontology graph. Tangledness is low, but unavoidable in terms of the interdisciplinary nature of the GSSO. For instance, “gender identity” exists as a subclass of “personal attribute” and “self-identity” with parallel hierarchies dependent on use case (such as a personal questionnaire or identities as

46

handled in various subject headings). Additionally, via the proxy leaf fan-outness we can measure dispersion of the ontology graph—which the GSSO has a slightly lower value for.

Table 9. Comparison of average values of other OBO Foundry ontologies to the GSSO.

Measure Average Value in OBO Value for GSSO (via Manouselis et al) [n = 75] No. of Classes 3,169.75 7,121.00 No. of Instances 11,318.80 2,939.00 No. of Properties 15.41 289.00 No. of Root Classes 496.23 1.00 Number of Leaf Classes 2,490.52 5,386.00 Average Population 2.64 0.007727 Class Richness 0.01 0.0 Inheritance Richness 0.93 1.152032 Relationship Richness Metric 0.44 0.0

When discussing the GSSO alongside other OBO ontologies, it has only slightly larger values in most categories of analysis, but has a much larger number of leaf classes, leading to decreased class richness and relationship richness seen in other OBO Foundry ontologies

(it is above zero, but the value does not go lower than 0.01 in Ontology Metrics). The reasons for this are related to the lack of solidified logical model within the GSSO, which is inherently connected to its interdisciplinary nature.

Despite this, the GSSO has a significantly higher inheritance richness than most OBO ontologies. This indicates that the ontology is “horizontal” rather than “vertical”, meaning that the ontology covers a wider domain than those of the theoretical average OBO

Foundry ontology described by Manouselis et al (73). There is a lower level of detail in the ontology, but this value does not necessarily represent the attribute (annotation) richness, which is much higher in the GSSO (7.4) versus, for instance, ICD-10-CM (8.4 × 10−5).

47

In addition to OntoMetrics, we also ran the GSSO through the OOPS! (OntOlogy Pitfall

Scanner!) RESTful web service (74). Only two minor “pitfalls” were detected out of a possible 41. These included missing disjointness and missing annotations. The missing disjointness is planned to be addressed in the future using a mixed-methods approach, but the issue mostly involves how instances can be assigned to classes which occasionally can make an ontology less human-readable (thus why it is unaddressed at this point). All missing annotations were addressed. For comparison, the DBpedia Ontology 3.8 had 12 pitfalls and the AKT Reference Ontology had 5 pitfalls detected.

Over 325 scholarly articles and 91 books were fully cited and sourced in the GSSO, covering a range of topics stretching over the late 19th, 20th, and early 21st centuries. References ranging from “The ‘Fairy’ and the Lady Lover” (1922) in which the slur “fag” entered mainstream literature to “Induced Lactation in a Transgender Woman” (2018) showcased the extreme differences in LGBTQIA+-related attitudes in medical care over the last hundred years.

At the time of writing, the GSSO is the only known ontology to include in-text citations and a bibliography which is part of the ontology itself. This allows for “drilling down” to observe conceptual shifts through time. Contextual search engines can prioritize entries with date related information and sources to provide better and more consistent results.

For instance, if one searches “transgender”, they could map backward to include entries with language like “transsexual” or “sexual inversion”. Likewise, they could map forward from “transsexual” to “transgender”. Both of these options are available on the GSSO

48

website for the transgender bibliography search engine, discussed further in the MEDLINE section of Chapter 12.

The first version of the GSSO website was created and published in June 2019. It was a flat website which ran entirely on the user side by processing the entries from a downloaded

OWL flat file. It offered only a simple search engine and a single-page Angular application.

We sought informal preliminary feedback on the website from the University of Cincinnati

Graduate Student Research Forum (GSRF), the American Medical Informatics Association

(AMIA) 2018 Annual Symposium, and the GLBT Museum and Archives in San Francisco,

California. Feedback was positive in all groups, but suggestions related to more content, more efficient back-end processing, and an API which would allow for more complicated querying and searching of the GSSO.

The second version of the GSSO website moved hosting from a personal webpage at the

University of Cincinnati to a more robust server at Cincinnati Children’s Hospital Medical

Center (CCHMC). The server allowed us greater ability to host larger datasets and provide multiple users efficient processing of GSSO data simultaneously.

This version included a more diverse search interface with hundreds of possible search combinations, faster searching and load times, more elaborate security protocols, an example tagging system (via the transgender bibliography search engine, created as part of work with MEDLINE discussed in Chapter 12), a plaintext tagging system, a simple API, a downloads tab, links to other accession points, access to sub-annotations and external links, and automatic database cross-referencing using both external links and variants of

GSSO links from all versions of the GSSO. 49

The website was created using an Angular MVC (model-view-controller) interface which connected through an Apache middleware to a Flask server, communicating with various

Python scripts which processed various aspects of GSSO flat files and auxiliary materials.

Figure 3. Screenshots from initial markups for the stand-alone GSSO website with version 1 (left) and version 2 (right).

From September to December 2020, we ran an ontology evaluation survey and a website usability survey (with attached demographic survey; all survey instruments are shown in

Appendix C and were built using REDCap11). This survey was present on the GSSO website, but we targeted several user groups through electronic media (Table 10). Due in part to the

COVID-19 epidemic and low response rate, we added additional populations in October

2020 (indicated by a “*”).

11 The public survey arm is posted on the website (https://gsso.research.cchmc.org/#!/survey), with the RedCAP link being available at: https://redcap.research.cchmc.org/surveys/?s=LNE7EN8T8D. 50

Table 10. Group membership and estimated response rates pre-survey distribution. Based on Sivo et al, we label a lower range of 60% of active membership and an upper range of 90% of active membership.

Lower Range

Upper Range Upper

Membership Membersh Membership

Group Name Group

Group Type Group

Primary Primary

Active

Total Total

ip

GLBT Museum and Mailing LGBTQIA+ ~200 ~50 30 45 Archives List Homosaurus Mailing LGBTQIA+ ~100 ~50 30 45 Mailing List List American Medical Online Medical ~250 ~75 45 68 Informatics group professionals Association Mental Health Working Group /r/AskTransgender Online LGBTQIA+ ~128,000 ~500 300 450 group /r/AskLGBT Online LGBTQIA+ ~5,000 ~75 45 68 group SNOMED Mental Online Medical 36 ~20 12 18 and Behavioral group professionals Health Clinical Reference Group Trans PhD Online LGBTQIA+ 1,183 ~400 240 360 Network group 2020 LD4 Conference Medical ~25 ~25 15 23 Conference on workshop professionals Linked Data in Libraries12 Queer PhD Online LGBTQIA+ Network group Trans Peer Online LGBTQIA+ Network group LGBQI Section at Online Medical the American group professionals Academy of Neurology

12 Moved online as a result of the COVID-19 epidemic. Survey was distributed via the LD4 Conference Slack channel. 51

The response rates were much lower than expected, potentially due to the length of the survey. However, we considered this adequate for simple analyses in relation to other published works with similar sample sizes (70,75). Thirteen people responded fully to the

OUS (which specifically considered the GSSO ontology) and 12 people responded fully to

SUS (which considered the GSSO website). The unweighted13 average score of the OUS was

38.2 (out of 50; 푠푥 = 6.5) and the unweighted average score of the SUS was 50.2 (out of

100; 푠푥 = 5.2). The weighted average score of the OUS was 49.1 and the weighted average score of the SUS was 59.7.

The GSSO ontology was considered as being satisfactory in comparison to other ontologies evaluated using the OUS (VSTO received a 26 and SIO received a 36, for instance). However, it did not generally perform as well as the DCO (receiving scores of 38, 42, and 29) and the

GCIS (receiving scores of 38 and 44) (70). The average score on the SUS was less than satisfactory (typically considered at 68).

We considered that there were sample size issues, response rate issues, and issues with the website that were unavoidable (for instance, unplanned down time). Both the ontology and the website could be improved, but the ontology is ultimately considered useful and is hosted in more consistent locations (such as Ontobee and the NCBO BioPortal).

In our evaluation of both research and medical documentation, we evaluated precision, recall, and F-score values compared to values considered to be satisfactory or excellent and

13 Unweighted scores calculated the total scores for individual respondents and those scores were then averaged. Weighted scores calculated the averages of each question, which were then added together for the total score. Both methodologies have been published in literature. 52

to closely related identification measures in current literature where available. Equations for statistical analysis are found in Appendix E.

Chapter 12: Research Documentation

Research and archival information were considered facets of research documentation in our attempt to evaluate the effectiveness of the GSSO.

MEDLINE

We used the GSSO to semi-automatically curate all transgender resources published between 1790 and 1999, so as to have the widest spread of terminology14. This process identified sources using terminology included in the GSSO (automated keyword searching), as well as using manual review of sources cited within documents and additions of sources found in other curated bibliographies.

Before this, however, we considered usage of the GSSO against a free-text selection of

LGBTQIA+-related articles on MEDLINE and a random selection of MEDLINE articles. The primary types of “tags” provided in MEDLINE are MeSH terms (and UIDs), chemical identifiers, and keywords. Of these, only MeSH terms form a controlled vocabulary in our context, since chemical identifiers can (and do) link out to numerous ontologies.

We used the small set of 32 LGBTQIA+-related terms from the Human Rights Campaign

(HRC) as a comparison tool. These terms were first used to search through MEDLINE to

14 This was done to make sure the GSSO could theoretically expand to newer terminologies as they arise. 53

create a “scoped” dataset of 14.019 entries. 13,998 (99.85%) of these entries were tagged by the GSSO, in comparison to 82.62% tagged by MeSH. We were able to search all of the

10,060 entries (including alternate labels, synonyms, and other annotation-related information) in the GSSO to tag a MEDLINE entry in 7.7 seconds on average (range 1.8 to

9.9 seconds) when running on a local machine without parallel processing.

The specificity of the GSSO toward gender, sex, and sexual orientation data was showcased by the coverage within this scoped dataset. 28.9% (2,911) of unique GSSO terms appeared as tags, in comparison to 2.5% of MeSH terms (7,022) and 8,833 keywords (no defined corpus size).

The HRC terms are shown in Table 11 alongside counts with plain-text search, MeSH tagging, keyword tagging, and GSSO tagging. Of the 17 terms considered non-ambiguous, we mapped five to MeSH to calculate precision, recall, and F-scores (lesbian, transgender, gender identity, gender dysphoria, and homophobia. Using aforementioned criteria, the result for ‘transgender’ was considered excellent (precision=0.58, recall=0.76, F- score=0.658) and three were considered satisfactory (lesbian, gender identity, and gender dysphoria) (76,77). The fifth term, ‘homophobia’ did not perform as well, with precision at

0.19, recall at 0.46, and an F-score of 0.27. It is possible that this was the result of many disparate types of homophobia which were not mapped to version 2.0 of the GSSO.

In the randomly selected subset of 1,217,621 MEDLINE entries, 628,979 (50.66%) were tagged with the GSSO, versus 1,058,065 (86.89%) tagged with MeSH terms, and 201,096

(16.52%) had keywords.

54

Table 11. Term frequencies for most common 32 terms across main terminologies for the HRC subset.

Word HRC Count MeSH Count Keyword GSSO Count Count ally 289 NM 2 304 androgynous 23 NM 2 39 asexual 1,288 NM 5 1,307 biphobia 3 NM 2 14 bisexual 2,477 2,021 183 3,240 cisgender 39 NM 8 176 closeted 3 NM 1 24 coming out 192 NM 21 307 gay 4,426 3,257 208 2,356 gender dysphoria 277 156 143 394 gender expression 24 NM 10 61 gender identity 725 1,644 164 1,078 gender non-conforming 11 NM 9 129 gender transition 10 NM 14 48 gender-expansive 0 NM 1 132 gender-fluid 0 NM 1 1 genderqueer 8 NM 8 19 homophobia 276 197 58 470 intersex 598 561 44 622 lesbian 2,195 1,891 246 3,182 LGBTQ 196 NM 58 271 living openly 0 NM 0 0 non-binary 31 NM 3 12 outing 33 NM 0 0 pansexual 3 NM 1 13 queer 369 NM 49 594 questioning 1,219 NM 12 1,301 same-gender loving 2 NM 0 2 sex assigned at birth 0 NM 0 21 sexual orientation 1,393 NM 358 2,044 transgender 2,102 2,078 529 2,431 transphobia 17 NM 10 50

55

In our manual review of transgender-related literature from 1790 to 1999, we examined

77 bibliographies and 11 databases, including MEDLINE. We identified 3,058 sources, of which 1,942 were journal articles. The GSSO automatically tagged 2,740 of the 3,058 sources (89.6%). Amongst the MEDLINE-specific sources (n = 1,328), MeSH tagged 1,303

(98.1%) while the GSSO tagged 1,328 (100.0%). The average number of tags per source in

MEDLINE was 18.1 for the GSSO and 10.2 for MeSH. 1,211 unique GSSO tags were utilized

(11.0% of all GSSO terms), while 1,542 unique MeSH tags were (0.6% of all MeSH terms), displaying the GSSO’s specificity in tagging LGBTQIA+-related literature.

No plaintext search mechanisms could match either of these percentages using 858 variations of transgender and transgender-related terminology. Forward feature selection

(FFS) combining plaintext search terms also failed. 187 terms had a precision value of 1 but recall values between 0.00075 and 0.03. The highest precision values achieved were for

“transsexualism” (0.87), “males with transsexualism” (0.76), “male with transsexualism”

(0.76), and “transsexualism in the male” (0.76). These terms also had the highest F-scores

(0.84, 0.79, 0.79, and 0.79, respectively).

We compared the identification capability to more complex manual search mechanisms, such as that created by Wanta and Unger (78). Their search mechanism includes six MeSH terms and twelve keywords, versus only the GSSO term “transgender” being selected. In our manually constructed set of 1,328 relevant MEDLINE sources, we found that all 1,328 appeared in the Wanta and Unger review and were also appropriately tagged by the GSSO.

Additionally, we compared the manual subject tagging in Wanta and Unger’s Table 1 to the automatic secondary tagging done by the GSSO (Table 12). The differences in subject

56

content captured between both capture methods were considered non-significant

(Kruskal-Wallis test, p < 0.01, H = 0.14, 1 d.f., p-value = 0.71).

Table 12. Percentages of subject coverage in 1799 to 1999 dataset versus Wanta and Unger dataset, up to June 2016 (Table 1 in that publication).

Topics Percentages – June 2016 1799 – 1999 – June 2016 1799 – 1999 Surgery Surgery 18.3 23.5 Endocrinology/Hormones Endocrinology 12.3 12.3 Mental Health Psychology/Psychiatry 10.8 33.3 Pediatrics Pediatrics 7.8 4.8 HIV Care HIV/AIDS 5.3 2.2 Cancer Oncology 1.7 0.8 Reproduction Fertility 0.9 0.1 Sexuality Sexology/Sexual Health 2.6 10.4 Linguistics/Voice Linguistics/Vocology 2.5 1.0 Law Law 2.3 5.7 Bioethics Ethics 2.2 1.2 Aging/Elderly Gerontology/Geriatrics 0.5 0.3 Education Education 0.5 0.1 Incarceration Criminology 0.5 0.8

Given that terms like “transsexualism” are significantly less common today, it is likely that these values (and the MeSH values) would be much lower in a more recent test dataset. The

GSSO could easily expand to more recent terminology. MeSH tagged 0 articles with

“Transgender Persons” (D063106) and 1,136 with “Transsexualism” as the term

“Transgender Persons” was not added to MeSH until 2016. The GSSO tagged 15 with

“transgender” and 1,172 articles with “transsexualism”, showcasing the terminology’s adaptability when considering older and more recent terminologies.

57

In terms of automatic subject-based curation, automated GSSO and MeSH tagging was adequate when compared to external manual tagging. Precision results were evenly split between MeSH and the GSSO, with the GSSO performing better in all recall categories and in all F-score categories except for pediatrics. Results for subjects tested are shown in Table

13.

Table 13. Precision, recall, and F-scores for the top ten represented subjects in MEDLINE transgender literature review.

Term MeSH & GSSO & MeSH & GSSO & MeSH & GSSO & Subject Subject Subject Subject Subject Subject Precision Precision Recall Recall F Score F score surgery 0.716** 0.557* 0.281* 0.738** 0.404* 0.635** psychology 0.231 0.252 0.368* 0.432** 0.284* 0.318* psychiatry 0.313* 0.295* 0.055 0.571** 0.094 0.389* endocrinology 0.333* 0.817** 0.011 0.538** 0.021 0.649** sexology 0.000 0.600** 0.000 0.060 0.000 0.109 law 1.000** 1.000** 0.079 0.175 0.146 0.298* pediatrics 1.000** 0.250 0.016 0.016 0.031 0.030 sociology 0.000 0.000 0.000 0.000 0.000 0.000 venereology 0.000 0.000 0.000 0.000 0.000 0.000 epidemiology 0.000 0.000 0.000 0.000 0.000 0.000 Bolded values represent higher values. * = satisfactory, ** = excellent.

Results with the GSSO were considered satisfactory for precision with surgery and psychiatry, and were excellent for endocrinology, sexology, and law. For recall, the results were excellent with surgery, psychology, psychiatry, and endocrinology. For F-scores, the

GSSO’s results were excellent for surgery and endocrinology, and satisfactory for psychology, psychiatry, and law (76,77).

58

AIDS History Project

The AIDS History Project provided us with data of highly variable quality derived from over

40 gigabytes of documents created using OCR (optical character recognition). The OCR text ranged from fairly human-readable to seemingly randomly generated characters.

We used the GSSO on both the raw OCR text and on “corrected” text which used an NLP algorithm selected by the AIDS History Project itself. We applied 1,309 unique tags and a total of 38,342 tags to 358 documents, or an average of 107 tags per document. All 358 documents were able to be tagged, including OCR of images and handwritten documents.

The documents were of varying length, with some occupying several megabytes. Each entry included only 3 Library of Congress Subject Headings tags maximum, showcasing the depth

GSSO tags were able to add. The most common tags included expected entries, like “AIDS”

(219 instances), “California” (199 instances), “epidemic” (170 instances), and “university”

(164 instances).

The tagging system was taught on this dataset over the course of two workshops to persons with varying levels of computing experience, showcasing the ability of the GSSO to expand into various disciplines. These experiences led to the creation of two Jupyter15 notebooks which allowed us to analyze the presence or absence of LGBTQIA+ language as a proxy for LGBTQIA+ representation in the AIDS History Project dataset.

15 Jupyter Notebook is a web-based integrated development environment (IDE), which can be used to code in a number of programming languages and markup languages. In this case, we used Python. 59

Using the rule-based tagging mechanism derived from our MEDLINE datasets, we found that 297 (83.0%) documents included lesbian language, 122 (34.1%) included language used primarily by gay men, 4 (1.1%) included bisexual language, and 3 (0.8%) included transgender language. The collections surveyed included many materials from the

Women’s AIDS Network (WAN) Records, the Sally Hughes AIDS Research Collection, the

Dritz (Selma) Papers), and the Sue Rochman Papers. In both workshops, persons with no computing experience were able to reproduce these results with only four hours of instruction.

Chapter 13: Medical Documentation

Medical documentation of LGBTQIA+ identities and concepts is relatively sparse. Gay activism in the 1960s and 1970s led to the eventual depathologization of homosexuality and thus the removal of gay-, lesbian-, and bisexual-related terminology in both the DSM and ICD.

There is currently no widespread collection of sexual orientation or sexuality data in electronic health records (EHRs). We utilized transgender terminology primarily so that free-text could be compared to a previously identified “gold standard”, i.e., ICD coding.

However, it is notable that Lynch et al utilized the GSSO in a hybrid natural language processing approach in order to find sexual orientation-related terms and phrases in

Veterans Health Administration clinical notes from 2000 to 2019 using the GSSO (obtaining a positive predictive value of 85.9% for sexual orientation and a 79.9% sensitivity when

60

compared to administrative coding for homosexuality) (79), demonstrating that the GSSO is not limited to transgender-related free-text identification.

The ICD typically follows the DSM in terms of psychological and psychiatric diagnostics but is mostly used in the United States as billing codes.

The history of transgender-related coding and pathologization in the DSM and ICD is important to understanding modern coding in clinical practice. Up until 1910, medical and psychological terminology relating to transgender identities was scant and highly variable, including such desperate (and oftentimes confusing) terms like ‘sexual inversion’ and

‘psychical hermaphroditism’.

In 1910, sexologist Magnus Hirschfeld introduced the German term Transvestit, which was translated variously into “transvestite” and “cross-dresser” in English. This was the prevalent terminology used later by Christian Hamburger, Georg K. Stürup, and E. Dahl-

Iversen when writing about ’s hormonal and surgical procedures in

May 1953. The first edition of the DSM [DSM-I] (1952) referred to “sexual deviation, transvestism” while the second edition [DSM-II] (1968) referred to “sexual deviation, transvestitism”.

Harry Benjamin’s The Transsexual Phenomenon, published in 1966, popularized David O.

Cauldwell’s term “psychopathic transexual” (1949). However, by the 1970s, Norman Fisk and others popularized “gender dysphoria syndrome” (later shortened to “gender dysphoria”) and “gender identity disorder”.

61

Transphobia in the late 1960s and 1970s, including the publication of Janice Raymond’s The

Transsexual Empire: The Making of the She-Male and the influence of John Money’s sex/gender differentiation, led to DSM-III (1980) splitting “sexual deviation transvestitism” into “psychosexual disorder, transsexualism” and “paraphilia, transvestism”.

However, by the fourth edition of the DSM [DSM-IV] (1994), “psychosexual disorder, transsexualism” had become “gender identity disorder”, which was split into various subtypes based on age of “onset”. By DSM-V (2013), the previous terminology was depathologized to a certain extent, with the terminology “gender dysphoria” mostly replaced

“gender identity disorder”.

ICD-9 adopted what is shown in Figure 4 Panel A, mostly in line with DSM-III and DSM-IV

(bolded terms are specifically transgender related). ICD-10, however, moved to a more DSM-

IV-related model, shown in Figure 4 Panel B.

62

A. ICD-9 codes 302 Sexual and gender identity disorders 302.3 Transvestic fetishism 302.5 Trans-sexualism 302.50 With unspecified sexual history 302.51 With asexual history 302.52 With homosexual history 302.53 With heterosexual history 302.6 Gender identity disorder in children 302.8 Other specified psychosexual disorders 302.85 Gender identity disorder in adolescents or adults

B. ICD-10 codes F64 Gender identity disorders F64.0 Transsexualism F64.1 Dual role transvestism F64.2 Gender identity disorder of childhood F64.8 Other gender identity disorders F64.9 Gender identity disorder, unspecified

Figure 4. Transgender-related diagnostic codes in the International Classification of Diseases, specifically in ICD-9 (Panel A) and ICD-10 (Panel B). Bolded codes are typically considered unambiguous in their reference to transgender persons.

MIMIC-III

MIMIC-III is a single-center relational database containing 38,597 distinct adult patients and 49,785 hospital admissions at the critical care units of the Beth Israel Deaconess

Medical Center (BIDMC)16 between 2001 and 2012. The database was chosen for a test- case study as it is freely available and has a very large population of deidentified ICU patients.

16 Located at 330 Brookline Avenue, Boston, Massachusetts 02215. 63

We first identified transgender patients utilizing ICD-9 codes (Table 14). This involved searching for all relevant ICD-9 codes and considering all ICD-9 matches to the codes in

Table 14 to be true positives, after manual evaluation of the associated notes.

Table 14. Percentage of transgender patients shown out of total adult patients and total diagnoses in MIMIC-III.

ICD-9 ICD-9 Term Diagnosable? # Diagnoses # Patients Code 302 Sexual and gender No N/A N/A identity disorders 302.3 Transvestic fetishism Yes 0 0

302.5 Trans-sexualism No N/A N/A

302.50 Trans-sexualism with Yes 7 4 unspecified sexual history 302.51 Trans-sexualism with Yes 0 0 asexual history 302.52 Trans-sexualism with Yes 0 0 homosexual history 302.53 Trans-sexualism with Yes 0 0 heterosexual history 302.6 Gender identity disorder Yes 0 0 in children 302.8 Other specified No 0 0 psychosexual disorders 302.85 Gender identity disorder Yes 2 2 in adolescents or adults Totals: 9 6 Percentage: 0.016%

Next, we implemented a free-text weighted search utilizing the GSSO (the same algorithm utilized in the MEDLINE search methodology), which provided us with a sample of 13 patients, which were manually confirmed. Although this dataset was small, ICD-9 codes only correctly identified 46% of the population that the GSSO derived. Further, this doubled

64

the percentage of total population, causing it to rise to 0.034%. This result was obtained only using clinical notes and did not involve any other aspect of the EHR.

Two population figures for Massachusetts were available for comparison: a 2016 estimate by the Williams Institute (0.57% of adults) and a 2011 self-identification dataset obtained from the Massachusetts Behavioral Risk Factor Surveillance Survey (0.5% of adults).

Given the number of patients identified, the gold-standard should be self-identification, followed by GSSO free-text or another free-text algorithm identification. ICD-9-focused identification is not reliable.

Comparing demographic data in the dataset, obtained from the patients table, against the note, which acted as the source of truth, illustrated that physicians utilized correct pronouns less than 40% of the time and assigned sex at birth (ASAB) was correct less than

54% of the time.

This had a significant effect on diagnostic statistics: ICD-9 codes provided 89 diagnoses, while GSSO identification yielded 163 diagnoses. Of the 89, 14 (15.7%) were mental health- related, while, out of the 163, 24 (14.7%) were related to mental health. The differences between individual diagnoses showcased by the methodology change are shown in Table

15.

65

Table 15. Differences in mental health diagnoses, based on ICD-9 and GSSO search mechanisms. Score cutoff for GSSO identification was 1.0.

ICD-9 Identification GSSO Identification Diagnosis % of Diagnoses % of All % of Diagnoses % of All Trans Per Diagnoses Trans Per Diagnoses Patients Patient Patients Patient (n = 6) (n = 13) Attention 16.7% 0.17 1.1% 23.1% 0.23 1.8% Deficit (1) (3) Disorder Depression 33.3% 0.33 2.2% 30.8% 0.31 2.5% (2) (4) Drug Abuse/ 50.0% 1.17 7.9% 46.2% 0.77 6.1% Dependence (3) (6)

There was a MIMIC-III column entitled “gender”, but the MIMIC-III platform and documented usage noted that this should be taken to mean “genetic sex,” despite minimal karyotyping results of patients available. If this is assumed to be sex assigned at birth (the closest external match in terminology to their descriptor), then it is correctly entered

53.8% of the time.

However, given the small sample size, these results cannot be applied to the general population without additional analysis.

Cincinnati Children’s Hospital

Given the limitations of the MIMIC-III dataset, we tested the GSSO on an additional pediatric dataset from the emergency department at CCHMC. CCHMC has a transgender health clinic, making potential cross-referencing for true positives easier, and suggests that a larger set of transgender patients will be available.

66

We identified 2,003 transgender patients using ICD-9 and ICD-10 codes associated with visits from 2009 to 2020. The ICD codes represented diagnosis codes added by providers, the same as those shown in the MIMIC-III dataset. For population-level statistics, we compared these results to a set of all patients seen in the same time frame across the entire hospital with at least one encounter and at least one diagnosis.

For ICD-9 we had (versus 551 were found using free-text matches to terms found using the transgender bibliography).

Table 16. Transgender-related ICD-9 diagnoses in the CCHMC dataset.

ICD-9 ICD-9 Term Diagnosable? # Diagnoses # Patients Code 302 Sexual and gender No N/A N/A identity disorders 302.3 Transvestic fetishism Yes 1 1

302.5 Trans-sexualism No N/A N/A

302.50 Trans-sexualism with Yes 37 15 unspecified sexual history 302.51 Trans-sexualism with Yes 0 0 asexual history 302.52 Trans-sexualism with Yes 5 1 homosexual history 302.53 Trans-sexualism with Yes 11 1 heterosexual history 302.6 Gender identity disorder Yes 3,038 452 in children 302.8 Other specified No N/A N/A psychosexual disorders 302.85 Gender identity disorder Yes 551 89 in adolescents or adults Totals: 3,092 494 Percentage: 0.07%

67

And then for ICD-10 codes we had (versus 1,831 were found using free-text matches to terms found using the transgender bibliography).

Table 17. Transgender-related ICD-10 diagnoses in the CCHMC dataset.

ICD-10 ICD-10 Term Diagnosable? # Diagnoses # Patients Code F64 Gender identity No N/A N/A disorders F64.0 Transsexualism Yes 16,885 1,559 F64.1 Dual role transvestism Yes 2 2 F64.2 Gender identity disorder Yes 1,207 221 of childhood F64.8 Other gender identity Yes 2 1 disorders F64.9 Gender identity Yes 1,761 484 disorder, unspecified Totals: 19,857 1,766 Percentage: 0.32%

Unlike the MIMIC-III dataset, we were also able to ascertain what the medical provider originally wrote, which was then mapped to an ICD code, meaning that many providers entered a transgender-related diagnostic note, but it was mapped to a non-transgender diagnosis. For instance, “Hormonal imbalance in transgender patient” was mapped to 259.9

(“Unspecified endocrine disorder”) and “Gender identity uncertainty” was mapped to F66

(“Other sexual disorders”). Other terms were outdated but were still in use when ICD-10 was implemented, such as “Sexual deviation” (which has not appeared since DSM-II in

1968), “Sexual aberration” (which has never appeared in the DSM or the ICD), and

“Transvestism” (has not appeared since DSM-III in 1980). Further, some entries in the 68

CCHMC were unnecessarily pathologized, such as physician entries for “Cross-dressing” being coded to “Transvestic fetishism”, which is a mostly theoretical construct described cross-dressing for explicitly sexual purposes in a way which is considered to be an issue by the patient themselves.

ICD-10 introduced a limited number of patient history codes (Z00-Z99, “Factors influencing health status and contact with health services”). These were not found in the MIMIC-III dataset, but they were present in the CCHMC dataset. Only two of these codes have the propensity to be transgender-related: Z87.890 (“Personal history of sex reassignment”) and Z79.890 (“Hormone replacement therapy”). “Z79.890” did not appear in our exploratory analyses, but several variations of the code “Z87.890” did, representing 4 patients. Unfortunately, these codes are not transgender-specific, with Z87.890 referring to any sex reassignment for any reason (such as for intersex infants) and Z79.890 referring to any hormone replacement therapy (including vaginal estrogen ring therapy and postmenopausal hormone replacement therapy). Analyses using these codes should carefully consider if the patient is transgender before using in widespread research projects.

There were several more issues which could affect the coding process: patient privacy including modification of billing codes, problems with mapping to billing codes, external factors related to code mapping, provider knowledge, knowledge of those in coding individual diagnostics, etc. CCHMC uses a mixture of various coding methodologies, making it difficult to parse out which factors may have influenced a given code assignment.

69

For population-level statistics, there was a large increase in transgender-related diagnostics from ICD-9 (0.07%) and ICD-10 (0.32%). However, it is still significantly less than the 2016 Williams Institute estimate of the transgender population in Ohio, being approximately 0.45% of adults. CCHMC opened a transgender health clinic in 2013. The clinic has seen more than 1,400 patients since that date and has 964 active patients as of

2020 (80).

In considering the changes from free-text analysis versus ICD codes alone, we had 551 versus 494 (with ICD-9) and 1,831 versus 1,766.

Patient demographics are shown in Table 16. The White, non-Hispanic transgender population was expected to be around 55% of the total number of transgender patients.

However, the percentage was found to be much larger (86%). Note that Williams Institute comparative data (for the purposes of racial/ethnic designations) were based on the free- text matches to terms, not the matches to ICD codes. CCHMC population data was extrapolated from the Vital Statistics Fiscal Year 2019 Summary (81). Statistics for African-

Americans were far below expected (0.08% versus 0.77%), potentially showcasing intersectional health disparities.

70

Table 18. Comparison of Williams Institute and CCHMC transgender racial/ethnic statistics. Latinx/Hispanic population statistics not recorded (hence the N/A).

Williams Institute CCHMC (ICD Identification) Racial/Ethnic Percentage Percentage Number Percentage Percentage Category of of Adult of of of Hospital Transgender Population Persons Transgender Population Population in in Group Population in Group Group in Group White, non- 55% 0.48% 1,688 86% 0.27% Hispanic African- 16% 0.77% 148 8% 0.08% American or Black, non- Hispanic Hispanic or 21% 0.84% 15 1% N/A Latinx Other Race or 8% 0.64% 110 6% 0.43% Ethnicity, non- Hispanic

Total 2,003

The compare NLP techniques to those we used in the MIMIC-III dataset, we evaluated all

2018 and 2019 Emergency Department and Urgent Care visits. All patents with a visit during the study period were evaluated for: (1) if they had ever received a transgender diagnosis, i.e., an appropriate ICD-10 code; (2) if the word ‘transgender’ appeared anywhere in their progress notes; or (3) based on the score from the NLP algorithm which utilized the GSSO. The overlap between these groups is shown in the table below (Table

19).

Applying ICD-10-CM codes identified 130 individuals (1.5% of patients), free-text identified

71 individuals (0.83% of patients), and the GSSO identified 78 individuals using a cutoff of

1.0 (0.91% of patients). With all methodologies, 172 patients (2.0%) were identified. Our

71

expected baselines were 0.32% (based on hospital demographics) and 0.45% (based on the

Williams Institute demographics).

Table 19. Overlap between various identification methods in CCHMC data.

ICD-10 Free-Text GSSO ICD-10 130 35 41 Free-Text 35 71 65 GSSO 41 65 78

Total Unique: 122 39 40

We evaluated a set of similar results (26 identified patients) created by Foer et al (2019), shown in Figure 1 of that paper (82). Their keywords included the term “transvestite”,

“gender identity”, and “gender reassignment”. These terms may not relate to being transgender and may not apply in a pediatric settings wherein intersex youth who are not transgender may have such reassignments performed. Additionally, they only found 26 patients (8.0%) in a set of 324 were identifiable as transgender upon chart review. The difference in identification methodologies by closest related procedure is shown in Figure

5.

72

Figure 5. Comparison of identification methodologies in the CCHMC dataset (left) and the Foer et al (2019) dataset (right).

There were several limitations to this identification, including the lack of comprehensive preprocessing or deduplication of clinical notes which may have had sections copied and pasted. This is a minor concern as most emergency department visits relate to the presenting complaint, not the patient’s full medical history. We considered using programs like WordNinja, a Python package, to split up and ‘correct’ potential spelling errors, etc., but when the system was used in MIMIC-III, the processing time was significantly longer and led to more false positives due to limitations in the WordNinja training corpus.

It was difficult to select an appropriate score cutoff with the limitations presented and without extensive manual chart review. Instead, we compared recall, precision, and F- scores between methodologies, assuming that each was the ‘gold’ standard. The group with the highest F-scores in comparison to the other two methodologies would be considered the most effective. The distribution of scores is shown in Figure 6.

73

Figure 6. Distribution of matching scores using GSSO algorithm for transgender identification.

GSSO scores had a wider distribution than in the MIMIC-III dataset, with one individual not reaching the 1.0 threshold (having a 0.7). Most patients hit the 1.0 threshold (81%), but four hit the highest value of 2.8. One patient with a score of 2.8 did not appear in either the

ICD-10-CM set or the free-text set. We are unable to create a significant cutoff without a much larger dataset, despite the GSSO’s attempt to quantify matches, although a cutoff of

1.0 does provide excellent precision, recall and F-score values. Additionally, GSSO scores were significantly better than those reported by any methodology in Foer et al (2019).

74

Table 20. Precision (P), recall (R), and F-scores (F) for ICD-10 identification, free-text identification, and GSSO identification based on a score cutoff of 1.0. Foer et al (2019) comparisons shown under averages (Table 1 in that paper), wherein methodologies are the same, except “GSSO” is “Keyword”.

Comparison Against Alternative Methodologies ICD-10 Free-Text GSSO Identification P R F P R F P R F Method ICD-10 1.00** 1.00** 1.00** 0.27* 0.49** 0.35* 0.32* 0.53** 0.39* Free-Text 0.27* 0.27* 0.35* 1.00** 1.00** 1.00** 0.92** 0.83** 0.87** GSSO 0.25 0.32* 0.28* 0.77** 0.92** 0.84** 1.00** 1.00** 1.00**

Average: 0.50* 0.53** 0.54** 0.68** 0.80** 0.73** 0.74** 0.79** 0.76** Foer et al: 1.00** 0.04 0.08 0.07 0.08 0.08 0.00 0.23 0.01 * = satisfactory (precision above 0.265, recall above 0.216, and F-score above 0.198), ** = excellent (precision above 0.56, recall above 0.42, and F-score above 0.48).

The ICD-10 diagnoses could have have occurred at any patient visit and not just in relation to their emergency department or urgent care visit. This could also mean that an individual may have detransitioned and/or desisted as identifying as transgender and the code would remain. Because of this cofounder, the free-text matching with the GSSO provides a more accurate comparison for what is in the note in and of itself, rather than something that happened outside the context of the emergency visit. In comparison to the other methodologies, the GSSO provided, on average, better precision and better F-scores, as well as recall values on par with free-text identification.

PART V: DISCUSSION

The GSSO was found to be accurate, complete, concise, adaptable, clear, computationally efficient and consistent across multiple scenarios and datasets. Completeness was judged with the usage of seed terms in multiple databases, manual literature review and curation 75

of terminology, and via the OUS with opinions from both LGBTQIA+ persons and medical providers. Accuracy was evaluated with comparisons to MeSH in MEDLINE and via the

OUS. Conciseness was shown via the elements showcased in the accuracy and completeness criteria, as well as the OUS. Adaptability was shown via the GSSO’s usage in medical and non-medical scenarios, by LGBTQIA+ persons generally, by archivists and librarians, and in medical research datasets, EHR datasets, and in archival records of varying textual quality. The OUS feedback and the SUS for the website contributed to assessment of clarity, computational efficiency, and consistency. The ability of the GSSO to be loaded without errors into multiple platforms such as Ontobee and the NCBO BioPortal showcased its ability to pass essential logical tests. Quick load times and fast tagging of thousands of documents further displayed the GSSO’s computational efficiency across multiple platforms, databases, browsers, websites, and programming languages.

Chapter 14: Limitations

While the GSSO demonstrated powerful results, there are some limitations to consider when designing and developing a novel ontology. The hybrid methodology used to construct the GSSO consisting of automated and manual updates can be considered as nonstandard, and typically a more automated approach to ontology creation is favored.

Because the ontology’s primary focus was education and readability, with computational methodologies being considered secondary, we considered this to be a non-issue.

Because the GSSO is the first ontology in this space, ontology alignment is difficult, and it is fundamentally too broad and medical in scope to be compared to LGBTQIA+ controlled

76

vocabularies or subject headings. Even if the scope were similar, alignment would be difficult because no known vocabularies in that space have an ontologically derived hierarchy. A forced hierarchy may skew or bias results, making ontology alignment as a measurement mechanism unreliable. However, the manual addition of ontology cross- references significantly reduces issues with inter-ontology alignment and matching. In the future, as the ontology expands, this manual curation may become unreliable, however, and an NLP mechanism may be required for more exhaustive alignment.

In terms of balancing the qualitative and quantitative linguistic aspects of the ontology, it was critical to consider a multi-disciplinary perspective and weigh different aspects of knowledge in the construction of the GSSO. These knowledges originated not just from differing fields (such as gender studies, medicine, history, etc.), but also from different levels of experience (persons inside and outside of academia, for instance). However, this prioritized the perspectives of privileged individuals (those who are not heavily involved in research, medical care, or are members of a marginalized group or groups) over those which may have not appeared in our review. Lay ontologies (those created by non- academics, or those outside of biomedical fields) were not well represented (other than, perhaps, Wikipedia mappings17). We mitigated this in part by using the OUS and individual feedback from members of various groups online via the GSSO website’s “contact us” option. To increase diversity in future work, community involvement will be crucial when considering appropriate descriptions and usage notes.

17 Several studies have analyzed bias as related to the English-language Wikipedia’s construction (83,84). Additionally, most editors of the site are between 17 and 40 years of age and 84% of editors identify as male (85,86). 77

Despite this shortcoming, there was still some discussion of “political-correctness” in descriptions of various linguistic aspects of the ontology, especially in relationship to medical providers. In response to such concerns, archivist Jessica Tai noted in “Cultural

Humility as a Framework for Anti-Oppressive Archival Description”:

From within a framework of cultural humility, archivists understand that

redescription is not just about revising language but about implementing a practice

of critical self-reflection, as well as recognizing and shifting power imbalances. In

emphasizing co-learning through community engagement, collaboration and

partnerships, cultural humility refocuses archivists to be fundamentally user-

centered. A pivotal step in doing so is to normalize not knowing.18 (87)

This conceptualization of engagement in cultural humility is deeply connected to the medical professions (88,89). In general, not knowing leads to individuals experiencing more discomfort and “more annoyance with lack of information” (90). However, even short educational “bursts” have been shown to improve knowledge and behavioral intentions of providers towards LGBTQIA+ persons (91). The GSSO website aimed to be an intermediate between medical and non-medical affectations, which may have affected its perception by both groups. It was difficult to ascertain how the groups were affected due to the low response rate and unplanned downtime, but we plan on running a more specialized and incentivized survey in future work. However, we did not consider the website design our study’s central focus.

18 Italicization appears as presented in the original article. 78

In the future, construction of a GSSO lookup feature in the EHR functionality may lead to higher agreement with medical providers about the GSSO in regard to its functionality and usefulness. Additionally, short training sessions on the website could close this gap as well.

Chapter 15: Future Directions

Construction and updating of the GSSO with input from multiple health and health-adjacent arenas allows us access to several areas of inquiry moving forward. First, identification of transgender individuals could be expanded to LGBTQIA+ individuals more broadly, continuing preliminary work we have already performed. Additionally comparing the

GSSO-based identification directly to existing NLP techniques (79) and population- and location-specific interventions could help to assure accuracy. Implementing the GSSO across institutions could help to improve the reach and evaluative capabilities of the ontology.

The GSSO can be used to evaluate social media-related health surveillance, especially regarding health behaviors. LGBTQIA+-based identification in social media data has limited by a lack of consideration of slang terminology, which the GSSO takes into account, allowing for data to be “bucketed” easily during classification-based processes. The existence of modern LGBTQIA+ forums and Usenet archives (e.g., the Transgender Usenet

Archive) would also allow for tracking of health-related behaviors and attitudes over time.

Translation of the GSSO into non-English languages has been an area of interest. Usage in non-English language situations is crucial as international standards are often developed without knowledge of translatability, or cultural conceptualizations of gender, sex, and

79

sexual orientation concepts. This may also lead to situations wherein medical information is used to discriminate against LGBTQIA+ persons, so it is imperative that non-English language representation be considered moving forward.

Construction of a unified search interface for LGBTQIA+ literature is important to providing up-to-date perspectives from multiple communities, histories, and disciplines.

During the construction of the GSSO, consulting with various groups, such as the GLBT

Museum & Archives, showcased the difficulties in effective cataloguing, especially in terms of digitalization and making materials available online following the advent of the COVID-

19 epidemic. Medical narratives were prioritized in many LGBTQIA+ articles and reviews, with many conceptualizations and issues being “discovered” sometimes decades after the first case.

This interface will come with necessary updates to the web interface generally, as improvement of the site’s usability is important to its continued usage and acceptability within the medical and non-medical communities.

Chapter 16: Conclusions

The Gender, Sex, and Sexual Orientation (GSSO) represents the first transdisciplinary ontology, including both lay slang terms and medical terminologies. The GSSO includes thousands of individually curated terms related to the clinically relevant areas of gender, sex, and sexual orientation. Most of these terms are not present in any other controlled vocabulary system and include constantly evolving terminologies and slang. The GSSO 80

includes the ability to drill-down to individual sources for terms in addition to its inclusion of definitions for all terms.

In comparison to other biomedical ontologies, like MeSH, the GSSO is more specific and more sensitive in its inclusion of LGBTQIA+ MEDLINE entries. The GSSO’s tagging system is fast and extensible and performs better in regard to LGBTQIA+ patient identification than

ICD codes. The GSSO is interoperable and available on a variety of platforms, including its own website, the NCBO BioPortal, and as part of the OBO Foundry.

Future work with the GSSO will continue enhancing patient identification NLP systems, facilitate social media public health surveillance, and educate medical providers about aspects of LGBTQIA+ health care.

The National Transgender Discrimination Survey reported that 50% of transgender patients had to teach their medical provider about transgender care and that 19% were refused care because of their gender status (4). A sixth of LGBTQIA+ adults have experienced discrimination in health care settings and a fifth avoided medical care entirely due to fear of discrimination (19). The GSSO is a first step in addressing these gaps as they relate to lack of understanding and subject-specific knowledge not currently obtained in medical education.

81

REFERENCES

1. Gates GJ. How many people are lesbian, gay, bisexual, and transgender? [Internet]. The Williams Institute; 2011 [cited 2019 Apr 22]. Available from: http://williamsinstitute.law.ucla.edu/wp- content/uploads/Gates-How-Many-People-LGBT-Apr-2011.pdf

2. Intersex Society of North America. How common is intersex? [Internet]. Intersex Society of North America; [cited 2019 Apr 22]. Available from: http://www.isna.org/faq/frequency

3. Discrimination in America: Experiences and Views of LGBTQ Americans [Internet]. National Public Radio; 2017 [cited 2019 Apr 22]. Available from: https://www.npr.org/documents/2017/nov/npr-discrimination-lgbtq-final.pdf

4. James SE, Herman JL, Rankin S, Keisling M, Mottet L, Anafi M. The Report of the 2015 U.S. Transgender Survey [Internet]. National Center for Transgender Equality; 2016 [cited 2019 Mar 13]. Available from: https://transequality.org/sites/default/files/docs/usts/USTS-Full- Report-Dec17.pdf

5. Li LW, Gee GC, Dong X. Association of Self-Reported Discrimination and Suicide Ideation in Older Chinese Americans. The American Journal of Geriatric Psychiatry. 2018 Jan;26(1):42–51.

6. Assari S, Moghani Lankarani M, Caldwell C. Discrimination Increases Suicidal Ideation in Black Adolescents Regardless of Ethnicity and Gender. Behavioral Sciences. 2017 Nov 6;7(4):75.

7. Ahuja A. LGBT adolescents in America: Depression, discrimination and suicide. European Psychiatry. 2016 Mar;33:S70.

8. a Committee on Improving the Health, Safety, and Well-Being of Young Adults, editor. Investing in the health and well-being of young adults. Washington, D.C: The National Academies Press; 2014. 479 p.

9. Halim ML, Moy KH, Yoshikawa H. Perceived ethnic and language-based discrimination and Latina immigrant women’s health. Journal of Health Psychology. 2017 Jan;22(1):68–78.

10. Hausmann LRM, Jeong K, Bost JE, Ibrahim SA. Perceived Discrimination in Health Care and Health Status in a Racially Diverse Sample: Medical Care. 2008 Sep;46(9):905–14.

11. Sorkin DH, Ngo-Metzger Q, De Alba I. Racial/Ethnic Discrimination in Health Care: Impact on Perceived Quality of Care. Journal of General Internal Medicine. 2010 May;25(5):390–6.

12. Sulaiman A. The Impact of Language & Cultural Barriers on Patient Safety & Health Equity [Internet]. The Washington Patient Safety Coalition; [cited 2019 Mar 13]. Available from: http://www.wapatientsafety.org/impact-of-language-cultural-barriers-on-patient-safety- health-equity

13. Center for the Application of Prevention Technologies. Words Matter: How Language Choice Can Reduce Stigma [Internet]. Substance Abuse and Mental Health Services Administration;

82

2017 [cited 2019 Mar 13]. Available from: https://www.samhsa.gov/capt/sites/default/files/resources/sud-stigma-tool.pdf

14. Cahill S. LGBT Experiences With Health Care. Health Affairs. 2017 Apr;36(4):773–4.

15. Obedin-Maliver J, Goldsmith ES, Stewart L, White W, Tran E, Brenman S, et al. Lesbian, Gay, Bisexual, and Transgender–Related Content in Undergraduate Medical Education. JAMA [Internet]. 2011 Sep 7 [cited 2018 Nov 27];306(9). Available from: http://jama.jamanetwork.com/article.aspx?doi=10.1001/jama.2011.1255

16. Melber DJ, Teherani A, Schwartz BS. A Comprehensive Survey of Preclinical Microbiology Curricula Among US Medical Schools. Clinical Infectious Diseases. 2016 Jul 15;63(2):164–8.

17. Cuerda C, Schneider SM, Van Gossum A. Clinical nutrition education in medical schools: Results of an ESPEN survey. Clinical Nutrition. 2017 Aug;36(4):915–6.

18. Snelgrove JW, Jasudavisius AM, Rowe BW, Head EM, Bauer GR. “Completely out-at-sea” with “two-gender medicine”: A qualitative analysis of physician-side barriers to providing healthcare for transgender patients. BMC Health Services Research [Internet]. 2012 Dec [cited 2018 Nov 25];12(1). Available from: http://bmchealthservres.biomedcentral.com/articles/10.1186/1472-6963-12-110

19. Powell A. The problems with LGBTQ health care [Internet]. The Harvard Gazette; 2018 [cited 2020 Nov 12]. Available from: https://news.harvard.edu/gazette/story/2018/03/health-care- providers-need-better-understanding-of-lgbtq-patients-harvard-forum-says/

20. Chandra S, Mohammadnezhad M, Ward P. Trust and Communication in a Doctor- Patient Relationship: A Literature Review. Journal of Healthcare Communications [Internet]. 2018 [cited 2019 Mar 13];03(03). Available from: http://healthcare- communications.imedpub.com/trust-and-communication-in-a-doctorpatient-relationship-a- literature-review.php?aid=23072

21. Hoehndorf R, Schofield PN, Gkoutos GV. The role of ontologies in biological and biomedical research: a functional perspective. Briefings in Bioinformatics. 2015 Nov 1;16(6):1069–80.

22. Noy NF, McGuinness DL. Ontology Development 101: A Guide to Creating Your First Ontology [Internet]. Stanford University; Available from: https://protege.stanford.edu/publications/ontology_development/ontology101.pdf

23. Jonquet C, LePendu P, Falconer S, Coulet A, Noy NF, Musen MA, et al. NCBO Resource Index: Ontology-based search and mining of biomedical resources. Journal of Web Semantics. 2011 Sep;9(3):316–24.

24. Leonelli S, Diehl AD, Christie KR, Harris MA, Lomax J. How the gene ontology evolves. BMC Bioinformatics [Internet]. 2011 Dec [cited 2019 Apr 1];12(1). Available from: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-12-325

25. Grissinger M. The Five Rights: A Destination Without a Map. P & T. 2010 Oct;35(10):542.

83

26. Ke Q. Identifying translational science through embeddings of controlled vocabularies. Journal of the American Medical Informatics Association. 2019 Jun 1;26(6):516–23.

27. Harrow I, Balakrishnan R, Jimenez-Ruiz E, Jupp S, Lomax J, Reed J, et al. Ontology mapping for semantically enabled applications. Drug Discovery Today. 2019 Oct;24(10):2068–75.

28. Mascardi V, Locoro A, Rosso P. Automatic Ontology Matching via Upper Ontologies: A Systematic Evaluation. IEEE Trans Knowl Data Eng. 2010 May;22(5):609–23.

29. Faria D, Pesquita C, Mott I, Martins C, Couto FM, Cruz IF. Tackling the challenges of matching biomedical ontologies. J Biomed Semant. 2018 Dec;9(1):4.

30. Spyns P, De Bo J. Ontologies: a revamped cross-disciplinary buzzword or a truly promising interdisciplinary research topic? Linguistica Antverpiensia. 2004;(3):279–92.

31. Hastings J, Ceusters W, Smith B, Mulligan K. The Emotion Ontology: Enabling Interdisciplinary Research in the Affective Sciences. In: Beigl M, Christiansen H, Roth-Berghofer TR, Kofod- Petersen A, Coventry KR, Schmidtke HR, editors. Modeling and Using Context [Internet]. Berlin, Heidelberg: Springer Berlin Heidelberg; 2011 [cited 2020 Nov 9]. p. 119–23. (Lecture Notes in Computer Science; vol. 6967). Available from: http://link.springer.com/10.1007/978-3-642- 24279-3_14

32. Poli R, editor. Computer applications. Dordrecht: Springer; 2010. 576 p. (Theory and applications of ontology).

33. Iliadis AJ. A black art: Ontology, data, and the Tower of Babel problem. Purdue University; 2016.

34. Musen MA. The protégé project: a look back and a look forward. AI Matters. 2015 Jun 16;1(4):4–12.

35. Gašević D, Djurić D, Devedzic V, Gašević D. Model driven engineering and ontology development. 2nd ed. Dordrecht ; New York: Springer; 2009. 378 p.

36. Michal K, Michal Š, Zdeněk B. Interoperability through ontologies. IFAC Proceedings Volumes. 2012;45(7):196–200.

37. Wilcosky TC, Waterbor JW, Choi BCK, Pak AWP. Subjective Terminology. Epidemiology. 1993 Jan;4(1):87–9.

38. Nouraei S a. R, Hudovsky A, Virk JS, Chatrath P, Sandhu GS. An audit of the nature and impact of clinical coding subjectivity variability and error in otolaryngology. Clin Otolaryngol. 2013 Dec;38(6):512–24.

39. Links AR, Callon W, Wasserman C, Walsh J, Beach MC, Boss EF. Surgeon use of medical jargon with parents in the outpatient setting. Patient Education and Counseling. 2019 Jun;102(6):1111–8.

40. Bodenreider O. Bio-ontologies: current trends and future directions. Briefings in Bioinformatics. 2006 May 23;7(3):256–74.

84

41. Speer SA. Gender talk: feminism, discourse and conversation analysis. London ; New York: Routledge; 2005. 236 p.

42. Jespersen O. Language: its nature, development, and origin. Place of publication not identified: Hamlin Press; 2013.

43. Foundalis HE. Evolution of Gender in Indo-European Language. Proceedings of the Twenty- fourth Annual Conference of the Cognitive Science Society. 2002 Aug;

44. Prewitt-Freilino JL, Caswell TA, Laakso EK. The Gendering of Language: A Comparison of Gender Equality in Countries with Gendered, Natural Gender, and Genderless Languages. Sex Roles. 2012 Feb;66(3–4):268–81.

45. Baron DE. What’s your pronoun? beyond he & she. First edition. New York: Liveright Publishing Corporation, a division of W. W. Norton & Company; 2020. 283 p.

46. Baker P. Fantabulosa: a dictionary of Polari and gay slang. London: Continuum; 2004. 256 p.

47. Serano J. Outspoken: a decade of transgender activism & trans feminism. Oakland, CA: Switch Hitter Press; 2016. 330 p.

48. Schuster M. On Being Gay In Medicine: After The Supreme Court Victory, Still Work To Be Done [Internet]. 2020 [cited 2020 Nov 12]. Available from: https://www.wbur.org/commonhealth/2020/06/24/on-being-gay-in-medicine-supreme- court

49. James SD. Lesbians Sue When Partners Die Alone [Internet]. 2009 [cited 2020 Nov 12]. Available from: https://abcnews.go.com/Health/story?id=7633058

50. Stroumsa D, Roberts EFS, Kinnear H, Harris LH. The Power and Limits of Classification — A 32- Year-Old Man with Abdominal Pain. N Engl J Med. 2019 May 16;380(20):1885–8.

51. Humm A. Jay Kallio, Model Activist to the End, Dead at 61 [Internet]. Gay City News; 2016 [cited 2020 Nov 12]. Available from: https://www.gaycitynews.com/jay-kallio-model-activist-to-the- end-dead-at-61/

52. James SD. Trans Man Denied Cancer Treatment; Now Feds Say It’s Illegal [Internet]. ABC News; 2012 [cited 2020 Nov 12]. Available from: https://abcnews.go.com/Health/transgender-bias- now-banned-federal-law/story?id=16949817

53. Yurdakok M. Neonatal medicine in prehistoric times in Anatolia. Journal of Clinical Neonatology. 2015;4(3):153.

54. Johnson EM. The Allure of Gay Cavemen [Internet]. Scientific American; 2012 [cited 2019 May 25]. Available from: https://blogs.scientificamerican.com/primate-diaries/the-allure-of-gay- cavemen/

55. Peralta E. Researchers Dig Up “Homosexual Or Transsexual” Caveman Near Prague [Internet]. NPR; 2011 [cited 2020 Nov 12]. Available from: https://www.npr.org/sections/thetwo-

85

way/2011/04/08/135212785/researchers-dig-up-homosexual-or-transsexual-caveman-near- prague

56. Hollimon SE. Sex, Gender and Health Among the Chumash: An Archaeological Examination of Prehistoric Gender Roles. Proceeedings of the Society for California Archaeology. 1996;9:205–8.

57. World Population Prospects, Volume 1: Comprehensive Tables [Internet]. United Nations; 2015 [cited 2019 Apr 25]. Available from: https://esa.un.org/unpd/wpp/Publications/Files/WPP2015_Volume-I_Comprehensive- Tables.pdf

58. Desjrdins B. Why is life expectancy longer for women than it is for men? [Internet]. Scientific American; 2004 [cited 2019 Apr 25]. Available from: https://www.scientificamerican.com/article/why-is-life-expectancy-lo/

59. Yu S. Uncovering the hidden impacts of inequality on mental health: a global study. Transl Psychiatry. 2018 18;8(1):98.

60. Dynes WR, Johansson W. Encyclopedia of homosexuality. Volume II Volume II [Internet]. 2016 [cited 2020 Dec 1]. Available from: http://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&db=nlabk&AN=12 04190

61. Littman L. Correction: Parent reports of adolescents and young adults perceived to show signs of a rapid onset of gender dysphoria. PLoS ONE. 2019 Mar 19;14(3):e0214157.

62. Russell ST, Pollitt AM, Li G, Grossman AH. Chosen Name Use Is Linked to Reduced Depressive Symptoms, Suicidal Ideation, and Suicidal Behavior Among Transgender Youth. Journal of Adolescent Health. 2018 Oct;63(4):503–5.

63. Dolan IJ, Strauss P, Winter S, Lin A. Misgendering and experiences of stigma in health care settings for transgender people. Medical Journal of Australia. 2020 Mar;212(4):150.

64. Glick JL, Theall KP, Andrinopoulos KM, Kendall C. The Role of Discrimination in Care Postponement Among Trans-Feminine Individuals in the U.S. National Transgender Discrimination Survey. LGBT Health. 2018 Apr;5(3):171–9.

65. Ding JM, Ehrenfeld JM, Edmiston EK, Eckstrand K, Beach LB. A Model for Improving Health Care Quality for Transgender and Gender Nonconforming Patients. The Joint Commission Journal on Quality and Patient Safety. 2020 Jan;46(1):37–43.

66. Jaffe S. LGBTQ discrimination in US health care under scrutiny. The Lancet. 2020 Jun;395(10242):1961.

67. Ruben MA, Livingston NA, Berke DS, Matza AR, Shipherd JC. Lesbian, Gay, Bisexual, and Transgender Veterans’ Experiences of Discrimination in Health Care and Their Relation to Health Outcomes: A Pilot Study Examining the Moderating Role of Provider Communication. Health Equity. 2019 Sep 1;3(1):480–8.

86

68. Raad J, Cruz C. A Survey on Ontology Evaluation Methods: In: Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management [Internet]. Lisbon, Portugal: SCITEPRESS - Science and and Technology Publications; 2015 [cited 2021 Feb 12]. p. 179–86. Available from: http://www.scitepress.org/DigitalLibrary/Link.aspx?doi=10.5220/0005591001790186

69. Sauro J. Measuring Usability with the System Usability Scale (SUS) [Internet]. MeasuringU; 2011 [cited 2021 Jan 18]. Available from: https://measuringu.com/sus/

70. Ma X, Fu L, West P, Fox P. Ontology Usability Scale: Context-aware Metrics for the Effectiveness, Efficiency and Satisfaction of Ontology Uses. Data Science Journal. 17:10.

71. The OBI Consortium, Smith B, Ashburner M, Rosse C, Bard J, Bug W, et al. The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nat Biotechnol. 2007 Nov;25(11):1251–5.

72. Amith M, Manion F, Liang C, Harris M, Wang D, He Y, et al. Architecture and usability of OntoKeeper, an ontology evaluation tool. BMC Med Inform Decis Mak. 2019 Aug;19(S4):152.

73. Manouselis N, Sicilia MA, Rodríguez D. Exploring ontology metrics in the biomedical domain. Procedia Computer Science. 2010 May;1(1):2319–28.

74. Poveda-Villalón M, Gómez-Pérez A, Suárez-Figueroa MC. OOPS! (OntOlogy Pitfall Scanner!): An On-line Tool for Ontology Evaluation. International Journal on Semantic Web and Information Systems. 2014 Apr;10(2):7–34.

75. Nielsen J. How Many Test Users in a Usability Study? [Internet]. Nielsen Norman Group; 2012 [cited 2021 Mar 1]. Available from: https://www.nngroup.com/articles/how-many-test-users/

76. Thenmalar S, Geetha TV. Enhanced ontology-based indexing and searching. Aslib Journal of Info Mgmt. 2014 Nov 11;66(6):678–96.

77. Barros FA, Gonçalves PF, Santos TLVL. Providing Context to Web Searches: The Use of Ontologies to Enhance Search Engine’s Accuracy. J Braz Comp Soc. 1998 Nov;5(2):00–00.

78. Wanta JW, Unger CA. Review of the Transgender Literature: Where Do We Go from Here? Transgender Health. 2017 Dec;2(1):119–28.

79. Lynch KE, Alba PR, Patterson OV, Viernes B, Coronado G, DuVall SL. The Utility of Clinical Notes for Sexual Minority Health Research. American Journal of Preventive Medicine. 2020 Nov;59(5):755–63.

80. Conard L. Personal Correspondence.

81. Vital Statistics Fiscal Year 2019 Summary.

82. Foer D, Rubins DM, Almazan A, Chan K, Bates DW, Hamnvik O-PR. Challenges with Accuracy of Gender Fields in Identifying Transgender Patients in Electronic Health Records. J GEN INTERN MED. 2020 Dec;35(12):3724–5.

87

83. Greenstein S, Zhu F. Is Wikipedia Biased? American Economic Review. 2012 May 1;102(3):343– 8.

84. Harvard Business School, Greenstein S, Zhu F, Harvard Business School. Do Experts or Crowd- Based Models Produce More Bias? Evidence from Encyclopedia Britannica and Wikipedia. MISQ. 2018 Mar 3;42(3):945–59.

85. Hill BM, Shaw A. The Wikipedia Gender Gap Revisited: Characterizing Survey Response Bias with Propensity Score Estimation. Sánchez A, editor. PLoS ONE. 2013 Jun 26;8(6):e65782.

86. Wikipedia Editors Study: Results from the Editor Survey, April 2011 [Internet]. 2011 [cited 2020 Dec 14]. Available from: https://upload.wikimedia.org/wikipedia/commons/7/76/Editor_Survey_Report_- _April_2011.pdf

87. Tai J. Cultural Humility as a Framework for Anti-Oppressive Archival Description. Journal of Critical Library and Information Studies. 2020 Oct 1;3.

88. Kahane S, Stutz E, Aliarzadeh B. Must we appear to be all-knowing?: patients’ and family physicians’ perspectives on information seeking during consultations. Can Fam Physician. 2011 Jun;57(6):e228-236.

89. Krouss M, Alshaikh J, Croft L, Morgan DJ. Improving Incident Reporting Among Physician Trainees: Journal of Patient Safety. 2019 Dec;15(4):308–10.

90. Noordewier MK, van Dijk E. Curiosity and time: from not knowing to almost knowing. Cognition and Emotion. 2017 Apr 3;31(3):411–21.

91. Singer RB, Crane B, Lemay EP, Omary S. Improving the Knowledge, Attitudes, and Behavioral Intentions of Perinatal Care Providers Toward Childbearing Individuals Identifying as LGBTQ: A Quasi-Experimental Study. J Contin Educ Nurs. 2019 Jul 1;50(7):303–12.

88

APPENDICES

Appendix A: Rule Structures

General

For various components of the GSSO’s analytics which concerned textual analysis or knowledge discovery, rule-based structures were necessary to implement, in order to limit noise from over-tagging of pronouns, slang terms, and shortened forms of words (like some abbreviations which could have multiple meanings).

For this reason, all tagging mechanisms eliminated stop words (such as conjunctions), pronouns, and terms shorter than three characters in length. Further, all mechanisms discussed below, whether weighted or unweighted, used match ranking mechanisms based on the different types of annotation properties attributed to strings.

Research Documentation

A simple, non-weighted algorithm was used for searching in MEDLINE and in the AIDS

History Project dataset. This is also the algorithm utilized on the website hosted at CCHMC.

This involves searching text for the following annotations: label, alternate name, short name, has synonym, has broad synonym, has narrow synonym, has related synonym, and replaces. These were preprocessed into three JSON ‘buckets’: primary, secondary, and tertiary. The primary JSON bucket contained labels, alternate names, synonyms, exact synonyms, broad synonyms, and narrow synonyms, while the secondary contained short 89

names, related synonyms, and replaces. The tertiary was created separately, containing instances, descendants, and superclasses.

The creation of JSON buckets was necessary to decrease load times from several minutes to a few seconds. It was implemented primarily in response to concerns from librarians, archivists, and ontologists that the tagging mechanism was too slow.

If the document matched a string in the primary JSON bucket, it was considered a match.

However, a term had to match at least two strings in the secondary and tertiary JSON buckets to be considered a match, and at least one of those strings had to be from the secondary JSON bucket.

Medical Documentation

In MIMIC-III and CCHMC, the following weights were used for various types of terms:

Annotation Type Weight Label 1.00 Alternate Name 1.00 Descendants 0.90 Individuals 0.85 Derived Terms 0.80 Exact Synonyms 1.00 Broad Synonyms 0.85 Synonyms 0.90 Narrow Synonyms 0.85 Obsoleted Terms 0.70 Related Synonyms 0.60 See Also 0.50 Short Names 0.30

90

Once a term like ‘transgender’ (http://purl.obolibrary.org/obo/GSSO_000096) was searched, all of these connected terms would be appropriately weighted and searched as well. In addition, weights for sub-annotations would be weighted in a combinatory manner

(so that a synonym of an obsoleted term would be 0.90 × 0.70 = 0.63). If the annotation or sub-annotation was a link to another term, this scenario was applied one-level deeper. For instance, is ‘transgender’ has the obsoleted term ‘transsexual’ and ‘transsexual’ has the related synonym ‘transsexualism’ which itself has an obsoleted term ‘sexual inversion’,

‘sexual inversion’ would be weighted 0.70 × 0.60 × 0.70 = 0.294).

Number of matches in a document were then summed for the final total, so if

‘transsexualism’ and ‘sexual inversion’ appeared in a document, then the total would be

0.294 + 0.42 = 0.714 score for a search for ‘transgender’.

This concept was tested with ‘transgender’ in the MIMIC-III dataset, in which we determined a cut-off of 0.6 was appropriate to capture all patients without any false positives.

Web Search Interface

The GSSO search mechanism on the CCHMC-hosted website primarily uses the Sørensen–

Dice coefficient to compare search strings to strings in the GSSO for ranking (when returned in an ordered manner to the user). This only applies to the approximate and fuzzy searching mechanisms available on the GSSO website. The default search selections as

91

labels, alternate names, and synonyms (which includes all synonym-type annotations).

Definitions, obsoleted names, quotes, and sources are also provided as options.

The Sørensen–Dice coefficient compares string bigrams to one another, using the following formula:

2푛 푠 = 푡 푛푥 + 푛푦

Where 푛푡 is the number of bigrams which appear in both strings, while 푛푥 is the number of bigrams in string x and 푛푦 is the number of bigrams in string y. Fundamentally, the

Sørensen–Dice coefficient is similar to the Jaccard index, i.e., the Jaccard index (J) can be calculated from the Sørensen–Dice coefficient (s) as follows:

푠 퐽 = 2 − 푠

Both substructures are programmed into the backend of the GSSO web interface, but only the Sørensen–Dice coefficient is currently accessible, in order to avoid overwhelming casual users.

Appendix B: Online Resources

Transgender-Specific (or LGBTQIA+-Specific) Resources • Digital Transgender Archive (DTA) (https://www.digitaltransgenderarchive.net/) • EBSCO LGBT Life (https://www.ebsco.com/products/research-databases/lgbt-life) • Gender Variance Who’s Who, A (https://zagria.blogspot.com/) • GLBT Museum & Archives (https://www.glbthistory.org/) • glbtq Encyclopedia Project (http://www.glbtqarchive.com/) • Homosaurus (https://homosaurus.org/) • OutHistory (http://outhistory.org/)

92

• Transas City (http://transascity.org/) • Vidensbanken om kønsidentitet (http://www.transviden.dk/)

General Resources • AntroSource (https://anthrosource.onlinelibrary.wiley.com/) • DOAJ (https://doaj.org/) • ERIC (https://eric.ed.gov/) • Fulton Search (https://fultonsearch.org/) • Google Scholar (https://scholar.google.com/) • HathiTrust Digital Library (HDL) (https://www.hathitrust.org/) • IEEE Explorer (https://ieeexplore.ieee.org/) • Internet Archive (https://archive.org/) • JSTOR (https://www.jstor.org/) • Newspapers.com (https://www.newspapers.com/) • Project Gutenberg (https://www.gutenberg.org/) • PubMed (https://pubmed.ncbi.nlm.nih.gov/) • ResearchGate (https://www.researchgate.net/) • ScienceDirect (https://www.sciencedirect.com/) • Scopus (https://www.scopus.com/) • Taylor & Francis Online (https://www.tandfonline.com/) • Web of Science (https://clarivate.com/products/web-of-science/) • Wikidata (https://www.wikidata.org/) • Wikipedia (https://www.wikipedia.org/)

Additional Appendices • GSSO Website @ CCHMC (http://gsso.research.cchmc.org/) • GSSO @ GitHub (https://github.com/Superraptor/GSSO) • GSSO @ NCBO BioPortal (https://bioportal.bioontology.org/ontologies/GSSO) • GSSO @ OntoBee (http://www.ontobee.org/ontology/GSSO) • GSSO @ Wikidata (https://www.wikidata.org/wiki/Q97063846) • GSSO @ Identifiers.org (https://registry.identifiers.org/registry/gsso) • GSSO @ EMBL-EBI OLS (https://www.ebi.ac.uk/ols/ontologies/gsso) • Review of Headings and Terms Related to Gender, Sex, and Sexual Orientation (https://rb.gy/qhlsb3) • Transgender Bibliography @ GitHub (https://github.com/Superraptor/transgender_bibliography)

93

Other Documents • HL7 Informative Document: Gender Harmony – Modeling Sex and Gender Representation, Release 1 (https://rb.gy/0vq3xx) • Suggested Tables for Sex/Gender-Related Documentation (https://rb.gy/lopbty)

Appendix C: Survey Instruments

Demographic Survey All survey questions also included options for “other (please specify)” and “prefer not to respond”. We excluded persons under the age of 18 via an introductory question which blocked off the remainder of the survey. 1. What is your age? (Choose one) a. 18 to 20 b. 21 to 29 c. 30 to 39 d. 40 to 49 e. 50 to 59 f. 60 to 65 2. What is your race/ethnicity? (Choose all that apply) a. White/Caucasian b. Black/Of African Descent/Afro-American/Afro-European/Afro-Caribbean c. American Indian/First Nation/Alaska Native d. Asian e. Pacific Islander/Native Hawaiian f. Multiracial/Multiple Racial or Ethnic Backgrounds 3. What is your highest attained level of education? (Choose one) a. Some High School, Less than High School, or No Formal Education/Schooling b. High School Graduate or Equivalent c. Some College/University d. Associate’s and/or Bachelor’s Degree e. Master’s Degree f. Doctoral and/or Professional Degree 4. What is your gender? (Choose all that apply) a. Female b. Male c. Nonbinary d. Agender/Genderless/Gender Neutral/Neutrois e. Genderfluid/Genderflux f. Androgyne g. Demigender/Demigirl/Demiboy 94

h. Bigender/Trigender/Polygender i. Intergender j. Questioning 5. What are your pronouns? (Choose all that apply) a. He/Him/His b. She/Her/Hers c. They/Them/Theirs d. Xe/Xem/Xers e. E/Em/Ers f. Ze/Hir/Hirs 6. How would you describe your sexual orientation? (Choose one) a. Heterosexual/Straight/Not Gay b. Homosexual/Gay c. Bisexual/Bisexual+/Pansexual/Polysexual/Omnisexual d. Asexual/Gray Asexual/Demisexual/Asexual Spectrum e. Questioning 7. Do you consider yourself transgender? (Choose one) a. Yes b. No c. Questioning 8. Do you consider yourself intersex? (Choose one) a. Yes b. No 9. Do you consider yourself neurodiverse or neuroatypical? (Choose one) a. Yes b. No 10. Do you currently or have you ever worked in a health care-related profession (e.g., physician, nurse, nurse practitioner, or related)? (Choose one) a. Yes b. No For individuals who answered “Yes” to question 7, we asked: • Do you feel like medical professionals you’ve encountered in the past had gaps in their knowledge regarding transgender health? (Select one) o Yes, most had gaps o Yes, some had gaps o Yes, a few had gaps o Unsure if they had gaps o No, most had no gaps o No, none had gaps If the individual indicated there were gaps, we then asked a free-response question: • If you’re comfortable telling us, what were some of those gaps? What knowledge would’ve made you feel more comfortable in receiving care? 95

For individuals who answered “Yes” to question 10, we asked: • Do you feel like you have gaps in your knowledge of LGBTQIA+ health? (Select one) o Yes, most had gaps o Yes, some had gaps o Yes, a few had gaps o Unsure if they had gaps o No, most had no gaps o No, none had gaps If the individual indicated they had gaps, we then asked a free-response question: • What are some of those gaps, particularly those involving transgender health?

Ontology Usability Scale (OUS) 1. I think the documentation provides sufficient examples for me to make sure how to use the ontology. 2. The purpose of this ontology is clear. 3. I found the concepts and relations in this ontology properly described in natural language. 4. I think the relations in this ontology relate appropriate concepts. 5. I am confident I understand the conceptualization of the ontology. 6. I would image that most domain experts would understand this ontology very quickly. 7. I think the attributes in this ontology describe the concepts well. 8. I found the subclasses in this ontology are properly defined. 9. I found the formal specification of concepts and relations in this ontology coincides with their descriptions in natural language. 10. I do not need the support of a person experienced with this ontology to be able to use it. System Usability Scale (SUS) 1. I think that I would like to use this website frequently. 2. I found this website unnecessarily complex. 3. I thought this website was easy to use. 4. I think that I would need assistance to be able to use this website. 5. I found the various functions in this website were well integrated. 6. I thought there was too much inconsistency in this website. 7. I would imagine that most people would learn to use this website very quickly. 8. I found this website very cumbersome/awkward to use. 9. I felt very confident using this website. 10. I needed to learn a lot of things before I could get going with this website. 96

Appendix D: Necessary Mappings

Mapping to MeSH For the comparisons made in Table 8, the following mappings were made from HRC terms to GSSO terms: • androgynous → androgynous gender expression • gender non-conforming → gender nonconforming • gender-expansive → gender variance • sex assigned at birth → sex at birth19 • gay → gay man • non-binary → gender nonbinary • gender-fluid → fluctuating gender identity • same-gender loving → same gender loving For the mappings from HRC terms to MeSH we used: • bisexual → • gay → Homosexuality, Male • intersex → Disorders of Sex Development • lesbian → Homosexuality, Female • transgender → Transgender Persons + Transsexualism For keywords in MEDLINE, the only mapping necessary was: • androgynous → androgyny

Mapping Racial Categories In the CCHMC data, in order to compare to the categories in the Williams Institute data, we mapped the following:

19 This was later shifted to “assigned sex at birth” in the GSSO. 97

Racial Category in CCHMC Closest Mapping from CCHMC Patient Count Williams Institute Data White White, non-Hispanic 1,688 Black or African American African-American or Black, 148 non-Hispanic Unknown None 31 White, Black or African Other Race or Ethnicity, 29 American non-Hispanic Black or African American, Other Race or Ethnicity, 28 White non-Hispanic Asian Other Race or Ethnicity, 16 non-Hispanic Hispanic/Latino Hispanic or Latino 9 Other Other Race or Ethnicity, 9 non-Hispanic Asian, White Other Race or Ethnicity, 7 non-Hispanic Not Recorded None 7 Patient Refused None 3 White, American Indian and Other Race or Ethnicity, 3 Alaska Native non-Hispanic American Indian and Alaska Other Race or Ethnicity, 2 Native, White non-Hispanic White, Asian Other Race or Ethnicity, 2 non-Hispanic Other, White Other Race or Ethnicity, 2 non-Hispanic Native Hawaiian and Other Other Race or Ethnicity, 2 Pacific Islander non-Hispanic American Indian and Alaska Other Race or Ethnicity, 1 Native non-Hispanic American Indian and Alaska Other Race or Ethnicity, 1 Native, Black or African non-Hispanic American Asian, Black or African Other Race or Ethnicity, 1 American non-Hispanic Black or African American, Other Race or Ethnicity, 1 American Indian and Alaska non-Hispanic Native, White Black or African American, Other Race or Ethnicity, 1 Asian non-Hispanic Black or African American, Hispanic or Latino 1 Hispanic/Latino

98

Black or African American, Other Race or Ethnicity, 1 White, American Indian and non-Hispanic Alaska Native Hispanic/Latino, Black or Hispanic or Latino 1 African American, White Hispanic/Latino, White Hispanic or Latino 1 Middle Eastern Other Race or Ethnicity, 1 non-Hispanic Native Hawaiian and Other Other Race or Ethnicity, 1 Pacific Islander, Asian non-Hispanic Native Hawaiian and Other Other Race or Ethnicity, 1 Pacific Islander, White non-Hispanic Native Hawaiian and Other Hispanic or Latino 1 Pacific Islander, White, Hispanic/Latino No Parent/Primary None 1 Caregiver Present Unknown, White Other Race or Ethnicity, 1 non-Hispanic White, Hispanic/Latino Hispanic or Latino 1 White, Hispanic/Latino, Hispanic or Latino 1 Asian

Total: 2,003

99

This leads us to the following totals: Racial/Ethnic Williams Institute CCHMC Percentage CCHMC Patient Group Group Percentage Transgender (ICD Identification) Transgender Population Population (ICD Identification) White, non- 55% 86% 1,688 Hispanic African- 16% 8% 148 American or Black, non- Hispanic Hispanic or 21% 1% 15 Latino Other Race or 8% 6% 110 Ethnicity, non- Hispanic

Total: 1,961

The FY2019 race breakdown for these groups at CCHMC is shown here and calculated for approximate values as the race data was consistent in reports from 2012 to 2019: Racial/Ethnic CCHMC CCHMC CCHMC CCHMC Predicted Group Raw # (of Percentage Theoretical Percentages % Value trans (of all trans Raw Based Based on of Trans persons) persons) on FY2019 FY2019 Persons in Group White, non- 1,688 86% 628,814 67.40% 0.27% Hispanic African- 148 8% 176,329 18.90% 0.08% American or Black, non- Hispanic Hispanic or 15 1% N/A N/A N/A Latino Other Race or 110 6% 25,563 2.74% 0.43% Ethnicity, non- Hispanic

Totals: 2,003 932,959

100

Appendix E: Additional Statistical Calculations

In both research and medical documentation, we utilized precision, recall, and F-score values for comparison. In the research documentation, precision was calculated as: |no. relevant documents ∩ no. retrieved documents| precision = |no. retrieved documents| Recall was calculated using: |no. relevant documents ∩ no. retrieved documents| recall = |no. relevant documents| Therefore, if an article was returned by the algorithm and was about being transgender it was considered a relevant document, whereas the total number of documents returned by that algorithm would be considered the number of retrieved documents. In the medical documentation, which was a classification algorithm task (rather than a searching algorithm task), we calculated precision as: No. True Positives precision = No. True Positives + No. False Positives Recall was then defined as: No. True Positives recall = No. True Positives + No. False Negatives It is of note for some of our comparisons that recall in this sense is sometimes referred to as the true positive rate (TPR) or sensitivity. Precision may also be called the positive predictive value (PPV). In both medical and research documentation, the F-score (also known as the F-measure or F1 score in literature) was calculated as: Precision × Recall F = 2 × 1 Precision + Recall Another way to write this formula (which was used in some cases), was as: No. True Positives F = 1 1 ( ) No. True Positives + 2 False Positives + False Negatives In all cases, precision was considered to be a measure of the exactness of a classifier or search algorithm. Recall was used as a measure of the completeness and the F-score is considered a balance between precision and recall values.

101

Some literature we compared (such as Foer at al) reported only sensitivity and specificity. Sensitivity is a synonym for recall, but we needed to transform these values to ascertain precision (and therefore F-scores). For these, we took the paper’s estimated prevalence of 0.89% of formulated an approximate TPR using: TPR = Prevalence × Sensitivity From there, we calculated the False Negative Rate (FNR) as: FNR = Prevalence × (1 − Sensitivity) The true negative rate (TNR) and false positive rate (FPR) were then: TNR = (1 − Prevalence) × Specificity FPR = (1 − Prevalence) × (1 − Specificity) Precision and F-score were then estimated as: TPR Precision = TPR + FPR Precision × Recall F = 2 × 1 Precision + Recall

102