Hematopathology / DATABASE FOR IMMUNOPHENOTYPING BY FLOW CYTOMETRY

A Relational Database for Diagnosis of Hematopoietic Using Immunophenotyping by Flow Cytometry

Andy N.D. Nguyen, MD,1 John D. Milam, MD,1 Kathy A. Johnson, PhD,2 and Eugenio I. Banez, MD3

Key Words: Hematopoietic neoplasms; Relational database; Immunophenotyping; Flow cytometry Downloaded from https://academic.oup.com/ajcp/article/113/1/95/1757626 by guest on 28 September 2021

Abstract Immunophenotyping has become one of the essential A relational database was developed to facilitate methods for proper classification of hematopoietic the diagnosis of hematopoietic neoplasms using results neoplasms. Flow cytometry used in immunophenotyping has of immunophenotyping by flow cytometry. This data- no doubt added a new dimension to the diagnosis of base runs on personal computers and uses backward- and .1,2 A wide range of monoclonal antibodies is chaining search to arrive at conclusions. Results of available to recognize various hematopoietic cells based on immunologic marker studies are processed by the their surface and cytoplasmic antigens.1-4 Leukemic and database to obtain a set of differential diagnoses. The lymphoma cells usually cannot be detected with a single current version of this database includes diagnostic immunologic marker. Instead, the use of a monoclonal anti- immunophenotyping pattern for 33 hematopoietic body panel consisting of multiple antibodies is required for neoplasms. We tested this database using 92 clinical supporting the provisional diagnosis based on histologic find- cases from 2 tertiary care medical centers. The ings.1,2 Since many hematologic neoplasms demonstrate database ranked the actual diagnosis as 1 of the top 5 similar patterns of immunophenotyping,2-4 their diagnosis differential diagnoses in 93% of the cases tested. The often presents a challenge to pathologists who interpret flow user can modify the database contents to suit individual cytometric data. This is particularly applicable to practicing needs. This database has been posted on the World pathologists who are not subspecialized in hematopathology Wide Web for direct access. We propose that this user- and to pathology residents-in-training. As the number of friendly database is a potential tool for computer- immunologic markers used in flow cytometry increases, a assisted diagnosis of hematopoietic neoplasms. systematic approach for interpretation of marker results also is essential for consistent classification of neoplasms.5 The interpretation of immunophenotyping results by flow cytometry involves pattern recognition of different hematopoietic neoplasms that may have similar immunologic marker patterns. Each is associated with a partic- ular pattern characterized by the presence of certain markers and the absence of others.2-4 The numerous markers available in the flow cytometry laboratory make these patterns difficult to remember, especially for those of uncommon neoplasms.24 Another factor that hinders the interpretation process is the lack of consistency in marker results for any neoplasm.2-4 A certain marker may be positive (or negative) for a certain neoplasm in most of the cases. However, excep- tions often are seen. For this reason, an absolute diagnostic

© American Society of Clinical Pathologists Am J Clin Pathol 2000;113:95–106 95 Nguyen et al / DATABASE FOR IMMUNOPHENOTYPING BY FLOW CYTOMETRY pattern usually is not available for any neoplasm. Instead, sion-support systems to help users solve complicated prob- the diagnostic approach is to seek a neoplasm that closely lems in laboratory diagnosis.12-14 Such systems have 2 matches the given marker results.2 Since the immunophe- major elements: the development environment and the consul- notyping pattern of hematopoietic neoplasms can be tation environment ❚Figure 1❚. The development environment described easily in terms of the presence or absence of is used by the database builder to construct the components and markers contained in separate fields, a database is a logical to enter information into the knowledge base. The consultation approach to facilitate the interpretation of marker results. In environment is accessed by the user to obtain recommenda- addition, a decision-support system that readily displays the tions. The following components are seen in our database: attributes associated with a specific neoplasm would be 1. Knowledge base. This component contains the convenient and useful to pathologists who encounter the knowledge required for formulating and solving problems. It more uncommon diseases from time to time. contains facts in the domain area and rules that direct the use

Several computer programs have been developed for the of facts to diagnose specific disorders. The diagnostic criteria Downloaded from https://academic.oup.com/ajcp/article/113/1/95/1757626 by guest on 28 September 2021 analysis of immunophenotyping results in malignant are the facts, also known as attributes, that are necessary to lymphoma and leukemia.6-10 In the present study, we confirm a certain disorder. Potential sources of knowledge designed and evaluated a relational database for interpreta- include human experts and literature. tion of immunophenotyping by flow cytometry. The database 2. Workplace. This component is an area of the is designed to help the user follow closely how the suggested computer’s working memory for the description of a current diagnoses are determined. The database also permits the user problem, as specified by the input data. The workplace also to modify the contents of the database to suit individual pref- records intermediate conclusions. erences. 3. Inference engine. The brain of the database, this component provides the method for query by using informa- tion in the knowledge base and in the workplace to formulate conclusions. Materials and Methods 4. User interface. This component allows communica- The first phase of our study involved the development of tion between the user and the database. The user uses this the database, CD Marker. In the second phase, the database interface to input data (positive and negative attributes was validated by using previously interpreted cases at our found in a patient), to obtain the results, and also to revise affiliated hospitals. contents of the database as needed. This communication interface usually is in a graphics format for ease of use Relational Database (graphic user interface). A database is a computer program that stores and retrieves information based on a defined relationship.11 By Database CD Marker using a database, one can organize data according to subjects CD Marker, a relational database for interpretation of so that the data will be easy to track and verify. Data in a immunophenotyping by flow cytometry, was implemented database are stored in tables. A table is a collection of data in Microsoft Access 97 (Microsoft, Redmond, WA) and on a particular subject. Data in a table are presented in was run on an IBM-compatible computer under Microsoft columns (called fields) and rows (called records). Typical Windows 95. The inference engine of CD Marker was built examples of fields and records in a database for hematopoi- with Microsoft Visual Basic for Application language. This etic neoplasms are disease attribute (such as immunologic inference engine uses a backward-chaining search strategy marker result) and type of neoplasm, respectively. In a rela- to draw conclusions.15,16 A total of 33 hematopoietic tional database, one also can store information on how neoplasms were included in the database ❚Table 1❚. The different subjects are related to facilitate compiling the asso- diagnostic criteria for different neoplasms were based on ciated data. This information is essentially an expression that the pattern of immunologic marker results.1-4 The marker links records in one table with records in another. Analysis result was designated as positive (or negative) for a of data in a database can be performed conveniently with neoplasm if more than 80% of the cases were found to be query.11 A query is a description of sets of records that meet positive (or negative) for that marker.4 A total of 42 certain criteria defined by the user. In a database for immunologic markers were included in CD Marker ❚Table hematopoietic neoplasms, for example, a query may be used 2❚. The current marker panel includes only the markers to look for neoplasms with marker results that match those of deemed to be used most commonly. As other markers the case under consideration. become more extensively used with their added value in Relational databases with comprehensive content and diagnosis, they will be incorporated into the database. A efficient query mechanisms can perform effectively as deci- complete listing of immunophenotype for each neoplasm is

96 Am J Clin Pathol 2000;113:95–106 © American Society of Clinical Pathologists Hematopathology / ORIGINAL ARTICLE

❚Table 1❚ Hematopoietic Neoplasms in CD Marker Database

Consultation Development Follicular small cleaved cell lymphoma Mantle cell lymphoma Large B-cell lymphoma Mediastinal B-cell lymphoma User Knowledge base: B-cell lymphoma, unclassifiable, mixed small and large cells facts, rules Lymphoplasmacytoid lymphoma Marginal cell lymphoma Burkitt lymphoma/ALL, L3 Splenic lymphoma with villous Peripheral T-cell lymphoma User interface Lymphoblastic lymphoma () Thymoma Chronic lymphocytic leukemia ()/small lymphocytic lymphoma Chronic lymphocytic leukemia (T cell) Downloaded from https://academic.oup.com/ajcp/article/113/1/95/1757626 by guest on 28 September 2021 Prolymphocytic leukemia (B cell) Workplace Inference engine Hairy cell leukemia Sézary syndrome/mycosis fungoides Adult T-cell leukemia/lymphoma ALL (T-cell precursor) Acute myeloblastic leukemia without maturation, M1 ❚ ❚ Acute myeloblastic leukemia with maturation, M2 Figure 1 Structure of a relational database used as a Acute promyelocytic leukemia, M3 decision-support system. Acute myelomonocytic leukemia, M4 Acute monoblastic leukemia, M5 Acute erythroleukemia, M6 ❚ ❚ ❚ ❚ Acute megakaryoblastic leukemia, M7 included in Table 3 . Figure 2 shows the main menu of Biphenotypic acute leukemia, AML and T-cell ALL the database with 3 modules: Biphenotypic acute leukemia, AML and B-cell precursor ALL Large granular leukemia, NK cell 1. Consultation for differential diagnosis Large granular lymphoproliferative disorder, T cell 2. Look up or revise a disorder in the database ALL (early-B precursor) 3. Look up or revise a marker in the database ALL (CALLA) ALL (Pre-B) The graphic user interface for the differential diagnosis module is shown in ❚Figure 3❚. All the data that are available ALL, acute lymphoblastic leukemia; AML, acute myeloblastic leukemia; NK, natural killer; CALLA, common acute lymphoblastic leukemia antigen. on marker results should be entered for the case under consid- eration. Lack of information in certain data fields does not prevent CD Marker from processing the data. However, the ❚Table 2❚ accuracy of the suggested diagnosis would be compromised if Immunologic Markers in CD Marker Database results of important markers were left out. A list of differential diagnoses is provided by CD Marker with each set of input CD1 CD33 data. The differential diagnoses have an assigned value of CD2 CD38 matching factor (MF). The MF value for a neoplasm reflects CD3 CD41 CD4 CD42 how well its immunophenotyping pattern matches the marker CD5 CD43 data for a given case. This factor is defined as (Equation 1): CD7 CD45 MF = M/(M + N) CD8 CD56 where MF is the matching factor for a particular CD10 CD57 neoplasm (0 ≤ MF ≤ 1); M, the number of attributes of a CD11b CD61 CD11c CD71 neoplasm that match the input data; and N, the number of CD13 CD77 attributes of a neoplasm that do not match the input data. CD14 CD79a Note that the value of MF, as defined by Equation 1, CD15 CD103 reflects only the similarity between the attributes of a CD16 HLA-DR neoplasm and the available data. A high value of MF for a CD19 sIg neoplasm, such as 1, does not exclude the possibility that CD20 cIg CD21 PC-1 more data input with increased value of N actually may CD22 TdT decrease its MF value. CD23 FMC-7 To demonstrate how the knowledge base of CD CD24 Marker is implemented, the diagnostic criteria for chronic CD25 Cytokeratin lymphocytic leukemia/small lymphocytic lymphoma cIg, cytoplasmic immunoglobulin; sIg, surface immunoglobulin; TdT, terminal (CLL/SLL)2-4 are given as an example: deoxynucleotidyl transferase.

© American Society of Clinical Pathologists Am J Clin Pathol 2000;113:95–106 97 Nguyen et al / DATABASE FOR IMMUNOPHENOTYPING BY FLOW CYTOMETRY

❚Table 3❚ Immunophenotype for Each Disorder1-4

Disorder Positive Negative

Follicular small cleaved cell lymphoma CD10, CD19, CD20, CD21, CD22, CD24, CD2, CD3, CD4, CD5, CD7, CD8, HLA-DR, sIg CD11c, CD23, CD25, CD43, cIg Mantle cell lymphoma CD5, CD19, CD20, CD22, CD24, CD43, CD11c, CD23, cIg, CD5/CD19 or HLA-DR, sIg CD5/CD20 Large B-cell lymphoma CD19, CD20, CD22, CD79a, CD2, CD3, CD5, CD7 HLA-DR Mediastinal B-cell lymphoma CD19, CD20, CD22, CD79a, PC-1 CD2, CD3, CD5, CD7, CD10, CD15, CD21, sIg, TdT B-cell lymphoma, unclassifiable, mixed CD19, CD20, CD22, CD79a, HLA-DR, sIg CD2, CD3, CD5, CD7 small and large cells Lymphoplasmacytoid lymphoma CD19, CD20, CD22, CD79a, sIg, cIg CD5, CD10, CD23 Marginal cell lymphoma CD19, CD20, CD22, CD79a, sIg CD5, CD10, CD23 Downloaded from https://academic.oup.com/ajcp/article/113/1/95/1757626 by guest on 28 September 2021 Burkitt lymphoma/ALL, L3 CD10, CD19, CD20, CD22, CD77, CD5, CD23, TdT CD79a, HLA-DR, sIg Splenic lymphoma with villous CD19, CD20, CD22, CD79a, HLA-DR, sIg CD5, CD10, CD23, CD25 lymphocytes Peripheral T-cell lymphoma CD2, CD3, CD5, CD7 CD19, CD20, CD22, cytokeratin Lymphoblastic lymphoma (T cell) CD2, CD3, CD5, CD7, CD38, CD71, TdT CD15, CD20, cytokeratin Thymoma TdT, Cytokeratin CD4, CD8, CD45 Chronic lymphocytic leukemia (B CD5, CD19, CD20, CD21, CD23, CD24, CD2, CD3, CD4, CD7, CD8, CD10, cell)/small lymphocytic lymphoma CD43, CD79a, HLA-DR, sIg CD25, FMC-7, CD5/CD19 or CD5/CD20 Chronic lymphocytic leukemia (T cell) CD2, CD3, CD5, CD7 CD16, CD25, CD56, CD57, TdT Prolymphocytic leukemia (B cell) CD19, CD20, CD22, HLA-DR, sIg, FMC-7 CD2, CD3, CD4, CD7, CD8, CD10, CD11c, CD25, TdT Hairy cell leukemia CD11c, CD19, CD20, CD22, CD25, CD2, CD3, CD4, CD5, CD7, CD8, CD79a, CD103, sIg, FMC-7 CD10, CD23 Sézary syndrome/mycosis fungoides CD2, CD3, CD4, CD5 CD1, CD7, CD8, CD10, CD11c, CD16, CD19, CD20, CD22, CD25, CD56, CD57, sIg, TdT, cytokeratin Adult T-cell leukemia/lymphoma CD2, CD3, CD4, CD5, CD25, CD38, CD1, CD7, CD8, CD10, CD11c, CD16, CD71, HLA-DR CD19, CD20, CD22, CD56, CD57, sIg, TdT, cytokeratin ALL (T-cell precursor) CD3, CD7, TdT CD10, CD19, CD20, CD22, HLA-DR, cytokeratin Acute myeloblastic leukemia without CD13, CD33, HLA-DR CD2, CD3, CD5, CD7, CD11b, CD14, maturation, M1 CD15, CD41, CD42, CD61, CD71, sIg, glycophorin A Acute myeloblastic leukemia with CD13, CD15, CD33, HLA-DR CD2, CD3, CD5, CD7, CD11b, CD14, maturation, M2 CD41, CD42, CD61, CD71, sIg, glycophorin A Acute promyelocytic leukemia, M3 CD13, CD15, CD33 CD2, CD3, CD5, CD7, CD11b, CD14, CD41, CD42, CD61, CD71, HLA-DR, sIg, glycophorin A Acute myelomonocytic leukemia, M4 CD11b, CD13, CD14, CD15, CD33, HLA-DR CD2, CD3, CD5, CD7, CD41, CD42, CD61, CD71, sIg, glycophorin A Acute monoblastic leukemia, M5 CD11b, CD11c, CD13, CD14, CD15, CD2, CD3, CD5, CD7, CD41, CD42, CD33, HLA-DR CD61, CD71, sIg, glycophorin A Acute erythroleukemia, M6 CD71, glycophorin A CD2, CD3, CD5, CD7, CD11b, CD14, CD41, CD42, CD61, sIg Acute megakaryoblastic leukemia, M7 CD33, CD41, CD42, CD61 CD2, CD3, CD5, CD7, CD11b, CD13, CD14, CD15, CD71, sIg, glycophorin A Biphenotypic acute leukemia, AML CD3, CD7, CD13, CD33, TdT CD10, CD19, CD20, CD22, cytokeratin and T-cell ALL Biphenotypic acute leukemia, AML and CD13, CD19, CD22, CD33, CD79a, CD3, CD7, sIg B-cell precursor ALL HLA-DR, TdT Large granular lymphocyte leukemia, CD2, CD5, CD16, CD56, HLA-DR CD3, CD4 NK cell Large granular lymphoproliferative CD2, CD3, CD16, CD56, CD57 CD25, TdT disorder, T cell ALL (early-B precursor) CD19, CD22, CD79a, HLA-DR, TdT CD3, CD4, CD5, CD7, CD8, CD10, sIg, cIg ALL (CALLA) CD10, CD19, CD22, CD79a, HLA-DR, TdT CD3, CD4, CD5, CD7, CD8, sIg, cIg ALL (pre-B) CD10, CD19, CD22, CD79a, HLA-DR, cIg, TdT CD3, CD4, CD5, CD7, CD8, sIg

ALL, acute lymphoblastic leukemia; AML, acute myeloblastic leukemia; CALLA, common acute lymphoblastic leukemia antigen; cIg, cytoplasmic immunoglobulin; NK, natural killer; sIg, surface immunoglobulin; TdT, terminal deoxynucleotidyl transferase.

98 Am J Clin Pathol 2000;113:95–106 © American Society of Clinical Pathologists Hematopathology / ORIGINAL ARTICLE Downloaded from https://academic.oup.com/ajcp/article/113/1/95/1757626 by guest on 28 September 2021

❚Figure 2❚ The main menu of CD Marker. ❚Figure 3❚ Graphic user interface to enter data. cIg, cyto- plasmic immunoglobulin; sIg, surface immunoglobulin; TdT, terminal deoxynucleotidyl transferase.

1. The malignant cells are positive for the following markers: CD5, CD19, CD20, CD21, CD23, CD24, CD43, CD79a, HLA-DR, surface immunoglobulin, and coexpres- sion of CD5 and CD19 or CD20. 2. The malignant cells are negative for the following markers: CD2, CD3, CD4, CD7, CD8, CD10, CD25, and FMC7. A summary of diagnostic attributes for CLL/SLL is shown in ❚Figure 4❚. Note that some marker results are left blank in this set of attributes. Only the markers considered to have an important role in the differential diagnosis are included. This design helps to minimize the computing time in the inference process. When marker results of a given case are entered, they are processed by the database inference engine, and a list of differential diagnoses will be displayed. ❚Figure 4❚ Immunophenotyping for chronic lymphocytic These diagnoses are listed with their associated MF value. leukemia (B cell)/small lymphocytic lymphoma. cIg, cyto- A demonstration of a case with CLL/SLL illustrates of plasmic immunoglobulin; Neg, negative; sIg, surface how CD Marker can be used for interpretation of immunoglobulin; TdT, terminal deoxynucleotidyl transferase; immunophenotyping results and how its search mechanism TRAP, tartrate-resistant acid phosphatase. works. ❚Figure 5❚ shows the marker data available for the patient sample. CD Marker attempts to match this set of data with the diagnostic attributes of 33 hematopoietic neoplasms for CLL/SLL in the knowledge base: CD11c, CD14, CD16, in the knowledge base. The available data matched the CD22, CD45, glycophorin A, and cytokeratin. These input following attributes of CLL/SLL: data had no effect on the ranking of CLL/SLL since they 1. Positive for CD5, CD19, CD20, CD23, HLA-DR, were not included as part of the calculation for its MF. surface immunoglobulin, and coexpression of CD5 and CD19. Similarly, the following attributes in the knowledge base 2. Negative for: CD3, CD4, CD7, CD8, CD10, and CD25. without corresponding input data had no effect on the MF The total number of attributes of CLL/SLL that matched value for CLL/SLL: CD2, CD21, CD24, CD43, CD79a, and the input data was 13 (7 positive results and 6 negative FMC7. The intentional exclusion of input data without corre- results). This number was represented by the variable M in sponding attributes in the database (or attributes without Equation 1 (M = 13). None of the attributes of CLL/SLL was corresponding input data) in calculating MF serves the in conflict with the input data (N = 0). Note that the important purpose of maintaining a flexible design for the following input data did not have a corresponding attribute knowledge base and for the data input panel. Since different

© American Society of Clinical Pathologists Am J Clin Pathol 2000;113:95–106 99 Nguyen et al / DATABASE FOR IMMUNOPHENOTYPING BY FLOW CYTOMETRY Downloaded from https://academic.oup.com/ajcp/article/113/1/95/1757626 by guest on 28 September 2021

❚Figure 5❚ Input data for a case of chronic lymphocytic ❚Figure 6❚ Output for the demonstration case of chronic leukemia (CLL). cIg, cytoplasmic immunoglobulin; sIg, surface lymphocytic leukemia (CLL). ALL, acute lymphoblastic immunoglobulin; TdT, terminal deoxynucleotidyl transferase. leukemia; AML, acute myeloblastic leukemia; NK, natural killer. flow cytometry laboratories may use different markers for markers in the data panel. However, the availability of immunophenotyping and various studies on marker pattern essential data would influence the accuracy of ranking by of neoplasms have used different marker panels, an absolute CD Marker. In the demonstration case of CLL/SLL, the requirement of certain markers in the interpretation process ranking results would be different if the result for CD5 and would be too stringent to yield reasonable matches.5 The MF CD23 were not available. In this scenario ❚Figure 7❚, value for CLL/SLL at this point was: prolymphocytic leukemia would be the leading diagnosis MF = 13 / (13 + 0) = 1 (MF = 1, M – N = 12), followed by CLL/SLL (MF = 1, M After CD Marker calculated the MF value for all the – N = 11). For comparison, the diagnostic criteria for remaining 32 hematopoietic neoplasms in the knowledge prolymphocytic leukemia2-4 are shown in ❚Figure 8❚. base and ranked them accordingly, it listed the following The critical role of the interpreting pathologist cannot be leading diagnoses: overemphasized. CD Marker is useful only for suggesting a 1. CLL/SLL; MF = 1 list of differential diagnoses. The pathologist must establish 2. Prolymphocytic leukemia (B-cell); MF = 1 the final diagnosis by correlating the histologic findings of 3. Mantle cell lymphoma; MF = 0.89 the case with the immunophenotyping results. The immuno- 4. Mixed cell lymphoma; MF = 0.88 logic marker pattern of neoplasms in the list of differential 5. Large B-cell lymphoma; MF = 0.86 diagnoses can be reviewed during the interpretation process CLL/SLL and prolymphocytic leukemia had the same by using the “Look up a disorder in database” feature of CD MF value (MF = 1) with the given input data. A second Marker (Figures 4 and 8). criterion, the difference between the matched attributes and As shown in the preceding demonstration, CD Marker the unmatched attributes for a neoplasm (M – N), is used to was designed with a user-friendly interface. This graphic refine the ranking process for neoplasms with the same MF user interface was arranged such that the sequence of data value. With this second criterion, CLL/SLL was ranked as entry, display of results, and review of marker pattern should the leading diagnosis (MF = 1, M – N = 13), followed by be intuitive to the user. Besides the essential features shown prolymphocytic leukemia (MF = 1, M – N = 12). ❚Figure 6❚ in the demonstration, CD Marker also has been designed to shows the list of differential diagnoses with all the calcula- include the following features: tion results. The mechanism of searching from neoplasms 1. “What If” reruns can be performed whenever the in the database to the input data for the best matches repre- user edits one or more of the input data. This editing may sents a strategy known as backward-chaining search.11,15,16 be in the form of changing the value of the data (positive to This demonstration shows the open-ended format of negative or vice versa), deleting the input data, or adding the data input. The data panel consists of many immuno- additional data. The option for reruns enhances the flexi- logic markers, some of which may not be part of routine bility of CD Marker for evaluating data that may not be testing in a particular laboratory. Consequently, the actual clear-cut, that is, when the marker results are borderline. data input for a case are unlikely to account for all the Without requiring different sets of data entry, the What If

100 Am J Clin Pathol 2000;113:95–106 © American Society of Clinical Pathologists Hematopathology / ORIGINAL ARTICLE reruns offer the user a convenient way to consider all the potential diagnoses based on laboratory data that may be subject to errors for various reasons. 2. The user has the option to revise the database contents to fit individual needs. In other words, the immunologic marker pattern of neoplasms in CD Marker can be modified to the user’s preference. CD Marker has default tables that together form the knowledge base of the database. This knowledge base has been validated in the present study (see later sections on methods of validation and validation results). When the user edits the database tables, the tables are updated automatically to reflect the revisions made by Downloaded from https://academic.oup.com/ajcp/article/113/1/95/1757626 by guest on 28 September 2021 the user. The revised tables will be used by CD Marker in subsequent runs. Since the revised tables contain information that has not been validated in our study, the user must be ❚Figure 7❚ Output for the demonstration case without data aware of the need to conduct individual validation studies to input for CD5 and CD23. ensure that CD Marker performs adequately after each revi- sion. CD Marker offers the user the option of selecting the default tables or the revised tables before each run. CD Marker makes this option possible by saving the default tables and the revised tables before any revision by the user. 3. The contents on the computer screen at any time during a session with CD Marker can be printed for hard copy report. 4. Online “HELP” instructions are available for using CD Marker.

Method for Validating Database CD Marker In the second phase of our project, we used 92 cases involving various hematopoietic neoplasms to assess the performance of CD Marker. These were patient cases at Lyndon B. Johnson Hospital and Ben Taub Hospital (Harris ❚Figure 8❚ Immunophenotyping for prolymphocytic County Hospital District, Houston, TX). Flow cytometry leukemia. cIg, cytoplasmic immunoglobulin; Glyco, studies for these cases were performed between January glycophorin; Pos, positive; sIg, surface immunoglobulin; 1995 and December 1996. Data for these 92 cases were TdT, terminal deoxynucleotidyl transferase; TRAP, tartrate- retrieved retrospectively, and immunophenotyping data were resistant acid phosphatase. tested on CD Marker. The number of cases for each disorder is shown in ❚Table 4❚. The final diagnosis for each case was established previously by histologic findings and correlation described previously.1 The specimens in our cases included with flow cytometry results. The final diagnosis was docu- bone marrow, lymph node, spleen, body fluid, and extra- mented in surgical pathology reports, including bone marrow nodal hematopoietic tumors. For a CD Marker session to be reports. Data entry for each case for CD Marker included considered successful, the final diagnosis for a given case only marker results that were available in the flow cytometry must be included in the list of 5 differential diagnoses laboratory at Ben Taub Hospital at the time of initial presen- suggested by CD Marker. tation. Only definitive marker results (positive or negative) in each case were used in the validation process. Equivocal results were not used owing to their lack of contribution to Results the validation results. To avoid potential bias in the design of CD Marker, its knowledge base was developed by one of us ❚Table 5❚ shows the preliminary results for all the 92 (A.N.D.N.) who had not interpreted the cases previously. cases used in this validation process with the accompanying Immunophenotyping by flow cytometry was performed information: ranking of the final diagnosis by CD Marker, on a FACScan (Becton Dickinson, Mountain View, CA) as the number of cases in each ranking category and the

© American Society of Clinical Pathologists Am J Clin Pathol 2000;113:95–106 101 Nguyen et al / DATABASE FOR IMMUNOPHENOTYPING BY FLOW CYTOMETRY

❚Table 4❚ Number of Cases Used in Validation (N = 92)

Disorder No. of Cases

Follicular small cleaved cell lymphoma 5 Mantle cell lymphoma 2 Large B-cell lymphoma 5 Mediastinal B-cell lymphoma 2 B-cell lymphoma, unclassifiable, mixed small and large cells 2 Lymphoplasmacytoid lymphoma 2 Marginal cell lymphoma 3 Burkitt lymphoma/ALL, L3 3 Splenic lymphoma with villous lymphocytes 4 Peripheral T-cell lymphoma 5 Lymphoblastic lymphoma (T cell) 2

Thymoma 2 Downloaded from https://academic.oup.com/ajcp/article/113/1/95/1757626 by guest on 28 September 2021 Chronic lymphocytic leukemia (B cell)/small lymphocytic lymphoma 7 Chronic lymphocytic leukemia (T cell) 2 Prolymphocytic leukemia (B cell) 2 Hairy cell leukemia 4 Sézary syndrome/mycosis fungoides 2 Adult T-cell leukemia/lymphoma 2 ALL (T-cell precursor) 2 Acute myeloblastic leukemia without maturation, M1 3 Acute myeloblastic leukemia with maturation, M2 4 Acute promyelocytic leukemia, M3 2 Acute myelomonocytic leukemia, M4 2 Acute monoblastic leukemia, M5 2 Acute erythroleukemia, M6 2 Acute megakaryoblastic leukemia, M7 2 Biphenotypic acute leukemia, AML and T-cell ALL 2 Biphenotypic acute leukemia, AML and B-cell precursor ALL 2 Large granular lymphocyte leukemia, NK cell 2 Large granular lymphoproliferative disorder, T cell 2 ALL (early-B precursor) 2 ALL (CALLA) 3 ALL (pre-B) 4

ALL, acute lymphoblastic leukemia; AML, acute myeloblastic leukemia; CALLA, common acute lymphoblastic leukemia antigen.

❚Table 5❚ Summary of Validation Results

Preliminary Final

Accumulated Accumulated Ranking by CD Marker No. (%) of Cases Percentage No. (%) of Cases Percentage

Differential diagnosis First 36 (39) 39 39 (42) 42 Second 23 (25) 64 23 (25) 67 Third 11 (12) 76 12 (13) 80 Fourth 10 (11) 87 10 (11) 91 Fifth 2 (2) 89 2 (2) 93 Lower ranking 10 (11) — 6 (6) — Total 92 (100) — 92 (99)* —

*Total does not equal 100 because of rounding.

percentage and accumulated percentage of cases in each (SLL) in case 25 was not ranked by CD Marker owing to ranking category. The validation results showed a success lack of a chain restriction. Review of flow cytometric rate of 89%. This success rate means that in 89% of the data revealed a B-cell subpopulation with a definite lambda cases, the final diagnosis was included in the list of 5 differ- light chain restriction. SLL subsequently was ranked first for ential diagnoses generated by CD Marker. In 11% of the this case by CD Marker. cases, the final diagnosis was not included in this list for the 2. Incorrect final diagnosis: Cases 1 and 11 were diag- following reasons: nosed as SLL, which was not ranked by CD Marker. Review 1. Inadequate data input: Small lymphocytic lymphoma of microscopic slides from these cases by a pathologist in

102 Am J Clin Pathol 2000;113:95–106 © American Society of Clinical Pathologists Hematopathology / ORIGINAL ARTICLE our group (E.I.B.) indicated that the diagnosis of these cases approaches have been attempted, including rule-based should be revised as mantle cell lymphoma (case 1) and systems, cluster analysis, semantic networks, and various large cell lymphoma (case 11). In fact, mantle cell mathematical algorithms. lymphoma was ranked first and large cell lymphoma was Alvey et al6 designed a rule-based production system for ranked third by CD Marker for cases 1 and 11, respectively. the diagnosis of acute and chronic leukemia. Suggested diag- 3. Limitations of CD Marker knowledge base: Case 9 was noses by the program are qualified as definite, probable, diagnosed as acute lymphoblastic leukemia (ALL), early pre- compatible, or possible. This program gave correct conclu- B cell type. This diagnosis was not ranked by CD Marker sions for 400 cases after 4 iterations in a retrospective study, owing to a lack of B-cell ALL subtypes in the database. The along with a summary of its reasoning process and sugges- general classification of B-cell precursor ALL included insuf- tions for further testing if necessary. This program was ficient diagnostic criteria for early pre-B cell ALL. A revision developed specifically for leukemia and, therefore, is limited was made for CD Marker to include all B-cell ALL subtypes. to only a subset of hematopoietic neoplasms. Downloaded from https://academic.oup.com/ajcp/article/113/1/95/1757626 by guest on 28 September 2021 The correct diagnosis of early pre-B cell ALL subsequently Petrovecki et al7 developed a mathematical algorithm was ranked first after this revision. Three cases of T-cell based on matrix algebra. Their program calculates compati- lymphoma (cases 37, 67, and 68) were not ranked by CD bility scores to potential diagnoses and covers a wide Marker. This deficiency was due to an intrinsic limitation of spectrum of hematopoietic neoplasms. The program was CD Marker in handling certain cases of T-cell malignant tested with 58 cases in a retrospective study and suggested neoplasms. Aberrant loss of T-cell antigens is a characteristic the correct diagnosis as 1 of the top 4 choices in 93% of the finding in T-cell malignant neoplasms.2-4 However, a suitable cases. However, the correct diagnosis was ranked first in inference mechanism has not been developed to detect such a only 29% of the cases. manifestation. The difficulty in designing an algorithm for Verwer et al8 used cluster analysis to automatically assign such detection lies in the random distribution of T-cell cell lineage to acute leukemic cells during the analytic phase markers, which makes it impossible to program the values of of flow cytometric study. In this method, multidimensional diagnostic attributes into the database. Despite this short- data from data files in list mode are divided into groups using coming, a considerable number of T-cell cases were ranked the nearest-neighbor algorithm. The position of the cellular successfully by CD Marker (11 [78%] of 14 T-cell cases). groups is analyzed by a criteria table. The program then classi- 4. Unusual marker results: One was a case of CD5+ fies the groups into appropriate cell lineage. This technique of large cell lymphoma (case 38). Review of microscopic slides automated gating in the analytic phase of flow cytometric from this case indicated that the histologic and immunophe- study is expected to reduce technical errors commonly seen notyping results were suggestive of Richter syndrome. with the conventional way of manual gating. This program, However, the available clinical information was insufficient however, is limited to leukemia and was tested only on a small to confirm this diagnosis. Another was a case of HLA- number of clinical cases. DR–negative large cell lymphoma (case 14). The reason for Diamond et al9 developed a relational database, known negative HLA-DR result was unknown but most likely was as “Professor Fidelio,” that matches immunophenotyping technical error. Last was a case of CD5– SLL (case 31). patterns of hematopoietic neoplasms with the input data. Review of microscopic slides from this case confirmed the This database also covers other types of diagnostic data, final diagnosis of SLL. including cellular morphology, cytochemistry, DNA content In the light of the new ranking by CD Marker for the and proliferative activity. The program was validated cases under consideration, our validation results were successfully in 300 (80.0%) of 366 test cases. This is one of revised to reflect the successful ranking for 4 additional the programs that offer the most comprehensive option for cases (1, 9, 11, and 25). The revised results, summarized in data input. The only drawback is the limited number of diag- Table 5, showed successful ranking in 93% of the cases noses in the database, with only 12 general categories of (the final diagnosis included in the list of top 5 differential hematopoietic neoplasms. diagnoses). Furthermore, the final diagnosis was ranked Thews et al10 used semantic networks to model a hier- first in 39 (42%) of 92 cases. archy of hematopoietic cells and their occurrence in neoplasms. Their program is limited to blood and bone marrow samples of a few specific types of neoplasms: acute myeloblastic leukemia, B-cell ALL, and B-cell lymphoma. Discussion The validation study showed an impressive success rate of A number of computer programs have been developed 97.3% of the cases (616/633). to facilitate interpretation of immunophenotyping for Although several programs have been introduced to hematopoietic neoplasms by flow cytometry.6-10 Different facilitate computer-assisted interpretation of flow cytometric

© American Society of Clinical Pathologists Am J Clin Pathol 2000;113:95–106 103 Nguyen et al / DATABASE FOR IMMUNOPHENOTYPING BY FLOW CYTOMETRY data, we believe that CD Marker possesses several unique priate diagnosis. It cannot be overemphasized that human features not found in other similar systems. The diagnostic judgment is the most important element in finalizing the criteria and the associated inference process we describe diagnosis. The number and complexity of hematopoietic demonstrate a clear and understandable way of knowledge neoplasms require that the differential diagnoses suggested representation for marker interpretation. The inference by CD Marker be reviewed before making a final diagnosis. process is no longer a “black box” with all the complex algo- We believe that the pathologist’s clinical judgment can be rithms incomprehensible to people outside the fields of math- enhanced with the information from CD Marker to yield an ematics and artificial intelligence. This advantage, coupled accurate diagnosis. with the user-friendly interface in a Windows environment, 3. The current version of CD Marker is deficient in potentially can increase the acceptance of computer-assisted handling some cases of T-cell malignant neoplasm owing to interpretation by pathologists who analyze flow cytometric the difficulty designing an algorithm for detection of the data. Confidence in using CD Marker also is enhanced by the random loss of T-cell antigens as discussed earlier. We are Downloaded from https://academic.oup.com/ajcp/article/113/1/95/1757626 by guest on 28 September 2021 option to modify its database contents to suit individual working on several approaches to alleviate this shortcoming needs. The database is comprehensive and also can be revised and expect to implement a new technique to handle T-cell easily to incorporate more disease entities as needed. malignant neoplasms more effectively in future versions of Furthermore, our program can run on an average personal CD Marker. computer with a minimum requirement for hardware and 4. Not all the commercially available markers were used software (486 microprocessor with 16 megabytes of RAM, in our laboratory. Subsequently, our validation results do not Microsoft Windows 95). The run-time version of CD Marker represent a maximal accuracy level that would have been does not require that Microsoft-Access database be installed achieved if all the available markers had been used. Further in the user’s computer. Recently, we posted CD Marker on research is needed to analyze results obtained from different the World Wide Web to make it available to any user who laboratories using markers not available in our laboratory. needs access to this database. The URL for our Web page is 5. CD Marker would not be useful in the diagnosis of as follows: http://dpalm.med.uth.tmc.edu/faculty/bios/- neoplasms that traditionally have not been shown to benefit nguyen/nguyen.html from flow cytometric immunophenotyping.1-4 These include The only requirement to run this database directly on the the following: (1) Hodgkin disease; (2) multiple myeloma; Web is the availability of the Microsoft-Access 97 program in (3) neoplasms with common immunophenotypes: mucosa- the user’s computer. Copyright is applicable to the contents of associated lymphoid tissue lymphoma, marginal cell CD Marker to protect our ownership. lymphoma, monocytoid B-cell lymphoma, lymphoplasma- Despite the usefulness of CD Marker, the following cytoid lymphoma, and many cases of large B-cell constraints are inherent in its use. lymphoma. These neoplasms have the same immunopheno- 1. This database is designed for pathologists who are not type CD19+/CD20+, CD5–, CD10–, and CD23–. Also, subspecialized in hematopathology and for pathology resi- follicular center cell lymphoma and Burkitt lymphoma have dents-in-training. It simply provides the differential diagnosis the same immunophenotype CD19+/CD20+, CD5–, for a given set of marker expressions without regard to CD10+, and CD23–. (4) Atypical immunophenotypes: morphologic features, clinical manifestations, or response to CD5– CLL/SLL, CD23– CLL/SLL, CD23+ mantle cell therapy. Therefore, the differential diagnosis is designed to be lymphoma, CD10– follicular center cell lymphoma, CD5+ broad. Just like trained hematopathologists, nonhematopathol- mucosa-associated lymphoid tissue lymphoma, and CD5+ ogists must correlate the immunophenotypic findings with the marginal cell lymphoma; (5) T-cell without loss morphologic characteristics of the neoplasm, the most impor- of T-cell–associated antigens, which may be misdiagnosed tant basis of diagnosis and classification. as benign T-cell hyperplasia; (6) aberrant expression of T- 2. The user must have a functional knowledge of cell antigens on B-cell lymphoma, such as CD2+ CLL/SLL, hematopoietic disorders to be able to use CD Marker effec- CD8+ CLL/SLL, and CD8+ mantle cell lymphoma; (7) tively because this database serves only as a search tool to polyclonal posttransplant B-cell lymphoproliferative disor- aid the user in making a diagnosis. The technical skills to ders; (8) CD34– acute leukemia; (9) terminal deoxynu- perform the laboratory procedures and the experience needed cleotidyl transferase–negative acute leukemia; (10) acute to accurately gate the cellular populations are critical in the promyelocytic leukemia without a diagnostic immunophe- diagnostic process. CD Marker can generate a list of differ- notypic profile, such as CD34–, HLA-DR–negative, ential diagnoses in most cases if adequate data are input. The CD13+, CD33+, CD41a–, and glycophorin A–negative. The interpreting pathologist then can quickly compare the immunophenotype merely reflects the presence of a patient’s laboratory data with the marker patterns available myeloid cell population, and the same immunophenotype from the CD Marker display module and make the appro- may be obtained in cases of chronic myelocytic leukemia

104 Am J Clin Pathol 2000;113:95–106 © American Society of Clinical Pathologists Hematopathology / ORIGINAL ARTICLE and secondary leukocytosis. (11) T-cell–rich B-cell 5. Check W. Diagnosing leukemia, lymphoma: when lymphoma with a predominant population of reactive T laboratories go with flow analysis. CAP Today. November 1998:1-37. lymphocytes, mostly CD4+ and in almost all instances do 6. Alvey PL, Preston NJ, Greaves MF. High performance for not show clonality by immunoglobulin light chain ratio expert systems: a system for leukemia diagnosis. Med Inf. because of a very small population of malignant large B 1987;12:97-114. lymphocytes. Such cases can be misdiagnosed easily as a 7. Petrovecki M, Marusic M, Dezelic G. An algorithm for reactive lymph node on flow cytometry data alone. leukaemia immunophenotype pattern recognition. Med Inf. 1993;18:11-21. The development of user-friendly software and the 8. Verwer B, Terstappen L. Automatic lineage assignment of rapidly increasing number of physicians with skills and inter- acute leukemia by flow cytometry. Cytometry. 1993;14:862- ests in computer-based applications create an ideal climate 875. for the use of computers for diagnostic purposes.17-34 The 9. Diamond LW, Nguyen DT, Andreeff M, et al. A knowledge- number of database applications in laboratory medicine is based system for the interpretation of flow cytometry data in Downloaded from https://academic.oup.com/ajcp/article/113/1/95/1757626 by guest on 28 September 2021 expected to increase. Laboratory clinicians are in an ideal and lymphomas. Cytometry. 1994;17:266-273. position to exploit database applications since they tend to be 10. Thews O, Thews A, Huber C, et al. Computer assisted interpretation of flow cytometry data in hematology. computer literate and have access to a large amount of Cytometry. 1996;23:140-149. 32 patient data on laboratory computers. 11. Chignell M, Parsaye K. Expert Systems for Experts. New York, We found that CD Marker provides a convenient inter- NY: John Wiley and Sons; 1988. active tool to assist clinical personnel in diagnosing 12. Ryan C. Cost reduction and QC software on a microbiology hematopoietic neoplasms by using flow cytometric data. identification and susceptibility system [abstract]. Am Clin Lab. 1995:26. This decision-support tool is available to a large number of 13. Autio K, Kari A, Tikka H. Integration of knowledge-based users via the World Wide Web and also can be useful for system and database for identification of disturbances in fluid independent study on immunophenotyping patterns of and electrolyte balance. Comput Methods Programs Biomed. hematopoietic neoplasms. A logical extension of CD Marker 1991;34:201-209. would be to integrate the knowledge base into an existing 14. Connelly DP. Embedding expert systems in laboratory information systems. Am J Clin Pathol. 1990;94(suppl 1):7-14. laboratory flow cytometer. Data could then be retrieved 15. Buchanan BG, Shortliffe EH, eds. Rule-Based Expert Systems. directly from the instrument for use by the database, thus Reading, MA: Addison-Wesley; 1984. simplifying user interaction. 16. Turban E. Expert Systems and Applied Artificial Intelligence. New York, NY: Macmillan; 1992:73-144, 453-480. 1 From the Departments of Pathology and Laboratory Medicine 17. Marquardt VC. Artificial intelligence and decision-support and 2Health Informatics, University of Texas Health Science technology in the clinical laboratory. Lab Med. 1993;24: Center at Houston, and the 3Department of Pathology, Baylor 777-782. College of Medicine, Houston, TX. 18. Barnett GO, Cimino JJ, Hupp JA. DXplain: an evolving Address reprint requests to Dr Nguyen: Dept of Pathology diagnostic decision-support system. JAMA. 1987;258:67-74. and Laboratory Medicine, University of Texas Health Science 19. Forsstrom J, Nuutila P, Irjala K. Using the ID3 algorithm to Center at Houston, 6431 Fannin MSB 2.292, Houston, TX 77030. find discrepant diagnoses from laboratory databases of thyroid patients. Med Decis Making. 1991;11:171-175. Acknowledgments: We thank Alex Buraruk, MT(ASCP), for 20. Shortliffe EH. Computer programs to support clinical decision consulting with us on technical matters during the course of this making. JAMA. 1987;258:61-66. project; Donna Obermeier, for many helpful suggestions; and 21. Groth T, Moden H. A knowledge-based system for real-time Robert Hunter, MD, PhD, for reviewing the manuscript with many quality control and fault diagnosis of multitest analyzers. insightful comments. Comput Methods Programs Biomed. 1991;34:175-190. 22. Tischler AS, Martin MR. Generation of surgical pathology reports using a 5,000 word speech recognizer. Am J Clin Pathol. 1989;92(suppl 4, pt 1):S44-S47. References 23. Tong DA. Weaning patients from mechanical ventilation: a knowledge-based system approach. Comput Methods Programs 1. Keren DF. Flow Cytometry in Clinical Diagnosis. Chicago, IL: Biomed. 1991;35:267-278. ASCP Press; 1989. 24. Weiss SM, Kulikowski CA, Galen RS. Representing 2. Sun T. Color Atlas and Text of Flow Cytometric Analysis of experience in a computer program: the serum Hematologic Neoplasms. New York, NY: Igaku-Shoin; 1993. diagnostic program. J Clin Lab Automation. 1983;3:383-387. 3. Kjeldsberg C, ed. Practical Diagnosis of Hematologic Disorders. 25. Shifman MA. FABHELP: a rule-based consultation program Chicago, IL: ASCP Press; 1995. for FAB classification of and 4. Harris NL, Jaffe ES, Stein H, et al. A revised European- myelodysplastic syndromes. Lab Med. 1991;22:639-643. American classification of lymphoid neoplasms: a proposal 26. O’Connor ML, McKinney T. The diagnosis of microcytic from the International Lymphoma Study Group. Blood. anemia by a rule-based expert system using VP-Expert. Arch 1994;84:1361-1392. Pathol Lab Med. 1989;113:985-988.

© American Society of Clinical Pathologists Am J Clin Pathol 2000;113:95–106 105 Nguyen et al / DATABASE FOR IMMUNOPHENOTYPING BY FLOW CYTOMETRY

27. Spackman KA, Connelly DP. Knowledge-based systems in 31. Nguyen AND, Hartwell EA, Milam JD. A rule-based expert laboratory medicine and pathology. Arch Pathol Lab Med. system for laboratory diagnosis of hemoglobin disorders. Arch 1987;111:116-119. Pathol Lab Med. 1996;120:817-827. 28. Bates JE, Bessman JD. Evaluation of BCDE, a microcomputer 32. Spackman KA, Connelly DP. Knowledge-based systems in program to analyze automated blood counts and differentials. laboratory medicine and pathology: a review and survey of the Am J Clin Pathol. 1987;88:314-323. field. Arch Pathol Lab Med. 1987;111:116-119. 29. Blomberg DJ, Ladley JL, Fattu JM, et al. The use of an expert 33. Siegel JD. Computerized diagnosis: implications for clinical system in the clinical laboratory as an aid in the diagnosis of education. Med Educ. 1988;22:47-54. anemia. Am J Clin Pathol. 1987;87:608-613. 34. Nguyen AND, Uthman M. XPCOAG: a teaching software 30. Nguyen DT, Diamond LW, Priolet G, et al. Expert system for laboratory diagnosis of coagulation disorders. Lab Med. design in hematology diagnosis. Methods Inf Med. 1992; 1994;25:399-401. 31:82-89. Downloaded from https://academic.oup.com/ajcp/article/113/1/95/1757626 by guest on 28 September 2021

106 Am J Clin Pathol 2000;113:95–106 © American Society of Clinical Pathologists