Welcome to the Data Analytics Toolkit PowerPoint presentation on EHR architecture and meaningful use. When data is collected and entered into the electronic health record, the data is ultimately stored in a . When analyzing the objectives related to Meaningful Use, data needs to be extracted from the database. It is likely that you’ve all worked with in some capacity. Most of you are probably familiar with Microsoft Excel’s spreadsheets. This a a flat-file type of database. However, it is unlikely that the data that must be acquired for Meaningful Use is stored in a flat-file database. That is because there are several limitations with this type of database - it can’t handle large datasets (more than 1 million rows), you can’t access the data using a programming language, and it is not multi-user friendly. A better approach that has been adopted by most healthcare organizations is to store the data in a which can be defined in terms of the relations of data. These databases can handle very large datasets, can be accessed and queried using a programming language, support multiple users, and can be accessed and updated remotely. A relational database is structured into tables, each is referred to by an entity or a noun. For instance, one table might have information about a patient and therefore be referred to as the ‘Patient’ table. The list of adjectives that would describe the entity are known as attributes and are listed in the table. For the patients table, the attributes may include gender, state of residence, year of birth, name, etc. There are typically many different tables and they often can be related to other tables based on common attributes. The common attributes are known as relations and they are the verbs that connect two entities. For instance, we might have a patients table and a medications table, which lists all the medications patients are taking. Both tables might share the common attribute of a patient identifier and therefore can be related. The is similar to a verb in that a patient TAKES medication. The relation between the patients table and medications table can be used for combining data for queries. Here is a picture of an entity relation diagram which represents a relational database. This database includes a patient, medications, and diagnosis table. The medication and diagnosis tables can be related to the patient table because they all share the attribute, “PatientID”. Similarly, the medications table can be linked to the diagnosis table because they share the attribute, “DiagnosisID”. There are three different types of relationships that can exist between tables. These different ways are understood as rules of cardinality.

The first type of relationship is a Many-to-Many relationship. These types of relationships are very common in the real world, but not in a relational database. An example would be that many different physicians may prescribe many different medications. The problem with these types of relationships is that they lead to a less efficient database. A solution to this problem is to create a third, intervening table called an intersection table. This breaks the many-to-many relationship into two one-to-many relationships.

The second type of relationship is known as a one- to-many relationship. This occurs when an entity or table is related to one or more instances of another entity. For example, one patient can have many diagnoses. This is the most common type of relationship in a relational database.

The last type of relationship is a one-to-one relationship. This occurs when both entities are related by one and only one instance of the other entity. For example, one patient can only have one date of birth. It is advised to combine entities into one table if a one-to-one relationship exists.

In the entity-relation-diagram shown on the previous slide, there are three tables and three one-to-many relationships. That is, one patient can have many medications, one patient can have many diagnoses, and, one diagnosis can have many medications. The type of relationship for a specific entity is represented as symbols on an entity-relation diagram. If an entity has a relationship with another table where one and only one record matches, this would be depicted as two straight lines. If one or many records match that of another table, this is depicted as a triangle and a line. A zero, or one, or many relationship is depicted as a triangle with a circle. Finally a zero or one relationship is depicted as a line and a circle. These symbols are important for interpreting an entity-relation- diagram for determining the type of relationship between two tables. For instance, if we consider the entity-relation-diagram shown previously, we see that the relationship between the patient and medication table is one-to- many where zero, or one, or many patients can be taking a medication, and one and only one patient is assigned to each instance of a medication. Keys are the attributes that link entities. A primary key is an attribute which can uniquely identify a particular instance of an entity. For example, the primary key for the Patient table shown previously is PatientID. It is important to realize that a primary key must be distinct. Therefore, when considering this characteristic for primary keys, would a patient’s full name be acceptable? Probably not, as it may not be unique. A social security number may also not be unique. A medical record number could be used, however, there are instances where we find duplicate records and duplicate medical record numbers for a single patient.

When a table’s primary key is present in another table, this is known as a . For instance, PatientID is present in both the medications and diagnosis tables. Therefore, although PatientID is the primary key in the patients table it is also the foreign key in the medications and diagnosis tables. The foreign keys are used to create a link between the different entities. The primary keys are unique identifiers. Each table has a Primary key. The primary key in the patient table is PatientID. The primary key in the medications table is MedicationID, and the primary key in the diagnosis table is DiagnosisID. Foreign keys are shown in both the medications and the diagnosis table. The foreign keys in the medications table include PatientID and DiagnosisID while the foreign keys in the diagnosis table include PatientID.

Because the patient table does not have any instances of medications or diagnoses, the patient table does not have a foreign key. Consider the one-to-many relationship. For each one patient there may be many diagnoses and medications. Therefore, in order to link the patient table with the other tables, we need to include the patient ID in the medication and diagnosis tables. Anytime there is a one-to-many relationship, the many side of that relationship holds the foreign key. The reason keys are used is primarily for organizational purposes. Without them, the tables would become cumbersome and impossible to link or navigate. If you consider the way the data is stored in spreadsheet form, this may become more apparent. The patient table includes data on each of the patients. Each has information for one and only one patient. We have the gender, year of birth, and state of residence for each patient. However, because the PatientID also shows up in the diagnosis table, and the fact that each patient can have zero, one, or many diagnoses, we find that one patientID may show up once, more than once, or not at all. The medications table is very similar. The same PatientID may show up in the medications table once, more than once, or not at all. Also, the same medication for the same patient can be used for more than one diagnosis. Therefore, the patientID and MedicationID may match but the DiagnosisID may differ for those rows of data. A data dictionary is essential in order to fully understand the data elements within a relational database. The data dictionary lists all of the attributes in each table, and provides a brief description of the attribute, the data type (e.g., date/time, numerical), the length of the data in the field, and several other fields that can provide information about the data. For the example shown in this presentation, the data dictionary shows that the patient table includes 5 columns of data. The PatientID is the primary key and is a unique identifer data type which is never left blank. The definition of the PatientID is a unique identifier for a patient record. Gender is stored as character that is 1 letter long and not left blank. The data is stored as “M” for males and “F” for female. You can go through each of the other columns to see their data type and descriptions. The medication table includes 8 columns of data. The MedicationID is a unique identifier for a patient medication and is the primary key. There are two foreign keys: the patientID which is the unique identifier for a patient taking the medication and DiagnosisID which is the identifier for the diagnosis that the provider linked to the medication. The diagnosis table has a primary key called the DiagnosisID which is the unique identifier for a patient diagnosis. The foreign key in this table is the PatientID which is the unique identifier for the patient. When considering EHR architecture, particularly consider the implications of the data storage. The data that is needed for assessing the core and menu objectives and clinical quality measures is derived from these databases. Therefore, an understanding of relational databases is essential for understanding the data and ensuring data quality.