<<

Understanding differential attainment across medical training pathways: A rapid review of the literature Final report prepared for The General Medical Council

Dr Sam Regan de Bere, Dr Suzanne Nunn, Dr Mona Nasser 21/08/2015

Funded by the General Medical Council.

The views expressed in this report are those of the participants and the authors and do not necessarily reflect those of the General Medical Council.

1

1 Contents Table of Figures ...... 4

Table of Abbreviations and Acronyms ...... 4

Executive Summary ...... 5

Introduction ...... 5

Research design ...... 5

Analysis ...... Error! Bookmark not defined.

Current narratives of differential attainment ...... 10

1. Introduction ...... 14

2 Background ...... 15

3 Aims and purposes of the review ...... 15

4 Methodology ...... 16

4.1 Rapid review ...... 16

4.2 Narrative synthesis ...... 16

5 Methods ...... 17

5.1 Development and registration of protocol ...... 17

5.2 Search strategy ...... 17

5.3 Data management and extraction ...... 20

5.4 Quality assurance ...... 20

6 Research Ethics ...... 22

7 Data Analysis ...... 22

8 Narrative synthesis ...... 25

9 Findings ...... 26

9.1 The Individual or discrete group ...... 26

Study habits ...... 27

Psycho-social ...... 27

2

Social and cultural capital ...... 28

Success ...... 28

Ethnicity ...... 29

IMG ...... 31

Language ...... 34

Gender ...... 36

9.2 The institutional ...... 37

The Medical School and the working environment ...... 37

Mentoring ...... 38

Selection ...... 38

9.3 Policy ...... 39

Predictors of success at postgraduate level ...... 39

PMQ ...... 40

High stakes examinations ...... 40

MRCGP and MRCGP Clinical Skills Assessment (CSA) ...... 41

Examiner bias ...... 43

10 Discussion...... 45

10.1 Causes ...... 45

10.2 Ways of researching ...... 46

10.3 Possible interventions ...... 48

11 Conclusions (the story so far) ...... 50

12 References ...... 52

13 Appendices ...... 56

Appendix 1: Studies and other documents included in the synthesis ...... 56

Appendix 2: Quality evaluation of studies using primary data...... 87

3

Table of Figures Figure 1. Flow diagram of study selection ...... 23 Figure 2 Analysis of included studies and documents by methodology or type ...... 24 Figure 3. Publication by date ...... 25 Figure 4. Conceptual map of themes identified in the published literature ...... 26

Table of Abbreviations and Acronyms AoMRC Academy of Medical Royal Colleges

BME Black and minority ethnic BAPIO British Association of Physicians of Indian Origin CSA Clinical skills assessment FRCA Fellow of the of Anaesthetists (examination) GMC General Medical Council HEFCE Higher Education Funding Council for England HEFCE Higher Education Funding Council for England IELTS International English Language Testing System IMG International medical graduates MCAT Medical College Admission Test (USA) MRCOG Member of the Royal College of Obstetricians and Gynaecologists (examination) MRCPsych Member of the Royal College of Psychiatrists MCQ Multiple choice question NBME National Board of Medical Examiners (USA) OSCE Objective Structured Clinical Examinations PMQ Primary medical qualification PLAB Professional and Linguistics Assessment Board examination RCA Royal College of Anaesthetists RCGP Royal College of General Practitioners RCOG Royal College of Obstetricians and Gynaecologists USMLE United States Medical Licensing Examination

4

Executive Summary

Introduction Differential attainment is a term used to describe the variations in levels of educational achievement that occur between different demographic groups undertaking the same assessment. Differential attainment has been recognised as a challenge for medical professionals and educators since the 1990s, and has been observed in both undergraduate and postgraduate contexts. It is not specific to medical education; it is a feature of professional education more generally.

Since 2010 the GMC has worked with others analysing data in order to better understand the progress of trainees through their programmes and to identify any potential differences between demographic groups. This rapid review of literature published in the period between 2004 and the present day contributes to a wider programme of research being carried out by the GMC to explore differential attainment across training pathways.

Research design The research was commissioned to provide a rapid review of the corpus of knowledge relating to differential attainment. The researchers adopted a narrative synthesis methodology in order to explore how contributions to the literature had sought to define, measure and explain differential attainment – and therefore to identify key factors that might be considered as having an impact upon attainment.

An initial scoping exercise highlighted that the current corpus of literature comprises materials in a variety of formats, including; qualitative and quantitative research reports, systematic reviews of attainment data patterns, policy documents and academic papers, and opinion pieces and editorials.

Narrative synthesis provides a useful framework for accessing and analysing such diverse and complex literatures. It lends itself to a ‘storytelling’ approach, by capturing a number of different insights, evidence bases, theories and position pieces in context, and presenting them together as an overarching narrative of differential attainment. In addition, rather than imposing a definitive structure or sequential process, which might preclude certain significant contributions that do not fit the initial review terms (1), narrative synthesis

5

allows researchers to move iteratively within a systematic approach – picking up on leads to relevant information throughout the research process.

The search was conducted using PubMED, MEDLINE and PsychINFO databases, within a search strategy that included Medical Subject Headings (MeSH) terms and text-word searches for maximal retrieval. These searches were supplemented with further iterative searching of reference lists, and a grey literature search of stakeholder websites. The research team was supplemented by an expert panel, members of which were selected in order to provide advice on search terms, to discuss the quality of the retrieved literature, to comment on any initial emergent themes and to review the final report prior to submission to the GMC.

We developed two frameworks against which to evaluate the retrieved papers and grey literature: PICOC (Population, Intervention, Comparison, Outcome, Context) for quantitative papers and SPICE (Setting, Perspective, Intervention/phenomena of Interest, Comparison, Evaluation) for qualitative papers and other documents. These frameworks provided transparency for our identification of included papers and other documents. A total of 39 papers were included in the synthesis with the addition of 24 documents from the grey literature.

The literature on differential attainment The findings of narrative synthesis are grounded in the literature surveyed. The research process does not begin with a set of a priori assumptions: instead, using this method enables themes to emerge and be recorded as the literature is identified. The search process highlighted that the evidence base relating to differential attainment is disparate, that it includes a number of different research designs and variously applied methods, and that it does not feature definitive terminology across studies. Concepts and terms are often used interchangeably and are operationalised in research accordingly, which makes constant or consistent comparison difficult to validate.

Overall the peer reviewed literature was of a high quality, where research aims, objectives, methods and analyses were clearly articulated and justified. The main focus of primary research was on the relationship between ethnicity and differential attainment in high

6

stakes examinations. While some studies are focused on undergraduate populations, some on postgraduate doctors, and a number include both, we found that the research questions, findings and conclusions were nevertheless relevant to understanding the emerging narrative of differential attainment in postgraduate cohorts.

Given the limitations in the literature, we read and re-read the materials selected, individually and then discussed as a team. During this process we used conceptual mapping to help us understand the categories and themes arising from the entire data set. This grounded approach led to the emergence of a three level schemata, providing three distinct but related categories, or layers, of information on:

• the macro or policy level (investigating the political agendas and practical activity surrounding high stakes examinations) • the meso or institutional level (exploring the impact of the medical school, training contexts and/or working environment) • the micro or individual/discrete group level (with a focus on individuals or groups of students, doctors, examiners and so on)

Quantitative studies dominated the research base (26 studies), focusing on the macro level and typically using large data sets to examine causal and associative relationships between various demographic groups and different high stakes exams. The focus of the qualitative research (5 studies) was more diverse, and explored the role of factors at the micro and meso levels of infrastructures built to support examinations, cultural contexts and personal interactions.

Two large scale commissioned studies in the grey literature examined the significance of language and cultural factors for IMGs (2, 3) using mixed methods approaches, and one further study combined a literature search with interviews.(4) In addition to this there was one systematic review and meta-analysis of ethnicity and performance in UK trained doctors and medical students, focusing on quantitative reports. (5)

Investigating assessment agendas and the design of high stakes examinations The majority of studies dealing with the macro level focused on differential attainment in high stakes exams. The research upon which this aspect of the literature is based typically used quantitative methodologies, using large datasets, with a focus on testing for bias in the

7

exam, or a component part of it using exam data. Their conclusions are founded on typically high quality, peer reviewed reports including clear validity measures.

Taken as a whole, these studies have broadly demonstrated the validity of high stakes exams, and discounted evidence of bias in the nature and structure of exams themselves as causal factors for differential attainment. However, the emerging narrative contains a recognition that the infrastructures and processes put in place to support selection (6) and high stakes exams may nevertheless encompass elements that lead to actual or implied bias and/or differential attainment. (7)

Examples of this include: i) potential examiner bias through levels of concordance between examiner and candidate in practical examinations, (8-12), and ii) the lack of a universal terminology to classify data, which may lead to different interpretations of bias and/or differential attainment from the exam data. An example of this is the variation in the ways the Royal Colleges monitor for protected characteristics, which has been identified as a potential contributor to unfair bias. (13)

The impact of institutional structures and organisational contexts Literatures focusing on the nuts and bolts of postgraduate education, at the level of the medical school or workplace, highlighted the paucity of well-developed research into postgraduate selection. These contributions drew on primary data, were typically published in high quality academic sources, and editorial comments were presented by authors with research or experience in this area. (14, 15)

In contrast to undergraduate selection, there is little research on postgraduate selection processes. In the literature, selection processes were presented as highly variable,(4, 6) although it was recognised that a rigorous (or otherwise) selection process might have implications for attainment. Best practice selection methods highlighted in the sample involved the identification of required competencies and the development reliable assessment methods for them. The narrative suggests that the application of a validation process should be used to assess the predictive value of the selection methods.

Pre-entry advice and proper induction processes were identified across the international literature as important factors for IMGs and other students who gained their PMQ outside

8

the country they wished to work in. (16, 17) One significant UK focused study (2) identified the GMC as having a central role in developing a ‘joined up’ approach to supporting PMQs and IMGs in addition to individual employers. (2)

Buddying or mentoring was highlighted as a useful approach to assisting acculturation. A literature search of PubMed identified mentoring programmes for undergraduates as having positive impacts on attainment levels but cautioned that this was relevant only if such programmes were based on robust designs and were evaluated to ensure effectiveness. This review demonstrated that most research in the area of mentoring to improve attainment has been undertaken in the USA. (18)

Understanding the role of the individual or discrete group The literature pertaining to the individual or discrete groups suggested that a combination of factors may be associated with educational performance. These include: learning styles and psycho-social factors; demographic characteristics such as gender and ethnicity; wider social and cultural capital; language and other, more tacit, contributors to success. The literature exploring these factors used both qualitative and quantitative methodologies and was generally of a high academic quality whereby methods and findings were justified accordingly. Two of the four studies were UK-focused: and both examining undergraduate medical students (19) (20) two qualitative studies more narrowly focused on specific types of student in the USA and Saudi Arabia focused on contributors to success. (21, 22)

Numerous studies focused on ethnicity in relation to analysis of differential attainment at macro, meso and micro levels. However, whilst this issue dominated the literature, the complexity of the term was largely unaddressed. Terms such as IMG and BME were used interchangeably and uncritically.

For example, while “BME” is a widely used term in public and private sector organisations to incorporate a range of minority communities living in the UK, using it as an umbrella term to group together diverse socio-cultural demographics has been critiqued – but typically this is not addressed in the sampling or conclusions drawn from the various studies within the literature.

9

Whilst perhaps more obvious, IMG is another umbrella term specific to medicine that requires clear definition, for similar reasons. The narrative emerging from the literature identifies “IMGs” as being increasingly important to the delivery of healthcare, but nevertheless experiencing the inherent difficulties of migration and acculturation. However, the specifics of these difficulties, how they might vary – and why this might be important for differential attainment of IMGs – is absent from these discussions.

Similarly, ‘language’ is cited as a predictor of good performance but it is not proven to be, of itself, the reason why students and/or doctors fail high stakes examinations. Moreover, any sociological or psychological examination of ‘language’ is also missing, and the concept is treated as unproblematic in terms of its application as a potential factor underpinning attainment.

The key narratives of differential attainment Following thematic analysis, narrative analysis was then used to identify any relationships emerging between and across these themes. As has already been acknowledged, the literatures are disparate and disjointed. However, there key messages are similarly structured around: i) the potential causes of differential attainment, ii) the ways in which differential attainment has been researched and iii) potential interventions to further our understanding and help inform strategies going forward.

Understanding causes and relationships The initial research undertaken into understanding differential attainment tended to focus on the analysis of exam data with the aim of validating high stakes examinations or identifying bias. There were 5 high quality quantitative studies included in the analysis. (7, 12, 23-25). The dominant message from these studies was that, while the reasons for differential attainment remained unclear, they were likely to be multifactorial.

The chronological trajectory of the research demonstrates that research is increasingly emphasising the importance of educational and social factors in contributing to performance. In this area research is frequently qualitative. We found 8 studies, key among which were Woolf’s analysis exploring the relevance of stereotype threat (26) and Vaughan’s study using social capital theory to understand the role of networks and social behaviour (19). 10

Both of these studies focused on undergraduate medical students, but provided a way of analysing differential attainments that bear relevance for postgraduate patterns. In terms of studies examining the attainment levels of postgraduate students, Illing’s and Roberts’ studies were the most extensive in terms of scope and data analysed. (2, 3)

The general point to draw from this development of research foci in both undergraduate and postgraduate fields (and one that suggests we may be best served by considering both), is growing consensus that researchers should not limit their analysis purely to exam results. Current thinking acknowledges the requirement to examine the ‘whole’ of the exam; its support structures (both formal and informal) and features of its candidature that go beyond demographics to attitudes and behaviours.

Selection, language and the identification of facilitators, as well as barriers, are factors that have been emphasised across a number of studies. In much of the literature, language is used as a proxy for communication broadly, which is an umbrella category incorporating gesture, pronunciation and intonation etc. This is an important observation, since communication skills form part of clinical skills assessments and these carry with them implicit cultural assumptions relating to the doctor-patient dynamic. The message emerging here is that lack of acculturation will impact on performance and ultimately attainment, even if clinical skills are to an expected standard or level of competency.

The literature also identifies poor induction, lack of support for IMGs in overcoming the difficulties inherent to migration, and career change; all as factors that may disadvantage IMGs in becoming better trained and acculturated doctors. A small number of studies have highlighted the importance of considering factors that support higher levels of attainment. Qualitatively, it is important to note these contributions to the building narrative: limiting analysis to why certain individuals or discreet groups might fail to progress along the career pathway risks ignoring evidence that identifies why other individuals or discreet groups succeed – all of which might help us to understand different levels of attainment along the spectrum.

The importance of appropriate research design For the reasons outlined above, this review included studies employing different research methods, the majority of which undertook the quantitative analysis of primary data. In 11

order to examine the complex nature of ‘causes’, qualitative research approaches have more recently been used to examine complex phenomenon embedded in the culture and contexts of assessment.

This relatively recent turn to qualitative methodologies to capture evidence of complexity adds depth of understanding to the breadth of the quantitative research literature. Indeed, the narrative emerging from the more recent contributions to the literature suggests that innovative research approaches are required now that complexity is acknowledged. Specific recommendations within the literature include: longitudinal tracking, interdisciplinary research to provide fresh perspectives, and the development of more appropriately sophisticated theoretical frameworks.

A significant issue across the research is the lack of (either) transparency or consistent definition around the categories of explanation. While some contributions acknowledge the inherent difficulty in defining and categorising, it remains the case that umbrella categories like BME and IMG, ethnic group and ethnicity have not been subjected to full interrogation. In this sense, the development of suitable interventions to address the problem of differential attainment is compromised by the problem of inconsistently applied definitions and classifications across existing databases and research studies.

Possible interventions and future strategies Overall, the differential attainment literatures suggest that a variety of factors may affect performance and attainment. These include issues around the background and characteristics of the individuals, the stage they are at in their medical career and the organisational structure of different workplace settings. These might have cumulative effects over time or ‘one-off’ effects at certain stages of their career.

Due to the variety of factors identified as potentially affecting performance and attainment in part, the narrative emerging from the current body of knowledge recognises the need for a complex intervention incorporating analysis of the micro, meso and macro levels of engagement - rather than a simple intervention to establish cause and effect relationships of single factors.

12

The first consideration in designing an intervention relates to the level at which the intervention is required: at the individual level, the institutional level, a broader policy level, or a complex intervention with components on each level. It is important to recognise that any intervention targeted at a single level needs to be thought through across all levels in case unanticipated effects at other levels emerge as a consequence

Conclusion

This review has found that differential attainment in postgraduate medical education in the UK cannot be attributed to a single identifiable cause, but results from a subtle combination of factors yet to be fully explored. Over time, research has moved from the quantitative analysis of exam data towards a more cross-disciplinary approach in order to explore a combination of educational and social factors (rather than single causes) as contributors to differential attainment. Such an interdisciplinary approach is now presented as essential for developing a nuanced understanding of the complexities of differential attainment across the micro, meso and macro-structure of medical education, and is viewed as the foundation upon which future interventions may succeed.

13

1. Introduction Differential attainment is a term used to describe variations in educational achievement by different demographic groups undertaking the same assessment. It is a phenomenon readily identified across the educational landscape, and research by HEFCE and others has identified a complex range of personal, cultural, institutional and structural factors impacting on parity.(27)

Differential attainment has been a recognised feature of medical educational achievement since the 1990s in both undergraduate and postgraduate contexts. But interest in the underperformance of ethnic minority doctors has been heightened in recent years in the UK with a judicial review in the High Court (April 2014) for alleged racial discrimination against ethnic minority doctors by the RCGP in their high stakes examinations. The legal challenge from BAPIO was dismissed but the Judge recommended action on differential attainment and that the RCGP should focus on training to ensure that candidates are prepared for their examinations.

The Judicial review is often presented as the catalyst for action, whereas the GMC has been working with others since 2010 to analyse data to better understand the progress of trainees through their programmes. A commissioned independent review of the RCGGP CSA identified that overseas qualified doctors, or (IMGs), were 15 times more likely to fail the CSA, and UK qualified BME doctors were four times more likely to fail than their white counterparts at first attempt (the difference diminished for UK BME doctors on their second attempt but differences for IMG BME doctors persisted on second and third attempts). (28) Recent analysis of exam data has shown that in a simple univariate analysis the same patterns of attainment were present across speciality groups. (29)

The present literature review contributes to a wider programme of work being carried out by the GMC to explore differential attainment across training pathways.

14

2 Background

Differential attainment is a term used to describe variations in educational achievement by different demographic groups undertaking the same assessment. Characteristics including gender, age, ethnicity, nationality and socio-economic status, along with medical school and postgraduate training programme, are all factors that HEFCE have identified as having a correlation with performance and attainment.(27)

A search of PROSPERO and the COCHRANE library revealed that there are currently no registered substantive reviews of differential attainment specific to postgraduate medical education. There is however a growing body of literature examining potential causes and factors relating to differential attainment across both undergraduate and postgraduate medical education. (20, 21, 30, 31)

3 Aims and purposes of the review The purpose of this review is to understand from the existing evidence the underlying causes of differential attainment in postgraduate medical education in the UK and English- language speaking countries with comparable medical education systems (USA, Canada, New Zealand and Australia). This includes identifying different causes and/or significance of causes across those countries, providing a conceptual framework to design interventions to address these issues in UK, identifying possible methods for further research in this area and rating the strengths and weaknesses of evidence that may suggest areas for future research and/or work.

The aims of the review are as follows:

o To establish an evidence base for differential attainment in the UK and other comparable countries

o To identify any research methods pertinent to identifying and/or understanding the causes of differential attainment in UK postgraduate medical education

o To examine interventions that have been effective in reducing differential attainment that may be applicable to UK postgraduate medical education 15

o To rate the quality of evidence as a ‘springboard’ for future work

4 Methodology 4.1 Rapid review Systematic reviews that engage with health policy are becoming increasingly valued by policy makers as the evidence base becomes more complex (32). However policy makers often require a synthesis of knowledge on emerging issues within a short time frame in order to facilitate a timely response and/or decision. A traditional systematic review takes at least 12 months to complete, the need to accelerate this process to produce a rapid review requires the reviewers to undertake methodological ‘shortcuts’ to streamline the process. There is currently no standardised method for undertaking rapid reviews, and indeed Oliver argues that this may be counterproductive.(33) In a review of the methods used in rapid reviews Ganann et al recommend transparency of reporting methods, in particular where ‘traditional’ processes had been streamlined. (34)

There is considerable debate about the relative merits of full systematic over rapid reviews with rapid reviews considered appropriate to answer focused questions or as an important intermediary step to further research where interventions are complex. Rapid reviews may lack the depth of full systematic reviews to present detailed recommendations, but a review comparing cases where both rapid and full systematic reviews were conducted found that overall there was no significant impact on the final conclusions of a review. (35)

4.2 Narrative synthesis “’Narrative synthesis’ refers to an approach to the systematic review and synthesis of findings from multiple studies that relies primarily on the use of words and text to summarise and explain the findings of a synthesis”.(1)

The flexibility of narrative synthesis lends itself to this type of ‘storytelling’ since rather than having a definitive structure or sequential process (1) it relies on a framework that can be broken down into four elements, through which the researchers can move iteratively:

• Developing a theory about how the intervention works, why and for whom

• Developing a preliminary synthesis of findings of included studies

16

• Exploring relationships within and between studies

• Assessing the robustness of the synthesis

5 Methods 5.1 Development and registration of protocol The protocol for the research was developed by the core research team: Drs Regan de Bere, Nunn and Nasser with support from the expert panel. The protocol for the research was agreed with the GMC on 06/02/2015 and registered with PROSPERO on 26/02/2015. The protocol was subsequently published on the PROSPERO website http://www.crd.york.ac.uk/PROSPERO Reference no: CRD42015017130.

5.2 Search strategy The inclusion and exclusion criteria were agreed between the lead researchers and the expert panel. These criteria set the boundaries for the research.

Table 1 Inclusion and exclusion criteria from the protocol

Inclusion Exclusion Published between 01/01/2004 and Disciplines outside medicine (e.g. pharmacy, 01/01/2015 dentistry, nursing and midwifery) UK and countries with comparable medical education systems (USA, Canada, New Zealand and Australia). In the English language Studies using any methodology singly or in combination and ‘grey’ literature Studies or documents related to postgraduate, and where appropriate to undergraduate medical education Differential attainment /success or failure

However, as the research progressed we did revisit and refine the initial criteria as we identified gaps and leads to important relevant literatures previously excluded. For example, while Norway was not on our original source list, while reviewing the literature we included one Norwegian study (36) since it addressed conceptual issues we considered relevant to 17

the review (namely those surrounding gender and qualification related to working environments). We also included a study from Switzerland examining the impact of mentoring during postgraduate training. (37)

We searched PubMed using the following search strategy that includes MeSH and ‘free text’. #7 #3 AND #6 Filters: Publication date from 2004/01/01 #6 #4 OR #5 #5 (Attainment or success* or fail*) #4 "Educational Status"[MeSH] #3 #1 OR #2 #2 (postgraduate AND educat* AND med*) # 1 "Education, Medical, Graduate"[MeSH]

This search strategy was adapted for other databases like PsychINFO. We also searched reference lists of key papers to: 1. Ensure that our search criteria was identifying key papers 2. Identify additional papers and/or grey literature

We also added to the studies found through the searches from our own knowledge of the subject literature.

We did not consult authors directly but met several leading researchers in the field at related GMC events 27/2/15 and 16/03/15 where there was the opportunity to discuss the review.

We also placed a call on the GMC website for contributions from other researchers and interested parties. This call produced no new sources of information.

The results of the searches, conversations and prior knowledge of the literature identified prominent topic areas and issues in the medical education literature, as well as highlighting those which have been less well documented. This information was later used to conduct additional iterative searches in educational literature in order to fill any gaps identified.

18

As part of the selection process, we categorised relevant literature in medical education that fell outside of our inclusion criteria i.e. studies relating to other countries. The rationale for this was to enable decisions at the later analysis stage, to decide whether such studies might help us fill any gaps (or otherwise).

After an initial screening of the results, we used NVivo 10, a data management software package, to calculate the themes identified across the literature. Individual papers may contain several foci and each is coded individually. By listing the number of studies that reference each descriptive theme we developed a simple schema to identify gaps in the literature. From this we conducted further iterative searches in the medical undergraduate literature to assess if there were any generalizable findings from those studies.

We also undertook general searching of relevant stakeholder websites listed below for grey literature.

Table 2 Stakeholder websites searched

General Medical Council British Medical Association Royal College of Physicians and Surgeons of Royal College of Psychiatrists Glasgow Royal College of General Practitioners Royal College of Ophthalmologists

Royal College of Obstetricians and Gynaecologists Royal College of Radiologists Royal College of Paediatrics and Child health Academy of Medical Royal Colleges (AoMRC) Royal College of Physicians of Edinburgh Royal College of Surgeons of England Royal College of Physicians of London Royal College of Surgeons in Ireland Royal College of Physicians of Ireland Royal College of Surgeons of Edinburgh UK Higher Education Funding Council for England Other representative groups: (HEFCE) BAIPO Medical Woman’s Federation

The initial search term used in ‘Google’ was: name of the stakeholder AND differential attainment 19

We then searched iteratively within the stakeholder websites for additional documents.

5.3 Data management and extraction In defining eligible literature formats, we included all content-relevant documents and articles, regardless of the status of their publication. The final sample therefore included academic studies, unpublished research, conference papers, guidance documents, opinion pieces and so on. Editorial and opinion pieces are included since they can provide useful insights and offer potential solutions or identify areas for thought. They will not be formally quality assessed but we will report on the perspective from which the paper was written (the author and their background) and how this may have contributed to the shaping of his/her argument.

We developed frameworks that disaggregated the elements of the research question, against which to map the papers. Due to their structured nature, quantitative studies tended to relate to the elements of the PICOC framework (Population, Intervention, Comparison, Outcome, Context), whilst qualitative studies were typically more effectively interrogated using the SPICE framework (Setting, Perspective, Intervention/phenomena of Interest, Comparison, Evaluation). The frameworks provided a transparent method of identifying papers to include and exclude from the synthesis.

We found no randomised or non-randomised controlled trials. Most studies focused on evaluating certain factors like gender and ethnicity on the performance of the students. Therefore, we have used a modified version of PICOC and SPICE frameworks for the final synthesis presented in this report. This is still consistent with our methodology in the protocol registered with PROSPERO (CRD42015017130).

5.4 Quality assurance Due to the inclusion of a wide variety of material in the final synthesis, and the iterative method of study and document extraction, the transparency of all decisions made about inclusion is guaranteed by thorough documentation of each stage of the review and the decision-making processes.

We undertook a quality assessment of the studies that included primary data using an adapted version of the Critical Appraisal of Qualitative Research (CASP) framework. We used

20

this for both qualitative and quantitative studies since the key issues around quantitative studies related to the approach to the questions, the design of the research as related to the question, the study’s population and what was measured and how. The ratings of the studies (high quality / unclear quality / low quality) are included where appropriate in Appendix 1 and a fuller description of the evaluation of each study using primary data is included as Appendix 2. We included a question related to generalizability of the study (direct / indirect / unclear). This question does not contribute to the quality evaluation but is reported separately to account for generalizability to the review.

The research team was supplemented by an expert panel to advise on search terms, discuss the retrieved literature, any initial emergent themes and review the final report prior to submission to the GMC. The Expert panel (Sam Regan de Bere, Suzanne Nunn, Mona Nasser, Paul Lambe, Julian Archer, Martin Roberts, Tom Gale and Rebecca Pitt) have met to discuss various stages of the review, including: feeding back on the research design; ratifying the protocol; agreeing the selected academic literature; discussing themes emerging from the literature; quality assessment and agreeing the structure of the final report.

During the process of the research the panel agreed that the retrieved literature was representative of the field and that the search terms used had been appropriate. The panel did not consider that there were any significant gaps in the literature: they suggested that, rather than reinforcing extant knowledge by including the literature from other health professions, the research team should concentrate on the emerging narratives and look to a broader cultural literature to inform the socio/cultural and pedagogic narratives that were emerging if required.

The panel did identify a lack of clarity in the terminology used in different studies across the literature: in particular the words ‘performance’ and ‘attainment’ have been used interchangeably. The panel suggested that, for the purposes of this review, the following definitions should be applied: attainment would be used in reference to a direct measurement, namely passing exams, whereas performance would refer to academic performance as a process which implies a temporal element, with attainment being a consequence of performance.

21

6 Research Ethics The research for this review is desk based and ethical permission was not required.

7 Data Analysis Initial database searches identified 3,044 potentially relevant documents. Duplicates were removed (68) leaving 2,976 documents to be screened by title for possible inclusion in the synthesis. Documents rejected at this stage, after exclusions were applied, were categorised in case any gaps were identified in the literature and these documents needed to be revisited. Ninety six documents were retrieved as papers for further review (10% of these being checked by SRdB against the inclusion criteria). From this tranche 40 papers were evaluated against the PICOC and SPICE frameworks, as described in the protocol, 8 failed on one or more of the criteria, leaving 32 documents extracted for discussion by the expert panel and potential synthesis. Following discussion a further three papers were added on the advice of the expert panel from their subject knowledge and 4 papers were added as a result of iterative searching of the reference lists in the papers identified for synthesis.

A total of 39 papers were included in the synthesis with the addition of 24 documents from the grey literature. A flow diagram of the search process is shown in Figure 1 below.

22

Figure 1. Flow diagram of study selection

23

The studies and other documents included in the synthesis use a variety of formats and methodologies. Shown in Fig 2 below

Figure 2 Analysis of included studies and documents by methodology or type

30

25 25

20 20

15

10 8

4 5 3 2 1 0 Lit review Quantative Qualitative Mixed method Other

Published Grey

Quantitative research is the dominant research methodology for published research. Interestingly mixed methods research studies were only found in the grey literature. The ‘other category’ includes opinion pieces, letters and comment, conference and other reports. Not surprisingly this is the area dominated by the grey literature.

Fifteen of the documents extracted from the grey literature were comment pieces in the online medical news and media, Pulse (n = 4), BMA (n = 6), GPonline (n = 1), BMJ Careers (n = 3) and Mancunian Matters (n = 1). The most disseminated document in the grey literature was the AoMRC 2013-14 review (38) it was linked to the Royal College sites and returned as a ‘hit’ when searching them. The document itself has little to say about differential attainment: a short paragraph identifying the judicial review as a catalyst for AoMRCs decision to “look at the wider question of differential attainment in medical education.” (38)

An examination of the dates of publication of the included documents testifies to a growing interest in differential attainment. This is with the caveat that there is a time lag between

24

academic research and its publication that does not apply to online comment. But even taking this into account a trend is clearly discernible.

Figure 3. Publication by date

14

12

10

8

6

4

2

0 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

Published Grey

Broadly speaking, the peaks of interest roughly coincide with significant changes to the MRCGP in 2010 (specifically the CSA component), the publication of Esmail and Roberts report in 2013 and the Judicial review in 2014.

8 Narrative synthesis A narrative synthesis does not begin with a set of a priori assumptions. Using this method themes emerge as the literature is identified and reviewed. The first level of thematic identification is descriptive and can be generated in a number of ways including coding followed by conceptual mapping to help us think about the relationships between and across the themes identified.

Using the themes coded in NVivo 10 we identified two key areas of interest that emerged across the literature: high stakes exams and ethnicity.

Fig 4 shows a conceptual map of the relationship between high stakes exams and ethnicity in the published literature, with the sub-themes or factors either identified or investigated.

25

Figure 4. Conceptual map of themes identified in the published literature

N.B. the size of the ovals does not reflect significance of factor or quality of the research

The conceptual map, while amply demonstrating complexity, also provides a way of populating a micro, meso, macro analytical framework that broadly relates to three key levels of engagement: the individual or discrete group (student/s, doctor/s, examiners etc), the institutional (medical school or work environment) and the level of policy (exams) .

9 Findings 9.1 The Individual or discrete group Although not discussing postgraduate education specifically, Schrewe makes a number of insightful observations around the place of the individual in medical education and the tension between the competing discourses of diversity (respect for the culture, gender and ethnicity of individuals) and standardisation (uniformity and consistency).(39) Arguing that these discourses need to be made explicit and “brought into the same conversation” in

26

order to enable students and trainers to achieve their full potential, Schrewe suggests that a better understanding of the common qualities required and the extent to which individual variation can be supported without detriment to the profession as a whole is the question that needs addressing with some urgency.(39)

In this section of the findings we discuss themes identified in the literature pertaining to the individual or discrete demographic group.

Study habits Woolf examined ‘study habits’ as part of wider research into ethnic underperformance in Year 3 medical students using a questionnaire to assess surface, deep and strategic learning processes. Deep learning is associated with an active search for meaning, whereas surface learning is associated with memorising rather than understanding. (40) Woolf found that minority ethnic students scored lower on deep learning study habits (p = .003) and higher on surface learning study habits (p =.008) than their white peers. (20) Strategic learning, where learners adopt the best learning style to fit with the needs of the task was identified by Woolf as positive predictor of performance but was statistically related to other factors including, living at home and having English as a first language. It is also important to recognise that students should not be identified with a fixed approach to learning; curriculum design, assessment and teaching style all encourage students to adopt a particular approach. (41) This suggests potentially broader questions about ethnicity and learning.

Psycho-social Psycho-social is a term used to describe an individual’s psychological development in, and interaction with a social environment. In the literature on widening participation psycho- social factors in relation to undergraduate degree choice are well documented. (42)

As part of a larger undergraduate research study Woolf examined personality types of white and non-white students using an adaptation of the NEO-PI-R (43) to identify five personality types (neuroticism, openness to experience, agreeableness, extraversion and conscientiousness). The study of a total of 703 (51% minority ethnic) students found that ethnic minority students were lower on the personality trait “openness to experience” (p =

27

0041) (20) but this was not found to have a negative effect on final year examination performance.

Social and cultural capital The ‘standing’ that medical professionals have within different cultures has been shown to have a significant effect on the choice of medicine as a career. A study linking informed choice and academic success in Iranian medical students provides a useful international review of studies that found many medical students having an “over-dramatized and romanticized view of medicine at the beginning of academic studies”. (34) The Iranian study used a multiple choice questionnaire (n = 2208) for final year medical students and found that informed choice had a positive effect on attainment.

Success Esmail recommends that more research is undertaken into factors for ethnic students’ success. (28) Whilst not looking at ethnicity, we identified one small scale qualitative study using interviews with 10 black male medical students and 3 black male physicians at Florida State University College of Medicine to explore their perceptions of the factors contributing to their success in being admitted to and graduating from medical school. (22) The study, with its gender, geographical and numerical limitations, never the less presented an interesting line of enquiry looking at contributors rather than barriers to attainment.

The study concluded that factors contributing to success were a balance between educational experiences, exposure to medicine, psychosocial-cultural experiences (including family and other support networks) and personal attributes. Participants in the research specifically identified structured activities like enrichment programmes and outreach programmes as significant. The Minority Association of Pre-Medical Students Programme (MAPS) was an example cited by the study participants. MAPS provided opportunities for networking with other premedical students, medical students and physicians and importantly provided the opportunity for shadowing experiences.

We then looked at the undergraduate literature to see if there were any other studies that looked at contributors. One qualitative study based in Saudi Arabia used focus groups to understand 19 mixed gender high achieving medical students perceptions of factors contributing to their success.(21) They identified learning strategies, resource management 28

including family support, motivation and the efficient management of non-academic problems i.e. stress.

In a study examining the differential achievement between white medical students and their ethnic minority peers, Vaughan (19) used social capital theory to develop and analyse survey data from medical students in the clinical phase of their training (n = 158). The research found no link between ethnic and religious homophily and achievement. However, interacting with problem-based learning group peers in study related activities and having a wider academic support network were found to be directly linked to better achievement. Vaughan concluded that ethnic homophile may cut minority students off from potential and actual resources that facilitate learning and achievement. Therefore it is key that students build wide relationships with colleagues at all levels of training.

All these papers evidence a mix of educational and social factors as

contributing to performance of individuals in addition to individual

characteristics

The literature examining contributors to success is important since by only looking at why certain students might fail only tells half the story.

From the papers found, contributors to success seem to be international

but with such few studies the results are not generalizable.

Ethnicity The underperformance of ethnic minorities compared to their white peers across the higher education landscape has been consistently identified.(44) (45) The studies discussed in this review focusing on the performance of UK-trained medical students and doctors from minority ethnic groups have corroborated broader HEFCE findings.(27)

Definitions of ethnicity are numerous and complex. In the UK studies we discuss in this review ethnicity was either self-declared specifically for an individual study or was a characteristic already identified in a data set being analysed.

29

Classification systems used in the research also varied, and included the 2001 UK census guidelines, (19, 20, 26) individual Royal College geographical bands, (46) white and non- white, (47) BME as an umbrella group, (5) (11) (24) categories approved by UK Commission for racial equality, (9) GMC National Training Survey (23) which uses UK census categories.

Studies use different categorizations and therefore comparisons between studies can be difficult. For example, Denney cites the conflation of all BME groups under one heading as a limitation of the study but states that it was necessary in order to compare and contrast the results with other studies and because the numbers were too small in some sub groups. (11) Woolf adopts the same approach, arguing that ethnic categories are to an extent artificial because they can never take into account the subtle variations between groups of people.(5)

There have been a number of key large scale quantitative research projects since the 1990s focusing on ethnicity and differential attainment. The catalyst for this area of research was the identification of a higher failure rate in clinical exams among non-white students at the University of Manchester the 1995 (48): the leading researchers in the field in the UK are Chris McManus, Katherine Wolf, Jane Dacre and Richard Wakeford.

In a systematic review of ethnicity and academic performance in UK undergraduate and postgraduate medical students, Woolf found ethnic differences in attainment to be widespread across different types of medical school and different types of exam at both levels of study.(5) The review focused on quantitative reports that measured performance and concluded that differential attainment was both “consistent and persistent”: but while ethnicity was clearly related to exam performance the reasons for this were not clear.(5)

The first large scale longitudinal study exploring in depth a number of potential psychological and demographic reasons for differential attainment in undergraduate and postgraduate medical students was led by Katherine Woolf (20).

In contrast to the studies, focusing on measuring differences in attainment between different groups, Woolf’s qualitative study (26) using focus groups and semi-structured interviews (n = 27 medical students and 25 clinical teachers) followed earlier studies in the US and examined the potential of stereotype threat to provide an insight into the identified gap in attainment. Stereotype threat has been identified as a psychological phenomenon

30

whereby individuals who are members of a group characterized by negative stereotypes perform below their actual abilities when group membership is emphasized. Woolf found that negative stereotyping could impact on the relationship between lecturer and student and therefore affect learning. She concluded that while a negative stereotype about an ethnic group had “numerous implications for teaching and learning” the relationship was neither simplistic or deterministic.(26) Woolf concluded that the student/teacher relationship was “vital for clinical learning” in particular the negative Asian stereotype was considered to be potentially jeopardising to Asian students relationship with their teachers. Woolf recommends that employers should facilitate teachers in getting to know their students as individuals. Although the study was limited to one London Medical School stereotype threat is an interesting line of inquiry – not just relevant to ethnicity – for example Burgess has studied gender in terms of stereotype threat in the context of career advancement in Academic Medicine in the US. (49)

Definitions of ethnicity are numerous and complex.

BME is a widely used term in public and private sector organisations to

incorporate a range of minority communities living in the UK. Such an umbrella term has been critiqued in terms of the validity of grouping together diverse groups in this way.

Conversely for quantitative studies broad terms may need to be used to

obtain statistically significant results

IMG IMGs are an important asset to the Health Service in the UK. In a review article in 2005 Sandhu opined that increasing numbers of IMGs would be needed to achieve the rapid increase of workers needed as a result of legislation relating to the creation of a consultant based service, and other working directives. (50)

31

Sandhu raises the concern that this requirement combined with the UK being a very attractive place for medical graduates to work and continue their training could encourage an influx of inexperienced doctors or doctors having poor communication skills seeking opportunities in the competitive specialities. Sandhu advocates that more realistic information about postgraduate opportunities and training be available to enable potential IMGs to make a more informed choice, but also praises the motivation and determination of IMGs as a group.

A study in the US found great persistence on the part of IMGs in pursuit of a US residency position.(51) The linked data study of a cohort comprising 10,328 IMGs who were both US citizen IMGs and non-US IMGs highlighted the importance of IMGs to the delivery of national healthcare.

In a large scale analysis of RCOG data Rushd undertook retrospective analysis on the performance of IMGs who appeared for the first time in the Part 1 (n = 11,863) and Part 2 written (n = 5336) MRCOG examinations between 2000 and 2010. (46) Rushd’s evaluation of the first time performance of IMGs in the MRCOG part 1 and 2 written examinations critiques IMG as a category by identifying variation in performance between students across the RCOG geographical bands.

Rushd was unable to perform statistical comparisons to the results of the study since geographical bands are not comparable: they contain different countries, different academic standards, different teaching methods etc. Rushd however, found that variation of IMG performance was likely to be multifactorial and suggests that the introduction of e- learning modules may “go some way in equalising the learning opportunities among geographic regions and could prove useful for both trainers and trainees.” (46)

Aside from Illing’s study, discussed below, (2) we only found one qualitative study examining barriers and facilitators encountered by IMGs. The study was situated in the Netherlands and the findings related mainly to sociocultural rather than educational factors, including being able to access information and financial support. (16) Lack of command of the Dutch language (particularly the medical terminology) and age were seen as barriers to securing employment and entrance to specialism. Age was only a barrier in some specialisms since they set an upper age limit for postgraduate specialist training. 32

The study concluded that better support to overcome difficulties inherent to migration and career change would result in better trained and acculturated doctors. The GMC has recently undertaken some work in this area and developed a ‘Welcome to UK Practice programme’ to raise awareness about practice in the UK. (52)

In contrast to Vaughan, who cautioned against homophily, (19) a presenter at the RCPsych conference (2014) encouraged IMGs to join and become active in diaspora organisations, thereby familiarising themselves with working in the NHS and broadening their network of professional contacts. (53)

The RCPsych convened a conference in 2014 to focus on familiarising IMGs with working in psychiatry in the UK. The conference was organised in recognition “that IMGs face more problems than British graduates in succeeding in the system.” (53) The college is keen to support IMGs by commissioning an external review of the MRCPsych exam and ARCPs and appointing an Associate Dean for Trainee Support.

Feedback from the delegates was positive and the college plans to run another in 2015.There was a recognition by delegates of the importance of trainers, the role of employers in developing meaningful induction programmes and giving IMGs additional support and remediation if required. Among the recommendations proposed at the conference were that the College appoint local and national IMG Champions and improve examiner training to help recognise unconscious biases (accents, manner etc).

IMG is a category that needs to be problematized and properly defined.

The literature identifies IMGs as increasingly important internationally to the

delivery of healthcare.

IMGs are noted for their persistence and tenacity in pursuing postgraduate qualification.

IMGs face the inherent difficulties of migration and acculturation. These include language, accessing information, financial support and limited knowledge of the

healthcare system.

33

Language A number of studies discussed language either as a sole focus or as part of a number of compounding factors. Woolf’s longitudinal study using exam data and questionnaires over two consecutive year 5 cohorts (n = 703: 51% minority ethnic) found that speaking English as a first language, with one parent also speaking English as a first language and being schooled in the UK, was a predictor of good performance in final year UCL medical students. However not having this level of English was not the reason why minority ethnic students underperformed. She suggests that where examinations like the OSCE require communication skills “country of schooling could be a proxy for communication or cultural differences.”(20)

This finding concurs with those of Watmough (21), discussed above, who was also unable to identify language as a determining factor in success in the RCA postgraduate examination.

The most significant study exploring language and cultural factors, was undertaken by Roberts and funded by the Economic and Social Research Council (ESRC). (3) This study used a sociolinguistic methodology to examine both how candidates performed in the RCGP exam but also how the specific conditions of the exam operated to determine behaviour.

In specific relation to the CSA, but with wider implications for other practical exams in both undergraduate and postgraduate contexts, Roberts’ study found the “relatively decontextualized nature of the CSA made it a ‘talk-heavy’ assessment from which a number of effects flow”. These include “communicative performance factors’ which relate to how IMGs talk and interact with role playing patients, examiner perceptions of candidates sounding formulaic and not engaging with the patient through a patient centred model.” The researchers suggest that the sociolinguistic “fingerprint” of the exam which assumes a patient centred approach could constitute a “hidden curriculum.”(3)

The study concludes that “Rather than talk of ‘cultural bias’ or not, there needs to be a debate about tolerances and communicative flexibility, about what are acceptable competencies in an increasingly diverse society and how, within these competencies, talk and interaction can be more explicitly addressed. ‘Cultural bias’ implies that there is a goal of neutrality that must be reached and that there is one ‘culture’, one way of doing things.”(3) 34

Memon argues that oral examination is an important element of postgraduate examinations, but ensuring its reliability and validity across specialisms is complex to design and implement. (35) Memon cites the work commissioned by the RCGP in this area of postgraduate examination as an example of good practice in providing an evidence base for the validity and reliability of the oral elements of their exam. Memon cautions that IMGs taking exams in other specialities may be disadvantaged if their English is less fluent and articulate than UK trained candidates.

Knight, an MRCGP examiner, argues in an editorial piece that while there is evidence that the MRCGP is reliable; IMGs are prone to failure because the exam is in English and they spend much of their practise consulting in other languages. (36) Aside from language Knight also cites other factors that may impact on IMG success in the MRCGP, including differing clinical environments in the UK from the one in which they trained and that they may spend much of their consulting time in the UK speaking in a language (or languages) other than English.(36) Knight with Roberts identify the failure to acknowledge or assess multilingual expertise, which both see as an asset in an increasingly diverse UK society.

The specialities with the highest proportion of IMG candidates are the MRCGP and the MRCP (particularly psychiatry). (45) These specialities require significant levels of cultural awareness and advanced communication skills, both of which may place IMG students at a disadvantage. (17)

Issues around IMG students and language are not unique to the UK, but also evident in other countries where there are minority groups. (54)

While language may be a predictor of good performance it is not, of itself, the

reason why students fail.

Language is often conflated with sociolinguistic performance.

There is currently no acknowledgement or assessment of multilingual expertise.

35

Gender Two papers (both American) compared female attainment against male attainment in obstetrics and gynaecology (Obs/Gyn) (55, 56). Both studies conclude that women outperformed men in the Obs/Gyn) specialism.

Bibbo’s study found that on the pre-clerkship measures MCAT men outperformed women, but on the overall clerkship scores women outperformed men. This was due to womens’ higher achievement on the standardised National Board of Medical Examiners (NBME) subject examination. Drawing on other literature a number of proposals were made as to why this might be the case, including men being less interested in the specialism and consequently less motivated, combined with the perception that patients prefer a female physician. Women in contrast being potentially more motivated because they want to enter this specialism due to gender identification, and the dominance of women already in the field.(55)

Cuddy’s study on examinee gender and United States medical Licencing Exam (USMLE) performance also found men outperforming women at Clinical Knowledge (CK) step 1 of the exam but with women outperforming men at CK Step 2 (clinical skills), and with women out performing men in most content areas of obs/gyne, paediatrics and psychiatry: in contrast, men out performed women in medicine, surgery and preventative medicine.(56)

In a Norwegian study of 2474 Norwegian residents who began specialization in 1999-2001 (36), Johannsen found that although women progressed more slowly than men, the gender variation was not significant when the effects of child-birth and having children under 18 were controlled for. But gender was found to have a strong influence on choice of speciality due to longer required working hours, for example in emergency services.

In combination these studies identify a gender split in specialisms, for example the dominance of women in Obs/Gyne.

Identified gender differences in exam performance may potentially be linked to gender motivation to succeed in specific specialisms and/or gender identification

with certain specialisms

Studies suggest that changes to the hospital environment, working practices and cultures could encourage a more even gender split across the specialities. 36

9.2 The institutional

The Medical School and the working environment In a Norwegian study (36), Johannsen looked at hospital specific factors in speciality choice and qualification. The study found that hospital factors were significant predictors for the

participants (n = 2474) timely attainment of specialization. Working at university hospitals (regional) or central hospitals was associated with a reduction in the time taken to complete the specialization, “whereas an increased patient load and less supervision had the opposite effect.” Johannsen’s study suggested that more flexibility in the curriculum would be beneficial.

Illing, using quantitative and qualitative data, describes how senior overseas doctors who come to the UK with established clinical practices may find adapting to a different workplace culture difficult and not have access to the support available to less experienced doctors.(2) IMGs may also find difficulties understanding roles and responsibilities in the NHS structure in addition to patient-centred culture and a holistic model of care. (2)

Two studies identified a need for a greater emphasis on Equality and Diversity and cultural awareness in training within organisations with targeted events and diversity initiatives used as opportunities. (3, 4)

As part of McManus’s data linkage study into PLAB and UK graduates performance on MRCP(UK) and MRCGP examinations, a comparison between graduates from different UK medical schools was performed. (7) The study found “clear and large differences in performance at MRCP(UK) between graduates of different medical schools.” (7) However, the study concluded that the identified differences in training could not account for the poorer performance of IMGs.

Esmail advocates examining the distribution of IMGs and BME doctors across UK medical schools in order to ascertain if the selection and training placement processes could operate against the interests of weaker candidates, thus encouraging a cycle of educational deprivation. (10) This observation is supported by Tiffin. (23)

37

Mentoring There is a significant body of literature around mentoring for medical students and doctors at all levels of study, with the majority of studies being undertaken in the USA. (18) Frei’s review concludes that mentoring is “an important career advancement tool for medical students” and that more programmes should be set up in Europe, but monitored and assessed for impact.(18)

In terms of mentoring in the context of postgraduate medical training the literature is not well developed, although there is support from the Royal colleges and the NHS generally. (57) (58) Stamm’s study examining mentoring as part of a developmental network, set in Switzerlamd, found that only 50% of doctors undergoing specialist training (n = 326) took advantage of mentoring despite the positive benefits identified and of those, females received less mentoring than their male colleagues. Reasons for this gender gap were identified as primarily due to extraprofessional concerns. Stamm concludes that given the often less straightforward career path for females mentoring is particularly important.(59)

Steven’s qualitative study over six NHS sites, identified benefits across the professional- personal interface. Steven suggests that successful mentoring makes doctors feel more confident and satisfied in their work, and this will have beneficial impacts for organizations. (60)

Selection The literature on postgraduate selection is less developed than that of undergraduate selection into medical school. The UK general practitioner selection process, uses a national machine markable shortlisting test to assess both cognitive and non-cognitive skills and a ‘corporately owned’ and validated selection methodology. Plint (42) summarises the success of the process and the confidence in it from both students and deaneries as: “Corporate commitment to national process; legitimate authority and locus of control; process of incremental convergence, rather than imposition; development and adoption of validated selection method; representative infrastructure operating the process.”

McManus, undertook a significant study examining the educational background and qualifications of UK students from ethnic minorities and the selection for medical school. (47) The study addressed the assumption that entrants to medical school are equivalent in 38

their academic ability and that following on from this differential attainment in undergraduate medical exams and beyond were accounted for at some point after selection to medical school. The study found however, that non-white students had slightly lower A level and GCSE grades than their white peers. Concluding that while GCSE and A level grades might explain some of the effects found, they could not entirely explain the poorer performance of non-white students at medical school and beyond.

Citing the GP selection process as an example, Paterson recommends a more robust selection process.(4)

Understanding workplace culture is important.

Mentoring programmes are beneficial but they need to be robust and evaluated to ensure they are effective. The literature on postgraduate selection is less developed than that of

undergraduate selection into medical school.

Prior attainment could not entirely explain the poorer performance of non-white students at medical school and beyond.

Selection processes for postgraduate study are highly variable.

9.3 Policy

Predictors of success at postgraduate level Woloschuk’s small scale study (n = 244 medical graduates) at the University of Calgary, Canada found that measures of undergraduate performance seemed to be poor predictors of postgraduate success. In particular they found a ‘weak’ relationship between performance in the Medical Council of Canada (MCC) national licencing exam, which they describe as a “rite of passage” to postgraduate training, and residency. They suggest that success may be due to non-cognitive attributes for example, work ethic, personality and motivation (61).

39

PMQ In a high quality comparative study of UK trained doctors and those whose PMQ was gained outside the UK sitting the RCA exam from 1999-2008, Watmough (62) found that candidates from Egypt, Iraq, Ireland and Pakistan performed significantly worse than those from Australia, New Zealand, South Africa, Zimbabwe and the UK. From June 1990 to February 2008, there were 9,315 attempts at the MCQ by 5,797 graduates from 70 countries, with 25 countries having candidates who made 15 or more attempts. The analysis was undertaken using data from the written part of the exam which uses multiple choice questions to test a range of generic clinical skills. The MCQ is a high stakes exam essential for career progression to consultant level. The study did not find a coherent pattern to attainment and concluded that “some IMG graduates who sit UK postgraduate exams may require additional support prior to taking the exam.” Importantly, the underperformance of students from Ireland and Pakistan, where English is the main language in medical education, indicates that language is not a key factor in differential attainment in this exam. The authors suggest that rather than language it may be that cultural ties ease the transition of working in the UK, however the poor performance of candidates from the Republic of Ireland casts doubt on this supposition.

High stakes examinations High stakes exams all contain a number of components, assessment of practical skills using ‘real’ or simulated clinical scenarios, multiple choice, written, oral – different elements of the exam are marked in different ways: by computer, by examiner and by assessment of skills.

It is important that the transparency and fairness of ‘high stakes’ exams be demonstrated given the influence they have on a doctor’s career progression and employment opportunities. Memon, in relation to the specifics of oral examination in postgraduate examinations argued for the Royal Colleges to undertake much more rigorous validity and reliability testing on their high stakes exams. (63)

Wakeford’s large scale assessment of validity and differential performance by ethnicity in the RCGP and MRCP(UK) examinations followed in the wake of a Judicial Review.(24) It sought to evaluate if the performance of candidates in the MRCP(UK) was predictive of their

40

attainment in the MRCGP (usually taken 3-4 years after). The study found substantial correlations between a candidates performance in the two exams which provides support for the validity of each. (24)

Wakeford identified a higher correlation between PACES and the new CSA than the old, suggesting that the new CSA is a more valid assessment. (24) in addition the study found that in particular BME candidates showed a higher correlation between PACES and the CSA than white students “suggesting that there is less extraneous variance between BME candidates making it a more valid assessment.”(24)

MRCGP and MRCGP Clinical Skills Assessment (CSA) The CSA exam was revised in the autumn of 2010 to improve the reliability of the assessment. In Esmail and Roberts key study, using previously unavailable data from the GMC and the RCGP, they examined ethnic minority candidates performance in the MRCGP exams between 2010 and 2012: therefore testing the new CSA.(28)

The headline conclusion was that “subjective bias due to racial discrimination in the clinical skills assessment may be a cause of failure for UK trained graduates and international medical graduates.”(10)

Discrimination is an incendiary term. Judith Hawkins writing in Mancunian Matters explained the format of the CSA to its non-medical audience and outlined Esmail’s findings. The article stimulated a heated online debate among readers who were only too willing to support claims of racial discrimination.(64)

Esmail and Roberts suggested that the different training experience and other cultural factors (patient/doctor relationship and proficiency in spoken English for example) between UK and non-UK trained candidates could affect exam outcomes. However they did not consider that these cultural factors could entirely account for differential attainment between white and BME UK trained candidates. It was suggested that discrimination could occur at a number of points in the CSA: the behaviour of standardised patients to white and non-white candidates and bias on the part of the examiners.(10)

McManus’s study (7) leads on from this study by Esmail and Roberts (10) although there are significant differences between the two: McManus’s study analyses PLAB part 1 (which 41

Esmail does not) and it analyses a larger dataset (n = 7,829) from MRCP(UK) compared to (n = 5,055 candidates + 1,175 not trained in the UK). Both studies use candidates’ marks at 1st attempt for all analysis. McManus’s study found that IMGs lower performance was “unlikely to result from systemic examiner bias or discrimination.”

Knight states in an opinion piece, that while there is evidence that the MRCGP is reliable IMGs are prone to failure because the exam is in English and they spend much of their practise consulting in other languages. (15)

The CSA exam was revised in the Autumn of 2010: the new CSA has been found not to discriminate between white and BME candidates. (24) However, the CSA will inevitably carry implicit cultural association’s specific to UK medicine. Esmail states that the CSA is not, and was not intended to be a culturally neutral exam. Therefore UK graduates are likely to be initially more successful, because they are acculturated. (28)

The CSA was consistently identified in the medical news media as a particular issue for IMGs. Commentary was prompted by both Esmail’s study (65, 66) and the judicial review. (67) (68)

MRCOG The MRCOG is an internationally recognised standard and at the time of Rushd’s study more than 85% of the total candidates were IMGs. The study found that MRCOG examination success rates were significantly different according to the university of medical graduation. Rushd also identified a variation in performance among graduates from different medical schools in the Part 1 and 2 of the MRCOG written examination which was comparable to those school’s performances on the MRCP (UK). (46)

PLAB and IELTS If IMGs are going to sit the PLAB they need to demonstrate that they have achieved an acceptable level of English via IELTS in the previous two years. PLAB was reviewed in 2011 to assess whether the knowledge and skills demonstrated by passing the PLAB continued to be equivalent to those demonstrated by an F1 doctor. A key component of this review was to examine any disparity between IMGs, who successfully passed the PLAB test, and their UK graduate peers in postgraduate examinations.(7) Aside from difficulties relating to direct comparison the study concludes that there are good correlations between PLAB and the 42

MRCP(UK) and MRCGP which means that PLAB is a valid assessment of skills relevant to progression during UK postgraduate training. It should be noted however, that PLAB is not designed to predict postgraduate exam performance, or to ensure that those passing PLAB can achieve at postgraduate level.

In order to produce outcome equivalence between IMG and white graduates it was suggested by McManas that the PLAB pass mark could be set higher – however this would have significant impacts on health service delivery.(7)

In Tiffin’s study of UK based trainee doctors with at least one competency related ARCP related outcome (n = 53,463 of whom 11,419 were IMG registered following a pass from the PLAB route) in the study period also found that the PLAB test was not generally equivalent to the requirements for UK graduates. With the standard of English competency and the PLAB pass mark needing to be raised to ensure equity.(23) Tiffin also discusses how PLAB candidates with lower scores may not be able to secure a post in their preferred specialism and therefore successfully apply for “shortage specialities” like psychiatry and general practice. Given that these specialisms require enhanced communication skills some IMGs may immediately be disadvantaged. (23)

Sandhu notes that the requirement to pass these exams cuts into IMGs time and can cause the erosion of time for research resulting in IMGs CVs being weak in publications which can impact on them being shortlisted for jobs in spite of clinical experience.(50)

Much of the research into differential attainment is quantitative with a focus on testing for bias in the exam, or a component part of it using exam data. Taken as a whole these studies have broadly demonstrated the validity of some high stakes exams and discounted evidence of bias in the exams themselves leading to differential attainment. This view is endorsed by Patterson with the caveat that it is not an endorsement of all assessment tools.(14)

Examiner bias Examiner bias in relation examinations like the MRCGP and the MRCP in which candidates are judged ‘live’ and therefore examiners can identify a candidate’s gender and ethnicity has been frequently questioned (11) (8, 9).

43

Examiner bias is a potential risk in any examination and a threat to the validity of an examination. The first study in this area by Dewhurst (9) focused on the MRCP(UK) and found any potential examiner prejudice was only significant when two non-white examiners examined a non-white candidate. This Dewhurst suggested was not conscious and may relate to a consistency in communication style and cultural understanding.

McManus’s investigation into possible bias as a threat to the validity, used data from MRCGP(UK) PACES and nPACES examinations.(8) The study found that having two independent examiners reduced any potential for bias and judged it a preferable method of assessment over a single examiner. This is an example of how the infrastructure around an exam can potentially impact on the outcome for the candidate

Denny’s study to investigate potential examiner bias as responsible for differential attainment in the MRCGP CSA, found no evidence to support examiner bias. Finding that differential attainment was linked to the candidates’ demographic rather than the examiners. (11)

In a letter to the BMJ, Shaw opined that the new CSA high failure rate of ethnic candidates was an unintended consequence of selection and examination rather than examiner bias. With the higher failure rate due to the combination of a disincentiveisation for medical schools to increase trainee numbers and the raised standards of English required by the new exam. (69)

The BMA suggests that considerable variability between Royal Colleges in the ways in which they monitor for protected characteristics in exam candidates could be a potential contributor to unfair bias and differential attainment in specialist examinations. (70) This was endorsed in BMJ Careers. (13)

In addition the BMA proposes that colleges should annually monitor the diversity of examiners and actors and cross reference this data with individual candidate’s performance.

44

Much of the research into differential attainment is quantitative with a focus on testing for bias in the exam, or a component part of it using exam data.

Taken as a whole these studies have broadly demonstrated the validity of high stakes exams and discounted evidence of bias in the exams themselves leading to differential attainment.

It is important to acknowledge how the infrastructure around high stakes exams may also lead to bias and/or differential attainment.

10 Discussion In this discussion section we return to the research question and structure the discussion around the three key areas of interest: potential causes of differential attainment identified in the literature, ways in which differential attainment has been researched and finally we discuss potential interventions. This segmentation, it should be stressed, is too an extent artificial since the three elements are interrelated. The identification of possible causes may suggest ways of researching, from which interventions can be developed, causes may suggest interventions, specific research methodologies may identify possible causes etc.

10.1 Causes In the AoMRC statement of principles there is a recognition that differential attainment does not in itself demonstrate that the exam, curricula content, process or delivery is discriminatory.(71) This document cites the GMCs independent commissioned report which found that “the method of assessment was not the reason for the differential outcomes.” (28)

Studies are tending to move away from the analysis of exam data and towards less tangible educational and social factors as contributing to performance. The general point to draw from this development of research over time, is not to just look at results but look at the ‘whole’ of the exam and its candidature. This includes both the pragmatic decisions made by candidates (why they chose a specialism, for example), and the factors beyond the assessment process, for example the way Royal Colleges record protected characteristics data and the impact this may have on demographic analysis.(70)

45

Highly variable selection and induction processes for postgraduate study in the UK have been identified. The RCGP selection process is well developed and can be robustly defended. However, poor induction and lack of support for IMGs in overcoming the difficulties inherent to migration and career change may disadvantage IMGs in becoming better trained and acculturated doctors. The role of trainers and employers is potentially pivotal to supporting IMGs. In particular identifying cultural specific needs of IMGs will be helpful in supporting IMGs adjust to the realities of practice in their adopted country.

Language, as a factor has been emphasised across a number of studies. However, quite what is meant by language is not always clear. Although not stated explicitly ‘language’ is a category of explanation that includes more than just words, it is often conflated with ‘communication’ in the context of the CSA. One critique of the CSA and other similar assessments is that they carry with them implicit cultural assumptions and lack of acculturation will impact on performance and ultimately attainment, even if clinical skills are to an expected standard.

Differential attainment is not just about ‘failure’, understanding why students succeed is also important and could provide pointers to factors that need to be encouraged/facilitated in order to improve student performance.

From the papers found, factors leading to success seem to be international but with such few studies the results are not generalizable. Factors identified might therefore be unique to given student communities and a UK based study could be helpful because by only looking at why certain students might fail only tells half the story. The information gained from such a study could be used by medical schools to inform both their teaching and their widening participation initiatives.

10.2 Ways of researching Patterson suggests that more open dialogue between stakeholders is essential if the complexity of differential attainment is to be fully understood and addressed. (14) She suggests more innovative research approaches are needed: Including, longitudinal tracking, interdisciplinary research to provide fresh perspectives and the development of appropriate theoretical frameworks. In particular she advocates detailed case studies of “outliers” as a way of approaching identifying facilitators to success.(14) Woolf’s exploration of stereotype 46

threat (26) is an example of research that adopts a fresh perspective by drawing on other disciplines. The literature strongly points towards interdisciplinary research as the future direction research will need to take in order to examine the complexity of differential attainment.

In this review we have discussed studies that have used a number of different research methods. The majority of studies included in the synthesis have used quantitative analysis (n = 25), but often recommended additional qualitative research. For example using qualitative analysis to drill down into Woolf’s findings (20) would be helpful in identifying where tailored support may be efficacious.

With the recognition that ‘causes’ are complex it becomes appropriate to use qualitative research approaches; traditionally favoured when the main research objective is to improve our understanding of a complex phenomenon embedded in its context. Qualitative research is expensive and labour intensive; so much of it is small scale and local in nature.

Drawing on a wider qualitative literature in areas like ‘diversity’ could provide methodological and empirical insights. We have not identified any studies that draw on a wider literature, for example medical education could learn much from the wider literature on learning styles, which could inform the current significant gap in knowledge. The wider educational literature on mentoring, for example, and other educational interventions would be a way of informing potential interventions.

A significant issue across the research is the lack of transparency around the categories of explanation. The category BME, for example, needs interrogating. Denny’s research critiques the conflation of BME into one category in research, suggesting qualitative research is required in order to “understand the detailed genesis of performance differences.” (11)

Ethnic and ethnicity are similarly ambiguous terms: you can’t develop interventions to address the differential attainment ‘problem’ unless you first recognise the ‘problem’ of definition and classification within BME and IMG across medical school and Royal College data and research.

47

Differential attainment is of course not confined to ethnicity, but this is the theme that has dominated the research. We found limited research into differential attainment in relation to gender and some in relation to age, the latter as an additional demographic category rather than the focus of any studies. That women out perform men in undergraduate education and continue to do so in certain specialisms at postgraduate level seems to be a given. We found no studies examining the contributors to the success of these female students aside from their gender of itself as a predictor.

10.3 Possible interventions The review found only two examples evaluating specific interventions tailored for postgraduate study. Stamm’s longitudinal quantitative study of postgraduate Swiss medical students found that mentoring had a positive impact. (37) The study was limited by relying on self-reported data and the limited age range of participants. The second study by Plint discusses the GP recruitment process as an example of best practice selection process. (6)

As researchers in undergraduate selection have long recognised, there is a correlation between selection and subsequent performance, but there is little research in the postgraduate context. Anecdotally we know that medical schools and employing organisations implement interventions to support students, doctors, trainers and the workforce as a whole. However we found scant evidence of this in the literature. Individual medical schools are probably doing excellent work but this is likely to remain un- disseminated in the academic literature, because it is related to teaching practice and will not be available for public access through the internet. This is a key finding in its own right, raising questions about how this information can be found and assessed for impact.

The studies show that there are a variety of issues that affect the performance and attainment of students. These include issues around the background and characteristics of the individuals, the stage they are at in their medical career and the medical school or workplace environment. These might have cumulative effects over time or ‘one-off’ effects at certain stages of their career.

We found very few studies examining interventions, and those that we found are not robust enough to demonstrate clear evidence of a definitive intervention aside from mentoring.

48

However, we can identify some ‘trends’ in the literature that can inform the ‘look’ of the overall structure of an intervention to improve student performance and/or attainment.

Due to the variety of identified factors affecting performance and attainment, it is more likely that a complex intervention needs to be designed rather than a simple intervention in order to address the complexity of differential attainment. The first question in designing an intervention would relate to the level at which the intervention is required: at the individual level, the institutional level, a broader policy level, or a complex intervention with components on each level.

It is important to consider any intervention in the light of all levels. For example raising the PLAB 1 and 2 pass marks (policy level), while it would provide an equivalence of performance with UK medical graduates (23) and candidates for the MRCP(UK) and MRCGP (7) it would reduce attainment with workforce implications.

An example of an intervention initiated at the policy level but operationalised at the institutional level to benefit individually identified learning needs was the proposed appointment of local and national IMG Champions and improved examiner training to help recognise unconscious biases (accents, manner etc). (53) This recommendation was developed through a conference workshop at the 2014 RCPsych conference devoted to IMGs.

The literature identifies that much of the support needed for IMGs, at least initially, is practical. One key area identified is the information available prior to arrival, appropriate induction and ongoing support. The undergraduate literature is well developed in this area but there should not be an uncritical translation of undergraduate processes to the postgraduate context since the postgraduate experience is much less structured and the curriculum more fragmented.

Development of intercultural competence is deemed essential for successful communication. Patterson identifies one of the most challenging aspects being the “ability to distinguish between idiosyncratic and culturally conditioned behaviours.” (4) The RCPsych conference is an example of a targeted event (policy level) to examine what support IMGs need, but there are doubtless other targeted events and interventions at the institutional

49

level as part of internal diversity initiatives. However there is no pool of best practice to help develop networks.

The identification in the literature of the complexities of differential attainment strongly suggests that students need to be supported on an individual level, and this may require significant changes at the institutional level, including greater flexibility in the curriculum, the acknowledgement that some students will take longer than others to reach a stage where they can be confident to pass a given exam, rather than making multiple attempts.

11 Conclusions (the story so far) This rapid review, conducted over three and a half months has identified the multifactorial nature of differential attainment in postgraduate medical education in the UK.

The review has identified a narrative of research interests and methods that have developed through the quantitative analysis of exam data, with the aim of locating bias, towards a more nuanced research approach looking at both educational and social factors as potential contributors (rather than single causes) to differential attainment. The development of the research over time shows us that researchers need to understand the micro, meso and the macro-structure of medical education in order to understand differential attainment. The literature strongly points towards interdisciplinary research as a future direction research will need to take in order to examine the complexity of differential attainment.

Interventions will have cost implications, yet we found no examples of cost benefit analysis in the literature. This is an important omission and should be structured into any evaluation of an intervention along with a rigorous analysis of impact.

The literature clearly identifies the growing importance of IMGs internationally to the delivery of healthcare and the increasing globalization of the medical workforce. Rushd suggests that the introduction of e-learning modules, for trainers and trainees may go towards developing parity across geographic regions. (46) While Roberts discusses the need for a debate about tolerances and communicative flexibility, about what are acceptable competencies.(3) Certainly a better understanding of the challenges of transition faced by individuals entering the UK workplace could inform interventions at individual, institutional and policy levels. 50

IMGs are part of an implied movement towards ‘global standards’ for medical exams as medicine becomes internationalized. This is potentially the next chapter in the story and ensures that differential attainment will not only remain firmly on the agenda but potentially become central to this emerging discourse and a driver for change.

51

12 References 1. Popay J, H Roberts, A Sowden, M Petticrew, L Arai, Rodgers M. Guidelines on the conduct of narrative synthesis in systematic reviews. 2006. 2. Illing J. The experiences of UK, EU and non-EU medical graduates in the transition to the UK workplace. 2009. 3. Roberts Celia, Atkins Sarah, Hawthorne Kamila. Performance features in clinical skills assessment: Linguistic and cultural factors in the Membership of the Royal College of General Practitioners examination. London: Centre for Language, Discourse & Communication, Kings College London; 2014. 4. Patterson Fiona, La-Band Analise, Koczwara Anna, Spicer John. GP National Selection Process: Equalities Impact. 2012. 5. Woolf K, Potts HWW, McManus IC. Ethnicity and academic performance in UK trained doctors and medical students: systematic review and meta-analysis. BMJ (Clinical Research Ed). 2011;342:d901-d. 6. Plint S, Patterson F. Identifying critical success factors for designing selection processes into postgraduate specialty training: the case of UK general practice. Postgraduate medical journal. 2010;86(1016):323-7. 7. McManus I C, Wakeford R. PLAB and UK graduates' performance on MRCP(UK) and MRCGP examinations: data linkage study. BMJ. 2014;348:g2612. 8. McManus C, Elder A, Dacre J. Investigating possible ethnicity and sex bias in clinical examiners: An analysis of data from the MRCP(UK) PACES and nPACES examinations. BMC Medical education. 2013;13. 9. Dewhurst Neil, McManus C, Mollon J, Dacew J, Vale A. Performance in the MRCP(UK) Examination 2003-4: analysis of pass rates of UK graduates in relation to self-declared ethnicity and gender. BMC Medicine. 2007;5(8). 10. Esmail A, Roberts C. Academic performance of ethnic minority candidates and discrimination in the MRCGP examinations between 2010 and 2012: analysis of data. BMJ (Clinical Research Ed). 2013;347:f5662-f. 11. Denney ML, Freeman A, Wakeford R. MRCGP CSA: are the examiners biased, favouring their own by sex, ethnicity, and degree source? The British journal of general practice : the journal of the Royal College of General Practitioners. 2013;63(616):e718-25. 12. Hawtin KE, Williams HR, McKnight L, Booth TC. Performance in the FRCR (UK) Part 2B examination: analysis of factors associated with success. Clinical radiology. 2014;69(7):750-7. 13. Rimmer A. Royal colleges must improve data on diversity of exam candidates BMA says. BMJ careers. 23 January 2014. 14. Patterson Fiona, Denney Mei-Ling, Wakeford R, Good D. Fair and equal assesmnet in postgraduate training? British Journal of General practice. 2011:712-3. 15. Knight RA. Reasons why doctors who perform well as doctors may fail the MRCGP clinical skills assessment exam. BMJ (Clinical Research Ed). 2013;347:f6438-f. 16. Huijskens EG, Hooshiaran A, Scherpbier A, van der Horst F. Barriers and facilitating factors in the professional careers of international medical graduates. Med Educ. 2010;44(8):795-804. 17. Farrokhi-Khajeh-Pasha Y, Nedjat S, Mohammadi A, Malakan Rad E, Majdzadeh R. Informed choice of entering medical school and academic success in Iranian medical students. Medical Teacher. 2014;36(11):978-82.

52

18. Frei E, Stamm M, Buddeberg-Fischer B. Mentoring programs for medical students - a review of the PubMed literature 2000 - 2008. BMC Medical Education. 2010;10(1):32. 19. Vaughan S, Sanders T, Crossley N, O'Neill P, Wass V. Bridging the gap: the roles of social capital and ethnicity in medical student achievement. Medical Education. 2015;49(1):114-23. 20. Woolf K, McManus IC, Potts HWW, Dacre J. The mediators of minority ethnic underperformance in final medical school examinations. The British Journal Of Educational Psychology. 2013;83(Pt 1):135-59. 21. Abdulghani HM, Al-Drees AA, Khalil MS, Ahmad F, Ponnamperuma GG, Amin Z. What factors determine academic achievement in high achieving undergraduate medical students? A qualitative study. Medical Teacher. 2014;36 Suppl 1:S43-S8. 22. Thomas B, Manusov EG, Wang A, Livingston H. Contributors of black men's success in admission to and graduation from medical school. Academic Medicine: Journal Of The Association Of American Medical Colleges. 2011;86(7):892-900. 23. Tiffin PA, Illing J, Kasim AS, McLachlan JC. Annual Review of Competence Progression (ARCP) performance of doctors who passed Professional and Linguistic Assessments Board (PLAB) tests compared with UK medical graduates: National data linkage study. BMJ: British Medical Journal. 2014;348. 24. Wakeford R, Denney M, Ludka-Stempien K, Dacre J, McManus I C. Cross-comparison of MRCGP & MRCP(UK) in a database linkage study of 2,284 candidates taking both examinations: assessment of validity amd differential performance by ethnicity. BMC Medical education. 2015;15:1. 25. Bowhay AR, Watmough SD. An evaluation of the performance in the UK Royal College of Anaesthetists primary examination by UK medical school and gender. BMC Med Educ. 2009;9:38. 26. Woolf K, Cave J, Greenhalgh T, Dacre J. Ethnic stereotypes and the underachievement of UK medical students from ethnic minorities: qualitative study2008 2008-08-18 09:33:44. 27. HEFCE. Student ethnicity: experiences in full-time, first degree study http://www.hefce.ac.uk/media/hefce1/pubs/hefce/2010/1013/10_13.pdf 2010. 28. Esmail A, Roberts C. Independent Review of the Membership of the Royal College of General Practitioners (MRCGP) examination. General medical Council, 2013. 29. General Medical Council. Interactive reports to investigate factors that affect progression of doctors in training. http://wwwgmc-ukorg/education/25495asp. March 2015. 30. Woolf K, Potts HWW, McManus IC. Ethnicity and academic performance in UK trained doctors and medical students: systematic review and meta-analysis2011 2011-03-08 23:34:46. 31. Haq I, Higham J, Morris R, Dacre J. Effect of ethnicity and gender on performance in undergraduate medical examinations. Med Educ. 2005;39:1126 - 28. 32. Lavis J HD, A Oxman, J Denis, K Golden-Biddle, E Ferlie. Towards systematic reviews that inform health care management and policy making. Health Services Research & policy. 2005;10(1):35-48. 33. Oliver S, Harden A, Rees R, Shepherd J, Brunton G, Garcia J, et al. An Emerging Framework for Including Different Types of Evidence in Systematic Reviews for Public Policy. Evaluation. 2005;11(4):428-46. 34. Ganann R, D Ciliska, Thomas H. Expediating systematic reviews: Methods and implications of rapid reviews. Implementation Science. 2010;5(56). 53

35. Watt A, Cameron A, Sturm L, Lathlean T, Babidge W, Blamey S, et al. Rapid reviews versus full systematic reviews: An inventory of current methods and practice in health technology assessment. International Journal of Technology Assessment in Health Care. 2008;24(02):133-9. 36. Johannessen K-A, Hagen TP. Individual and hospital-specific factors influencing medical graduates' time to medical specialization. Social Science & Medicine (1982). 2013;97:170-5. 37. Stamm M, Buddeberg‐Fischer B. The impact of mentoring during postgraduate training on doctors’ career success. Medical Education. 2011;45(5):488-96. 38. Academy of Medical Royal Colleges. Academy of Medical Royal Colleges Review 2013-2014. 2014. 39. Schrewe B, Frost H. Finding potential in balance: navigating the competing discourses of diversity and standardization. Academic Medicine: Journal Of The Association Of American Medical Colleges. 2012;87(11):1479-. 40. Marton F, Säljö R. On qualitative differences in learning: I - Outcome and Process,. British Journal of Educational Psychology. 1976;46:4-11. 41. Biggs J. Teaching for Quality Learning at University: SHRE and Open University Press.; 1999. 42. Moore J, Sanders J, Higham L. Literature review of research into widening participation to higher education: Report to HEFCE and OFFA by ARC Network. London: 2013. 43. Costa P T, McCrae R R. Revised NEO personality inventory (NEO-PI-R) and NEO five- factor inventory (NEO-FFI) professional manual. Odessa: Psychological Assessment Resources; 1992. 44. Feedback from the HEA ECU and HEFCE sponsored summit, editor Supporting black and minority ethnic student success in higher education - narrowing the gap

2012. 45. Singh G. A Synthesis of Research Evidence. Black and minority ethnic (BME) students' participation in higher education: Improving retention and success. 2012. 46. Rushd S, Landau A B, Lindow S W. An evaluation of the first time performance of international medical graduates in the MRCOG Part 1 amd Part 2 written examinations. European Journal of Obstetrics & Gynecology and Reproductive Biology. 2013;166:124-6. 47. McManus IC, Woolf K, Dacre J. The educational background and qualifications of UK medical students from ethnic minorities. BMC Med Educ. 2008;8:21. 48. Dillner L. Manchester tackles failure rate of Asian students. BMJ. 1995;310:209. 49. Burgess DJ, Joseph A, van Ryn M, Carnes M. Does stereotype threat affect women in academic medicine? Academic Medicine: Journal Of The Association Of American Medical Colleges. 2012;87(4):506-12. 50. Sandhu DP. Current dilemmas in overseas doctors' training. Postgraduate medical journal. 2005;81(952):79-82. 51. Jolly P, Boulet J, Garrison G, Signer MM. Participation in U.S. graduate medical education by graduates of international medical schools. Academic medicine : journal of the Association of American Medical Colleges. 2011;86(5):559-64. 52. General Medical Council. Welcome to Practice. 2014. 53. Al-Taiar Hassan, Menzies Alexandra. Report on the RCPsych IMG Conference 2014: Royal College of Psychiatrists

54

2014. 54. Rimmer A. RCP is to highlight gap in performance between overseas doctors and UK graduates. BMJ Careers. 02 December 2014. 55. Bibbo C, Bustamante A, Wang L, Friedman F, Jr., Chen KT. Toward a Better Understanding of Gender-Based Performance in the Obstetrics and Gynecology Clerkship: Women Outscore Men on the NBME Subject Examination at One Medical School. Academic Medicine: Journal Of The Association Of American Medical Colleges. 2014. 56. Cuddy Monica, Swanson David, Clauser Brian. A Multilevel Analysis of the Relationships between Examinee Gender and United States Medical Licensing Exam (USMLE) Step 2 CK Content Area Performance. Academic Medicine. 2007;82(10):589-93. 57. Royal College of Obstetricians and Gynaecologists. Mentoring for all: RCOG press; 2005. 58. Department of Health. Mentoring for Doctors: Signposts to current practice for career grade doctors. Guidance from the Doctors' Forum. London: DOH; 2004. 59. Stamm M, Buddeberg-Fischer B. The impact of mentoring during postgraduate training on doctors’ career success. Medical Education. 2011;45(5):488-96. 60. Steven A, Oxley J, Fleming W G. Mentoring for NHS doctors: perceived benefits across the personal-professional interface. Journal of teh Royal Society of Medicine. 2008;101:552-7. 61. Woloschuk W, McLaughlin K, Wright B. Is undergraduate performance predictive of postgraduate performance? Teach Learn Med. 2010;22(3):202-4. 62. Watmough S, Bowhay A. An evaluation of the impact of country of primary medical qualification on performance in the UK Royal College odf Anaesthetists' examinations. Medical Teacher. 2011;33:938-40. 63. Memon M, Joughin G, Memon B. Oral assessment and postgraduate medical examinations: establishing conditions for validity, reliability and fairness. Advances in Health Sciences Education. 2010;15(2):277-89. 64. Hawkins Judith. Ethnic minority trainee GPs are suffering racial discrimination, claims Manchester uni study. Mancunian Matters. 28 September 2013. 65. Differing pass rates raise concerns about MRCGP exam. BMA. 24 May 2013. 66. Low ethnic minority exam pass rates sparks call for research. BMA. 24 June 2013. 67. Action needed to end college exam disparity. BMA. 11 April 2014. 68. Duffin Christian. RCGP will ensure examiners are 'representative of race and ethnicity'. PULSE. 21 May 2014. 69. Shaw Q. High failure rate of ethnic minority groups in MRCGP exam comes from changes to exam and candidate selection. BMJ (Clinical Research Ed). 2013;347:f6442-f. 70. British Medical Association. Examining Quality: A survey of royal college examinations. Progress review. Equality nad Diversity Committee, 2014. 71. Academy of Medical Royal Colleges. Fairness, equality and medical royal college exams: Academy of medical royal colleges statement of principles. 2014.

55

13 Appendices Appendix 1: Studies and other documents included in the synthesis Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Abdulghani HM, Al- 10 male and 9 female The aim of this study is Qualitative study using Factors influencing high Addressing these High quality study Drees AA, Khalil MS, high achieving (scores to explore the high focus group discussions academic achievement factors, which might be Generalizability = Ahmad F, more than 85% in all achieving students’ include: prioritization of unique for a given indirect Ponnamperuma GG, tests) students, from the perceptions of factors learning time student community, in a Amin Z. What factors second, third, fourth contributing to management, and systematic manner determine academic and fifth academic academic family support. would be helpful to achievement in high years. achievement. Management of non- improve students’ achieving academic is also performance. undergraduate medical important. students? A qualitative study. Medical Teacher. 2014;36 Suppl 1:S43-S8. Academy of Medical n/a Academy of Medical n/a Paragraph in the review AoMRC held a seminar n/a Royal Colleges. Academy Royal Colleges about differential and are co-ordinating of Medical Royal attainment work to take initiatives Colleges Review 2013- suggested forward. 2014. 2014.

56

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Academy of Medical Royal College Statement of principles n/a 7 principles identified MRCGP exam is not the n/a Royal Colleges. Fairness, examinations reason for differential equality and medical attainment. Complex royal college exams: and varied factors lead Academy of medical to differential royal colleges statement attainment and are not of principles. 2014. unique to medicine. Colleges must have no factors in their control that contribute to differential attainment Al-Taiar Hassan, IMGs taking the Conference addressing n/a A number of papers and Trainers and employers n/a Menzies Alexandra. MRCPsych the difficulties workshops have key roles in Rport on the RCPsych encountered by IMGs. supporting IMGs IMG Conference 2014. Including: induction, London: Royal College of training, supervision and Psychiatrists feedback. 2014. BAPIO. Special Edition BAPIO Special conference n/a Endorsements from high Although not n/a BAPIO conference,. edition of Sushruta level policy makers and successful BAPIO Sushruta. 2014;7(1). following the Judicial politicians about BME describe the judicial Review. doctors contribution to review as a ‘moral the NHS victory’.

57

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Bibbo C, Bustamante A, Retrospective cohort To better understand Comparison of female Women who took the Interest in Ob/.Gyn is Unclear quality study Wang L, et al. Toward a study of students with why women outperform and male students’ MCAT scored lower than declining, evidenced by Generalizability = Better Understanding of Ob/Gun rotation (2008- men in the Ob/Gyn performance on MCAT, men. Similarly, in the decrease in U.S. medical indirect Gender-Based 2011), the Icahn School clerkship. USMLE and Ob/Gyn USMLE - women scored school graduates Performance in the of Medicine at Mount clerkship components. lower than men. entering residency Obstetrics and Sinai in New York City. In the Ob/.Gyn clerkship programs (8% in 1993 to Gynaecology Clerkship: – most components 4% in 2013). In addition, Women Outscore Men showed no significant the aging Ob/Gyn on the NBME Subject gender differences. But, workforce has a high Examination at One women outscored men level of career Medical School. on the NBME subject dissatisfaction, which Academic Medicine: examination in Ob/Gyn leads to early Journal Of The and so outperformed retirement and Association Of American men in the Ob/Gyn decreased work hours. Medical Colleges 2014. clerkship. British Medical BMA members BMA Equality and Review of the Continuing variable Concerns that n/a Association. Examining Diversity Committee. monitoring of speciality processes and insufficient attention is Quality: A survey of Monitoring for examinations by a letter procedures for being paid to ensuring royal college protected requesting information monitoring speciality these examinations are examinations. Progress characteristics by from the Colleges. All examinations with not affected by unfair review. Equality nad medical schools 18responded. respect to equality and discrimination or bias. Diversity Committee, diversity. Further research 2014. needed beyond the assessment process.

58

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Bowhay AR, Watmough UK medical graduates The impact that changes Data from each sitting 4983 attempts at the Graduates from each of High quality study SD. An evaluation of the in the postgraduate to undergraduate of the MCQ section of MCQ part of the FRCA the medical schools in Generalizability = direct performance in the UK examinations in curricula might have on the primary FRCA examination by 3303 the UK show differences Royal College of anaesthesia, which is postgraduate academic examination from June graduates from in performance in the Anaesthetists primary the largest hospital performance. 1999 to May 2008 were the 19 UK medical MCQ section of the examination by UK based speciality in UK analysed for schools. The pass marks primary FRCA, but medical school and medicine. performance by medical of graduates from five significant curriculum gender. BMC Med Educ. school and gender. schools performed change does not lead to 2009;9:38. significantly better than deterioration in the mean for the group. performance. Whilst Males performed females now outnumber significantly better than males taking the MCQ, females in all aspects of they are not performing the MCQ. Graduates as well as the males. from three medical schools that have undergone the change from Traditional to a PBL curricula did not show any change in performance in any aspects of the MCQ pre and post curriculum change.

59

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Cuddy Monica, Swanson 23,538 examinees examine the effect of Descriptive statistics Observed differences While past research High quality study David, Clauser Brian. A from 136 Liaison gender on Step 2 CK were computed, and a indicated that indicated that Generalizability = Multilevel Analysis of Committee on Medical content area series of examinees women generally women outperformed indirect the Relationships Education–accredited performance, on the nested-in-schools outperformed men in men in some content between Examinee medical schools/ relationships between hierarchical linear most content areas. areas, and men Gender and United campuses. Step 1 scores and models were conducted. School characteristics outperformed States Medical Licensing Step 2 CK content area were generally women in others, the Exam (USMLE) Step 2 CK performance, and unrelated to the current study revealed a Content Area medical school relationships between somewhat different Performance. Academic characteristics on examinee characteristics pattern, with women Medicine. the relationships and Step 2 CK content outperforming men in 2007;82(10):589-93. between examinee area performance. most content areas. characteristics and Step 2 CK content area performance. Davis Joe. RCGP RCGP. Post judicial Medical news media RCGP conducting RCGP will be exploring n/a n/a instigates wide-ranging review wholesale review into effective ways to collect review into diversity its equality and diversity ‘characteristic data’ on policies following CSA policies following trainees and examiners court case. Pulse. 19 judicial review March 2015. Denney ML, Freeman A, Data on 4000 candidates An investigation of Univariate analyses Univariate analysis Concern exists regarding High quality study Wakeford R. MRCGP (52 000 cases) sitting the candidates’ case were undertaken of showed some differential performance Generalizability = direct CSA: are the examiners MRCGP clinical skills performances by subgroup performance differences between of candidates in biased, favouring their assessment in candidates’ and (male/female, outcomes between the postgraduate clinical own by sex, ethnicity, 2011–2012. examiners’ white/black and BME, same-group and other- assessments by and degree source? Br J demographics. UK/non-UK graduates) group examiners: these ethnicity, sex, and Gen Pract by parallel examiner were contradictory country of primary 2013;63(616):e718-25. demographics. regarding examiners qualification. ‘favouring their own’.

60

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Dewhurst Neil, UK medical graduates Reported Pass rates for each part In all three parts of the The cause of these High quality study McManus C, Mollon J, sitting the MRCP (UK) underperformance of of the MRCP(UK)] examination, white differences is most likely Generalizability = direct Dacew J, Vale A. examination in 2003-4. students from ethnic Examination in 2003–4 candidates performed to be multifactorial. Performance in the minorities in were analysed for better than other ethnic Potential examiner MRCP(UK) Examination undergraduate differences between groups (P < 0.001). prejudice, significant 2003-4: analysis of pass examinations graduate groupings Analysis of overall only in the cases where rates of UK graduates in based on self-declared average marks showed there relation to self-declared ethnicity and gender. no interaction between were two non-white ethnicity and gender. All candidates declared candidate gender and examiners and the BMC Medicine. their gender, and 84– the number of candidate was non- 2007;5(8). 90% declared their assessments made by white, might indicate ethnicity. female examiners (P = different cultural 0.151). interpretations of the judgements being made. Duffin Christian. RCGP IMG and BME trainees Medical news media n/a BAPIO and the British n/a will ensure examiners International Doctors are 'representative of Association jointly race and ethnicity'. developing initiatives to PULSE. 21 May 2014. help IMG and BME trainees Esmail A, Roberts C. CSA candidates October Understanding the Independent Significant differences in Candidates need to be High quality study Independent Review of 2010 – November 2012 difference in pass rates quantitative review. All outcomes between provided with better Generalizability = direct the Membership of the between IMG and BME CSA sittings from IMG, BME, white and information from the Royal College of General candidates from UK and October 2010 – UK graduates in both GMC and the Deaneries. Practitioners (MRCGP) white graduates sitting November 2012 the AKT and CSA Training for educational examination. General the MRCGP examination components of the supervisors and trainers medical Council, 2013. MRCGP. is required. More research needed on factors for success.

61

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Esmail A, Roberts C. Cohort of 5095 To determine the Analysis of data After controlling for age, Consideration should be High quality study Academic performance candidates sitting the difference in failure provided by the RCGP sex, and performance in given to strengthening Generalizability = direct of ethnic minority applied knowledge test rates in the and the GMC. A further the applied knowledge postgraduate training candidates and and clinical skills postgraduate analysis was carried out test, significant for international discrimination in the assessment components examination of the on 1175 candidates not differences persisted medical graduates. MRCGP examinations of the MRCGP Royal College of General trained in the United between white UK between 2010 and examination between Practitioners (MRCGP) Kingdom, who sat IELTS graduates and other 2012: analysis of data. November 2010 and by ethnic or national test and PLAB candidate groups. Black BMJ (Clinical Research November 2012. background, and to examination , and minority ethnic Ed) 2013;347:f5662-f62. identify factors controlling for scores on (BME) graduates trained associated with pass these examinations and in the UK or who rates in the clinical skills relating them to pass trained abroad were assessment component rates of the clinical skills more likely to fail the of the examination. assessment. clinical skills assessment than white UK candidates

62

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Farrokhi-Khajeh-Pasha 220 final-year medical Compared students Self-administered Students who had not Idealistic views of Low quality study Y, Nedjat S, students randomly who made an informed questionnaire. made an informed medicine should be Generalizability = Mohammadi A, selected from six Iranian choice about entering choice had a higher replaced by rational and indirect Malakan Rad E, medical schools. medicine with those tendency not to choose logical ones to help Majdzadeh R. Informed who did not, in terms of medicine if they were to students select the choice of entering academic start over (p value careers best suited to medical school and success. _0.001). The pre- their abilities and academic success in admission scores of talents. Iranian medical students who had made students. Medical an informed choice of Teacher. medicine were worse 2014;36(11):978-82. than the other group (p¼0.03). However, their final year scores as well as their satisfaction with medicine were higher than the other group.

63

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Frei E, Stamm M, Doctors and Types of structured A literature-search A total of 162 Although the results of n/a Buddeberg-Fischer B. undergraduate medical mentoring programmes strategy was applied to publications were mentoring are Mentoring programs for students that exist for doctors as Medline for 1966–2002 identified. 16 (9 for promising, more formal medical students - a well as for medical using keyword medical students and 7 programmes review of the PubMed students, combinations. for doctors) were with clear setup goals literature 2000 - 2008. included for review. and a short- and long- BMC Medical Education. The majority of the term evaluation of the 2010;10(1):32. programmes lack a individual successes of concrete structure as the participants as well well as a short- and as the cost-benefit long-term evaluation. analysis are needed. Main goals included increasing professional competence and to build up a professional network for the mentees General Medical For those responsible for Initial findings may help Reports cover one year Reports show exam pass n/a Council. Interactive managing and delivering to further identify of exam outcomes and 3 rates between medical reports to investigate education and training effective mechanisms to years of round one school and factors that affect in the UK support graduates recruitment data postgraduate training progression of doctors through training programmes in addition in training. 2015. pathways to some demographics. http://www.gmc- uk.org/education/25495 .asp

64

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Hawkins Judith. Ethnic Refers to Esmail and Local news media n/a Strongly suggest n/a minority trainee GPs are Roberts report. institutional racism suffering racial discrimination, claims Manchester uni study. Mancunian Matters. 28 September 2013.

Hawtin KE, Williams HR, FRCR 2B candidates To assess factors that 2238 examination UK candidates were The FRCR 2B High quality study McKnight L, et al. 2006-10 influence pass rates and were evaluated significantly more likely examination is non- Generalizability = direct Performance in the examination scores in between Spring 2006 to pass than non-UK discriminatory for UK FRCR (UK) Part 2B the Fellowship of the and Spring 2010. Pass candidates. White candidates with respect examination: analysis of Royal College of rates and examination candidates were more to gender and ethnicity. factors associated with Radiologists (FRCR) 2B scores were analysed by likely to pass at 1st or Poorer performance of success. Clin Radiol examination. This is a gender, ethnicity, and 2nd attempt than non- non-UK trained 2014;69(7):750-7. high stakes the influence of factors white candidates, but candidates is a examination. The final such as radiology when restricted to UK consistent outcome in component of the FRCR training (UK versus non- entrants ethnicity did the literature. examination, Part 2B UK), sitting (Spring not influence success at (FRCR 2B) involves both versus Autumn), and the 1st attempt. Overall, oral and written presence of an females were more assessments and has undergraduate or successful than males. remained fundamentally postgraduate degree. Having an un- changed for more undergraduate or post- than a decade. graduate degree did not affect pass rate at first attempt for UK candidates.

65

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Huijskens EG, 32 IMGs who entered Social and cultural Qualitative research Reported barriers Barriers identified have High quality study Hooshiaran A, the Netherlands as diversity are increasingly using in-depth included difficulties in major implications for Generalizability = Scherpbier A, van der refugees or as spouses important interviews accessing information IMGs wishing to practise indirect Horst F. Barriers and of Dutch citizens. As characteristics of the on medicine in the facilitating factors in the their non-European medical professional and lack of (financial) Netherlands. Better professional careers of medical qualifications workforce. Every year, support. Perseverance support to overcome international medical are substantial numbers of was reported the difficulties inherent graduates. Med Educ. not considered IMGs seek jobs outside to be essential. Lack in 2010;44(8):795-804. equivalent to the Dutch the countries in which of command of the migration and career qualifications, they were educated. Dutch language and age change will result in they are required to were seen as barriers to better trained and undertake additional securing employment acculturated doctors medical training. and entrance to who specialisation. will be more motivated to contribute to society. Illing J. The experiences Doctors with a PMQ Challenges faced by Qualitative and Overseas qualified GMC may play a central High quality study of UK, EU and non-EU gained outside the UK doctors with a PMQ quantitative data doctors identified role in developing a Generalizability = direct medical graduates in gained outside the UK differences in practice, joined up approach to the transition to the UK structural elements of the support of overseas workplace. 2009. healthcare and qualified doctors knowledge gaps as well as difficulties outside the workplace.

66

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Johannessen K-A, Hagen All 2474 Norwegian Gender differences in A multivariate analysis Multivariate analysis Hospital factors were High quality study TP. Individual and residents who began relation to medical with extended Cox showed that the smaller significant predictors for Generalizability = hospital-specific factors specialization in 1999- specialization have regression, using proportion of women the timely attainment of indirect influencing medical 2001 focused more on social register data for socio- who qualified for a specialization:. graduates' time to variables than hospital- demographic variables specialty was explained medical specialization. specific factors. together with hospital- principally by childbirth Social Science & specific variables to and by the number of Medicine (1982) study the concurrent children aged under 18 2013;97:170-75. effect of these variables years. on specialty qualification Jolly P, Boulet J, IMGs are an important The multiple pathways Descriptive study of The IMGs in the study Although they face High quality study Garrison G, et al. part of U.S. graduate used by IMGs in pursuit 10,328 IMGs certified by cohort began applying significant hurdles in Generalizability = Participation in U.S. medical education and of a U.S. residency the ECFMG between for residencies the year achieving their goal, the indirect graduate medical medical practice. They position. July 1, 2005 and June immediately following majority of those who education by graduates make up a significant 30, 2006. Linked data ECFMG certification, but persist are ultimately of international medical number of the study on this cohort almost half were successful. If schools. Acad Med participants in both the determined the unsuccessful in their enrolments 2011;86(5):559-64. ERAS) and NRMP. numbers of members of first attempts. Three- and graduations of U.S. the study cohort who quarters of the MD- and DO-granting participated in ERAS members of the cohort medical schools and/or the NRMP in had begun a continue to rise, IMGs’ 2003 through 2009, and residency by 2010. difficulty in finding who found a residencies is sure to residency appointment. increase.

67

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Knight RA. Reasons why IMGs Questions why IMGs fail Letter n/a Provides eight points n/a doctors who perform an examination in a that outline possible well as doctors may fail simulated environment reasons why ethnicity, when they can perform training experience, and the MRCGP clinical skills in the real environment. sex can disadvantage assessment exam. BMJ certain candidate (Clinical Research Ed) groups in a high stakes 2013;347:f6438-f38. simulated environment.

McManus IC, Woolf K, White and non-white UK medical students Attainment at GCSE and NW students have The effect size for the High quality study Dacre J. The educational students entering and doctors from ethnic A level, and selection for higher educational difference between Generalizability = background and medical school minorities medical school in aspirations, being more white and non-white indirect relation to ethnicity, qualifications of UK underperform in likely to go on to take A medical school were analysed in two medical students from undergraduate and separate databases. - levels, especially in entrants is about B0.10, ethnic minorities. BMC postgraduate exams. The 10th cohort of the science and particularly which would mean that Med Educ 2008;8:21. Research examines the Youth Cohort Study chemistry, despite for a typical medical assumption that white (GCSEs and A level) and relatively lower school examination and nonwhite students UCAS for medical school achievement at GCSE. there enter medical school entry data. NW medical school might be about 5 NW

with similar entrants have lower A failures for each 4 W qualifications. level grades than W failures. However, this entrants, with an effect effect can only explain a size of about -0.10. portion of the overall effect size found in undergraduate and postgraduate examinations of about - 0.32.

68

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

McManus I C, Wakeford 7829, 5135, and 4387 To assess whether IMGs Data linkage of GMC PLAB1 marks were a PLAB is a valid High quality study R. PLAB and UK PLAB graduates on their passing PLAB1 and PLAB performance data valid predictor of assessment of medical Generalizability = direct graduates' performance first attempt at PLAB2 are equivalent to with data from the MRCP(UK) Part 1, knowledge and clinical Royal Colleges of on MRCP(UK) and MRCP(UK) Part 1, Part 2, UK graduates at the end MRCP(UK) Part 2, and skills, correlating well Physicians and the Royal MRCGP examinations: and PACES assessments of the first foundation College of General MRCGP AKT (r=0.521, with performance at data linkage study. BMJ. from 2001 to 2012 year of medical training Practitioners on 0.390, and 0.490. PLAB MRCP(UK) and MRCGP. 2014;348:g2612. compared with 18 532, (F1), as the GMC performance of PLAB graduates had PLAB graduates’ 14 094, and 14 376 UK requires, and if not, to graduates and UK significantly lower knowledge and skills at graduates taking the assess what graduates at the MRCP(UK) and MRCGP MRCP(UK) and MRCGP MRCP(UK) and MRCGP same assessments; 3160 changes in the PLAB assessments and were are over one standard examinations. PLAB1 graduates making pass marks might more likely to fail deviation below those

their first attempt at the produce equivalence. assessments and of UK graduates. MRCGP AKT during to progress more slowly To produce equivalent 2007-12 compared with than UK medical performance on the 14 235 UK graduates; graduates. IELTS scores MRCP and MRGP and 1411 PLAB2 correlated significantly examinations, the pass graduates making their with later performance, mark for PLAB1 would first attempt at the multiple regression require raising by about MRCGP CSA during showing that the effect 27 marks (13%) and for 2010-12 compared with of PLAB1 (β=0.496) PLAB2 by about 15-16 6935 UK graduates. was much stronger than marks (20%). the effect of IELTS (β =0.086).

69

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

McManus C, Elder A, Candidates at all Bias of clinical A statistical analysis The method works well In examinations where High quality study Dacre J. Investigating examination centres for examiners against some comparing each when there is more than there are two Generalizability = direct possible ethnicity and the first 26 diets of types of candidate, examiner against a one examiner at a independent examiners ‘basket’ of all of their sex bias in clinical PACES, the original form based on characteristics station and in the case at a station, our method co-examiners to identify examiners: An analysis of the examination, held such as sex or ethnicity, examiners whose of the current can assess the extent of of data from the from 2001/1 to 2009/2, would represent a behaviour is anomalous MRCP(UK) clinical bias against candidates MRCP(UK) PACES and and for the next six diets threat to the validity of The results of 26 diets examination, nPACES, with particular nPACES examinations. of nPACES, diets 27–32, an examination, since of PACES and six diets of found possible sex bias characteristics. The BMC Medical education. held from 2009/3 to sex or ethnicity are nPACES were examined in no examiners and method would be far 2013;13. 2011/2. ‘construct-irrelevant’ statistically to assess the possible ethnic bias in less sensitive in exams extent of hawkishness, characteristics. only one. with only a single as well as sex bias and ethnicity bias in examiner per station as individual examiners. examiner variance would be confounded with candidate performance variance. McManus I C, Woolf K, The 1980, 1985, and Selection of medical This study analyses data There were robust The existence of the High quality study Dacre J, Paice E, 1990 cohort studies students in the UK. from five longitudinal correlations across Academic Backbone Generalizability = direct Dewberry C. The (entered medical school studies of UK medical different years at concept is strongly Academic Backbone: in 1981, 1986, and students and doctors medical school, and supported, with longitudinal continuities 1991), and the UCLMS from 1970s - 2000s. medical school attainment at in educational Cohort Study (entered Sex and ethnic performance also secondary school achievement from clinical studies in 2005 differences were also predicted MRCP(UK) predicting performance secondary school and and 2006 analysed in light of the performance and being in undergraduate and medical school to changing demographics on the GMC post-graduate medical MRCP(UK) and the of medical students over Specialist Register. A- assessments, and the specialist register in UK the past decades. levels correlated effects spanning many medical students and somewhat less with years. doctors. BMC Medicine. undergraduate and 2013;11(242). post-graduate performance, 70

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Alex Matthews-King. IMG and BME students Medical news media n/a BAPIO and the British n/a RCGP and BAPIO International Doctors collaborate to address Association jointly pass rate discrepancies. developing initiatives to PULSE. 20 June 2014. help IMG and BME trainees Memon M, Joughin G, Postgraduate students Review to examine the n/a Highlights the Calls for high quality n/a Memon B. Oral practice of oral complexity of oral published research to assessment and assessment in assessment as an allay concerns about the postgraduate medical postgraduate medical examination format, transparency and examinations: education in the context and raises concerns fairness of these establishing conditions of the core assessment about the validity, examinations, especially for validity, reliability constructs of validity, reliability and fairness of when assessing IMGs. and fairness. Advances reliability and fairness. such an assessment The article concludes by in Health Sciences procedure for the award proposing 15 conditions Education. of certification of under which oral 2010;15(2):277-89. completion of the assessment is valid, specialist training. reliable and fair. Millett David. Exclusive: IMG and BME students Press release ahead of n/a n/a BAPIO describe 2014 as n/a BAPIO hails watershed the 2014 BAPIO a ‘watershed year’ in year for BME doctors. conference which the judicial GPonline. 27 November review ‘woke up the 2014. establishment’.

Nash Sally. RCGP exam Candidates taking the Medical news media n/a Narrowing gap between n/a results reveal narrowing MRCGP white UK and other gaps between UK and graduates taking the overseas graduates. examination as a result PULSE 23 January 2015. of ‘a number of critical interventions’.

71

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Patterson Fiona, Responds to Woolf’s Ethnic differences in Editorial. New research n/a Denney Mei-Ling, meta analysis (2011) attainment are a methodologies could Wakeford R, Good D. consistent feature of provide original insights. Fair and equal medical education in the Four key areas to guide assessment in UK. The most further research are postgraduate training? substantial differences presented, ranging from British Journal of being doctors taking design issues to General practice. postgraduate analysing outcomes in 2011:712-3. examinations as IMGs. practice.

Patterson Fiona, La- Candidates for GP Commissioned Multi-stage project Consistent findings in Appropriate and High quality study Band Analise, Koczwara selection equalities impact comprising a desk GP equal opportunities realistic efforts need to Generalizability = direct Anna, Spicer John. GP project review and data data over time, show be made to understand collection. National Selection that the largest group and reduce group Focusing on Equality Process: Equalities and Diversity issues in differences in differences in Impact. 2012. relation to selection. performance in the performance. national selection Additional analysis of process relate to place national selection data of medical qualification, required. Annual with UK trained equalities impact candidates significantly monitoring required. outperforming others. Facilitation of qualitative research

72

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Plint S, Patterson F. The general practitioner Relatively little research Describes the history of The recruitment The key success factors n/a Identifying critical recruitment process on developing selection the development of the process is a robust have been identified as success factors for introduced machine methodology for entry GP recruitment for national process which corporate commitment postgraduates designing selection markable short listing to postgraduate has high reliability and to the goal of a national processes into assessments for the first training. predictive process, with gradual postgraduate specialty time in the UK validity, and is perceived convergence training: the case of UK postgraduate to be fair by candidates maintaining locus of general practice. recruitment context, and and allocates applicants control rather than the Postgrad Med J also adopted selection equitably across the imposition of change 2010;86(1016):323-7. centre workplace country. without perceived simulations. legitimate authority. Rimmer A. Royal The ways in which Royal Medical news media n/a Highly varied n/a colleges must improve colleges gather development of equality data on diversity of information on and diversity training exam candidates BMA protected characteristics across colleges. says. BMJ careers. 23 of candidates taking Call by the BMA for the January 2014. exams colleges to match public sector requirements (section 149 of the Equality Act 2010) Rimmer A. RCP is to Report on RCP Medical news media n/a RCP looking to train and n/a highlight gap in presentation at the include lay people in performance between BAPIO conference assessment. overseas doctors and Issues faced by IMGs are UK graduates. BMJ not unique to the UK Careers. 02 December 2014.

73

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Rimmer A. BME doctors RCP raising awareness Medical news media n/a RCP looking to train and n/a less likely to be offered among its examiners include lay people in postgraduate training about differential assessment. posts than white attainment between UK Issues faced by IMGs are doctors, says GMC. BMJ and IMGs not unique to the UK Careers. 12 March 2015. Roberts Celia, Atkins IMGs and BME UK- To investigate the Quantitative and Decontextualised nature The need for focused High quality study Sarah, Hawthorne trained graduates. extent to which qualitative of the \CSA makes it training, support and Generalizability = direct Kamila. Performance For the purpose of this linguistic and cultural sociolinguistic methods ‘talk heavy’ requiring preparation for specific supported by features in clinical skills study all graduates who factors contribute to communicative fluency. areas of the exam ethnographic assessment: Linguistic trained abroad both poor performance. information. Communicative and cultural factors in from the EU and To raise awareness performance factors the Membership of the elsewhere are included among examiners, GP Videoed 198 candidates contribute to gap in Royal College of General in the category IMG. trainers and candidates over 2 exam diets and a success rates. Practitioners of the linguistic and detailed analysis of 40 Higher rates of examination. London: cultural demands cases and reviewed CSA misunderstanding with paperwork Centre for Language, of the CSA exam. role play patients. Discourse & Multilingual expertise of Communication, Kings IMGs not assessed College London, 2014.

74

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Rushd S, Lndau AB, 1335 doctors graduating Evaluate the variations Comparison of graduate Graduates of UK There is variation in High quality study Khan JA, et al. An in UK medical schools in performance of UK performance of UK medical schools performance but a lack Generalizability = direct analysis of the who entered the Part 1 medical graduates in the medical schools in the performed differently in of evidence on whether performance of UK MRCOG and 822 doctors MRCOG examination. two parts of the MRCOG the Part 1 and Part 2 graduates from medical graduates in taking the Part 2 MRCOG examinations. written MRCOG different medical the MRCOG Part 1 and written examination for The main outcome examination. schools perform Part 2 written the first time between measures were to No gender difference in differently in examinations. Postgrad 1998 and 2008. evaluate medical school the success rates postgraduate Med J effects, gender effects candidates in the Part 1; examinations. 2012;88(1039):249-54. and academic however, female performance effect. candidates had a significantly better success rate in the Part 2 written examination than male candidates Rushd S, Landow AB, 11,863 candidates who To evaluate the Retrospective analysis Candidates from A variation in High quality study Lindow SW. An appeared for the first performance of IMGs in using RCObs/Gyne different bands performance among Generalizability = direct evaluation of the first time in Part 1 and 5336 the MRCOG Part 1and database. performed differently IMG from different time performance of in Part 2 (2000-2010) Part 2 written geographical regions in international medical examinations. the Part 1 and Part 2 graduates in the Candidates were written MRCOG MRCOG Part 1and Part grouped according to examinations 2 written examinations. geographical bands European Journal of of Obstetrics and Gynaecology and Reproductive Biology 2013: 166: 124-126

75

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Sandhu DP. Current IMGs Changes to DoH, Home Opinion piece Their very success and IMGs are a remarkably n/a dilemmas in overseas Office, and deanery media publicity about successful professional doctors' training. regulations with general practice and group in the United Postgrad Med J expansion of medical consultant shortages, Kingdom making up to 2005;81(952):79-82. schools, implementation has led to a large influx 30% of the NHS work of European Working of inexperienced force. In 2003 a record Time Directive, doctors seeking training 15 549 doctors joined Modernising Medical opportunities in the medical register of Careers, and the future competitive specialties. which 9336 doctors role of the Dissemination of were non-European Postgraduate Medical realistic information Economic Area citizens. Education and Training about postgraduate Board, will have an training opportunities is important impact on important as the NHS IMGs’ training. for some time will continue to rely on IMGs. Schrewe B, Frost H. Examines the place of Response to earlier n/a Asks, which common n/a Finding potential in the individual in the paper on diversity. qualities make us balance: navigating the context of the Discussion of the physicians and to what competing discourses of profession. tension between the extent can individual diversity and discourse of diversity variation around these standardization. and the discourse of qualities be supported Academic Medicine: standardization before the very essence Journal Of The of the profession begins Association Of American to dissipate? Medical Colleges 2012;87(11):1479-79.

76

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Shaw Q. High failure References Esmail A, Letter n/a Unintended institutional Data needs to be re- n/a rate of ethnic minority BMJ 2013;347:f5662. (26 racism resulting from analysed to expose the groups in MRCGP exam September.) changes to the RCGP importance of language comes from changes to combined with a skills. exam and candidate disincentivization for selection. BMJ (Clinical deaneries to recruit Research Ed) more students. 2013;347:f6442-f42. Smits PB, Verbeek JH, A follow-up study of 118 Gender difference in The following personal After multivariate Gender and learning High quality study Nauta MC, et al. Factors doctors on a post- learning and contextual variables analysis female gender style found to be related Generalizability = predictive of successful graduate occupational were measured as was positively related to to an increase in indirect learning in postgraduate health training potential predictors of accruements in both knowledge. medical education. Med programme on the outcome: gender; age; knowledge and Gender was also found Educ 2004;38(7):758-66. management of mental years of experience as a performance to be related to health problems. doctor; university of independently of the improvement graduation; learning influence of other in performance after a style (Kolb); present factors. Accommodator postgraduate medical employer (occupational learning style showed a education programme. health service), and relation with knowledge educational format increase but had no (problem-based or influence on lecture-based). performance. The PBL format yielded a better performance outcome but had no influence on knowledge tests.

77

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Stamm M, Buddeberg‐ 326 doctors (172 SwissMedCareer Study made use of a This study confirmed Formal mentoring High quality study Fischer B. The impact of women, 52.8%; 154 Study, assessing longitudinal design to the programmes could Generalizability = mentoring during men, 47.2%) personal characteristics, investigate the impact positive impact of reduce barriers to indirect postgraduate training from a cohort of medical the of mentoring during mentoring on career mentorship and on doctors’ career school graduates possession of a mentor, postgraduate success in a cohort of promote the career success. Medical participating in the mentoring support specialist training on the Swiss doctors in a advancement of female Education. prospective provided by the career success of longitudinal design. doctors in 2011;45(5):488-96. SwissMedCareer Study. development network, doctors. However, female particular. and career success. doctors, who are mentored less frequently than male doctors, appear to be disadvantaged in this respect.

78

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Thomas B, Manusov EG, In 2010, one of the Increasing the number Qualitative research The authors identified success for black Low quality study Wang A, et al. authors, a black man, of black physicians in using in-depth six broad contributors to men is achieved via a Generalizability = Contributors of black interviewed 10 black medicine is a goal that interviews to determine successful admission balance between indirect men's success in male medical students continues to receive characteristics and and completion of educational admission to and enrolled at Florida State attention from individual experiences medical school: social experiences, graduation from University College of researchers and medical that contribute to black support, education, psychosocial– cultural medical school. Medicine and 3 black schools. Currently, black men’s success in being exposure to the field of experiences, and Academic Medicine: male physicians Americans constitute admitted to and medicine, group personal attributes and Journal Of The associated approximately 13% of graduating from medical identity, faith, and social individual perceptions. Association Of American with that school, using the U.S. population. school. responsibility. This information can be Medical Colleges consensual qualitative They account, however, used by medical schools 2011;86(7):892-900. research methodology for only 4% of the U.S. to strengthen their to analyse the data. physician workforce. outreach programs, provide a theoretical construct for discussion and research, and generate questions for future quantitative studies.

79

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Tiffin PA, Illing J, Kasim 53 436 UK based trainee To determine whether Observational study International medical PLAB test used for High quality study AS, et al. Annual Review doctors with at least one use of the PLAB linking ARCP outcome graduates were more registration of IMGs is Generalizability = direct of Competence competency related examination system data from the UK likely to obtain a less not generally equivalent deaneries with PLAB to the requirements for Progression (ARCP) ARCP outcome reported used to grant satisfactory outcome at test performance and UK graduates. This may performance of doctors during the study period, registration for demographic data held ARCP compared with UK be addressed by raising who passed of whom 42 017 were international medical by the GMC. graduates. the standards of English Professional and UK medical graduates graduates results in language competency Linguistic Assessments and 11 419 were equivalent postgraduate required as well as the Board (PLAB) tests international medical medical performance, as pass marks for the two compared with UK graduates who were evaluated at (ARCP), parts of the PLAB test. An alternative might be medical graduates: registered following a between UK based to introduce a different National data linkage pass from the PLAB doctors who qualified testing system. study. BMJ: British route overseas and those who Medical Journal obtained their primary 2014;348. medical qualification from UK universities. Tolan AM, Kaji AH, A total of 77 residents Electronic Residency Retrospective USMLE scores were The ERAS application is High quality study Quach C, et al. The from two (one university Application Service correlation of data only predictive of useful for predicting Generalizability = electronic residency and one community (ERAS) points found in the ERAS Medical Knowledge. subsequent competency indirect application service based university- application with core Multivariable analysis based performance in application can predict affiliate) general surgery competency- based showed honors in surgical residents. accreditation council for residency programs clinical rotation Ob/Gyn, female gender, Receiving honors in the graduate medical were included in the evaluations. older age, and total surgery clerkship, which education competency- analysis. The overall competency number of honors to be has traditionally carried based surgical resident score was defined as an predictive of a number weight when evaluating performance. J Surg average of all 6 of individual core a potential surgery Educ 2010;67(6):444-8. competencies and competencies. resident, may not be as technical skills strong a predictor of future success.

80

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Vaughan S, Sanders T, Participants were An identified Data from a cross- Although significant Because of ethnic High quality study Crossley N, et al. sampled across the four discrepancy between sectional social network patterns of ethnic and homophily, minority Generalizability = Bridging the gap: the hospital placement sites; the achievement level of study conducted in one religious homophily students may be cut off indirect roles of social capital a total of 158 medical White students and that UK medical school are emerged, no link was from potential and and ethnicity in medical students in their clinical of their ethnic minority presented and are found between these actual resources that student achievement. phase (Years 3 and 4) peers. The processes analysed alongside factors and facilitate learning and Medical Education completed the survey. underlying this disparity examination records achievement. Lower achievement. 2015;49(1):114-23. have not been obtained from the levels of the social adequately investigated medical school. capital that mediates or explained. Study utilises social interaction with peers, network analysis to tutors and clinicians investigate the impact may be the cause of of relationships on underperformance by medical student ethnic minority achievement by students. ethnicity, specifically by examining homophily.

81

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Wakeford R, Denney M, 2,284 candidates who In the UK, Analysis of performance Correlations between High correlations High quality study Ludka-Stempien K, had taken one or more underperformance of on knowledge-based MRCGP and MRCP(UK) between MRCGP and Generalizability = direct Dacre J, McManus I C. parts of both ethnic minority doctors MCQs (MRCP(UK) Parts were high. BME MRCP(UK) support the Cross-comparison of assessments, MRCP(UK) taking MRCGP has had a 1 and 2 and MRCGP candidates performed validity of each, MRCGP & MRCP(UK) in typically being taken 3.7 high political profile. Applied Knowledge Test less well on all five suggesting they assess a database linkage years before MRCGP. Substantial performance (AKT)) and clinical assessments (P < .001). knowledge cognate to study of 2,284 differences between examinations (MRCGP Correlations both assessments. candidates taking both white and BME doctors Clinical Skills disaggregated by Whilst the reason for examinations: undoubtedly exist. Assessment (CSA) and ethnicity were the differential assessment of validity Understanding ethnic MRCP(UK) Practical Complex. performance is unclear, amd differential differences can be Assessment of Clinical CSA changed its scoring the similarity of the performance by helped by comparing Skills (PACES)). method during the effects in independent ethnicity. BMC Medical the performance of study; multiple knowledge and clinical education. 2015;15:1. doctors who take both regression showed the examinations suggests MRCGP and MRCP(UK). newer CSA was better the differences predicted by are unlikely to result PACES than the previous from specific features of CSA. either assessment and most likely represent true differences in ability.

82

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Watmough S, Bowhay This study summarises From June 1990 to Data were collated from Candidates from Some graduates who sit High quality study A. An evaluation of the the performance of February 2008, there RCA spreadsheets for Australia, New Zealand, UK postgraduate exams Generalizability = direct impact of country of graduates by country of were 9315 attempts at each attempt of the South Africa, Zimbabwe may require additional primary medical primary medical the MCQ by 5797 primary examination and the UK performed support prior to taking qualification on qualification in part one graduates from 70 from June 1999 to May significantly better than these examinations. performance in the UK of the UK RCA countries, with 25 2008 from the main RCA the mean for the group Royal College odf examination from 1999 countries having trainee database. and candidates from Anaesthetists' to 2008. candidates who made Candidates were ranked Egypt, Iraq, Ireland and examinations. Medical 15 or more attempts into groups according to Pakistan performed Teacher. 2011;33:938- the country of PMQ significantly worse. 40. for the overall final percentage mark of the MCQ section of the Primary Fellowship of the Royal College of Anaesthetists examination. Woloschuk W, Medical school To determine whether Residency program Correlations between Measures of High quality study McLaughlin K, Wright B. graduates (Classes undergraduate directors assessed the undergraduate and the undergraduate Generalizability = Is undergraduate 2004–2006) at the end performance is performance of medical two postgraduate performance appear to indirect performance predictive of the 1st postgraduate predictive of school graduates measures were low be poor predictors of of postgraduate year. postgraduate (Classes2004–2006) at (.03–.31). performance in performance? Teach performance. the end of the 1st residency that consisted Learn Med postgraduate year. of two primary 2010;22(3):202-4. dimensions (clinical acumen and human sensitivity).

83

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Woolf K, Cave J, 27 year 3 medical To explore ethnic Qualitative study using The existence of a Asian clinical medical High quality study Greenhalgh T, Dacre J. students and 25 clinical stereotypes of UK semi structured one to negative stereotype students may be more Generalizability = Ethnic stereotypes and teachers, purposively medical students in the one interviews and about their group also likely than white indirect the underachievement sampled for ethnicity context of academic focus groups. raises the possibility students to be of UK medical students and sex. underachievement of Data were analysed that underperformance perceived from ethnic minorities: A London medical medical students from using the theory of of medical students stereotypically qualitative study2008 school. ethnic minorities. stereotype threat from ethnic minorities and negatively, which 2008-08-18 09:33:44. may be partly due to may reduce their stereotype threat. learning by jeopardising their relationships with teachers. Woolf K, Potts HWW, Medical students and To determine whether Systematic and meta Ethnic differences in More detailed High quality study McManus IC. Ethnicity doctors from different the ethnicity of UK analysis. The study academic performance information to track the Generalizability = and academic ethnic groups were trained doctors and included quantitative are widespread across problem as well as indirect performance in UK included. medical students is reports measured the different medical further research into its trained doctors and related to their performance of medical schools, different types causes is required. Such medical students: academic performance. students or UK trained of exam, and in actions are necessary to systematic review and Design Systematic doctors from different undergraduates and ensure a fair and just meta-analysis. BMJ review and ethnic groups in postgraduates. They method of training and (Clinical Research Ed) undergraduate or have persisted for many of assessing current and 2011;342:d901-d01. postgraduate years and cannot be future doctors. assessments. dismissed as atypical or local problems

84

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

Woolf K, McManus IC, Two consecutive cohorts To investigate whether Participants were Ethnic differences in the UK-trained medical HIGH QUALITY Potts HWW, et al. The of Year 5 (final year) UCL demographic and administered a final year performance students and doctors mediators of minority Medical School students psychological factors questionnaire that of two cohorts of UCL from minority ethnic ethnic (n = 703; 51% minority mediate the relationship included a short version medical students were groups underperform underperformance in ethnic). A total of 587 between ethnicity and of the NEO-PI-R, the not due to differences in academically. It is final medical school (83%) had previously final examination Study Process psychological or unclear why this examinations. The completed a scores. Questionnaire, and demographic factors, problem exists, which British Journal Of questionnaire in Year 3. the General Health which suggests makes it difficult to Educational Psychology Participants were then Questionnaire (GHQ) as alternative explanations know how to address it. 2013;83(Pt 1):135-59. followed up to final year well as socio- are responsible for the (2007–2010). demographic measures. ethnic attainment gap in Questionnaire medicine. responses and final examination grades were compared using univariate tests. The effect of ethnicity on final year grades after taking into account the questionnaire variables was calculated using hierarchical multiple linear regression.

Action needed to end Comment on the Medical news media n/a college exam disparity. outcome of the judicial BMA. 11 April 2014. review

85

Reference / Data Population & Setting Perspective (the Intervention or Test Outcome (findings) Conclusion Quality assessment of Collection (quantitative (Context) objective and used to evaluate the study and or qualitative?) standpoint of the study) participants Generalizability

GPs seek exam help for GP trainers Medical news media n/a CSA identified by GP Specifically tailored international doctors. trainers as contributing training required. BMA. 14 Jan 2013. to low pass rates for Suggests that all CSA IMGs exams are videoed Low ethnic minority BMA annual executive Medical news media n/a exam pass rates sparks meeting calls for Royal call for research. BMA. colleges to publish 24 June 2013. analysis of their exam results Differing pass rates Concerns raised by GPs Medical news media n/a IMGs identified as not a raise concerns about over the validity of the homogenous group MRCGP exam. BMA. 24 MRCGP given the May 2013. disparity in pass rates between UK and IMG candidates GMC report highlights Doctors in training Medical news media n/a Relates to the GMCs exam disparity. BMA. 17 publication of March 2015. Interactive reports to investigate factors that affect progression of doctors in training

86

Appendix 2: Quality evaluation of studies using primary data

Was there a Was the Were there Was the Was the data Have ethical Has the Was the data Is there a How Generalizability clear research any issues recruitment collected in a issues been relationship analysis clear and valuable is statement of design relating to the strategy way that taken into between the sufficiently thorough the the aims if the appropriate to selection of appropriate to addressed the consideration researcher rigorous? statement of research? research? address the the aims of the measurement the aims of research ? and findings? research? s and the research? issue? participant categories in been the research adequately project? considered?

High / Low / High / Low / High / Low / High / Low / High / Low / High / Low / Direct / Unclear Yes / No / Yes / No / High / Low / High / Low / Unclear Unclear Unclear Unclear Unclear Unclear Unclear Unclear Unclear Unclear / Indirect

Abdulghani Yes Unclear High High High High Unclear High High High Indirect (2014) Bibbo ( 2014) Yes Yes Low High High High High High High Unclear Indirect Bohay (2009) Yes Yes High High High High High High High High Direct Cuddy (2007) Yes Yes High High High High High High High High Indirect Denney (2013) Yes Yes High high High High High High High High Direct Dewhurst Yes Yes High High High High High high High High Direct (2007) Esmail Yes Yes High High Unclear High High High High High Direct (2013)a Esmail Yes yes High High Unclear High High High High High Direct (2013)b Farrokhi- Yes Unclear High High Unclear High Unclear Unclear Unclear High Indirect Khajeh-Pasha (2014) Hawtin (2014) Yes Yes High High High High High High High High Direct Huijskens Yes Yes High High High High Unclear High High High Indirect (2010) Illing (2009) Yes Yes High High High High High High High High Direct Johannessen Yes Yes High High High High High High High High Indirect (2013) Jolly (2011) Yes Yes High High High High High High High High Indirect McManus Yes Yes High High High High High High High High Indirect (2008) 87

McManus Yes Yes High High High High High High High High Direct (2013) McManus Yes Yes High High High High High High High High Direct (2114) Patterson Yes Unclear High Unclear Unclear Unclear Unclear Unclear High Unclear Direct (2012) Roberts Yes Yes High High High High High High High High Direct (2014) Rushd (2012) Yes Yes High High Unclear High High High High High Direct Rushd (2013) Yes Yes High High High High High High High High Direct Smits(2004) ; Yes Yes High High High High High High High High Indirect Stamm (2011) Yes Yes High High high High High High High High Indirect Thomas Yes Yes High Hihg Low Unclear Low High High High Indirect (2011) Tiffin (2014) Yes Yes High High High High High High High High Direct Tolan (2010) Yes Yes High High High High High High High High Indirect Vaughan Yes Yes high high High high High High high High Indirect (2015) Wakeford Yes Yes High High High High High High High High Direct (2015) Watmough Yes Yes High High High High High High High High Direct (2011) Woloschuk Yes Yes Unclear High High High Low High High High Indirect (2010) Woolf Yes Yes High High High High High High High High Indirect (2011) Woolf (2008) Yes Yes High High High High High high High High Indirect

88