Toxicology 202 (2004) 1–20

Fraud, errors and gamesmanship in experimental toxicologyଝ

Iain F.H. Purchase∗

University of Manchester, Oxford Road, Manchester M13 9PT, UK Available online 3 August 2004

Abstract

We expect moral behaviour from scientists. Morality implies being a good person and being good at one’s profession. The general view appears to be that the vast majority of scientists aim to achieve these high standards. Science prides itself on the ‘self-correcting’ mechanism in the scientific method, namely the requirement to reproduce findings before they are taken seriously. However, when findings are related to the adverse effects of chemicals there are several features that make this less effective than in some other fields of science. First, is the perception that everyone is exposed to chemicals and observations about chemical danger are immediately applicable to many people. Second, it is often easy to summarize adverse findings in attention-getting headlines seen by the lay public before the slow process of replication and interpretation has time to work. Third, most regulatory toxicology studies on a particular compound are only done once to minimise cost and the use of ani- mals. Finally, the question posed about chemicals – are they safe? – is easy to ask but more difficult to test with appropriate studies. Fabrication of data in regulatory studies was found to occur in several contract laboratories in the 1960s and this lead directly to the introduction of Good Laboratory Practice regulations. Now studies submitted for regulatory purposes must comply with GLP regulations and this has virtually eliminated flawed studies due to fraudulent or careless behaviour. It is possible to discern different ways in which the expected standards have not been met. The first is in the intention of the work. Thus reports that the Roodeplaats Research Laboratory in South Africa was seeking to identify toxins that would kill without trace is an example where the intention is unacceptable. The second is in the conduct of the studies. Here the examples of William McBride and Michael Briggs who falsified data are pertinent. The example of the retraction of reports on the toxicity of ecstasy because the wrong compound had been administered indicates a degree of carelessness in the conduct of the study. The third is in the design and interpretation of studies. The report that genetic modification per se could render potatoes toxic has been criticised because of the inappropriate design and interpretation of the studies. Finally, that the reports of studies are biased because of conflicts of interest. Journals often require a declaration that the author has no financial conflict of interest. However, there are many other conflicts of interest with just as large an impact on the author’s impartiality which are omitted from consideration. Gamesmanship has also entered the practice of toxicology, for example where strong assertions about conflict of interest are used to justify particular points of view. The main casualty from fraud, errors and gamesmanship is the perceived status of science itself. It is only gamesmanship that is on the increase. The remedies for these activities are explored. © 2004 Elsevier Ireland Ltd. All rights reserved.

Keywords: Fraud; Errors; Gamesmanship

ଝ PATON prize award lecture 2004. ∗ Tel.: +44 1620 515 458; fax: +44 1620 586 396. E-mail address: [email protected] (I.F.H. Purchase).

0300-483X/$ – see front matter © 2004 Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.tox.2004.06.029 2 I.F.H. Purchase / Toxicology 202 (2004) 1–20

1. Introduction many cases of fraud and error in scientific activi- ties have been identified in other ways (e.g., whistle- It is a great honour to be asked to deliver the Paton blowing) suggests that the self-correcting mechanism Prize lecture. Sir William Paton, a former chairman is only partly successful. of the British Toxicology Society, endowed the Paton prize at the BTS as a means of encouraging the study of the history of science. His interests covered a wide 2. What is misconduct, fraud, error and area of science. He first came into prominence as a gamesmanship? consequence of his discovery of the methonium salts which caused neuromuscular and ganglion block. Hex- There have been many attempts at defining miscon- amethonium was used to reduce blood pressure during duct, both from a legalistic and practical point of view. surgical procedures. But he also became involved in Some argue that the final output of the scientific many other projects, such as hyperbaric physiology, process, the scientific paper, is itself fraudulent. This the mechanism of action of cannabis and the control is because ‘it represents a mythical reconstruction of of the doping of racehorses (Rang and Walton, 1996). what actually happened. All of what are in retrospect During his time as chairman of the BTS he was re- mistaken ideas, badly designed experiments and in- sponsible for encouraging the Society to look at the correct calculations are omitted. The paper represents use of the LD50 test and this resulted in the proposal to the research as if it had been carefully thought out, modify the methods used to study acute toxicity (the planned and executed according to a neat, rigorous BTS method). The BTS method provides information process, for example involving testing of a hypothe- on acute toxicity while reducing the adverse effects on sis. The misrepresentation of the scientific paper is the animals. formal aspect of the misrepresentation of science as Scientific investigation is widely believed to be the an orderly process based on a clearly defined method’ pursuit of objective truth with a view to providing ben- (Martin, 1992). It could also be argued that the publi- efits for humankind. The pursuit of truth is through the cation of scientific work overemphasizes the positive development and testing of hypotheses using reliable aspects of the work and exaggerates its quality and methods. In this respect, toxicology is no different, for could therefore, be called triumphalist. Part of the rea- while it’s principle aim is to uncover adverse effects, son for this is that scientific journals’ policy for the this is done with the purpose of discovering the ‘truth’ selection of manuscripts for publication favours ad- about the toxicity of a particular chemical. Such knowl- vances in science rather than statements of the status edge is essential in protecting human health and the quo. There are few journals with negative results in the environment. title. Misconduct in the form of fraud and carelessness The individual scientist is regularly faced with a seems out of place in this type of endeavour, because choice about what to include in a paper. There is cer- we expect that those involved in any enterprise should tainly no room for a description of the details of how be acting morally (Oderberg, 2000). That is, we expect individual experiments were conceived and conducted, that their essential objective is to carry out their task of where mistakes were made. The dividing line be- well. Thus, we expect that a toxicologist would aim to tween what is acceptable and what is unacceptable in carry out their toxicology work well and to act in the that choice is decided by unwritten rules of behaviour right way. determined largely by what authoritative scientists find It is argued that one defining characteristic of the acceptable and convey to scientists in training. scientific method is that it is self-correcting. Through Nevertheless, there are certain activities that all hold replication and verification of experimental work and to be unacceptable because they distort the search for by peer review of work proposals and reports of that ‘truth’. The literature on this topic identifies fabrica- work, the effects of error, bias and deception are elim- tion, falsification and plagiarism as the principle con- inated (Grayson, 1995). Many cases of fraud and er- cerns. The US Federal Policy on Research Misconduct ror have been identified through the operation of these provides the following definition (Rennie and Gunslas, ‘self-correcting’ mechanisms. However, the fact that 2001). I.F.H. Purchase / Toxicology 202 (2004) 1–20 3

2.1. Federal policy on research misconduct mean laboratory experimental work aimed at identify- ing the toxicity of chemicals and their mechanisms of 1. Research misconduct defined action. Research misconduct is defined as fabrication, falsi- Experimental toxicology, although practised for fication, or plagiarism in proposing, performing, or many centuries, came into its own as an experimen- reviewing research, or in reporting research results. tal subject early in the 20th Century. Chemicals be- • Fabrication is making up data or results and came important in commerce and affected lives in many recording and reporting on them. ways (through health care, agriculture, fabrics, house- • Falsification is manipulating research materials, hold products etc) leading to the interest in their safety. equipment, or processes, or changing or omitting Scientific methods of breeding and maintaining experi- data or results such that the research is not accu- mental animals were established, methods for growing rately represented in the research record. cells in culture and analytical methods for detecting • Plagiarism is the appropriation of other person’s small quantities of chemicals were also developing ideas, processes, results, or words without giving from the 1930s or before. Thus, both the incentive and appropriate credit. the methods for studying the toxicity of chemicals pro- Research misconduct does not include honest error vided an impetus to the subject. or differences of opinion. The practice of toxicology has developed rapidly in 2. Findings of research misconduct the intervening years. Much experimental work is de- A finding of research misconduct requires that: signed to describe the toxicity of new chemicals for • There be a significant departure from accepted regulatory purposes. Equally important are the inves- practices of the relevant research community; and tigative studies aimed at understanding the mechanisms • The misconduct be committed intentionally, or of toxicity either as an aid to understanding basic bio- knowingly, or recklessly; and logical processes or as a means of improving the ability • The allegation be proven by a preponderance of to extrapolate laboratory experimental observations to evidence. humans. Regulatory toxicology [the safety assessment of Definitions such as this one provide a legal frame- chemicals to meet regulatory requirements] started in work for investigating and making a finding on whether earnest in the 1920s and 1930s. Initial studies at the certain actions should be considered as research mis- FDA in the USA began with human volunteer trials of conduct. I have taken misconduct to be equivalent to common food additives. Ethical and other concerns led fraud. to the use of experimental animal studies. Gradually, In this paper, I have not included the question of pla- particularly late in the 20th Century, regulations were giarism, on the basis that, while plagiarism is clearly introduced to set common standards for toxicity testing. misconduct, it usually does not have an impact on ad- In the 1970s, good laboratory practice (GLP) regula- vances in knowledge. tions were introduced. Nowadays, we are all familiar I have, however, included gamesmanship in this es- with the processes of harmonisation of testing guide- say, even if it is not strictly misconduct. Gamesmanship lines for medicines (through the International Confer- is where the normal paradigm of the self-corrective ence of Harmonisation) and other chemicals (through mechanism in science, that is verification and scien- the OECD). tific review, is not followed. Rather, in place of scien- Regulatory toxicity studies do require special at- tific criticism, the focus of the criticism is the scientist tention because they are not regularly repeated, thus or the scientist’s affiliation. invalidating the most important self-correcting mech- anisms on which science relies. They also have the potential to affect the lives of many people directly 3. Is toxicology a special case? as a result of their contribution to safety assessment. Brock (2001) argues that toxicological studies in drug In this report, I will concentrate on issues arising development require special attention for these reasons. from the field of experimental toxicology – by which I There are many reasons why regulatory toxicology is 4 I.F.H. Purchase / Toxicology 202 (2004) 1–20 rarely repeated, including the cost and ethical concerns rightness of an action. It is this argument that supports about the unnecessary use of experimental animals. In- the consideration of intention in the judgement of ac- vestigative toxicology is similar to other areas of bio- tions. logical science in that studies may be replicated more Man is endowed with free choices about what ac- frequently and hence it can not be argued that they are tions to take in that his activities are under the control of a special case. his free will. Moral goodness and badness can be seen I have argued that regulatory toxicology requires through the operation primarily of acts of free will. In special attention in the context of misconduct, because modern language, acts of free will are spoken of as in- of the lack of opportunity for replication. The appli- tentions. Thus, the idea of the goodness or badness of cation of GLP assists in that respect, in that it pro- an act of will, can be expressed by saying that it is the vides other controls, not imposed on most scientific intention with which something is done that gives it endeavour, in order to ensure high standards. There are moral character (Oderberg, 2000). of course other scientific activities that have similar One problem with the philosophical argument that controls, e.g. the conduct of clinical trials controlled moral character can be deduced from intention is the by Good Clinical Practice. But the majority of scien- problem that certain actions can have both good and tific activity does not have such regulatory controls, bad consequences – the so-called Principle of Double although working to guidelines such as those of Good Effect (Oderberg, 2000). Thus, we may choose to carry Scientific Practice may help to reduce transgressions out research on the toxicity of ricin in order to establish (Rils, 2001). how certain biological process operate, but at the same time provide information to a terrorist about how to use ricin more effectively. Thus, an act that would be 4. Howdo wedecide on the correctness or judged as good because of its intention, could be bad otherwise of actions? because of some aspect of its outcome.

The common approach to judging the rightness or wrongness of an action is to consider the outcome 5. The processes of toxicological research of that action. If the outcome provides benefits, it is considered to be a moral action. This consequentialist For the purpose of considering the probity of the approach to the ethics of scientific activities is the pre- conduct of research, the processes involved in scien- dominant analytical paradigm in use in ethics applied to tific research, including investigative toxicology, can scientific issues today. It claims that individual action be divided into several steps: and public policy should aim at producing the great- est happiness to the greatest number. In the context of 1. The intention or motive that drives the scientist to the practice of experimental toxicology, the rightness carry out the research. of particular actions taken by toxicologists would be 2. The design of the experimental work. assessed by considering the outcome and the benefits 3. The conduct of the experimental work. that accrue from that outcome. 4. The outcome of the work and publication of the Some moral philosophers consider that the exclu- results. sive use of utilitarian ethics does not provide a sufficient basis for considering the morality of particular actions 5.1. Intention (Oderberg, 2000). They argue that a more traditional view of ethics, which considers matters additional sim- I have discussed the ethical argument that the ply to the outcome of an action, overcomes some of the intention of the person carrying out an act has a major problems with consequentialist thinking. For ex- strong bearing on a judgement of its moral charac- ample, if the judgement of the rightness of an action ter. In the toxicological context, the issues that im- is based exclusively on the outcome of the action, then pact on the scientist’s intention include scientific is- concepts such as ‘rights’ of the individual, or the in- sues (discovering toxic effects or mechanisms of ac- tention of the scientist, have no place in judging the tion or investigating biological processes) and other is- I.F.H. Purchase / Toxicology 202 (2004) 1–20 5 sues (commercial ambitions, environmental concerns). sible for the adverse effects seen in rats, was novel The latter issues form part of a consideration of and had widespread implications for the future of GM conflict of interest, which is addressed later in this crops and food. A frenzied debate then occurred in the paper. media, with scientists, politicians and single interest groups expressing their views. About a year later the 5.1.1. Project Coast (Case study 1) manuscript was published. A striking example of a research programme in The basic design of the experiment was that six which the intention of the work provides the basis of groups of six rats were fed diets of potatoes (either criticism is found in the description of Project Coast. parent potatoes, potatoes modified by the expression Here the intention of the research programme carried of a plant lectin (GNA) or potatoes with added GNA; out by scientists at the Roodeplaats Research Institute the potatoes were either raw or boiled). There was an in South Africa was described as: ‘the Holy Grail of increase in stomach crypt length that was greater in all research was the perfect murder weapon: a taste- the GNA-GM potatoes than in the parent potatoes with less, colourless, odourless toxin that could not be traced added GNA. It was thus concluded that the stimula- post mortem’. From the context of the evidence given to tory effect of GNA-GM potatoes on the stomach was the South African Truth and Reconciliation Commis- mainly due to the GNA transgene in the potatoes. Thus sion, it is clear that this research was aimed at develop- the insertion of the gene, rather than the expression of ing methods of killing those opposed to the apartheid the gene’s products, was claimed to be responsible for regime. The research was unsuccessful, and hence a the stimulatory effect on the stomach. judgement of its ethical basis solely on outcome would There are extensive criticisms of this work. Some be difficult. However, the intent was clear: to invent a of them are particularly pertinent for the toxicologist. tool that could be used to kill others unlawfully. This The assessment of the toxicity of major components was therefore, an unacceptable intention for a research of the diet is challenging, because toxicity testing has programme. traditionally involved several dose levels of a chemi- cal, administered at multiples of the expected human 5.2. Experimental design dose without affecting the quality of the diet. Changes in food consumption and food conversion ratios are The next step in the consideration of the toxicologi- used to identify toxic effects of chemicals. This ap- cal research is the design of the experimental work. The proach is not possible with a major dietary component intention of the project may be entirely justifiable, but and particular care has to be taken to ensure that di- if the design of the experiments is inappropriate they etary composition is not altered unintentionally by the will not achieve the intentions. component under consideration. It is worth noting the comments made earlier about One major problem with the experiment reported by the way in which the style of writing scientific publi- Ewen and Pusztai (1999) was that the rats were fed an cations avoids describing the mistakes and errors made inadequate diet, in that the protein content (6%) was during the research programme. There are, however, very low for growing rats. There were also differences examples of published work where the experimental between the chemical composition of GM potatoes and design is inadequate to support the intention and the the parent potatoes other than in the expression of the conclusions of the work. gene. Raw potatoes are indigestible and are not a suit- able diet for growing rats. It is clear, therefore, that 5.2.1. Genetic modification is toxic? (Case study 2) the comparability of the diet between groups and the The work of Ewen and Pusztai (1999) has many in- nutritional suitability of the diets for growing rats are teresting facets. One that pervades any consideration important confounding factors in the interpretation of of the published work is that the conclusions that Pusz- the experiment. However, even a simple indication of tai drew from this experiment were in the public do- the impact of diet on the rats’ health, such as weight main (in 1998) following a television interview he gave gain, is not reported. before the results were published. His main assertion, A second major criticism of this work is that a sin- that GM manipulation of potatoes was in itself respon- gle control group (parent potato) is inadequate for the 6 I.F.H. Purchase / Toxicology 202 (2004) 1–20 authors to draw their conclusion that the insertion of of chemicals (Case study 3). This is a clear example of a gene was responsible for the changes. An additional fraud. control, a GM potato with the promoter connected to a non-functional gene, would be required for the conclu- 5.3.1. Is Ecstasy that bad? (Case study 4) sion to be valid. A third major criticism is that only a An example of how incorrect study conduct can single dose level was used. Thus, there was no oppor- have widespread effects is the report by Ricuarte et al. tunity to assess whether increasing effects were seen (2001). He and his colleagues reported that Ecstasy with increasing expression of the gene. given to primates at doses intended to replicate the There is a list of the other criticisms of the experi- doses used by people caused dopaminergic neurotox- ment in the Case study. icity, which is known to lead to Parkinson’s disease. Ewen and Pusztai have continued to defend their Subsequent correspondence commented that it was a work. From the voluminous correspondence on this surprising that ecstasy was dopaminergic and that the topic, it seems clear that Ewen and Pusztai’s intention human equivalent doses used were toxic to primates. when carrying out this work was to obtain informa- Although Ricuarte et al. (2003b) defended their work, tion on the effects of genetic modification and of GNA, when they tried to repeat it they found that the original an entirely acceptable intention. However, the design bottles had been mislabelled and that the primates had of the experiment reported by them had major limita- been given amphetamine. The information about the tions. This was an experiment that was not sufficiently potential of ecstasy to produce Parkinson’s had been well designed to test the hypothesis that genetic ma- widely disseminated and had an effect on the approach nipulation was itself toxic. In particular, the more gen- to the control of drugs. However, in this case, there is eral conclusion that the insertion of a transgene caused no evidence of fraud; rather it was an error that led to adverse effects is unwarranted as only one experiment the incorrect conclusions. with one gene had been conducted. However, the exper- iment was sufficiently informative to provide the basis 5.4. Outcome of the work and publication of for the design of a more specific experiment capable results of testing the hypothesis about the effects of inserting genes into potatoes. 5.4.1. Oestrogen synergy (Case study 5) If, as I believe, the design of the experiment reported Another case where false data were reported in- by Ewen and Pusztai was inadequate to support the volved a study of the synergy between oestrogenic conclusions drawn from it, does this constitute wrong chemicals. The effect of two weakly oestrogenic chem- doing on the part of the authors? They clearly believe icals was amplified by 1000-fold when they acted to- that their experiment and conclusions were valid as is gether. This was of profound importance, because of evidenced by Pusztai’s continued defence of their con- concerns about the effects of oestrogens on human clusions. If indeed the experimental design was flawed, health. Several scientists from other laboratories tried the criticism is that errors were made. unsuccessfully to repeat the experiment using similar and different methods. Finally, the paper was with- 5.3. Conduct of experimental toxicology drawn because the data were unreliable (McLachlan, 1997). The accuracy and reliability of experimental work is In this example, the results were found not to be ver- at the heart of good science, for inaccurate or incorrect ified after replication; thus the self-correcting nature of observations lead inevitably to incorrect conclusions. the scientific process (Grayson, 1995) is seen to work. Unfortunately there are several examples of fabrication There is little doubt that it is also an example of fraud, of data in toxicology studies or errors in their conduct in that the experiment was either not carried out or the that have had serious consequences. data were altered. The most startling example of misconduct in this Clearly the final step in the research process, where field occurred in the 1960s and 1970s. Several contract the results of the experiment(s) are written up and sub- testing laboratories in the USA were found to have fab- mitted for publication is important because it informs ricated results of tens of thousands of tests on the safety the scientific community about the advances in science I.F.H. Purchase / Toxicology 202 (2004) 1–20 7 from the experiment. The publication process is con- 1989, p. 162). McBride was found guilty of fraud and strained by the requirements of the editors and publish- removed from the register of medical practitioners. ers of scientific journals. Nevertheless, the scientific There is no doubt that reporting fabricated results as authors must leave out inconsequential detail and in- if they had been properly part of the conduct of a study clude all information relevant to the conclusions drawn. is fraudulent. But there are additional issues raised by The judgements about what to leave out and what to this case. The original experimental design was flawed, include are not determined by written guidance (im- in that no controls were used and the use of six rab- possible given the wide range of scientific disciplines), bits for such a study would be considered as too few. but what is considered acceptable by senior scientists Once again, the experimental design was totally inad- – called the political scientific elite (Martin, 1992). equate to establish whether the compound being tested was teratogenic. Indeed, one commentator reflects that 5.4.2. Debendox and all that (Case study 6) McBride was not trained as a scientist and had access William McBride, the clinician who identified the to public funds without the usual scientific processes teratogenic effects of , was found to have of review of proposals and of peer review (Humphrey altered the results of an experiment designed to test the 1992). teratogenicity of scopolamine. McBride believed that anti-cholinergic agents such as dicyclomine, an ingre- 5.4.3. Safety of oral contraceptives (Case study 7) dient of Debendox used for treating morning sickness, A similar example of fraudulent generation of re- were potentially teratogenic. He therefore instructed sults by Michael Briggs has also been reported. his technician, Phil Vardy, to carry out an experiment involving the administration of scopolamine, another 5.4.4. Potency of environmental oestrogens (Case anti-cholinergic agent, to pregnant rabbits. The exper- study 8) iment was carried out as requested and the next that Over the last two decades there has been an increas- Vardy knew on the subject was when reprints bearing ing interest in the potential effects on human health his name, and that of another technician, Jill French, and the environment of chemicals with beneficial uses as co-authors arrived at the institute. It was clear to (pesticides, plastics) that also have oestrogenic prop- him that the reported experiment was different from erties. The scientific community has responded with a the one he had carried out. After a long period in which vast range of publications and regulators are ponder- Vardy’s views were challenged by Foundation 41 man- ing the best way to control this potential risk. One area agers, an investigation was carried out into the matter. of controversy is the potential effect(s) of chemicals It transpired that the manuscript had been submitted to with oestrogenic effects on male reproduction and par- Toxicology and Applied Pharmacology and rejected. ticularly whether these chemicals have effects at low McBride then altered the experimental results by in- doses. This latter concern is because some authors have creasing the number of animals reported to have been claimed that effects are seen at doses far below those used. This revised manuscript was then submitted to that are used for regulatory decisions based on the no Acta Pharmacologia et Toxicologia and rejected. Fur- adverse effect level. ther alterations to the experiment were made, princi- The particular example I have chosen is that of the pally by altering the doses, and it was submitted to effect of Bisphenol A (BPA) on prostate weight of mice the Australian Journal of Biological Sciences that pub- (Ashby, 2003a). A small study in 1997 reported that ex- lished it. posure of mice in utero to BPA increased the weight of McBride defended the changes by asserting that the prostate gland in adult offspring at doses as low additional work had been carried out by a colleague, as 2 ␮g/kg (Nagel et al., 1997). Similar, larger experi- Professor Langman who had subsequently died, in ments have been reported by two other groups (Ashby the USA and that he had added the results to the et al., 1999; Cagen et al., 1999) without seeing any manuscript. However, the Foundation 41 Committee changes in prostate weight even at much higher doses. of Inquiry concluded that it was highly unlikely that Nevertheless Frederick vom Saal (the corresponding Professor Langman had treated the animals but this author in Nagel’s publication) argues the correctness could not be proved beyond reasonable doubt (Nicol, of the result at 2 ␮g/kg and criticises technical aspects 8 I.F.H. Purchase / Toxicology 202 (2004) 1–20 of the experiments from the other groups. Indeed he Other funders, such as governments and foundations, goes as far as to call others’ work pollution of the lit- have political and economic points of view that they erature with false negative results. The self-correcting may want to see reflected in the research results.’’ They mechanism in science relies on repetition as a means then go on to provide guidance on how to manage these of verifying that results are reproducible. Vom Saal has issues. neither published a repeat of his own work nor Ashby’s The Society of Toxicology has produced guide- work. Advocating that effects are seen at low doses, lines for their journal Toxicological Sciences (Lehman- when there is no evidence of its reproducibility and McKeeman and Peterson, 2003). It requires authors and indeed evidence that it is not reproducible in other lab- reviewers to declare a conflict of interest, particularly oratories is a form of gamesmanship. by declaring sources of funding and other potential fi- nancial or other interest that may be perceived to bias research. It is noteworthy that all the four examples 6. Conflict of interest given involve financial conflicts of interest.

A major concern pervading the public’s trust in sci- 6.1. Re:Regulatory toxicology and pharmacology ence is that of conflict of interest. While the major concern has been in financial interests that may bias Recently Elsevier received a letter from 45 authors, a person’s objectivity, other issues are also pertinent. apparently co-ordinated by James Huff and supported An excellent summary of the issues involved in con- by the Center for Science in the Public Interest (CSPI) flict of interest is given in Jordan Cohen’s Presidential (CSPI, 2002). This letter made a number of assertions address to the Annual Meeting of the Association of about ‘the apparent conflicts of interest, lack of trans- American Medical Colleges (Cohen, 2001). Although parency and the absence of editorial independence of his comments refer primarily to clinical research, the the journal Regulatory Toxicology and Pharmacology’ principles are as relevant to toxicological experimenta- (RTP). As I am here addressing the question of conflicts tion. He argues that institutional and investigator’s con- of interest, I will not comment further on the other mat- flicts of interest challenge the objectivity of the work. ters raised. However, a conflict of interest does not imply a particu- The journal is the official publication of the Inter- lar behaviour but describes a state of affairs; in our par- national Society of Regulatory Toxicology and Phar- lance it is a hazard, but not necessarily a risk! Indeed, he macology, which is in turn supported by a number of argues that such associations may be the very stimulus industries. Members of the editorial board also have that drives scientists to do their best work. The conflict affiliations with industry. Thus, the main thrust of this may be financial, in which case specific safeguards are assertion, namely, that there as a strong affiliation with needed, but may also be non-financial (career advance- industry, is correct. ment, peer recognition, garnering grants and publica- Further assertions are also made in the letter. The tions). His definition is: ‘‘a conflict of interest exists most significant of which are: the large number of whenever an individual or an institution has a primary industry-affiliated scientists and lawyers on the edito- allegiance that requires certain actions and, simultane- rial board of RTP is indicative of the pro-industry bias ously, has a secondary interest that (1) could abrogate of RTP, RTP editorials commonly support industry’s that primary allegiance and (2) is sufficiently tempting anti-regulatory goals and the preponderance of publi- to raise the possibility that it might actually do so’’. cations are by industry-funded scientists. The World Association of Medical Editors (WAME, 2003) has provided policy statements about many of the 6.1.1. Industry-affiliated scientists matters discussed here. They have published a policy It is quite clear that RTP has strong links with in- on the journals’ role in managing conflicts of interest. dustry through its editorial board and authorship. What They state: ‘‘organisations that pay for research may is not so clear is the affiliation of those writing the let- have a vested interest in the results. Industry has a legit- ter to Elsevier. A superficial examination of the au- imate interest in the profitability of its products, which thors affiliations reveals that some have a declared can be affected by the results of published research. or undeclared association with single interest groups I.F.H. Purchase / Toxicology 202 (2004) 1–20 9

(such as CSPI, Environmental Defense, MCS Refer- tal toxicology, because similar instances are found in ral and Resource, Cancer Prevention Coalition) or with other branches of science. What makes these examples Government-sponsored institutes (such as the National noteworthy is that they are in the field of toxicology Institute of Environmental Health, Institute of Public where the findings often affect the lives of many peo- Health). Cohen (2001) definition of conflict of interest ple. is that any interest that interferes with an individual’s Probably the most serious examples of fraud come primary interest can create a conflict. So what is the from the contract laboratories in the 1970s, where primary interest in publication of a manuscript in a tens of thousands of studies were falsified (Case study scientific journal? For all scientists it should be to re- 3). The immediate consequence of this was that the port on high quality science accurately and fully. So it companies that held registrations for those compounds can be argued that those with industry affiliation have were required to retest them, at considerable expense. a potential for COI, because the profitability of their This scandal provided the stimulus for setting up GLP company and thus, their own job security may be influ- regulations, first in the USA and, soon afterwards, in- enced by the publication of a scientific report critical ternationally. These regulations embody a set of princi- of a company product or supporting further regulation. ples that provide a framework within which laboratory But equally, those working for single interest groups studies are planned, performed, monitored, recorded, are keen to promote the success of their organisation reported and archived. on which their job security rests. Those working for The application of GLP procedures to toxicity test- government sponsored institutes may have an interest ing has undoubtedly improved the reliability of reg- that reflects the government’s view of regulations or ulatory toxicity. By changing the focus for ensuring toxic risk. These interests are just as powerful in influ- reliability from the individual toxicologist to the pro- encing individual’s to compromise their science as are cedures necessary for the conduct of toxicology, it has the industrial affiliations. The latter are much easier to helped to reduce misconduct and errors. However, it is identify and quantify, particularly since disclosure fo- worth commenting that GLP cannot eliminate miscon- cuses on financial interests. But the interests of others duct. Even with the most careful application of GLP,in- may be just as persuasive, but less measurable. cluding comprehensive audit, there are probably some remaining cases of misconduct. On the other hand, 6.1.2. Pro-industry editorials the GLP regulatory authorities, who conduct regular The letter asserts that RTP has pro-industry editori- inspections of laboratories involved in regulatory tox- als that support industry’s anti-regulatory goals. I think icology testing, have not reported major areas of non- here we come to the heart of the criticism of RTP. The compliance. One of the driving forces for this state of authors of the letter appear to disagree with the views affairs is that regulatory toxicology testing laborato- expressed in the editorials that seek to reduce regula- ries require a certificate of compliance with GLP for tion. Some of the authors work in organisations whose their work to be acceptable to regulatory authorities; aim is to improve health by increasing the regulation compliance therefore, becomes a priority task. of chemicals. A further beneficial consequence of working to GLP The authors of the letter have chosen to accuse the is that the general accusation that the scientific paper editors of bad faith through their conflict of interest is fraudulent because of its post hoc selectivity does rather than advancing an argument based on scientific not apply. Good laboratory practice imposes the re- knowledge. Hence, I believe this approach is unscien- quirement that the protocol is clearly defined before tific and have called it gamesmanship. the experiment is started, any changes required in the protocol during the conduct of the study must be noted in good time and that all the results specified in the 7. Discussion protocol must be reported. 7.1. Consequences The publication of fraudulent information purport- ing to show that scopolamine, related to an ingredi- It comes as no surprise that there are examples of ent of Debendox, was teratogenic by William McBride various types of poor practice in the field of experimen- added to the concerns about its safety and contributed 10 I.F.H. Purchase / Toxicology 202 (2004) 1–20 to the drug being withdrawn from the market (Case Parkinson’s in ecstasy users does not match ecstasy study 6). Debendox was an efficacious medicine for users’ experience, they have an excuse for ignoring treating nausea in and had been shown to health warnings in future. be safe; thus women were deprived of a useful and In the Case study on environmental oestrogens safe medicine (Nicol, 1989, p. 146). A second con- (Case study 8), vom Saal has reported effects on sequence was that his technicians Phil Vardy and Jill prostate weight in mice at very low doses; no such ef- French were subjected to a very difficult period of 6 fects have been seen by others. Nevertheless he argues years before the matter was concluded. Thus, the ac- that these effects are real and has not published evi- tions taken by McBride were not just damaging to that dence supporting his results since the negative results particular experiment, but had adverse consequences reported by others. Observations of effects at such low on a wider public. doses are very important from the public health point The fraudulent claim by Arnold that low doses of oe- of view and need to be resolved so that appropriate reg- strogens had synergistic effects was of profound impor- ulatory action can be taken. There is a major difference tance and caused considerable concern (Case study 5). in the regulatory consequences of the control of BPA In this case, however, the retraction of the paper by the if the NOAEL is 5 mg/kg or below 2 ␮g/kg. The obvi- senior author, after others had reported that they were ous way to resolve this issue is for vom Saal to carry unable to reproduce the results, corrected the record. out a repeat study to establish whether the results are The report on the adverse effects of genetic manipu- reproducible in his laboratory. lation of potatoes had a massive impact on the public’s It has been argued very strongly that underreport- perception of GM (Case study 2). Most scientists now ing of clinical trials research is scientific misconduct accept that there were severe faults in the design of (Chalmers, 1990) because it allows clinical practices the experiments reported by Ewen and Pusztai (1999). to continue when they are not appropriate. Failure to Most agree that the claims made by Pusztai, before pub- confirm results in an important field which is so con- lication, to the media that it was genetic modification troversial is a step towards failure to publish known per se which caused the effects were an exaggeration results. This has consequences for the particular field and not warranted by the data available to him. How- and for public trust in science. ever, the publicity surrounding this report contributed to the fact that the majority of the British public are 7.2. Incidence of misconduct now against the use of GM crops in the UK. From the scientific perspective, the problems inherent in the de- The number of cases discussed here is relatively sign of good toxicology experiments are not widely ac- small. Other cases have been reported (Nigg and knowledged. Indeed, toxicology is often viewed as an Radulescu, 1994), but even with these there are rel- applied science with little to offer to basic science. This atively few cases of fraud or error specifically in the example demonstrates that toxicologists have much to field of toxicology in the literature. There is consider- contribute to good experimental design when the safety able disagreement about the likely extent of fraud (in- of foods and chemicals are being investigated. It is rela- cluding plagiarism) in science. Some believed it to be tively easy to ask the question ‘is this safe’. Much more almost non-existent, but it is only recently that formal difficult is the design of the appropriate experiment to mechanisms have been in place in some countries to answer that question. investigate claims of fraud. Even with this mechanism An error made by Ricuarte et al. (2001), where in place, relatively few cases are reported. The US Of- dopaminergic effects on the brain of primates were fice of Research Integrity investigated 150 cases and 76 attributed incorrectly to ecstasy, also had widespread resulted in findings of scientific misconduct during the impact (Anon, 2003) because the doses used in the period 1993–1997 (Abbott, 1999). The Danish Com- study aimed to mimic the recreational use of ecstasy mittee on Scientific Dishonesty considered 25 cases by humans (Case study 4). The conclusions that the over 7 years, and found only four involved dishonesty dopaminergic effects of ecstasy would lead to Parkin- (Abbott, 1999). However, surveys of scientists’ expe- son’s disease received world wide publicity. More rience reveal that a relatively high proportion of scien- importantly is the fear that, if the information on tists claim to have experienced fraudulent behaviour. It I.F.H. Purchase / Toxicology 202 (2004) 1–20 11 is unlikely that we will ever know the true incidence of often the institutes’ response is driven by the need to fraud. defend the reputation of the institute, in other words The incidence of major errors in the literature that it has a conflict of interest. The subsequent enquiries are missed by peer review also appears to be relatively have been prolonged and have had a deleterious effect low. Peer review and verification through repetition are on the whistle-blower. In some cases the investigation the major mechanisms for revealing these errors. is stopped when the accused resigns, as in the case of Briggs. 7.3. Changes over time A review of fraud and misconduct in biomedical research (Lock et al., 2001) covers many of the ap- Reviewing the cases discussed here fails to reveal proaches to reducing fraud. Some countries (USA, any evidence of changes in incidence or seriousness Denmark, Finland, Norway and Sweden) have now of fraud or error over time. The possible exception introduced an independent mechanism for investigat- is in those examples classified here as gamesmanship. ing accusations of scientific fraud. These mechanisms The pressure on scientists to deliver results, measured have many advantages, not the least of which is that through publication records, has never been greater. the results of the investigations are made public. Of Commercial pressures are clearly believed to be im- course, the pharmaceutical industry has a major in- portant, as evidenced by the requirement of scientists terest in combating fraud in the work carried out for to disclose conflicts of interest, with an emphasis on them, and Wells (2001) has described their approach. financial conflicts. The stakes are also high, with gov- Scientific institutes have an important role in prevent- ernments using the output of scientific investigations to ing fraud and error in scientific work. The introduction guide regulatory policy and companies using the results of Good Research Practice modelled on GLP but with of studies to support their commercial success. So, it academic research institutes in mind, should be encour- is not surprising that accusations of conflict of interest aged. Institutional ethics committees also have a role. occur when the regulatory consequences of scientific The mechanism for investigating fraud in the UK work are in dispute. Equally, presentation of a particu- relies on institutional review (Abbott, 2004) and man- lar conclusion from scientific work, such as the effect agement responsibility for implementing Good Scien- of oestrogens at low doses, seems aimed at particular tific Practice, both in terms of its implementation and regulatory outcome. These examples of ‘gamesman- for the training of young researchers. (BBSRC, 1998; ship’ are part of a common trend in science and are MRC, 2000). There are those who believe that the UK frequently seen in the field of toxicology. needs to have a national mechanism for dealing with fraud, based on the experiences of the USA and Scan- 7.4. Remedies dinavian countries (Smith, 2001). Others involved in the scientific enterprise also have The introduction of good laboratory practice regu- an important task. The editors of scientific journals can lations internationally was the response to the major take steps to discourage fraud. The World Association fraud in regulatory testing of chemicals and pharma- of Medical Editors has a significant programme to en- ceuticals. Any study that supports a registration ap- courage editors to take their role in fraud prevention plication must be conducted in compliance with GLP, more seriously. The Committee on Publication Ethics reducing the opportunity for fraud or errors. This means also works to use the editorial process to reduced fraud that fraud and errors are now more likely to occur and misconduct. In their 2003 report of 29 cases re- in studies not conducted to GLP standards, which in- ferred to them, 1 or 2 were related to fabrication of cludes most studies from academia and may of those data and several were of plagiarism. investigating mechanisms of action. Certainly the cases identified here confirm that conclusion. 7.5. Conflict of interest The main criticism of the processes for dealing with accusations of fraud or error is that the response is the The attitude of toxicologists to risk and to the inter- responsibility of the research institute. This was the pretation of toxicological experiments has been studied case with McBride, Briggs, Pusztai and Arnold. Too among members of the British Toxicology Society 12 I.F.H. Purchase / Toxicology 202 (2004) 1–20

(Slovic et al., 1997). The method was to survey at- had about 45,000 chemicals as intermediates or prod- titudes, perceptions and beliefs about risk through a ucts. I had therefore, to check on the computer record questionnaire sent to members of the BTS. Women had whether the company made a particular chemical; if so, higher perceptions of risk than did men, toxicologists I had a conflict of interest, if not I had no conflict of working in industry had a more favourable attitude to- interest to declare. And this was about chemicals I had wards chemicals and their use than did those working never heard of before checking the computer record. in academic settings, older people had a lower percep- It is time that the practical management of conflict of tion of risk than did younger people. When consider- interest was broadened to include interests in addition ing the evaluation of technical summaries of various to financial interests. animal studies based on NTP reports, there was con- siderable disagreement among toxicologists and with 7.6. Gamesmanship judgements made by NTP panels. About 40% dis- agreed that animal studies of this type permit reliable Of all the subjects reviewed here, gamesmanship is extrapolation to humans. the one which appears to be on the increase. I have The interesting issue arising from this work is that chosen two examples here where the characteristic is there is no generally agreed conclusion among toxicol- that arguments are advanced to discredit particular sci- ogists about perception of risks, or about interpretation entist’s work. In one, conflict of interest is advanced of animal studies. This is understandable, because the as a reason to seek wholesale changes to a journal’s concept of risk as a representation of the interaction be- management; if such changes were made, they would tween several variables inevitably leads to differences align the journal’s management along the lines of the of opinion. The assessment of risk is not an exact sci- interests of the authors of the critical letter. In the sec- ence that precludes exact answers. Recently, evidence ond, it is argued that negative results are polluting the is accumulating regarding the importance of ‘world literature; if this were true the advocate’s own results views’ or cultural biases or general dispositions in de- would stand without challenge. In both cases, reliance termining an individual’s perception of risks (Mertz on challenging the scientific basis of the work would et al., 1998). Differences in opinions among toxicolo- be a more appropriate way forward. gists may also be closely associated with differences in With the struggle for influence in presenting scien- ‘world views’. Kraus et al. (1992) have suggested from tific findings, positive results and accusations of unfair- their studies that controversies about chemical risks ness are more likely to reach the headlines. Negative may be caused as much by weaknesses in the science of results and compliance with scientific good practice do risk assessment as by misconceptions (of the public). not make headlines. Thus, there is fertile ground for Thus, my view of the assertions in the letter to El- gamesmanship as an attempt to advance a particular sevier is that there are likely to be genuine differences viewpoint. of opinion between the authors and the editorial board members in respect of, for example regulation. These 7.7. Public trust in science derive from different affiliations and probably different ‘world views’. But does a scientist affiliated to a group The main casualty from fraud, errors and games- promoting the regulation of chemicals have greater ob- manship in science is the perceived status of science jectivity on the subject of chemical regulation than a itself. While scientific endeavour will always have its scientist affiliated to industry? The answer is: no. fair share of uncertainties, errors and disputes, it is mis- The definitions of conflict of interest encompass conduct that eats away at the trust conferred on science many different interests. However, the detailed guide- and scientists (Anon, 2004). The mechanisms of polic- lines given by WAME or by the Society of Toxicology ing scientific activity, such as peer review and Good in their policies focus almost exclusively on financial Scientific Practice, need to be supported and strength- associations which may cause a conflict. This leads ened. Above all, it is transparency of these regulatory to strange outcomes. I can remember, as a member processes and the mechanisms for investigating allega- of the National Toxicology Programme, being asked tions of misconduct that will improve the trustworthi- about conflict of interest. The company I worked for ness of science in the long run. I.F.H. Purchase / Toxicology 202 (2004) 1–20 13

8. Conclusion trial of Dr. Wouter Basson, the Director of Project Coast) that information on the activities in Project Misconduct and errors in experimental toxicology Coast became widely known. do not appear to be on the increase. However, ‘games- Evidence was given to the Commission by Dr. manship’ does appear to be becoming more frequent. Schalk Van Rensburg. He had joined the RRL because All of these activities are bad for science because he believed that the work was defensive in nature and of the impact they have on public trust of science. designed to protect South African Troops. However, They are also bad for people who become involved he became aware that the work of RRL was largely – the scientist who commits misconduct, the whistle- offensive in nature. Of the products delivered to the blower, those investigating cases of misconduct, the Defence Force, 36% were lethal toxins, 36% were ap- gamesman and those on the receiving end of personal plicators, 18% were pathogens that could cause se- criticism. vere illness, 10% were irritants and 3% were psychotic The GLP systems already in place for regulatory agents (CCR, 2000). toxicology studies go a long way towards remedying He claimed that ‘the holy grail of all research was the the problem. Similar approaches for investigative tox- perfect murder weapon: a tasteless, colourless, odour- icology, such as Good Scientific Practice supported by less toxin that could not be traced post mortem. Almost laboratory management, excellence in peer review and as urgent, according to Van Rensburg, was the fruitless editorial overview, can also contribute to reducing the quest for a form of birth control which he believed problem of misconduct and errors. None of these will would have been covertly administered to the bur- have any impact in gamesmanship. geoning black population of apartheid South Africa’. The problem of gamesmanship is not unique to ex- The project involved developing a vaccine able to con- perimental toxicology. So far there is little attention trol human fertility that could be given orally without focussed on it, but I believe that those engaged in the the knowledge of the recipients. He believed that the science and debate of toxicological issues should find project was in line with the World Health Organisa- ways of discouraging such activity. As a start, the op- tion’s attempt to curb rising global birth rates. He was erational definition of conflict of interest should be told by Dr. Basson that the anti-fertility vaccine was broadened from its current focus on financial mat- required so that female Unita soldiers would not fall ters. The editor’s role in this area should also be pregnant. However, Dr. VanRensburg and his colleague explored. It is in the interests of all involved in exper- Dr. Goosen, believed that the intention was to secretly imental toxicology to seek ways of reducing the im- give the contraceptive vaccine to black South African pact of gamesmanship on the publication of research women (Burger and Gould, 2002 p. 31). The project findings. to produce an anti-fertility vaccine was not successful (Burger and Gould, 2002 p. 23).

9. Case studies 9.2. Case study 2:genetic modification is toxic?

9.1. Case study 1:Project Coast Arpad Pusztai, from the Rowett Research Institute, appeared on television in August 1998 to announce that In 1981, the South African Defence Force approved he had carried experiments with genetically modified a chemical and biological warfare project named (GM) potatoes that showed that a diet of GM pota- Project Coast. By 1982 the construction of two pro- toes could stunt rats’ growth and impair their immune duction facilities was approved – Delta G for chemi- system. He was retired from his institute, the Rowett cals and the Roodeplaats Research Laboratory (RRL) Research Institute in August. On 12th February 1999 a for biological products. RRL was completed in 1987 letter was published in the Guardian newspaper from 20 (Burger and Gould, 2002 pp. 19–20). international scientists defending his still unpublished The Truth and Reconciliation Commission took ev- work (POST, 2000). This started a frenzied public de- idence in 1988 from some of those involved in Project bate on the question of the safety of GM food (Enserink, Coast. It is from these public hearings (and from the 1999). 14 I.F.H. Purchase / Toxicology 202 (2004) 1–20

The Rowett Research Institute set up an audit com- were counted in jujenal villi from each of six rats fed mittee to review all the work done on this project, be- diets containing GNA-GM potatoes or parent potatoes, cause of concerns about its quality. The debate contin- both raw and boiled. No measurements were made on ued in the media and also involved scientists. The Royal groups fed potatoes spiked with GNA because GNA Society arranged for 12 documents to be peer reviewed does not induce lymphocyte infiltration. by a panel of six reviewers (which included exper- The presence of GNA in the diet from GNA-GM tise on clinical trials, physiology, nutrition, quantitative potatoes or from potatoes supplemented with GNA was genetics, growth and development, and immunology). associated with an increase in mucosal thickness in the The documents included Pusztai’s project report of the stomach. Crypt length in the jujenum of rats fed raw, but work, the report of an audit committee from the Rowett not boiled, GNA-GM potatoes was significantly greater Research Institute and a detailed statistical analysis of than those given parent line or parent line spiked with the work carried out by an independent organisation. GNA. Rats fed boiled GNA-GM potatoes had signif- The Royal Society published its report in June 1999 icantly thinner caecal mucosae than rats given boiled (TRS, 1999), before Pusztai and Ewen had published parent potatoes with or without GNA. There was also in the scientific literature. an increase in intra-epithelial lymphocyte counts in The Lancet published a research letter from Ewen groups fed GNA-GM potatoes whether raw or boiled. and Pusztai (1999), which only dealt with a single ex- Ewen and Pusztai (1999) concluded that the stimu- periment on the effects of GM potatoes on the intestines latory effect of GNA-GM potatoes on the stomach was of rats (The Royal Society Report deals with more ex- mainly due to the expression of GNA transgene in the periments). The report was of an experiment with six potatoes. The potent proliferative effect of raw GNA- groups of six rats fed raw or boiled potatoes. The groups GM potatoes on the jujenum and the anti-proliferative received (either raw or boiled) the parent potatoes, par- effect of boiled GNA-GM potatoes on the caecum is ent potatoes with added snowdrop lectin (Galanthus only partly due to GNA gene expression. Other parts nivalis agglutinin or GNA) or GM potatoes expressing of the construct or the transformation could have con- GNA for 10 days. Crypt length was measured in the tributed to the overall effect (see Table 1). stomach, jujenum, ileum, caecum and colon. Statisti- cal comparisons were made between the boiled and 9.2.1. Criticism of the experiment raw potatoe groups, and between the groups on the Because of the media debate on this work, many three diets. The number of intraepithelial lymphocytes have commented on the quality of the work reported by

Table 1 showing an extract of the results from Ewen and Puzstai (1999) Mean (S.D.) crypt length (␮m) and difference between treatments Parent Parent versus parent + Parent + GNA Parent + GNA vs. GNA-GM Parent vs. GNA (p) GNA-GM GNA-GM Stomach Boiled 294 [46] 0.29 347 [42] 0.37 339 [36] 0.02 Raw 261 [32] 0.03 312 [32] 0.98 323 [54] 0.07 P 0.18 0.94 0.35 Jejunum Boiled 75 [19] 0.72 78 [17] 0.97 78 [12] 0.71 Raw 57 [8] 0.01 64 [11] 0.01 90 [20] <0.01 P 0.06 0.09 0.24 Caecum Boiled 95 [19] 0.90 98 [21] 0.04 70 [15] 0.05 Raw 132 [19] 0.02 104 [17] 0.25 119 [25] 0.35 P <0.01 0.55 <0.01 I.F.H. Purchase / Toxicology 202 (2004) 1–20 15

Pusztai. He has responded to some of these comments, • Inappropriate statistical analyses were carried out but his responses are not summarised here. on the lymphocyte data. When appropriate compar- isons are made, there are no interpretable differences • The diets were protein deficient, with only 6% (TRS, 1999). protein (Kuiper et al., 1999). The GM potatoes con- • The claim that GNA caused jujenal hyperplasia by tained almost 20% less protein than the parent pota- a direct stimulatory effect on crypt cells cannot be toes (TRS, 1999). substantiated, because no measurement of mitotic • The composition of the diets was not well charac- activity was made (Mowatt, 1999). terised. Pusztai released details of the analyses of the • General inferences should not be drawn from an diets which showed that the content of starch, glu- experiment which used one insertion of one trans- cose polymers, lectin, and trypsin and chymotrypsin gene by one method, with results tested in one inhibitors in GM potatoes differed from the parental strain of one species at one age and at one dose line (Kuiper et al., 1999). But it is not known whether level. these differences are attributable to the GM manip- ulation or not. Thus, it is not possible to tell whether 9.3. Case study 3:the origins of good laboratory the effects seen were due to the transgene, GNA or practice to some other factor. • The rats were underfed. Raw potato is indigestible The most startling example of fraud in this field oc- and a poor diet for rats. curred in the 1960s and 1970s. Several contract test- • The study design was poor. ing laboratories were found to have fabricated results ◦ The number of animals per group was too small. of tens of thousands of tests on the safety of chemi- ◦ There was no control group fed a normal rat diet cals. A major consequence was that many chemicals with and without GNA which would help to in- had to be retested. This scandal provided the stimulus terpret the changes found. for setting up good laboratory practice (GLP) regu- ◦ Only one dose was used, reducing the opportu- lations. The Food and Drug administration were the nity to assess the consistency of response (Kuiper first to propose GLP regulations for non-clinical stud- et al., 1999). ies (FDA, 1976), followed by the US Environmental ◦ A group fed parent potatoes spiked with GNA for Protection Agency (EPA, 1979) and subsequently the the intra-epithelial lymphocyte counts was omit- Organisation for Economic Cooperation and Develop- ted (Lachmann, 1999). ment published the international regulations (OECD, • There was no prior hypothesis which the experiment 1982). These regulations embody a set of principles that was designed to test (Lachmann, 1999). Putszai, in provides a framework within which laboratory studies his reply, stated that it was obvious: ‘‘it was thought are planned, performed, monitored, recorded, reported that comparison of the histological parameters of the and archived. gut of rats fed potato diets containing either GM potatoes or non-GM potatoes with or without being 9.4. Case study 4:is Ecstasy that bad? supplemented with GNA should give a clear indica- tion whether GNA gene insertion had affected the MDMA or ‘Ecstasy’ was reported to cause nutritional and physiological impact of potatoes on dopaminergic neurotoxicty in Squirrel monkeys and the mammalian gut’’. baboons (Ricuarte et al., 2001). The primates were ex- • The method of statistical analyses was flawed, be- posed to several sequential doses of MDMA, a regi- cause multiple comparisons were made and because men modelled after one used by humans. These results changes in the various parameters within one animal implied that, in addition to the serotonergic toxicity were likely to be linked. associated with MDMA, dopaminergic neurotoxicity • To confirm that the effects were due to the transgene, that could lead to Parkinson’s disease would also occur. a group would have been fed a diet of potatoes trans- Indeed, they claimed that humans who used repeated formed only by the use of the promoter connected to doses of MDMA over several hours (as might happen a non-functional gene. at a rave party) are at high risk for incurring severe 16 I.F.H. Purchase / Toxicology 202 (2004) 1–20 dopaminergic neural injury which may put them at risk Subsequently, McLachlan (1997) withdrew the re- of developing Parkinsonism. port. These findings caused alarm and strengthened the anti-drugs campaigners arguments against the use of 9.6. Case study 6:debendox and all that ecstasy (Anon, 2003). The report was viewed with disbelief by many The case of Dr. William McBride is extensively neuroscientists (Anon, 2003). One of the five squir- recorded in the literature because a formal enquiry was rel monkeys and one of the five baboons dosed with carried out into allegations of falsification of data and MDMA died and a further two primates had to have because a reporter became interested and has published the third planned dose omitted because of toxicity. a book with details of the case (Nicol, 1989). If this was a dose regimen based on human use, In the period of 4th May–6th June 1961, Dr. William why was mortality after using MDMA rare in hu- McBride delivered three babies suffering from particu- mans? Several studies, including one by these same lar birth deformities which involved limb reduction. He authors, failed to find reduced dopamine levels in submitted a paper to Lancet which was rejected, but a the brain, or its metabolites in cerebrospinal fluid, in second shorter version was published on 16th Decem- heavy MDMA users. Finally, although d-amphetamine ber 1961. Dr. William McBride came to prominence and d-methamphetamine produce similar effects on because he recognised that all three mothers had re- dopamine levels, there is no evidence linking their use ceived thalidomide during their pregnancy, thus brand- to Parkinson’s disease (Mithoffer et al., 2003). Ricuarte ing thalidomide as a human teratogen. Because of his et al. (2003a) rebutted these concerns. fame he was able to attract public funding for the for- Finally, the report was withdrawn because further mation of Foundation 41, a charitable institution spon- studies had failed to reproduce the findings. As a re- soring studies on birth defects (Humphrey, 1992). sult, the original drug samples were checked. It was We move on to the 1980s. McBride became con- reported that the animals thought to have received vinced that Debendox (also known as Bendectine), MDMA were most likely to have received metham- a medicine used in the treatment of morning sick- phetamine (Ricuarte et al., 2003b). Of course metham- ness, was teratogenic because it contained an anti- phetamine would be expected to produce dopaminergic cholinergic constituent. In August 1980 he arranged neurotoxicty, thus explain the anomalous results. for his technical assistants, Phil Vardy and Jill French, to carry out an experiment in rabbits to assess the 9.5. Case study 5:oestrogen synergy? teratogenicity of scopolamine. Scopolamine, although not a constituent of Debendox, is an anti-cholinergic The subject of the effect of oestrogens on human compound, as is dicyclomine, the anti-cholinergic con- health has been the subject of controversy for some stituent of Debendox. time. In particular the speculation that two chemicals In June 1982, reprints of a paper (McBride et al., with oestrogenic activity could have synergistic activity 1982) were received at Foundation 41 and Phil Vardy became the subject of research. first became aware of its contents. He noted that the Arnold et al. (1996) reported that combinations of number of rabbits reported in the paper was greater than two weak environmental oestrogens (such as dieldrin, the number used in the experiment he had conducted, endosulfan or toxaphene) were 1000 times as potent and that other inconsistencies were present in the pa- as any chemical alone. This was electrifying. The im- per. This was the beginning of a long process which plications were that even low concentrations of these eventually resulted in McBride being found guilty of chemicals in combination would present a risk to hu- deliberately falsifying the experimental results. While man health. the story of those proceedings makes fascinating read- The experiments were carried out in vitro systems ing (Nicol, 1989), this summary is of the main scien- which relied on human oestrogen receptor transactiva- tific issues raised (Nicol, 1989, Chapter 7; Humphrey, tion. Others tried to replicate these findings but were 1992). unsuccessful (e.g. Ashby et al., 1997; Ramamoorthy McBride prepared a manuscript reporting an experi- et al., 1997). ment in which scopolamine was administered to rabbits I.F.H. Purchase / Toxicology 202 (2004) 1–20 17 and chick embryos. The rabbit experiment had been any work on rabbits. However, the Foundation 41 preceded by administration of scopolamine to a few committee of enquiry concluded that it was highly animals to establish the appropriate dose to be used. unlikely that Professor Langman had treated the ani- The manuscript, with Vardy as a co-author, reported mals but this could not be proved beyond reasonable on the administration of scopolamine to six rabbits by doubt (Nicol, 1989, p. 162). injection and six via the drinking water. One of the • The paper claimed that the embryos had been sec- does receiving scopolamine via the drinking water had tioned to examine the brain, palate and thoracic and malformed foetuses. It was submitted to Toxicology abdominal organs, but this had not been so. and Applied Pharmacology with McBride and Vardy as authors and rejected. McBride then made several Little was said about the intramuscular experiment, changes to the text, increasing the number of does in but the data had been altered in that experiment too. each part of the experiment to eight and including eight Apart from the glaring inconsistencies, the work was controls. In the group dosed via the drinking water, one of poor scientific standard, e.g. in that no statistical of the additional does also had deformed foetuses and analysis was carried out and there were no controls. both had resorptions. The doses administered to the six An idea of just how the additional rabbits in the does were different. The paper stated that the foetuses oral experiment made the results much more striking were sectioned to examine the brain, palate, abdominal is given in the tables. and thoracic organs. It was submitted to Acta Pharma- Original manuscript: cologia et Toxicologia and rejected. A typographical error was corrected, Jill French’s name added to the Doe Weight Daily Number Number Number author list and it was submitted to the Australian Jour- number (kg) dose living deformed resoptions nal of Biological Sciences which published it. (␮g/kg) foetuses The original manuscript reported the experiment rel- R67 2.64 895 7 0 0 atively accurately; however, the published manuscript R68 2.62 513 6 0 0 differed substantially from the experiment. An enquiry initiated by Foundation 41 in June 1987 concluded that R69 4.62 504 6 0 0 the paper was fraudulent in that many statements in it R70 2.99 1582 5 0 0 were not accurate (Humphrey, 1992). In his book Bill R71 3.37 473 8 8 1 Nicol questions the accuracy of additional items; to- R72 2.77 624 7 0 0 gether the inaccurate statements were: Total 39 8 1

• Only six rabbits were used in the oral and injection Published report (with the changes in bold): experiment, not eight as stated • There were no control animals as was stated in the Doe Weight Daily Number Number Number paper number (kg) dose living deformed resoptions • The doses in the oral experiment had been altered (␮g/kg) foetuses from those administered to the rabbits. McBride R67 2.64 495 70 0 modified the doses after comments from the editor R68 2.62 513 6 0 0 of the Journal by taking the average of the water in- R69 4.02 504 6 0 0 take by the other four rabbits and substituting it for the seemingly high values (Nicol, 1989, p. 150) R70 2.99 582 50 0 • McBride claimed that the additional rabbits had been R71 3.37 473 8 8 1 part of the experiment carried out in the USA by Jim R72 2.77 424 70 0 Langman, Professor of Anatomy at the University +1 3.42 520 4 4 2 of Virginia a year after the rabbits had been dosed +2 2.96 483 5 0 1 in Australia. Although Professor Langman had since Total 48 12 4 died and could not therefore corroborate the claim, Controls (8) there was no record in Professor Langman’s files of 18 I.F.H. Purchase / Toxicology 202 (2004) 1–20

Pressure from members of the Foundation 41’s Re- In the event, Dr. Briggs resigned from Deakin Uni- search Advisory Committee forced McBride to submit versity just before a committee of enquiry was due to a note to the Journal correcting the original article. An consider his case. initial note, withdrawing Table 2, was not published. An article in the Sunday Times in September 1986 A second note reporting a repeat experiment was pub- revealed that Dr. Briggs admitted to generalising small- lished (McBride, 1983). This meant that the informa- scale findings to large, apparently convincing trials. tion in the original article remained in the scientific Dr. Briggs died in 1986 in Europe. literature (Nicol, 1989, p. 140). In 1993 a New South Wales Medical Tribunal found 9.8. Case study 8:environmental oestrogens – are McBride guilty of 14 counts of scientific fraud (Ragg, they really that potent? 1993a) and he was deregistered as a medical practi- tioner (Ragg, 1993b). Bisphenol A (BPA), an ingredient of plastics used in many products and therefore a chemical to which 9.7. Case study 7:safety of oral contraceptives people are widely exposed, is oestrogenic. In a study aimed at establishing the doses at which BPA induced Professor Michael Briggs was appointed Founda- changes in the secondary sex organs in mice, increases tion Professor of Biology at Deakin University in Aus- were reported in the weights of prostate and preputial tralia in 1977. His expertise was in the safety of oral glands at doses administered in utero of BPA of 2 ␮g/kg contraceptives, a subject for which he had an interna- (Nagel et al, 1997). This dose is 2500 times lower than tional reputation. the NOAEL derived from standard reproductive toxi- ProfessorEJRRossiter was the chair of the Uni- cology studies (5mg/kg) and was thus of considerable versity Ethics Committee. On the basis of concerns ex- importance. pressed to him about the work that Michael Briggs had So important was it that others attempted to repro- published, he reported to the Chancellor of the Univer- duce the results (Ashby, 2003a). Two studies aimed at sity his concerns about Briggs’ work. duplicating the results have failed to find any effects at Michael Briggs published several important papers the doses used by Nagel (Ashby et al., 1999) or at those on the subject of the safety of oral contraceptives. Sev- and higher doses up to 200 ␮g/kg (Cagen et al., 1999). eral allegations were made against him. The first that a These results have been dismissed by a group of partic- compound (desogesterol) which was reported to have ipants, including vom Saal, at conferences as ‘pollution been tested (Briggs and Briggs, 1980a) was at the time of the literature and false negative endocrine toxicity unavailable in Australia and that Briggs did not have an data’ (Ashby, 2003b). First the technical competence import permit. The second allegation was that he had of Ashby’s group was questioned, but a member of vom received payment from a pharmaceutical company for Saal’s group trained Ashby’s group in the technique of results on desogesterol when in fact it was not avail- prostate dissection. Then high levels of phytoestrogens able for testing. The third that a study claiming to use in Ashby’s laboratory diets were claimed to invalidate 291 human volunteers (Briggs, 1980, 1981) did not the results, but the phytoestrogen levels in the diets em- in fact have that number of volunteers and thus the ployed by vom Saal were higher. Then the environment studies were not carried out to the extent claimed. In in the animal rooms was considered an issue and finally this study, blood protein measurements were reported, it was suggested that the mice used in Ashby’s experi- when no such measurements had been made. A fourth ments were too heavy rendering their reproductive tis- allegation was that measurements of FSH and LH re- sues resistant to the effects of BPA-mediated effects; ported in a 1980 paper (Briggs and Briggs, 1980b) however, analysis of all the studies available renders were not carried out. Similarly, a fifth allegation re- this explanation unlikely (Ashby, 2003a). lated to a study in beagle dogs to measure progesterone It is important to note that only seven animals per receptors (Briggs, 1980). It was claimed that the mea- group were used in Nagel’s original experiment while surements were carried out at Huntingdon Laboratories larger numbers were used in the repeats by Ashby and in the UK, when in fact they had not been (Rossiter, Cagen. Thus the argument that BPA causes effects at 1992). 2 ␮g/kg is based on observations on 14 dosed animals I.F.H. Purchase / Toxicology 202 (2004) 1–20 19 in two groups and 11 controls. Ashby used 66 animals erations in Oral Contraception Ed I Brossens, p. 9, 1981 (from in two dosed groups and 54 controls; Cagen used 87 Rossiter, 1992). animals in four dosed groups and 44 controls. Brock, P.A., 2001. A pharmaceutical approach to the threat of re- search fraud, 3rd edition. BMJ Books, London. Nevertheless, Dr. Frederick vom Saal, the corre- Burger, M., Gould, C., 2002. Secrets and Lies: Wouter Basson and sponding author on the original Nagel paper, continues South Africa’s Chemical and Biological Warfare Programme. to advocate that the effects of BPA and other oestro- Zebra Press, Cape Town. genic chemicals occur at low doses and that there is an Cagen, S.Z., Waechter, J.M., Dimond, S.S., Breslin, W.J., Butala, inverted dose response relationship at these low doses. J.H., Jekat, F.W., Joiner, R.L., Shiotsuka, R.N., Veenstra, G.E., Harris, L.R., 1999. Normal reproductive organ development in He has failed to publish any evidence or to demonstrate CF1 mice exposed in utero. Toxicol. Sci. 50, 36–44. that the result is reproducible. CCR, 2000 Centre for Conflict Resolution, University of Cape Town, Private Bag, Rondebosch 7701, South Africa, 2000. http://ccrweb.ccr.uct.ac.za/20b.html. Accessed March 2004. Acknowledgements Chalmers, I., 1990. Underreporting research is scientific misconduct. JAMA 263, 1405–1408. CSPI, 2002. Center for Science in the Public Interest, Press Release, I thank Malcolm Lovibond for constructive criticism November 19, 2002. and Tim Hammond, Gerry Oliver and Suzanne Cotton Cohen, J.J., 2001. Trust us to make a difference. Ensuring public for help in accessing the literature. From 1980 until confidence in the integrity of clinical research. Acad. Med. 76, 1998, I was the Director of the laboratory where John 209–214. Ashby (Case study 8) works. Enserink, M., 1999. The Lancet scolded over Pusztai paper. Science 286, 656. EPA, 1979 Proposed health effects test standards for Toxic Sub- stances Control Act test rules. Federal Register, 44: no. 91, pp. References 27334–27375. Ewen, S.W.B., Pusztai, A., 1999. Effect of diets containing genet- Abbott, A., 2004. Science comes to terms with the lessons of fraud. ically modified potatoes expressing Galanthus nivalis lectin on Nature 398, 13–17. rat small intestine. Lancet 354, 1353–1354. Anon., 2003. Editorial, New Scientist. 13 September, 2003, p. 5. FDA, 1976. Proposed Regulations for Good Laboratory Practice Anon., 2004. All above board. Editorial, New Scientist, 6 March, Regulations. Federal Register, 41: no. 225, 51208-51230. 2004. Grayson, L., 1995. Scientific deception. The British Library, London. Arnold, S.F., Klotz, D.M., Collins, B.M., Vonier, P.M., Guilette, L.J., Humphrey, G.F., 1992. Scientific fraud: the McBride Case. Med. Sci. McLachlan, J.A., 1996. Synergistic activation of estrogen recep- Law 32, 199–203. tor with combinations of environmental chemicals. Science 272, Kraus, N., Malmfors, T., Slovic, P., 1992. Intuitive toxicology: expert 1489–1491. and lay judgements of chemical risks. Risk Anal. 12, 215–232. Ashby, J., 2003a. Endocrine disruption occurring at doses lower than Kuiper, H.A., Noteborn, H.P.J.M., Peijnenburg, A.A.C.M.P., 1999. those predicted by classical chemical toxicology evaluations: the The Lancet 354, 1315–1316. case of bisphenol A. Pure Appl. Chem. 75, 2167–2180. Lachmann, P., 1999. GM Food debate. Lancet 354, 69. Ashby, J., 2003b. Ashby corrected. Endocrine/Estrogen Lett. 9, 8–10. Lehman-McKeeman, L., Peterson, R.E., 2003. Guidelines governing Ashby, J., Lefevre, P.A., Odum, J., Harris, C.A., Routledge, E.J., conflict of interest. Toxicol. Sci. 72, 183–184. Sumpter, J.P., 1997. Synergy between synthetic oestrogens? Na- Lock, S., Wells, F., Farthing, M., 2001. Fraud and Misconduct in ture 385, 494. Biomedical Research. BMJ Books, London. Ashby, J., Tinwell, H., Haseman, J., 1999. Lack of effects for low Martin, B., 1992. Scientific fraud and the power structure of science. dose levels of bisphenol A and diethylstilbesterol on the prostate Prometheus 10, 83–89. gland of CF1 mice. Regul. Toxicol. Pharm. 30, 156–166. McBride, W.G., 1983. Note on the paper ‘Effects of scopolamine BBSRC, 1998. Safeguarding Good Scientific Practice. hydrobromide on the development of the chick and rabbit em- Biotechnology and Biological Sciences Research Council. bryo’ by W.G. McBride, P. Vardy, J. French. Aust. J. Biol. Sci. http://www.bbsrc.ac.uk/funding/overview/good practice.pdf? 36, 171–172. Accessed 15 March 2004. McBride, W.G., Vardy, P.H.,French, J., 1982. Effects of scopolamine Briggs, M., Briggs, M., 1980a. Reproduction 4 (Suppl.), 79 (from hydrobromide on the development of the chick and rabbit em- Rossiter, 1992). bryo. Aust. J. Biol. Sci. 35, 173–178. Briggs, M., Briggs, M., 1980b. The Development of a New Triphasic McLachlan, J.A., 1997. Synergistic effect of environmental estro- Oral Contraceptive, 79 (from Rossiter, 1992). gens: report withdrawn. Science 277, 462–463. Briggs, M., 1980. Res. Vet. Sci. 28, 199 (from Rossiter, 1992). Mertz, C.K., Slovic, P., Purchase, I.F.H., 1998. Judgements of chemi- Briggs, M., 1980/1981. Bulletin of the Post Graduate Committee On cal risks: comparisons among senior managers, toxicologists and Medicine of the 36: 148. and New Consid- the public. Risk Anal. 18, 391–404. 20 I.F.H. Purchase / Toxicology 202 (2004) 1–20

Mithoffer, M., Jerome, L., Doblin, R., 2003. MDMA (‘‘Ecstacy’’) Ramamoorthy, K., Wang, F., Chen, I-C., Safe, S., Norris, J.D., Mc- and Neurotoxicity. Science 300, 1504. Donnell, D.P., Guido, K.W., Bocchinfuso, W.P., Korach, K.S., Mowatt, A., 1999. GM Food debate. The Lancet 354, 69. 1997. Potency of combined estrogenic pesticides. Science 275, MRC, 2000. Good Research Practice. Medical Research Council. 405. www.mrc.ac.uk. Accessed 15 March 2004. Rennie, D., Gunslas, C.K., 2001. Regulations on scientific miscon- Nagel, S.C., Von Saal, F., Thayer, M.G., Dhar, M.G., Boechler, M., duct: lessons from the US experience., 3rd edition. BMJ Books, Welshons, W., 1997. Relative binding affinity-serum modifica- London. tion access (RBA-SMA) assay predicts the relative in vivo bioac- Ricuarte, G.A., Yuan, J., Hatzidimitriou, G., Cord, B.J., McCann, tivity of the xenoestrogens bisphenol A and octylphenol. Environ. U.D., 2001. Severe dopaminergic neurotoxicity in primates after Health Perspect. 105, 70–76. a common recreational dose of MDMA (‘Ecstasy’). Science 297, Nicol, B., 1989. McBride: Behind the myth. ABC Enterprises for 2260–2263. the Australian Broadcasting Corporation, 20 Atchinson Street, Ricuarte, G.A., Yuan, J., Hatzidimitriou, G., Cord, B.J., McCann, Crows Nest, NSW, Australia. U.D., 2003a. Response Science 300, 1504. Nigg, H.N., Radulescu, G., 1994. Scientific misconduct in environ- Ricuarte, G.A., Yuan, J., Hatzidimitriou, G., Cord, B.J., McCann, mental toxicology. JAMA 272, 168–170. U.D., 2003b. Retraction Science 301, 1479. Oderberg, D.S., 2000. Moral theory. Blackwell Publishers, Oxford, Rils, P., 2001. The Concept of scientific dishonesty: ethics, value sys- UK. tems, and research. In: Stephen, L., Frank, W., Michael, F. (Eds.), OECD, 1982. Good Laboratory Practice in the Testing of Chemi- Fraud and Misconduct in Biomedical Research, 3rd edition. BMJ cals. Organisation of Economic Cooperation and Development, Books, London, pp. 3–12. 75775, Cedex, Paris. Rossiter, E.J.R., 1992. Reflections of a whistle-blower. Nature 357, POST, 2000. Parliamentary Office of Science and Technology. The 434–436. ‘great GM food debate’ – a survey of media coverage in the first Slovic, P., Malmfors, T.J., Mertz, C.K., Neil, N., Purchase, I.F.H., half of 1999. Parliamentary Office of Science and Technology, 1997. Evaluating chemical risks: results of a survey of the British Report no. 138. Toxicology Society. Hum. Exp. Toxicol. 16, 289–304. Ragg, M., 1993a. Australia: McBride guilty of scientific fraud. Smith, R., 2001. Meeting calls for a national body to respond to Lancet 341, 550. research misconduct. Br. Med. J. 323, 889. Ragg, M., 1993b. William McBride’s penalty. Lancet 342, 361–362. The Royal Society. 1999. Review of data on possible toxicity of GM Rang, H.P., Walton, Lord P., 1996. Sir William Drummond Macdon- potatoes. Statement 9/99, The Royal Society, London, June 1999. ald Paton, C.B.E. Biographical Memoirs of The Royal Society, World Association of Medical Editors. http://www.wame.org/ 291–314. wamestmt.htm. Accessed on 24 December 2004.