Report on the “Review of the Status of the Development of Alternatives to using Animals in Chemical Safety Testing and Identification of New Areas for Development or Research in the Context of the Proposed REACH Regulation”

May 2005

Report Prepared By:

Liverpool John Moores University and The Fund for the Replacement of Animals in Medical Experiments (FRAME)

Dr Mark Cronin School of Pharmacy and Chemistry Liverpool John Moores University Byrom Street Liverpool L3 3AF

e-mail: [email protected] Tel: 0151 231 2066 Fax: 0151 231 2170

Report on the “Review of the Status of the Development of Alternatives to using Animals in Chemical Safety Testing and Identification of New Areas for Development or Research in the Context of the Proposed REACH Regulation”

A Report Prepared in Partial Fulfilment of Defra Contract Number: CPEC 42

May 2005

Report Prepared for: Report Prepared By:

Dr John Garrod Liverpool John Moores University UK Chemicals Policy and The Fund for the Replacement of

Chemicals and GM Policy Animals in Medical Experiments Division (FRAME) 3/F7 Ashdown House 123 Victoria Street London Dr Mark Cronin SW1E 6DE School of Pharmacy and Chemistry

Liverpool John Moores University Byrom Street Liverpool

L3 3AF

e-mail: [email protected]

Tel: 0151 231 2066 Fax: 0151 231 2170

1 Important Notices

Confidentiality

The contents of this report and its findings are strictly confidential and may not be released to any third party outside of the authors (QSAR and Modelling Research Group, School of Pharmacy and Chemistry, Liverpool John Moores University; Fund for the Replacement of Animals in Medical Experimentation) and the recipients (Department of Environment, Food and Rural Affairs).

Disclaimer

The information and conclusions in this report have been prepared in good faith. Whilst every effort has been made to ensure the accuracy of the information in this report the authors cannot be held responsible for any event or incident that may arise from actions taken as a result of information contained herein.

2 Contents

1. Introduction ...... 4 1.1 Registration, Evaluation and Authorisation of CHemicals (REACH)...... 4 1.2 Alternatives to Animal Testing With Regard to REACH Legislation...... 4 1.3 Purpose of this Progress Report ...... 10 1.4 Bibliography for Section 1 ...... 10 2. Survey of Alternative Methods to Animal Testing Pertinent to the Forthcoming REACH Legislation...... 13 2.1 Introductory Comments ...... 13 2.1.3 Source of information for in silico methods...... 20 2.1.3.1 Expert Systems for Predicting Toxicity and ADME Properties...... 20 2.2 Eye Irritation ...... 27 2.3 Skin Irritation ...... 41 2.4 Skin Corrosivity ...... 48 2.5 Skin Sensitisation ...... 53 2.6 Acute Toxicity ...... 61 2.7 Chronic Toxicity...... 78 2.8 Mutagenicity ...... 84 2.9 Carcinogenicity...... 93 2.10 Development / Reproductive Toxicity ...... 101 2.11 Bioavailability – including Toxicokinetics, Metabolism,...... 116 2.12 Acute Environmental Toxicity ...... 126 2.13 Chronic Environmental Toxicity ...... 134 2.14 Bioaccumulation (Environmental)...... 143 3. Additional Information on In Silico Methodologies With Regard to Their Usage in REACH ...... 147 3.1 International Efforts Associated with QSAR and In Silico Modelling...147 3.2 Regulatory Use of QSAR ...... 148 3.3 Strategies for the Use of QSARs...... 155 3.4 The Application Toolbox for QSAR...... 158 4. REACH Implementation Projects (RIPs)...... 170 4.1 Introduction...... 170 4.2 Description of REACH Implementation Projects...... 170 4.3 Further Information on RIPs ...... 174

3 1. Introduction

1.1 Registration, Evaluation and Authorisation of CHemicals (REACH)

The Registration, Evaluation and Authorisation of CHemicals (REACH) proposal for the regulation of chemical substances was published on the 29th October 2003 by the European Commission (European Commission, 2003a). The proposal had been preceded by an internet Consultation Document concerning REACH which was published by DG ENTR and DG ENV in May 2003. The draft REACH legislation addresses one of the key issues for the safety evaluation of chemicals, namely the long-held view that there is a lack of publicly available data.

A recent publication has evaluated the need for further testing under REACH (Pedersen et al., 2003). This takes into account current existing obligations and voluntary industry initiatives, and is based on a number of assumptions regarding the use of estimation techniques such as (Q)SARs and the outcome of screening tests and risk assessments. The report focussed on the testing costs for REACH but did not address the number of (vertebrate) test animals that could potentially be used as a consequence of implementing the new legislation.

1.2 Alternatives to Animal Testing With Regard to REACH Legislation

In order to counter the increase in the use of animals with regard to the forthcoming REACH legislation, the European Commission has made some suggestions for minimising the use of animals. These are summarised in Table 1.2 below.

Table 1.2. Proposed suggestions by the European Union for minimising animal testing in the REACH system (adapted from Combes et al., 2004)

1. Encouraging the use of validated in silico techniques (including quantitative structure-activity relationships, QSARs). 2. Promoting the development of new in vitro test methods. 3. Minimising the actual numbers of animals used in each of the tests needed. 4. Making obligatory the provision of data, cost sharing and the establishment of a substances information exchange forum to facilitate dialogue. 5. Requiring that all testing must comply with Good Laboratory Practice and other legislation to protect the use of animals in experiments. 6. Replacing animal tests wherever possible by the use of validated alternative methods. 7. Requiring that proposals to undertake tests from Annexes VII and VIII must be officially sanctioned.

4 1.2.1 Animal Testing and the Three Rs

The Three Rs concept as applied to laboratory animal experimentation was first proposed by Russell and Burch in their book “The Principles of Humane Experimental Technique” published in 1959. The Three Rs are summarised as follows (there are many references on this topic, see, for instance, Balls et al. (1995) and Festing et al. (1998):

Reduction

Reduction is defined as a means of lowering the number of animals used to obtain information of a given amount and precision, without increasing the welfare costs to individual animals.

Refinement

Refinement is defined as any development leading to a decrease in the incidence or severity of procedures applied to those animals that have to be used.

Replacement

Replacement is defined as the use of any scientific method employing non- sentient material, which could replace procedures that use conscious living vertebrates.

The gradual adoption of the Three Rs has encouraged better scientific practice. The principles have been accepted to underpin the promotion of alternatives with regard to the REACH legislation (Combes et al., 2004).

While it is recognised that the ultimate goal is to entirely replace animal testing with in silico predictions and in vitro methods of hazard characterisation and risk assessment, it is also recognised that replacement alternatives are likely to be slower in their application to risk assessment than reduction and refinement initiatives, since there are few alternative methods that have the same predictive value as tests conducted in animals. Nevertheless, with an improvement in our understanding of the mechanisms of toxicity, opportunities for alternatives development, reduction and refinement are arising.

A reduction in the number of animals used is heavily dependent on careful experimental design to reduce or share animals, improved statistical analysis and data-sharing, of information such as historical controls. Refinement is generally reliant on high standards of husbandry, environmental enrichment, housing requirements and companionship, the use of minimally invasive methods and an increase in the use of pain relief.

1.2.2 In vitro alternatives

5 In vitro methodologies are those which use cells or parts of tissues/organs from humans or animals to assess chemicals thus avoiding the use of a whole living organism (in vivo testing). They implicitly require the use of prediction models to extrapolate in vitro data such that this information mirrors an in vivo toxicity endpoint.

Many in vitro methods exist and are used in academia and industry for screening. However, for regulatory use a method must be validated to assess its relevance and reliability for a specific purpose. This is necessary to ensure that it is reproducible when carried out by other scientists and in other laboratories and that it generates meaningful data that assists decision making regarding potential hazards of the chemical in question.

Types of in vitro tissue culture systems

There are various types of cell culture systems used for in vitro toxicology testing. Table 1.2.2 outlines their advantages and disadvantages as an aid to the relevance of specific cell culture systems to the different types of in vitro test discussed within this report.

System Advantages Disadvantages

Primary cells • Obtainable from different target • Short lifespan in vitro tissues • Progressively lose in vivo properties • Possess more tissue-specific • Prone to contamination characteristics

Monolayers • Can be grown to confluency and • Simplistic with limited interactions and subcultured between cells monocultures • Used as barrier models • Absence of other cell types, nervous, • Used to quantify cell proliferation/ immune and endocrine systems growth • Suitable for genetic manipulation

Co-cultures • More than one cell type so resemble • Some cell combinations are in vivo situation closely e.g. blood- incompatible with each other in brain barrier models culture • Complicated/conflicting cell culture requirements

Continuous • Readily available and continuous • Tend to lose in vivo differentiation cell ines source of cells and take on properties induced by • Avoids repeated cell isolation from culture conditions animals or humans • enter senescence and decline after a certain number of population doublings

Genetically • Generated by transforming cells with • Techniques are specialized engineered foreign DNA: • Methods do not always lead to cell lines • DNA can confer cell line stability permanent changes • DNA may encode structural or • Limited potential for altering cellular functional proteins features • Used to create polymorphic cell line batteries

6

Immortalised • Generated from human/animal cells • The techniques are specialized cell lines by introducing oncogenes/telomere- • There is not always permanent controlling DNA transformation • Cells have cell line longevity but retain character

Stem cells • Cells are able to differentiate into • Limitations on cell types and that can many cell types be generated • Can be from animals or humans • Some species/strain limitations • Human adult stem cells can be used • Ethical problems when using human subject to availability and consent embryonic stem cells

Tissue slices • Represents complexity of the organ • Difficult to produce • Easy cross-species comparisons • Exposure and activity of cells of slices • Many organs from same donor can can vary be obtained • Allows histological and biochemical tests • Allows co-culturing of slices from different organs • Regional effects in same organ are particularly useful for metabolism studies

Organotypic • Multilayered and spatially • Correct culture conditions can be cultures differentiated difficult to define • Exhibit cellular communication • Batch variation of propriety models • Good retention of in vivo physiology • Limited lifespan • Can be generated from primary/immortalized cells • Proprietary models available

Perfused • Applicable to a variety of the systems • Technically complex cultures above • High risk of contamination • Perfusion restores media and removes metabolites • Allows cells to grow for extended periods • High cell densities possible • Long-term repeat exposure/recovery

Reconstructed • Since they are based on cells taken • Technically complex tissue cultures from a tissue and added to a tissue • High risk of contamination culture model they are more advanced than organotypic cultures

Whole organs • Organ functions modelled closely • Can be difficult to culture • Different cell types with cellular • Limited culture life interactions • Must be freshly isolated from animals Particularly useful for embryotoxicity • Very difficult to acquire human tissue studies

7

Table 1.2.2. The advantages and disadvantages of the various types of tissue culture systems used in in vitro toxicology (table adapted from Bhogal et al, 2005 with permission).

1.2.3 In silico alternatives

In silico technologies for alternatives to animal testing include any approach that attempts to formalise the relationship between chemical structure and biological activity. This include the use of (quantitative) structure-activity relationships ((Q)SARs) and other molecular modelling techniques. The whole area of in silico techniques to predict the toxicity and fate of chemicals has been the subject of a recent review by Cronin and Livinstone (2004).

Current uses of (Q)SARs to predict endpoints of regulatory significance have been reviewed recently by Cronin et al. (2003a, 2003b). These include applications for prioritisation of chemicals for further testing, for classification and labelling and ultimately for the registration of chemicals. The regulatory use of in silico techniques for the prediction of toxicity and fate is currently more advanced in North America than Europe by agencies such as the United States Environmental Protection Agency, Health Canada and Environment Canada.

Within the European Union greater usage of computational techniques to predict toxicity and fate is envisaged (Worth et al., 2004). This is due to the adoption of the -called REACH (Registration, Evaluation and Authorisation of Chemicals) system (European Commission 2003a). Another development within the EU is the 7th Amendment to the Cosmetics Directive (European Commission, 2003b), which foresees the phasing out of animal testing on cosmetics, combined with the imposition of marketing bans on cosmetics that have been tested on animals.

1.2.4 Other Approaches for the Replacement, Reduction and Refinement of Animal Tests

Optimisation of in vivo testing

There is hope that the number of animals in in vivo tests can be reduced. This view has recently been published in a critical analysis of the OECD health effects test guidelines for in vivo testing (Combes et al., 2004). The authors conclude that the opportunities for streamlining individual assays are very limited but they have put forward the view that in vivo testing can be made more efficient by e.g.

8

(a) only perform tests that provide relevant data (b) eliminating redundant tests (c) using one sex, wherever possible (d) applying some tests simultaneously to the same animals (e) by making greater use of screens and preliminary testing.

Read-across

Read across is probably the most effective tool to reduce animal testing under the REACH legislation. The strength is that it combines in vivo testing information (experimental data) and the grouping of chemicals belonging to one family to predict the effects of the other “family members” for which no animal test data are available. A chemical category is a “family” of chemicals that have been grouped together because they share similar chemical structures and/or physicochemical properties, and are consequently considered to share similar environmental, ecotoxicological or toxicological properties. Chemical categories are “designed” on the basis of scientific considerations, including SAR, QSAR and read-across (where an endpoint value or classification for one chemical is used as the best estimate for a related chemical).

Thresholds of toxicological concern

Thresholds of toxicological concern (TTCs) are exposure thresholds values for chemicals below which no significant risk is expected. TTCs have been developed both for human health and environmental health considerations (Bradbury et al., 2004). The approach is basically a worst-case estimate of the toxicity of a compounds expressed as an exposure threshold. If exposure information shows that TTCs will not be reached in the human body, in food or in the environment, this could be used as screening tool to set aside a chemical as being of “no concern”. If the measured or predicted exposure concentration comes close to the TTC this could trigger further information on the toxicity of the chemical. This is certainly a promising approach that can limit testing, when combined with adequate information on the use of and exposure to chemicals and chemicals in products. Until now this approach has not been frequently used.

Exposure-based waiving

Increased realism in the exposure evaluation will allow stakeholders to eliminate a higher number of substances that are of no concern. In other words if the exposure to a chemical can be predicted or measured adequately and the toxicity is much lower than the (predicted) exposure concentration, the animal testing may be waived.

1.2.5 Test Strategies for Implementing Alternatives into the REACH System

9 The implications of the REACH proposals for the levels of laboratory animals used for regulatory testing have prompted an urgent need for the development and adoption of new approaches to toxicity testing and risk assessment. Emphasis is placed on the adoption of integrated testing schemes that make maximum use of non-animal testing methods, and provide a more scientific assessment of human risk from exposure to chemicals. FRAME has proposed an overall scheme whereby non-animal methods can be used in conjunction with each other in a tiered approach, in order to provide a rapid and scientifically justifiable basis for the risk assessment of chemicals for their toxic effects in humans.

The strategy, particularly for existing chemicals, is based on the need for reliable and high quality exposure data first, leading to an assessment of the required hazard information for making useful risk assessment. The scheme starts with a preliminary risk assessment process, involving the use of available information on exposure and hazard, followed by assessment that is initially based on physicochemical properties and (Q)SAR approaches. The (Q)SAR approach is used in conjunction with expert system modelling, and consideration of information on metabolism and the identification of the principal human metabolites (to estimate bioavailability, and relevance to human hazard). This information is then combined with an assessment of potential exposure based on production levels, distribution, and proposed chemical usage, to provide an acceptable and adequate risk assessment. It is recommended that additional testing should only be required where essential information is missing, rather than testing to cover all data gaps according to a generalised, check-list approach. In other words, the official waiving of the need for hazard data by regulatory agencies, on a case-by- case basis, is an integral component of the scheme. The selection of tests to provide this essential information should be based on the use of non-animal methods, as far as possible.

1.3 Purpose of this Progress Report

The purpose of this report is to “Review of the Status of the Development of Alternatives to using Animals in Chemical Safety Testing and Identification of New Areas for Development or Research in the Context of the Proposed REACH Regulation”.

This Report is organised by endpoint in Section 2, with general comments and recommendations in Sections 3 and 4.

Please note that for ease of use, bibliographic information is provided with each sub-section in Section 2.

1.4 Bibliography for Section 1

Bhogal N, Grindon C, Combes R, Balls M (2005) Toxicity testing: creating a revolution based on new technologies. Trends in Biotechnology, in press

10

Balls M, Goldberg AM, Fentem JH et al. (1995) The three Rs: The way forward - The report and recommendations of ECVAM workshop 11. ATLA 23: 838-866.

Bradbury SP, Feytel TCJ, van Leeuwen CJ (2004). Ecological risk assessment: scientific needs in a regulatory context. Environ. Sci. Technol. 38: 463A-470A.

Combes R, Gaunt I, Balls M (2004) A scientific and animal welfare assessment of the OECD Health Effects Test Guidelines for the safety testing of chemicals under the European Union REACH system. ATLA 32: 163-208.

Cronin MTD, Jaworska JS, Walker JD, Comber MHI, Watts CD, Worth AP (2003b) Use of quantitative structure-activity relationships in international decision-making frameworks to predict health effects of chemical substances. Environmental Health Perspectives 111: 1391-1401.

Cronin MTD, Livingstone DJ (2004) Predicting Chemical Toxicity and Fate. CRC Press, Boca Raton, FL, USA.

Cronin MTD, Walker JD, Jaworska JS, Comber MHI, Watts CD, Worth AP (2003a) Use of quantitative structure-activity relationships in international decision-making frameworks to predict ecologic effects and environmental fate of chemical substances. Environmental Health Perspectives 111: 1376- 1390.

European Commission (2003a) Proposal for a Regulation of the European Parliament and of the Council concerning the Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH), establishing a European Chemicals Agency and amending Directive 1999/45/EC and Regulation (EC) {on Persistent Organic Pollutants}.

European Commission (2003b) Directive 2003/15/EC of the European Parliament and of the Council of 27 February 2003 amending Council Directive 76/768/EEC on the approximation of the laws of the Member States relating to cosmetic products (Text with EEA relevance). Official Journal of the European Union L66: 26–35.

Festing MFW, Baumans V, Combes RD, et al. (1998) Reducing the use of laboratory animals in biomedical research: Problems and possible solutions - The report and recommendations of ECVAM workshop 29. ATLA 26: 283- 301.

Pedersen F et al. (2003) Assessment of Additional Testing Needs under REACH. Effects of QSARs, Risk Based Testing and Voluntary Industry Initiatives. Report EUR 20863; EN European Commission, Joint Research Centre: Ispra, Italy.

11 Worth AP, van Leeuwen CJ, Hartung T (2004) The prospects for using (Q)SARs in a changing political environment – high expectations and a key role for the Commission’s Joint Research Centre, SAR and QSAR in Environmental Research 15: 331-343.

12 2. Survey of Alternative Methods to Animal Testing Pertinent to the Forthcoming REACH Legislation.

2.1 Introductory Comments

The following endpoints have been identified as priorities in terms of this review.

Mammalian (Human Health) Endpoints

Eye irritation (Section 2.2) Skin irritation (Section 2.3) Skin corrosivity (Section 2.4) Skin sensitisation (Section 2.5) Acute toxicity (Section 2.6) Chronic toxicity (Section 2.7) Mutagenicity (Section 2.8) Carcinogenicity (Section 2.9) Reproductive/developmental toxicity (Section 2.10) Bioavailability, including toxicokinetics, metabolism, absorption and others (Section 2.11)

Environmental Endpoints

Acute toxicity (Section 2.12) Chronic toxicity (Section 2.13) Bioaccumulation (Section 2.14)

For each of the endpoints listed above the following items will be addressed:

• Established toxicity tests (e.g. OECD Guidelines etc) • Issues relating to the feasibility of applying the 3Rs to the endpoint. In particular this can be addressed by an appreciation of o The mechanistic understanding of each endpoint. o The availability of reliable data for modelling and validation o How these factors may influence the search for alternatives • In vitro methods • In silico approaches • Integrated testing strategies for specific endpoints • Reduction and Refinement opportunities • Scope for future work

2.1.1 Source of information for in vitro methods

The in vitro methods discussed were found though the following reports/websites:

13 - Animal testing and alternative approaches for the human health risk assessment under the proposed new European chemicals regulation. (2004). Hofer T et al, Arch. Toxicol., 78, 549-564.

- Alternative (Non-animal) methods for chemicals testing: Current status and future prospects. (2002). Eds Worth A. and Balls M. ATLA, 30 Supplement 1.

- The Way Forward – Action to end animal toxicity testing. Compiled for the BUAV by Gill Langley (2001). Website: http://www.buav.org/pdf/TheWayForward.pdf

- Prospects for the use of alternative test methods. Compiled by the Institute for Environment and Health (2001). Website: http://www.le.ac.uk/ieh/pdf/HoLfinal.pdf

- European Commission Enterprise DG (alternatives to animal testing for cosmetics) Website: http://pharmacos.eudra.org/F3/cosmetic/AnimalTest.htm

- The Database on Test Method Protocols (INVITTOX). Compiled by the European Centre for the Validation of Alternative Methods (ECVAM). Website: http://ecvam-sis.jrc.it/

The validation status of the resultant methods was determined using the ECVAM/ESAC website: http://ecvam.jrc.it/index.htm

2.1.2.1 Validation status of in vitro alternatives

Many in vitro alternatives for regulatory toxicity testing have been developed over the years. The first in vitro methods were adopted by the OECD in 1986. These were for mutagenicity testing, an area where the development of in vitro tests has been highly successful. There are now eight in vitro methods adopted for mutagenicity testing from twelve such methods within the OECD Health Effects Test Guidelines. Two additional methods are currently under review (see Table 2.1.2.1a).

Before an in vitro test method is adopted by the OECD, it must be fully validated by an independent body such as the European Centre for the Validation of Alternative Methods (ECVAM) or the American Interagency Co- ordinating Committee on the Validation of Alternative Methods (ICCVAM). Validation is the process by which the reliability (reproducibility) and relevance (scientific basis and predictive capacity) of a method are established for a particular purpose.

Test method Description Endpoint OECD/ EU Test guideline

14 reference In vitro skin Uses excised skin (human/animal). Skin absorption TG 428 absorption Monitors substance/metabolite concentrations in receptor fluid.

Transcutaneous Monitors changes in electrical Skin corrosion TG 430, Electrical resistance following 24hr exposure B40 Resistance Test to the test substance as an (TER) indicator of loss of corneum integrity and barrier function. Uses (rat) skin discs from 28-30day old, humanely killed animals.

Human Skin Reconstructed human epidermal Skin corrosion TG 431, Models equivalent (commercial system) B40 (EpiDerm™ used to assess cell viability using EPISKIN™) the MTT reduction test following test substance exposure.

3T3 NRU Balb/c 3T3 (murine) cell line used. Cell NR uptake TG 432, phototoxicity test The cytotoxicity of test substance in B41 the presence of non-cytotoxic levels of light using a Neutral Red Uptake test for cell viability. Not a true alternative; there is no in vivo equivalent.

Corrositex™ An artificial barrier system coupled Skin corrosion Draft TG membrane barrier to a pH-based chemical detection 435 test system which utilises pH indicators and colorimetric determination to monitor movement passage of the test across the barrier.

Bacterial Reverse Detects that revert Mutagenicity TG 471, test mutations present in strains of B13/14 (Ames) bacteria which affect ability of the bacteria to synthesize an essential amino acid. Revertant bacteria are detected by their ability to grow in the absence of the amino acid.

In vitro Cells are exposed to the test Mutagenicity TG 473, Mammalian substance, chromosomal B10 preparations made and subjected Aberration test to microscopy to determine gross structural changes.

In vitro Cell line based assay used to look Mutagenicity TG 476, Mammalian Cell at the mutagenicity of chemicals B17 Gene Mutation employing changes in DNA test encoding a number of enzymes. Relies on functional bioassays being available and mutations causing changes in enzyme activity.

Sex-linked Involves exposure of male fruit flies Mutagenicity TG 477 recessive lethal to the test substance and the

15 test germline transmission of mutations monitored through two successive generations.

Sister Chromatid Cell based assay which involves Mutagenicity TG 479, Exchange assay exposure of cells in culture to the B19 test substance, two rounds of division and then metaphase arrest and chromosomal preparation. Chromatid exchange is monitored by microscopy.

Gene mutation Haploid or diploid S. Cerivisiae Mutagenicity TG 480, assay in yeast strains are exposed to the test B15 substance and growth under different culture conditions used to monitor mutagenic potential similarly to the bacterial reversion test.

Mitotic Used to follow cross-over or gene Mutagenicity TG 481, Recombination conversion following exposure of B16 assay in yeast yeast to the test substance. Relies on different growth requirements of mutated and wildtype yeast strains.

Unscheduled Mainly in vitro cell-based assay. Mutagenicity TG 482, DNA Synthesis in Measures the DNA repair synthesis B18 Mammalian cells after deletions caused by the test substance. Based on the incorporation of radioactive nucleotides into the newly synthesised DNA and monitored by autoradiography or by scintillation counting of DNA from the treated cells.

In vitro Cell-based assay. Supplement to Mutagenicity Draft TG Micronucleus test TG474. Cells are exposed to the 487 test substances with/without metabolic activation, grown to allow chromosome damage and formation of micronuclei in interphase. Cells are stained and analysed microscopically for the presence of micronuclei.

Table 2.1.2.1a. In vitro test methods which are validated and accepted for regulatory use by the OECD, (http://www.oecd.org/home/; methods with draft guidelines are currently under review for acceptance).

For validation, the test must first be optimised and standardised, via a pre- validation study (Curren et al, 1995). The validation process then includes evaluation of inter and intra-laboratory reproducibility, and the predictive ability of the test and its associated prediction model (Worth and Balls, 2004; Hartung et al, 2004). If the in vitro method is proposed as an alternative to an

16 in vivo method it must also give results relevant to the final endpoint of the in vivo test. Once a method has been validated, it must then undergo peer review (at EU and/or at OECD level) before it is accepted by the regulatory bodies (OECD, 2003).

The 7th Amendment of the EU cosmetics regulation, which will halt all animal testing of cosmetic products or ingredients by 2013, has focused research into alternatives for the major toxicity endpoints, especially skin and eye irritation. Similarly, the impending REACH regulation, which will require up to 4 million animals to be used over 11 years for the risk assessment of a large number of chemicals, has increased efforts to develop in vitro methods for regulatory chemicals testing. Table 2.1.2.1b shows the status of various methods which are currently undergoing validation procedures at ECVAM (http://ecvam.jrc.it/index.htm) or ICCVAM (http://iccvam.niehs.nih.gov/home.htm).

Test method Test System Endpoint Comments Validated Methods

EpiOcular™ Human keratinocyte Eye irritation Industry validated, derived model of the may undergo corneal epithelium. ECVAM Assays are based on retrospective changes in barrier validation. ESAC function. document under preparation In vitro CHL/IU, CHO, SHE and Mutagenicity Retrospective Micronucleus test V79 are commonly used validation. cell lines. Cells are Submitted to ESAC exposed to the test for review substances with/without metabolic activation, grown to allow chromosome damage and formation of micronuclei in interphase. Cells are stained and analysed microscopically for the presence of micronuclei.

Embryonic stem 3T3 or embryonic stem Developmental ESAC endorsed for cell test murine cell lines used to toxicity use in a screening examine teratogenic test potential.

Post implantation 10 days post gestation Developmental ESAC endorsed for rat whole embryo rat embryos are toxicity use in a screening test surgically removed, test exposed to the test substance and morphologically is assessed.

Micromass test Rat Micromass cultures Developmental ESAC endorsed for of limb bud are toxicity use in a screening

17 subjected to the test test substance and inhibition of cell differentiation and division monitored.

Colony Forming Murine bone marrow Haematotoxicity Validated for the Unit – Granulocyte cells or human umbilical prediction of acute (CFU- cord blood cells used to neutopenia in GM) Assay evaluate the inhibition of humans as an CFU-G growth to predict alternative to neutropenia. second species testing

Methods Undergoing Validation

EPISKIN™ Reconstructed human Skin irritation Awaiting lab report skin system. Assay is based on the formation of a coloured product in a mitochondrial- dependent reaction (MTT assay). This is dependent on the integrity of barrier function.

EpiDerm™ Similar to EPISKIN™ Skin irritation Awaiting lab report

Neutral Red Based on assessing Basal cytotoxicity Using Balb/c 3T3 Uptake assay uptake of neutral red and NHK cells - dye by cells in culture awaiting lab report following test substance exposure.

Pre-validated Methods

SkinEthic eye Epithelial corneal cell Eye irritation Industry pre- model line used for cytotoxicity validated. ESAC testing based on the document under MTT reduction assay. preparation

Mouse skin Uses ex vivo mouse Skin irritation Failed phase 1 of integrity function skin. Assessment of validation study test (SIFT) mouse stratum corneal integrity using trans- epidermal water loss (TEWL), and electrical resistance (ER) measurements.

Methods Undergoing Pre- validation or Evaluation

Tissue culture Neutral Red release and Eye irritation Models under models silicon microphysiometry review at ECVAM or fluorescein leakage for possible bioassays using human retrospective keratinocytes and validation MDCK cells, respectively.

18 Red blood cell (RBC) haemolysis test.

Organotypic Bovine Corneal Opacity Eye irritation Models under models and Permeability review at ICCVAM (BCOP) using post- for possible mortem corneas, hen`s retrospective egg test on the chorio- validation allantoic membrane (HET-CAM Assay) and Isolated Rabbit and chicken eye tests (IRE and ICE).

Perifusion systems Chronic toxicity

Cell transformation Using SHE and Balb/c Carcinogenicity assay 3T3 cell lines.

Frog Embryo Uses whole embryos Developmental Evaluated in 2000 Teratogenesis and relies on the toxicity – no updates since Assay Xenopus collection of mortality, (FETAX) malformation, and growth-inhibition data.

Modified Leydig Analysis of One/two generation For use as part of cell line progesterone production study test battery as a measure of the test substance effects on hormone production.

Testis slices Assessment of steroid One/two generation For use as part of production capacity of study test battery the Leydig upon toxicant exposure of ex vivo rat tissue.

Human Assay to allow entire One/two generation For use as part of adrenocortical steroid pathway effects study test battery carcinoma cell line to be mapped.

Placental Monitors the ability of One/two generation For use as part of microsomal substances to affect study test battery aromatase assay steroid production. A subcellular microsomal assay is used industrially.

Rat uterine cytosol Based on direct Endocrine Partial replacement assay interaction of test disruption/ substance with nuclear Reproductive oestrogen receptors toxicity

Rat prostate Based on direct Endocrine Partial replacement cytosol assay interaction of test disruption/ substance with nuclear Reproductive oestrogen receptors toxicity

19 Androgen receptor Uses genetically Endocrine Partial replacement transcriptional engineered cell lines disruption/ assay where reporter gene Reproductive expression is regulated toxicity by an androgen- regulated responsive element.

Gut absorption Using Caco-2 cell line to Toxicokinetics/ models monitor gut Bioavailability biotransformation and barrier function

Blood brain barrier Toxicokinetics/ models Bioavailability

Fish Cell lines Acute Environmental Toxicity

Table 2.1.2.1b Methods currently undergoing independent evaluation or (pre)validation studies.

2.1.3 Source of information for in silico methods

2.1.3.1 Expert Systems for Predicting Toxicity and ADME Properties

2.1.3.2 Definition

According to ECVAM Workshop Report Number 24 (Dearden et al 1997) expert systems may be considered to be:

"… any formalised system, not necessarily computer-based, which enables a user to obtain rational predictions about the toxicity of chemicals. All expert systems for the prediction of chemical toxicity are built upon experimental data representing one or more toxic manifestations of chemicals in biological systems (the database), and/or rules derived from such data (the rulebase)."

Since this definition was stated, the role of expert systems has expanded to include the prediction of ADME properties, which will also be included in this report. This latter development is mainly due to the desire of the pharmaceutical industry to develop in silico screens for the assessment of new pharmaceutical substances.

Further details of expert systems are provided in Tables 2.1.3.3-2.1.3.5 and 3.5.

20

21 2.1.3.3 Summary of the Contact Details of Commercial and Non-Commercial Expert Systems for Toxicity and ADME Prediction

Table 2.1.3.3 Contact Details of Commercial and Non-Commercial Expert Systems for Toxicity and ADME Prediction

Name Supplier Web-Site Address Telephone Fax Contact Person (if known) and e-mail (if known) TOPKAT Accelerys Http://www.accelrys.com/p Accelrys Ltd. 01223 228500 01223 228501 Dr Zahra Parandoosh roducts/topkat/ 334 Cambridge Science [email protected] OR Park [email protected] Cambridge CB4 0WN MCASE, CASE, MultiCASE Inc http://www.multicase.com/ MULTICASE Inc. 001 216 831 001 216 831 3742 Prof Gilles Klopman CASETOX etc 23811 Chagrin Blvd. Ste 3740 [email protected] 305 Beachwood, OH 44122 USA PASS Laboratory of http://www.ibmh.msk.su/P The Russian Academy of 007 095 246 007 095 245 0857 Prof Vladimir Poroikov Structure Function ASS/ Medical Sciences 6980 (Number of (Number of [email protected] or Based Drug Pogodinskaya Str., 10, Institute) Institute) [email protected] Design, V.N. Moscow, 119992, Russia. Orekhovich Institute of Biochemical Chemistry. ToxScope LeadScope Inc. http://www.leadscope.com 1245 Kinnear Road, 001 614 675 001 614 675 3732 Dr Gregory Beavers. /products/txs.htm Columbus, Ohio 43212- 3777 [email protected] 1155. USA

DEREK for LHASA Limited http://www.chem.leeds.ac. LHASA Limited, 22-23 0113 2336533 0113 2336535 Ms Edith Laraway Windows, METEOR uk/luk Blenheim Terrace, [email protected] Woodhouse Lane, Leeds, LS2 9HD OncoLogic® LogiChem Inc.., http://www.epa.gov/opptint N.B. This system is being Dr Yintak Woo: US EPA r/cahp/actlocal/can.html updated to allow free [email protected]

22 distribution in 2005 HazardExpert CompuDrug http://www.compudrug.co CompuDrug International, 001 928 284 001 928 284 4775 Dr Aida Citti. m Inc., 115 Morgan Drive, 4757 [email protected] Sedona, AZ 86351, USA

EPISuite (including US EPA Free to download at: ECOSAR, http://www.epa.gov/oppt/e DERMWIN etc) xposure/docs/episuitedl.ht m OASIS / TIMES University of http://omega.btu.bg/ Prof Ovanes Bourgas Mekenyan [email protected] g

Tox Boxes/ ADME Pharma http://www.ap- A. Mickeviciaus g. 29, 00370 5 262 00370 5 262 3728 Contact not known Boxes Algorithms algorithms.com/index.html LT-08117, Vilnius, 4032 [email protected] Lithuania

ADMEWORKS FQS Poland http://www.fqspl.com.pl/?a FQS Poland Sp. z o. o., 0048 12 429 43 0048 12 429 61 24 Contact not known. =product_viewandid=18 ul. Starowislna 13-15, 45 [email protected] Palac Pugetow, 31-038 Krakow, Poland PK-MAP Nimbus http://www.nimbus- NIMBUS Biotechnology 0049 341 4793 0049 341 4793 349 Contact person not known Biotechnology biotechnology.net/product GmbH, Eilenburger Str. 4 350 [email protected] s_pkmap.html 04317 Leipzig, Germany ADMET Predictor Simulations Plus http://www.simulations- Simulations Plus, Inc. 001 661 723 001 661 723 5524 Contact not known plus.com/products/predict 1220 W. Avenue J., 7723 [email protected] or.html Lancaster, CA 93534- 2902, USA TerraQSAR TerraBase Inc http://www.terrabase- TerraBase Inc. 001 905 802 001 905 527 0263 Contact Dr Klaus Kaiser: inc.com/ 1063 King St. W., Suite 0154 130, Hamilton, ON, L8S [email protected] 4S3, Canada

PBT Profiler US EPA Freely available at: http://www.pbtprofiler.net/

23

2.1.3.4 Summary of the Toxicity Endpoints Predicted by Commercial and Non-Commercial Expert Systems

Table 2.1.3.4 Summary of the Toxicity Endpoints Predicted by Commercial and Non-Commercial Expert Systems

TOP- MCase PASS Tox- DE Onco- Hazard- ECO- OASIS / Tox ADME ADME Terra PBT KAT Scopea RE Logic Expert SAR TIMES Box - T- QSAR Profile K es WORK Predict r S or

Mammalian (Human Health) Endpoints

Eye irritation 1 1 1 1 Skin irritation 1 1 1 Skin corrosivity Skin sensitisation 1 1 1 1 1 Acute toxicity 1 1 1 1 1 1 1 1 Chronic toxicity 1 1 1 1i 1 1 Mutagenicity M M M 1 1 1 1 1 Carcinogenicity M M M 1 1 M 1 1 1 1 Reproductive/developmental toxicity M M M 1 1

Environmental Endpoints

Acute toxicity 1 1 M 1 1 1 1 Chronic toxicity M 1 Bioaccumulation 1 (BCF) 1 1 1

1 = one model available M = Many models available I = by implication aThis is a database tool

24 2.1.3.5 Summary of the ADME Endpoints Predicted by Commercial and Non-Commercial Expert Systems

Table 2.1.3.5 Summary of the ADME Endpoints Predicted by Commercial and Non-Commercial Expert Systems

MCase Meteor Metabol EPIWIN Com- TIMES Molins- ADME ADME- PK- ADMET- -Expert pact (OASIS) piration Boxes WORKS MAP Predictor

Mammalian (Human)

Bioavailability (toxicokinetics) M 1 M Bioavailability (metabolism) M 1 1 1 1 1 M 1 Bioavailability (absorption) M Skin 1 1 M 1

1 = one model available M = Many models available

25 2.1.3.6 Bibliography for Section 2.1

Curren RD, Southee JA, Spielmann H, Liebsch M, Fentem JH, Balls M (1995) The role of prevalidation in the development, validation and acceptance of alternative methods. ECVAM Prevalidation Task Force Report 1. ATLA 23: 211-217 Combes RD, Rodford RA (2004) The use of expert systems for toxicity prediction: illustrated with reference to the DEREK program. In MTD Cronin, DJ Livingstone (eds) Predicting Chemical Toxicity and Fate. CRC Press, Boca Raton FL, USA, pp 193-204. Dearden JC, Barratt MD, Benigni R, Bristol DW, Combes RD, Cronin MTD, Judson PM, Payne MP, Richard AM, Tichy M, Worth AP, Yourick JJ (1997) The development and validation of expert systems for predicting toxicity. The report and recommendations of an ECVAM/ECB workshop (ECVAM workshop 24) ATLA, 25: 223-252. Greene N (2002) Computer systems for the prediction of toxicity: an update. Advanced Drug Delivery Reviews 54: 417-431 Hartung T, Bremer S, Casati S, Coecke S, Corvi R, Fortaner S, Gribaldo L, Halder M, Hoffmann S, Janusch Roi A, Prieto P, Sabbioni E, Scott L, Worth A, Zuang V (2004) A modular approach to the ECVAM principles on test validity. Helma C (2005) Predictive Toxicology. Marcel Dekker / CRC Press. Langowski J, Long A (2002) Computer systems for the prediction of xenobiotic metabolism. Advanced Drug Delivery Reviews 54: 407-415 OECD (2003) Draft Guidance document on the validation and international acceptance of new or updated test methods for hazard assessment. OECD Environment, Health and Safety Publications, Series on Testing and Assessment No.34 Tunkel J, Mayo K, Austin CE, Hickerson A, Howard P (2005) Practical considerations on the use of predictive models for regulatory purposes. Environmental Science and Technology 39: 2188-2199. Worth AP, Balls M (2004) The Principles of Validation and the ECVAM Validation Process. ATLA Suppl. 1 623-629

26 2.2 Eye Irritation

2.2.1 Established toxicity tests (e.g. OECD Guidelines etc)

REACH requirements:

Eye irritation is covered in both Annex V and Annex VI of the REACH proposals. Annex V states that the assessment of eye irritation should comprise the following steps:

1. an assessment of available human and animal data 2. an assessment of the acid or alkaline reaction 3. an in vitro study for eye irritation

If the test substance is found to be corrosive, a strong acid (pH < 2.0) or base (pH > 11.5), flammable in air at room temperature or very toxic in contact with the skin, then the in vitro study need not be conducted and the test substance is classified accordingly.

Annex VI states that an in vivo eye irritation study should be carried out if data from Annex V studies are inadequate to quantify the eye irritancy potential of the test substance. At present there are no validated in vitro methods for eye irritation, therefore it is likely that in vivo studies will continue.

Method outline:

OECD TG 405 (REACH B5) was updated in April 2002 to include a testing strategy. This strategy is not an integral part of the test method but is recommended.

Testing strategy:

1. Evaluate of existing human and animal data 2. Perform SAR analysis 3. Assess physicochemical properties and chemical reactivity (pH and buffering capacity) 4. Consideration of other existing information (from systemic toxicity via dermal route) 5. Consider results form in vitro or ex vivo tests (if a chemical has been shown to be corrosive or irritant in a validated in vitro or ex vivo test it need not be tested further in animals) 6. Assess in vivo skin corrosion/irritancy (if positive in this animal test further in vivo tests need not be carried out) 7. Assess in vivo eye corrosion/irritancy

In vivo Draize ocular irritancy test

The test substance is applied in a single dose to one of the eyes of the experimental animal (usually albino rabbits); the untreated eye serves as control. A single animal

27 should be used first, with up to two more being used to confirm non-corrosive results. The degree of eye irritation/corrosion is evaluated by scoring lesions of conjunctiva, cornea and iris at specific time intervals. Other effects in the eye and adverse systemic effects are also recorded to provide a complete evaluation. The duration of the test should be sufficient to evaluate the reversibility or irreversibility of the effects.

Number of animals used – 1-3

2.2.2 Issues relating to the feasibility of applying the 3Rs to the endpoint.

A number of mechanisms for eye irritation have been established. These indicate that a chemical may be classified as an irritant through a number of very different biological responses. It is probable that many chemicals will act by a variety of different mechanisms to elicit irritation.

The (in silico) modelling of eye irritation is complex due to the multiple mechanisms. In addition, develop of qualitative models (i.e. SARs) is complex due to the differences in classifications from, for example, OECD, EU and North American guidelines.

Some “high quality” data are available for modelling purposes. Bagley et al (1999) provided a data bank of 149 in vivo rabbit eye irritation data for 132 chemicals. Care was taken to select only high quality data, and the chemicals tested were known to be available at high and consistent purity and were expected to be stable in storage. All in vivo data compiled had been generated since 1981 in studies carried out according to OECD Test Guideline 405 and following the principles of Good Laboratory Practice

2.2.3 In vitro methods

At present, no alternative tests are fully validated and accepted for regulatory use. The testing strategy proposed for eye irritation by the OECD includes steps where validated methods can be used. There are a large number of potential replacements to the Draize eye test, although none of these alternatives can reproduce all aspects of the in vivo method and so are most likely to be used in combinations of complementary tests to form test batteries. Table 2.2.3 shows the currently available alternative methods and the endpoints they measure.

Method Test system Endpoints Developers/ measured References EpiOcular™ Reconstituted human Cell viability, www.mattek.com corneal epithilium release of inflammatory mediators, membrane permeability

28

SkinEthic HCE model Reconstituted human Cell viability, www.skinethic.com corneal epithelium histology, release of inflammatory mediators

Neutral Red Release Rabbit corneal Damage to cell Reader et al, 1989, (NRR) assay /mouse membrane 1990 (INVITTOX No. 54) 3T3-L1 cells/normal human keratinocytes

Red Blood Cell RBCs from calf blood Damage to Pape et al, 1987 haemolysis (RBC) samples cytoplasmic Pape and Hoppe 1990 assay membrane in (INVITTOX No. combination with 37/99) damage to liberated cellular proteins

Fluorescein Leakage Madin-Darby canine Damage caused to Tchao 1988 (FL) assay kidney (MDCK) cells the tight junctions Shaw et al. 1990, (INVITTOX No. 71) in MDCK 1991 monolayers

Silicon Normal human Decrease in the Bruner et al. 1991 Microphysiometer kertatinocytes/L-929 extracellular McConnell et al. 1992 (SM) mouse fibroblasts acidification rate Catroux et al. 1993 (INVITTOX No. 97/102)

Neutral Red Uptake 3T3-L1 cells; Balb/c Cell viability via Borenfreund and (NRU) assay 3T3 cells neutral red dye Borrero 1984 (INVITTOX No. 46) uptake

Aragose diffusion L-929 mouse Cell viability via Wallin et al, 1987 method cells; NRU or MTT Jackson et al, 1987 (INVITTOX No. mammalian corneal Milstein and Hume 31/40) cell culture 1991

Bovine Corneal Excised cornea from Effects on opacity Gautheron et al,1992, Opacity and the bovine eye and permeability of 1994 permeability test the cornea (BCOP) (INVITTOX No. 124)

Hen’s Egg test (HET- Hen’s egg Damage to egg Luepke 1985 CAM) chorioallantoic Gilleron et al, 1997 (INVITTOX No. membrane showing de Silva et al, 1992 47/96) acute irritation potential to mucous membranes

Chorio-allantoic Hen’s egg Damage to egg Hagino et al. 1991, membrane – trypan chorioallantoic 1993 blue staining (CAM- membrane TBS) (INVITTOX No. 108)

29 Isolated chicken eye Isolated chicken eye Assessment of Burton et al. 1981 test (ICE) corneal swelling, Price and Andrews, (INVITTOX No. 80) corneal opacity and 1985 fluorescein Prinsen and Koeter retention 1993

Isolated rabbit eye Isolated rabbit eye Assessment of Burton et al. 1981 test (IRE) corneal swelling, Whittle et al. 1992 (INVITTOX No. 85) corneal opacity and fluorescein retention

Irritection assay Macromolecular Perturbation or www.invitrointl.com system (IAS formally matrix which mimics denaturation of EYTEX™ – the cornea corneal proteins INVITTOX No. 110)

Pollen tube growth Pollen tube Inhibition of pollen Kappler and Kristen, test (PTG) tube growth in vitro 1987 (INVITTOX No. 55) Kristen et al, 1993

Mucosal irritation Arion lusitanicus Membrane damage Adriaens and Remon, model using slugs by the release of 2002 proteins, LDH and Ceulemans et al, 2001 ALP

Table 2.2.3. In vitro alternatives for eye irritancy testing. INVITTOX protocols can be found from the ECVAM SIS website http://ecvam-sis.jrc.

Several validation studies have been carried out (1991-1997) by using many of these methods, but unfortunately none has led to any validated alternatives for regulatory use. These studies are:

1. The European Commission/Home Office (EC/HO) study (Balls et al, 1995) 2. The European Cosmetic, Toiletry and perfumery Association (COLIPA) study (Brantom et al, 1997) 3. The Bundesgesundheitsamt/German Department of Research and Technology (BGA/BMBF) study (Spielmann et al, 1993, 1996) 4. The Cosmetics, Toiletries and Fragrances Association (CTFA) study (Gettings et al, 1991, 1994, 1996) 5. The Interagency Regulatory Alternatives group (IRAG) study (Bradlaw et al, 1997) 6. The Japanese Ministry of Health and Welfare/Japanese Cosmetic Industry Association (MHW/JCIA) study (Ohno et al, 1994, 1995)

The outcome of all of these studies has been summarised in ECVAM Workshop Report 34 on ‘Eye Irritation Testing: The Way Forward’ (Balls et al, 1999). The apparent lack of success has been attributed to various aspects of the validation studies (e.g. in vitro tests only able to partially model the complex in vivo eye irritation response; inappropriate choice of test substances) and also the fact that the in vivo Draize test, which they are all validated against, often provides very variable estimates of eye irritancy itself - the report suggests the use of reference standards

30 to overcome this problem, although subsequent testing only marginally increased predictability (Worth and Balls, 2002). These validation studies did show that combinations of tests could give better predictions of eye irritancy than tests on their own. Additional studies showed that combinations of data from epithelial integrity studies (FL), ex vivo models (IRE, ICE) and cytotoxicity tests (NRU) explained more of the variability in the data than any single test used alone (Balls et al, 1995).

Combinations of assays could be used to form a test battery to predict eye irritation, although a careful choice of assays is crucial. Tests with similar mechanisms would serve to confirm each others results increasing reliability, whereas tests with differing mechanisms would provide a broader screen increasing the predictive ability of the test battery (Knight and Breheny, 2002).

ICCVAM and ECVAM are currently investigating a number of in vitro eye irritation models with a view to retrospective validation. EVCAM is reviewing the tissue culture models NRR, RBC, FL and SM and has pre-validated the two reconstituted human corneal models. ICCVAM has reviewed the ex vivo and organotypic models BCOP, HET-CAM, IRE and ICE (ICCVAM, 2005) and concluded that for use in tiered testing strategies: - BCOP is an accurate and reliable model for the prediction of severe eye irritants for certain substances only - HET-CAM is useful for the prediction of severe eye irritants but further optimisation is required before a validation study can be conducted - IRE appears to be capable of predicting severe eye irritants but more substances need to be tested to prove the reliability of test - ICE did not meet the criteria for ICCVAM validation because: the intra- laboratory reliability was not adequately evaluated; the raw data was not available for review; detailed diagrams of the apparatus required were not made available to allow for transferability of the test set-up

At joint ECVAM/ICCVAM meeting in February 2005, two similar in vitro based testing strategies were identified, a ‘top down’ approach and a ‘bottom up’ approach (Chantra Eskes, personal communication). In the ‘top down’ approach, step one involves the prediction of severe or non-severe irritancy. If irritancy is deemed to be non-severe, step two proceeds to predict classified or non-classified irritancy. If irritancy is then classified, step three either classifies the substance in question as an irritant or proposes that it is subjected to a Draize rabbit test. The ‘bottom up’ approach tests for classified irritancy at step one, before determining severe irritancy at step two, step three is the same as it is in the ‘top-down’ approach. Classification is the level of toxicity which corresponds to a specific hazard label, for example a European classification of R41 would equal ‘severe eye irritant’, whereas that of R36 would equal ‘irritating to the eyes’.

The meeting also identified areas for follow up, including: reversibility of irritation; evaluation of applicability domains; definition of decision criteria and comparisons to human data.

As some European Competent Authorities already allow the use of alternatives (e.g. IRE, BCOP, HET-CAM) for the assessment of severe eye irritants (Worth and Balls,

31 2002), it has been suggested that ECVAM should perform a validation study so that a European-wide system for the classification of severe eye irritants is produced, that involves the use of in vitro methods only.

The detection of mild and moderate eye irritants is more complicated. For a test method to be able to distinguish between mild and moderate the method needs to be reversible as the degree of persistence and reversibility is a major determining factor that leads to the difference in classification, rather than just the extent of initial toxicity. Assays for this could be based on the release of inflammatory mediators such as (Balls et al, 1999). The reconstitiuted human corneal epithelium models, the EpiOcular™ assay and the SkinEthic HCE model, therefore appear most likely to be able to determine lower levels of irritancy. In fact, the manufacturers of EpiOcular™ (the Mattek Corportation (www.mattek.com)), claim that it can predict mild, milder and mildest formulations.

A study by Unilever, involved testing the EpiOcular™ system and four other in vitro alternatives (BCOP, IRE, FL, NRR; Jones et al, 2001), for their ability to predict eye irritancy potentials of hair care products. The results showed that, when using carefully derived prediction models, EpiOcular™, FL and IRE were all able to predict mild eye irritancy potentials satisfactorily, but exhibited varying abilities to predict moderate and substantial irritancy potentials. For overall predictability, the IRE peformed best with EpiOcular™ and FL being unable to distinguish between different levels of higher irritancy potentials. BCOP and NRR were the least predictive systems, being unable to distinguish between any of the irritancy potentials, confirming the results of an earlier study on BCOP and IRE (Cooper et al, 2001). Further studies need to be conducted to assess whether this predictability of mild irritancy can be repeated in substances other than hair care products.

More recent studies have discussed further developments in predicting mild eye irritants. Lilja and Forsby (2004) have developed a neuronal cell model to measure neuronal stimulation by test substances and Debbasch et al. (2005) have developed a novel method which uses a human corneal epithelial cell line, with cytotoxicity and IL-8 release measured. The development of models that include a neuronal component for detecting irritation is considered to be an important consideration for their overall improvement (Garle and Fry, 2003). With further evaluation, these models could potentially be used alongside some of the more common alternatives in an in vitro test battery, helping to identify mild eye irritants.

Developments are also ongoing at the FRAME Alternatives Laboratory in Nottingham on a 3D corneal epithelial model. A project to assess effects on viability and barrier function of a human corneal epithelial cell line from the repeat exposure of surfactants is currently underway (Richard Clothier, personal communication Also, innervation of the corneal model is being achieved by the introduction of ND7/23 sensory neurons (Moore et al, 2005), since the corneal epithelium in vivo is a highly innervated tissue.

There is great diversity within the tests available for eye irritation, with some having biological systems that are very remote from those of the eye, and have rather dubious mechanistic relationships with the phenomenon of eye irritation. Examples

32 are the pollen tube and slug assays. It is therefore unlikely that such tests, even as part of a test battery, would ever gain regulatory acceptance. Moreover, few, if any of the test systems, model the essential mechanism involved in the induction of corneal opacity - hydration levels of the endothelial stromal cells in the cornea, which in turn are controlled by the sodium/ potassium ATPase system. (Cjeckova et al, 1988; Ubels et al, 2000). Our ability to model this process in vitro would greatly facilitate the complete replacement of in vivo methods for eye irritation.

2.2.4 In silico approaches

The prediction of eye irritation has been reviewed by Cronin and Dearden (1995d) and more recently by Patlewicz et al (2003).

QSARs

A number of QSARs have been determined using quantitative data such as molar eye irritation scores. Some of these efforts have been based around particular chemicals classes e.g. cationic surfactants (Patlewicz et al 2000); neutral organic chemicals (Barratt 1997). Other workers have developed models on more diverse series of chemicals, typically using available databases such as the ECETOC data compilation. Examples of more general models include Abraham et al (2003; 1998a, b) and Cronin et al (1994). These models are developed using “traditional” QSAR descriptors and a variety of statistical techniques ranging from regression analysis to neural networks. Other, more novel approaches include the use of membrane- interaction QSAR (MI-QSARs) (Kulkarni et al (2001). MI-QSAR uses calculations from a molecular modelling study of an idealised membrane.

Expert Systems

The TOPKAT and MultiCASE (Klopman et al 1993) expert systems both contain models to discriminate between irritants and non-irritants.

DEREK and HazardExpert are coded with a number of rules, mainly relating to strongly acidic and basic molecular features. The numbers of rules are generally low, OECD (2004) report less than 30 rules that may be relevant to eye irritation.

2.2.5 Integrated testing strategies for specific endpoints

A number of integrated testing strategies have been proposed including those described in Cronin et al (2003) and Worth and Cronin (2001). Such proposal integrate in silico and in vitro techniques in a tiered scheme to identify irritants and to test experimentally only non-irritants. Chamberlain and Barrett (1995) also described the integration of computational predictions into the process in more detail.

The recent update of the OCED TG 405 for eye irritation (OECD, 2002) now includes a recommended step-wise strategy where in the absence of prior-information, SARs for eye corrosion/irritation and for skin corrosion are performed and pH

33 measurements used to determine whether further testing is needed. Otherwise, where a substance is classified as systemically toxic, a severe dermal irritant or corrosive on skin, no in vivo eye irritancy testing is required. However, although validated non-animal methods can now be used to classify test substances as irritating to the eye, at present no such methods exist and the Draize test is still used.

The BUAV (Langley, 2001) has also proposed a similar scheme where in silico methods are used to predict potential irritants. Strong acids/bases are automatically assumed to be corrosive to the eye and are not tested further, and severe skin irritants are also assumed to be eye irritants for classification and labelling purposes. The strategy suggests use of alternative tests such as the isolated rabbit eye, isolated chicken eye, HET-CAM, BCOP tests. No in vivo testing is incorporated into the testing strategy.

2.2.6 Reduction and Refinement opportunities

There are presently no validated alternatives for completely replacing the Draize test. However its use has been reduced and there have been several refinements to the methodology over the years. The OECD TG was updated in 2002. It recommends the use of a testing strategy where in silico and in vitro methods are used to assess corrosive and irritancy potentials before the Draize test is used. It recommends that skin corrosion and irritation studies are carried out prior to eye irritation studies, so that results from these studies can be taken in to account. Thus, for example, if a test substance is found to be corrosive in a skin study the corresponding eye study need not be conducted. Further reduction and refinement measures have been proposed by Combes et al. (2004). These include introducing mandatory requirements that ensure that only a single animal is used in preliminary testing and only one additional animal is used in confirmatory studies. Refinement measures include training animals for use in the required restraining devices.

Another refinement measure has been proposed – a low volume eye test (LVET; Freeberg et al, 1986; Bruner et al, 1992). In contrast to the traditional Draize test, the LVET involves the use of 10µl rather than100µl dose volumes and application is directly to the cornea of the eye rather than into the conjunctival sac. The eye is not forced closed after administration of the test substance to retain the sample, but is left so that natural mechanisms, such as blinking, can dilute or remove the test material, a situation that would happen naturally. Although, the LVET requires the same number of animals as the traditional Draize test, less test substance is applied to the eye, causing less suffering to the animal. The LVET is also used as a reference test when developing alternatives to the Draize test (Roggeband et al, 2000).

Many studies have been carried out to test the predictability of the LVET (Cormier et al, 1995, 1996; Gettings et al, 1998a, 1998b). Some of these showed that it can provide better predictions of human response to eye irritation than the standard Draize test. However, as the LVET has never undergone any formal independent validation, it cannot be used for regulatory testing, although, it has recently been submitted to ECVAM for evaluation with regard to possible future validation.

34 2.2.7 Scope for future work

There is real potential for this endpoint. This is especially important due to the importance associated with it (from an ethical standpoint).

(Q)SARs are likely to play only a small role in the development of alternative strategies, but used in combination with e.g. physical measurements of pH, may provide a filter to identify irritants. It is unlikely at the moment that any accurate predictions of “relative” effect e.g. potency will be made from in silico approaches.

Future efforts may also wish to address the modelling of fundamental mechanisms of eye irritation.

2.2.8 Other Information

- ECVAM Task Force on eye irritants 22-23rd June 2004 – awaiting report - ECVAM Eye irritation expert meeting ‘In vitro testing strategies for eye irritation’ and Extended Task Force meeting Feb 2005 – awaiting report

2.2.9 Bibliography

Abraham MH, Hassanisadi M, Jalali-Heravi M, Ghafourian T, Cain WS, Cometto- Muniz JE (2003) Draize rabbit eye test compatibility with eye irritation thresholds in humans: A quantitative structure-activity relationship analysis. Toxicological Sciences 76: 384-391 Abraham MH, Kumarsingh R, Cometto-Muniz JE, Cain WS (1998a) A quantitative structure-activity relationship (QSAR) for a Draize eye irritation database. Toxicology in Vitro 12: 201-207 Abraham MH, Kumarsingh R, Cometto-Muniz JE, Cain WS (1998b) Draize eye scores and eye irritation thresholds in man can be combined into one QSAR. Olfaction and Taste XII 855: 652-656 Adriaens E, Remon JP (2002) The evaluation of an alternative mucosal irritation test using slugs. Journal of Applied Toxicology and Pharmacology 182: 169-175 Bagley DM, Gardner JR, Holland G, Lewis RW, Vrijhof H, Walker AP (1999) Eye irritation: Updated reference chemicals data bank. Toxicology in Vitro 13:505-510. Balls M, Berg N, Bruner LH, Curren RD, de Silva O, Earl LK, Esdaile DK, Fentem JH, Liebsch M, Ohno Y, Prinsen MK, Spielmann H, Worth AP (1999) ECVAM Workshop Report 34: Eye Irritation Testing: The Way Forward. ATLA 27: 53-77 Balls M, Botham PA, Bruner LH, Spielmann H (1995) The EC/HO international validation study on alternatives to the Draize rabbit eye irritation test. Toxicology in Vitro, 9, 871-929 Barratt MD (1997) QSARS for the eye irritation potential of neutral organic chemicals. Toxicology in Vitro 11: 1-8

35 Borenfreund E, Borrero O (1984) In vitro cytotoxicity assays: potential alternatives to the Draize ocular irritancy test. Cell Biology and Toxicology 1: 55-65 Bradlaw J, Gupta K, Green S, Hill R, Wilcox N (1996) Practical application of non- animal alternatives program: summary of IRAG workshop on eye irritation testing. Food and Chemical Toxicology 35: 79-117 Brantom PG, Bruner LH, Chamberlain M, DeSilva O, Dupuis J, Earl LK, Lovell DP, Pape WJW, Uttley M, Bagley DM, Baker FW, Bracher M, Courtellemont P, Declercq I, Freeman S, Steiling W, Walker AP, Carr GJ, Dami N, Thomas G, Harbell J, Jones PA, Pfannenbecker U, Southee JA, Tcheng M, Argembeaux H, Castelli D, Clothier R, Esdaile DJ, Itigaki H, Jung K, Kasai Y, Kojima H, Kristen U, Larnicol M, Lewis RW, Marenus K, Moreno O, Peterson A, Rasmussen ES, Robles C, Stern M (1997) A summary report of the COLIPA international validation study on alternatives to the Draize rabbit eye irritation test. Toxicology in Vitro 11: 141-179 Bruner LH, Miller KR, Owicki JC, Parce JW, Muir VC (1991) Testing ocular irritancy in vitro with the silicon microphysiometer. Toxicology in vitro 5: 277-284 Bruner LH, Parker RD, Bruce RD (1992) Reducing the number of rabbits in the low- volume eye test. Fundamental and Applied Toxicology 19: 330-335 Burton ABG, York M, and Lawrence RS (1981) The in vitro assessment of severe irritants. Food and Cosmetics Toxicology 19: 471-480 Catroux P, Rougier A, Dossou KG, Cottin M (1993) The silicon microphysiometer for testing ocular toxicity in vitro. Toxicology in vitro 7: 465-469 Cejkova J, Lojda Z, Brunova B, Vacik J, Michalek J (1988) Disturbances in the rabbit cornea after short-term and long-term wear of hydrogel contact lenses: usefulness of histochemical methods. Histochemistry 89: 91–97 Ceulemans J, Vermeire A, Adriaens E, Remon JP, Ludwig A (2001) Evaluation of a mucoadhesive tablet for ocular use. Journal of Controlled Release 77: 333-344 Chamberlain M, Barratt MD (1995) Practical applications of QSAR to in-vitro toxicology illustrated by consideration of eye irritation. Toxicology in Vitro 9: 543-547 Combes RD, Gaunt I, Balls M (2004) A Scientific and animal welfare assessment of the OECD Health Effects Test Guidelines for the safety testing of chemicals under the European Union REACH system. ATLA 32: 163-208 Cooper KJ, Earl LK, Harbell J, Raabe H (2001) Prediction of ocular irritancy of prototype shampoo formulations by the isolated rabbit eye (IRE) test and bovine corneal opacity and permeability (BCOP) assay. Toxicology in Vitro 15: 95-103 Cormier EM, Hunter JE, Billhimer W, May J, Farage MA (1995) Use of clinical and consumer eye irritation data to evaluate the low-volume eye test. Journal of Toxicology -Cutaneous Ocular 14: 197-205 Cormier EM, Parker RD, Henson C, Cruse LW, Merritt AK, Bruce RD, Osborne R (1996) Determination of the intra- and inter-laboratory reproducibility of the low volume eye test and its statistical relationship to the Draize Eye Test. Regulatory Toxicology and Pharmacology 23: 156-161

36 Cronin MTD, Basketter DA, York M (1994) A quantitative structure-activity relationship (QSAR) investigation of a Draize eye irritation database. Toxicology in Vitro 8: 21-28 Cronin MTD, Dearden JC (1995d) QSAR in toxicology. 4. Prediction of non-lethal mammalian toxicological endpoints, and expert systems for toxicity prediction. Quantitative Structure-Activity Relationships 14: 518-523 Cronin MTD, Dearden JC, Walker JD, Worth AP (2003) Quantitative structure- activity relationships for human health effects: commonalities with other endpoints. Environmental Toxicology and Chemistry 22: 1829-1843. de Silva O, Rougier A, Dossou KG (1992) The HET-CAM test: a study of the irritation potential of chemicals and formulations. ATLA 20: 432-437 Debbasch C, Ebenhahn C, Dami N, Pericoi M, Van den Berghe C, Cottin M, Nohynek GJ (2005) Eye irritation of low-irritant cosmetic formulations: correlation of in vitro results with clinical data and product composition. Food and Chemical Toxicology 43: 155–165 Freeberg FE, Nixon GA, Reer PJ, Weaver JE, Bruce RD, Griffith JF, Sanders LW (1986) Human and rabbit eye responses to chemical insult. Fundamental and Applied Toxicology 7: 626-634 Garle MJ, Fry JR (2003) Sensory nerves, neurogenic inflammation and paid: missing components of alternative irritation strategies? A revew and a potential strategy. ATLA 31: 295–316. Gautheron P, Dukic M, Alix D, Sina J F (1992) bovine corneal opacity and permeability test: an in vitro assay of ocular irritancy. Fundamental and Applied Toxicology 18: 442-449 Gautheron P, Giroux J, Cottin M, Audegond L, Morilla A, Mayordomo-Blanco L, Tortjada A, Haynes G, Vericat JA, Priovano R, Gillio Tos E, Hagemann C, Vanpary S, Deknudt G, Jacobs G, Prinsen M, Kalweit S, Spilemann H (1994) Interlaboratory assessment of the bovine corneal opacity and permeability (BCOP) assay. Toxicology In Vitro 8: 381-392 Gettings SD, DiPasquale LC, Bagley DM, Casterton PL, Chudkowski M, Curren RD, Demetrulias JL, Feder PI, Galli CL, Gay R, Glaza SM, Hintze KL, Janus J, Kurtz PJ, Lordo RA, Marenus KD, Moral J, Muscatiello MJ, Pape WJW, Renskers KJ, Roddy MT, Rozen MG (1994) The CTFA evaluation of alternatives program: an evaluation of in vitro alternatives to the Draize primary eye irritation test. Phase II. Oil/water emulsions. Food and Chemical Toxicology 32: 943-976 Gettings SD, Lordo RA, Feder PI, Hintze KL (1998a) A comparison of low volume: Draize and in vitro eye irritation test data. III. Surfactant-based formulations. Food and Chemical Toxicology 36: 209-231 Gettings SD, Lordo RA, Feder PI, Hintze KL (1998b) Comparison of low volume: Draize and in vitro eye irritation test data. II. Oil/water emulsions. Food and Chemical Toxicology 36: 47-59 Gettings SD, Lordo RA, Hintze KL, Bagely DM, Casterton PL, Chudkowski M, Curren RD, Demetrulias JL, DiPasquale LC, Earl LK, Feder PI, Galli CL, Gay R, Glaza SM, Gordon VC, Janus J, Kurtz PJ, Marenus KD, Moral J, Pape WJW, Renskers KJ,

37 Rheins LA, Roddy MT, Rozen MG, Tedeschi JP, Zyracki J (1996) The CTFA evaluation of alternatives program: an evaluation of in vitro alternatives to the Draize primary eye irritation test. Phase III. Surfactant based formulations. Food and Chemical Toxicology 34: 79-117. Gettings SD, Teal JJ, Bagely DM, Demetrulias JL, DiPasquale LC, Hintze KL, Rozen MG, Weise SL, Chudkowski M, MarenusKD, Pape WJW, Roddy MT, Schnitzinger R, Silber PM, Glaza SM, Kurtz PJ (1991) The CTFA evaluation of alternatives program: an evaluation of in vitro alternatives to the Draize primary eye irritation test. Phase I. Hyro-alcoholic formulations: Part 2: data analysis and biological significance. In vitro Toxicology 4: 247-288 Gilleron L, Coecke S, Sysmans M, Hansen E, van Oproy S, Marzin D, van Cauteren H, Vanparys P (1997) Evaluation of the HET-CAM-TSA method as an alternative to the Draize eye irritation test. Toxicology in vitro 11: 641-644 Hagino S, Itagaki H, Kato S, Kobayashi T (1993) Further evaluation of the quantitative chorioallantoic membrane test using Trypan Blue stain to predict the eye irritancy of chemicals. Toxicology in vitro 7: 35-39 Hagino S, Itagaki H, Kato S, Kobayashi T, Tanaka M (1991) Quantitative evaluation to predict the eye irritancy of chemicals: modification of chorioallantoic membrane test by using Trypan Blue. Toxicology in vitro 5: 301-304 ICCVAM (2005) Expert Panel Report: Evaluationof the Current Validation Status of In Vitro Test Mehtods for Identifying Ocular Corrosives and Severe Irritants. http://iccvam.niehs.nih.gov/methods/ocudocs/EPreport/ocureport.htm Jackson EM, Hume RD, Wallin RF (1987) The agarose diffusion method for ocular irritancy screening cosmetic products, Part II. Journal of Toxicology - Cutaneous and Ocular Toxicology 7: 187 Jones PA, Budynsky E, Cooper KJ, Decker D, Griffiths HA, Fentem JH (2001) Comparative evaluation of five in vitro tests for assessing the eye irritation potential of hair-care products. ATLA 29: 669-692. Kappler R, Kristen U (1987) Photometric quantification of in vitro pollen tube growth: A new method suited to determine the cytotoxicity of various environmental substances. Environmental and Experimental Botany 27: 305-310 Klopman G, Ptchelintsey D, Frierson M, Pennisi S, Renskers K, Dickens M (1993) Multiple computer automated structure evaluation methodology as an alternative to in vivo eye irritation testing. ATLA-Alternatives to Laboratory Animals 21: 14-27 Knight DJ, Breheny D (2002) alternatives to animal testing in the safety evaluation of products. ATLA 30: 7-22 Kristen U, Hoppe U, Pape W (1993) The pollen tube growth test: A new alternative to the Draize eye irritation assay. Journal of the Society of Cosmetic Chemistry 44, 154-162 Kulkani A, Hopfinger AJ, Osborne R, Bruner LH, Thompson ED (2001) Prediction of eye irritation from organic chemicals using membrane-interaction QSAR analysis. Toxicological Sciences 59: 335-345

38 Langley G (2001) The Way Forward. Action to end animal toxicity testing. Report compiled for the BUAV and ECEAE, Lilja J, Forsby A 2004) Development of a sensory neuronal cell model for the estimation of mild eye irritation. ATLA 32: 339-343 Luepke NP (1985) Hen's egg chorioallantoic membrane test for irritation potential. Food and Chemical Toxicology 23: 287-291 McConnell HM, Owicki JC, Parce JW, Miller DL, Baxter GT, Wada HG, Pitchford S (1992) The cytosensor microphysiometer: biological applications of silicon technology. Science 257: 1906-1912 Milstein SR, and Hume RD (1991) Correlating the L-929 and SIRC variants of the in vitro agarose diffusion method for the assessment of cosmetic product eye irritation potential. Journal of Toxicology - Cutaneous and Ocular Toxicology 10: 3-14 Moore P, Ogilvie J, Horridge E, Mellor IR, Clothier RH (2005) The development of an innervated epithelial barrier model using a human corneal cell line and ND7/23 sensory neurons. European Journal of Cell Biology in press OECD (2002) OECD Guideline for the testing of chemicals. Acute Eye Irritation/Corrosion. Ohno Y, Kaneko T, Kobayashi T, Inoue T, Kuroiwa Y, Yosjide T, Momma J, Hayashi M, Akyama J, Astumi T, Chiba K, Endo T, Fujii A, Kakishima H, Kojima H, Masamoto K, Masuda M, Matsukawa S, Ohkoshi K, Okada J, Sakamoto K, Takano K, Takanaka A (1994) First-phase validation of the in vitro eye irritation test sfor cosmetic ingredients. In vitro Toxicology 7: 89-94 Ohno Y, Kaneko T, Kobayashi T, Inoue T, Kuroiwa Y, Yosjide T, Momma J, Hayashi M, Akyama J, Astumi T, Chiba K, Endo T, Fujii A, Kakishima H, Kojima H, Masamoto K, Masuda M, Matsukawa S, Ohkoshi K, Okada J, Sakamoto K, Takano K, Takanaka A (1995) First-phase inter-laboratory validation of the in vitro eye irritation tests for cosmetic ingredients. I. Overview, organisation and results of the validation study. AATEX 3: 123-136 Pape WJW, Hoppe U (1990) Standardization of an in vitro red blood cell test for evaluating the acute cytotoxic potential of tensides. Arzneimittelforschung 40: 498- 502 Pape WJW, Pfannenbecker U, Hoppe U (1987) Validation of the red blood cell test as an in vitro assay for the rapid screening of irritation potential of surfactants. Molecular Toxicology 1: 525-536 Patlewicz G, Rodford R, Walker JD (2003) Quantitative structure-activity relationships for predicting skin and eye irritation. Environmental Toxicology and Chemistry 22: 1862-1869 Patlewicz GY, Rodford RA, Ellis G, Barratt MD (2000) A QSAR model for the eye irritation of cationic surfactants. Toxicology in Vitro 14: 79-84 Price JB, Andrews IJ (1985) The in vitro assessment of eye irritancy using isolated eyes. Food and Chemical Toxicology 23: 313-315

39 Prinsen MK, Koëter HBWM (1993) Justification of the enucleated eye test with eyes of slaughterhouse animals as an alternative to the Draize eye irritation test with rabbits. Food and Chemical Toxicology 31: 69-76 Reader SJ, Blackwell V, O'Hara R, Clothier RH, Griffin G, Balls M (1989) A vital dye release method for assaying the short-term cytotoxic effects of chemicals and formulations. ATLA 17: 28-33 Reader SJ, Blackwell V, O'Hara R, Clothier RH, Griffin G, Balls M (1990) Neutral Red Release from pre-loaded cells as an in vitro approach to testing for eye irritancy potential. Toxicology In Vitro 4: 264-266 Shaw AJ, Balls M, Clothier RH, Bateman ND (1991) Predicting ocular irritancy and recovery from injury using madin-derby canine kidney cells. Toxicology In Vitro 5: 569-571 Shaw AJ, Clothier RH, Balls M (1990) Loss of trans-epithelial impermeability of a confluent monolayer of Madin-Darby Canine Kidney (MDCK) cells as a determinant of ocular irritancy potential. ATLA 18: 145-151 Spielmann H, Kalweit S, Liebsch M, Wirnsberger T, Gerner I, Bertram-Neis E, Krauser K, Kreiling R, Miltenburger G, Pape W, Steiling W (1993) Validation study of alternatives to the Draize eye irritation test in Germany: cytotoxicity testing and HET- CAM test with 136 industrial chemicals. Toxicology in Vitro 7: 505-510 Spielmann H, Liebsch M, Kalweit S, Moldenhauer F, Wirnsberger T, Holzhutter H-G, Schneider B, Glaser S, garner I, Pape WJW, Kreiling R, Kruaser K, Miltenburger HG, Steiling W, Luepke NP, Muller N, Kreuzer H, Murmann P, Spengler J, Bertram-Neis E, Siegemund B, Wiebel FJ (1996) Results of a validation study in Germany on two in vitro alternatives to the Draize eye irritation test, the HET-CAM test and the 3T3- NRU cytotoxicity test. ATLA 24: 741-858 Tchao R (1988) Trans-epithelial permeability of fluorescein in vitro as an assay to determine eye irritants. In Alternative Methods in Toxicology, Volume 6, Progress in In Vitro Toxicology, (ed A.M. Goldberg), Mary Ann Liebert, Inc.; New York, pp. 271- 283. Ubels JL, Prius RM, Sybesma JT, Casterton PL (2000) Corneal opacity, hydration and endothelial morphology in the cornea opacity and permeability assay using reduced treatment times. Toxicology in Vitro 14: 379–386. Wallin RF, Hume RD, Jackson EM (1987) The agarose diffusion method for ocular irritancy screening cosmetic products, Part I. Journal of Toxicology – Cutaneous and Ocular Toxicology 6: 239 Whittle E, Basketter D, York M, Kelly L, Hall T, McCall J, Botham P, Esdaile D, Gardner J (1992) Findings of an inter-laboratory trial of the enucleated eye method as an alternative eye irritation test. Toxicological Methods 2: 30-41 Worth AP, Balls M (2002) Alternative (Non-animal) methods for chemicals testing: Current status and future prospects. A report prepared by ECVAM and the ECVAM Working Group on Chemicals. ATLA 30: Suppl. 1 Worth AP, Cronin MTD (2001) The use of pH measurements to predict the potential of chemicals to cause dermal and occular toxicity. Toxicology 169: 119-131.

40 2.3 Skin Irritation

2.3.1 Established toxicity tests (e.g. OECD Guidelines etc)

REACH requirements:

Skin irritation is covered in both Annex V and Annex VI of the REACH proposals. Annex V states that the assessment of skin irritation should comprise the following steps:

1. an assessment of available human and animal data 2. an assessment of the acid or alkaline reaction 3. an in vitro study for skin corrosion 4. an in vitro study for skin irritation

If the test substance is found to be corrosive, a strong acid (pH < 2.0) or base (pH > 11.5), flammable in air at room temperature, very toxic in contact with the skin or a dermal acute toxicity study has shown no irritation up to the limit dose of 2000mg/kg, the in vitro study need not be conducted and the test substance is classified accordingly.

Annex VI states that an in vivo skin irritation study should be carried out if data from Annex V studies are inadequate to quantify the skin irritancy potential of the test substance. At present there are no validated in vitro methods for skin irritation, therefore it is likely that in vivo studies will continue to be used.

Method outline:

OECD TG 404 (REACH B4) was updated in April 2002 to include a testing strategy. This strategy is not an integral part of the test method but is recommended.

Testing Strategy:

1. Evaluate of existing human and animal data 2. Perform SAR analysis 3. Determine physicochemical properties and chemical reactivity (pH and buffering capacity) 4. Consider other existing information (from systemic toxicity via dermal route) 5. Perform in vitro or ex vivo corrosivity tests (if a chemical has been shown to be corrosive in a validated in vitro or ex vivo test it need not be tested further in animals) 6. Perform in vitro or ex vivo irritancy tests (if a chemical has been shown to be irritant in a validated in vitro or ex vivo test it need not be tested further in animals) 7. Assess in vivo skin irritancy

In vivo Draize dermal irritation/corrosion test

41

The test substance is applied to a small (6cm2) area of shaved skin on the experimental animal (usually albino rabbits) and covered with a gauze patch held in place with non-irritating tape for 4 hours. One animal should be used first with up to two more being used to confirm non-corrosive results. The degree of dermal irritation/corrosion is evaluated by scoring signs of erythema and oedema at certain time intervals after patch removal. Other dermal and adverse systemic effects are also recorded to provide a complete evaluation. Duration should be sufficient to evaluate the reversibility or irreversibility of the effects, with an observation period of up to 14 days. If reversibility is observed before 14 days, or the animal shows signs of severe pain or distress, the test should be terminated.

Number of animals used – 1-3

2.3.2 Issues relating to the feasibility of applying the 3Rs to the endpoint.

There appears to be reasonable understanding of the mechanisms of action of skin irritation, although some knowledge e.g. inflammatory mechanisms is still sparse. As with eye irritation, it is possible that chemicals may elicit a response via multiple mechanisms.

The modelling of, and development of alternatives for, skin irritation is overshadowed in many ways by the importance placed on eye irritation and skin corrosivity.

There are few readily available (and high quality) data for modelling. One example, however, is provided by Bagley et al (1996) who listed in vivo rabbit skin irritation data for 176 chemicals. All chemicals were known to be of high or consistent purity and stable on storage. The chemicals were tested undiluted in in vivo studies, apart from those chemicals where high concentrations could be expected to cause severe effects. In vivo data were generated in studies carried out since 1981 according to OECD Test Guideline 404 and following the principles of Good Laboratory Practice.

Other data are available, a good example are those manipulated by Gerner et al (2004), but have not been published due to commercial confidentiality.

2.3.3 In vitro methods

The testing strategy proposed for skin irritation by the OECD includes steps where validated alternative methods can be used and although the skin corrosion methods can be used, efforts need to be made to ensure that methods for skin irritation also become available. It is important to remember that, while corrosion involves necrosis of cells and is irreversible, irritation is a reversible process that involves non-lethal effects on cells. As a result, it is important to consider measuring recovery levels from repeat dose exposures rather than the initial toxicity to single acute doses. As with eye irritation, there are no alternative tests fully validated and accepted for the regulatory assessment of skin irritancy, although a validation study is currently being carried out by ECVAM.

42 In vitro test systems currently available to model skin irritancy comprise the reconstituted skin models such as EpiDerm™, EPISKIN™, Prediskin™, and the model by SkinEthic; excised skin models such as the mouse skin integrity function test (SIFT) and the Pig ear test; and cell culture models such as the arachiadonic acid release test (Table 2.3.3). Human volunteer testing has also been suggested. Human volunteers are widely used in the cosmetic industry to test finished products (the use of animals for finished products is banned) but in most instances the hazard of the test substance has already be characterised and the use of volunteers is only to finalise risk assessment by performing skin compatibility and exposure studies, to confirm that there are no harmful effects. Human volunteer testing is therefore not applicable to the industrial chemicals likely to require testing under the REACH system.

Method Test system Endpoints Developers/ measured References EpiDerm™ Reconstituted Cell viability www.mattek.com human epidermal measured by MTT equivalent assay

EPISKIN™ Reconstituted Cell viability www.loreal.com human epidermal measured by MTT equivalent assay

Prediskin™ Reconstituted Histology, cell www.biopredic.com human epidermal viability measured equivalent by MTT assay

SkinEthic model Reconstituted Tissue histology, cell www.skinethic.com human epidermal viability, release of equivalent inflammatory mediators

Pig ear test Isolated pig ear Trans epidermal Fentem et al, 2001 water loss (TEWL) is measured to assess percutaneous absorption

Mouse skin integrity Excised mouse skin TEWL and electrical Heylings et al, 2001, function test (SIFT) resistance 2003

Cutaneous toxicity Rabbit, human or Dermal toxicity, Van der Sandt et al, testing using skin pig skin in two absorption and 1991, 1993, 1995, organ culture compartment model metabolism of test 2000 (INVITTOX No. 103) compound, 7-day culture period permits some aspects of recovery to be studied Arachiadonic acid Promonocytic Rate of release of Klöcking et al, 1993, release test human cell line arachiadonic acid to 1994 (INVITTOX No.87) U937 assess membrane- toxic effects and thus inflammatory

43 potential The Zein test Corn protein Amount of corn Götte, 1964 (used a screening test protein (Zein) for evaluation of dissolved by a relative mildness of surfactant as mg surfactants/detergents) nitrogen in 100ml of (INVITTOX No. 26) surfactant solution.

Table 2.3.3. In vitro methods for the prediction of skin irritation. INVITTOX protocols can be found via the ECVAM-SIS website http://ecvam-sis.jrc.it/.

The Pre-validation study at ECVAM was carried out on five of the methods described in the table above - EpiDerm™, EPISKIN™, Prediskin™, SIFT and the Pig ear test (Fentem et al, 2001). The model by SkinEthic is also currently undergoing pre- validation at ECVAM. Prediskin™ and the Pig ear test both failed the pre-validation stages and have not been taken forward to the full validation. The other three methods initially failed the pre-validation but additional work was carried out to enable them to be assessed further in the full validation study (Zuang et al, 2002; Fentem and Botham, 2002).

The full validation consisted of:

- Phase 1 – the optimisation and confirmation of the standard protocol and prediction models. This phase involved the testing of 20 coded chemicals in the lead laboratories only, to confirm intra-laboratory reproducibility. The SIFT method failed this phase and therefore only the EpiDerm and EPISKIN models continued into Phase 2.

- An ECVAM Management Team meeting after Phase 1 discussed the determination of 1 alpha (IL-1α) as a complementary test to the cell viability MTT assay already performed in the EpiDerm and EPISKIN models. It was agreed that this would help to increase the sensitivity of the tests and therefore the method was evaluated prior to commencement of Phase 2 (Botham, 2004).

- Phase 2 – evaluation of inter-laboratory reproducibility and the predictive ability of the models. This phase involved the testing of 60 coded chemicals in three different laboratories per model. This phase was scheduled for October 2004 – April 2005 and a final report of the validation study is expected by July 2005.

A recent review paper by Welss et al (2004) discussed the relevance of IL-1α and other biomarkers for skin irritation, including IL-6, IL-8, IL-10, Tumour necrosis factor alpha (TNF-α) and arachiadonic metabolites. These are areas where further research could be focused to increase the predictability of current in vitro methods.

44 2.3.4 In silico approaches

(Q)SARs for skin irritation have been reviewed recently by Patlewicz et al (2003) and Walker et al (2004)

QSARs

There are few QSARs for skin irritation. This may reflect the paucity of high quality data, or that efforts have hereto been concentrated on eye irritation and skin corrosivity.

A membrane-interaction QSAR approach has been attempted with encouraging success (Kodithala et al 2002).

SAR-type rule based on physicochemical properties such as melting point, log P, aqueous solubility etc have been developed and are reported by Gerner et al (2004)

Expert Systems

The TOPKAT and MultiCASE expert systems both contain models to discriminate between irritants and non-irritants.

DEREK is coded with a number of rules, mainly relating to strongly acidic and basic molecular features. The numbers of rules are generally low, OECD (2004) report less than 25 rules that may be relevant to skin irritation.

2.3.5 Integrated testing strategies for specific endpoints

The recently amended OECD TG 404 for skin corrosion/irritation testing (OECD, 2002) now includes a recommended step-wise strategy for skin corrosion and irritation (outlined in the section on skin corrosion).

The BUAV (Langley, 2001) proposed a similar scheme but suggested the use of the human 4-hour patch test (Griffiths et al, 1997) instead of using the Draize test to finally quantify skin irritation.

2.3.6 Reduction and Refinement opportunities

Recent changes to the OECD test guideline to include a testing strategy is a welcome refinement to the Draize skin irritancy test. Although, at present, it has not led to a significant reduction in the number of animals used, since the guideline is now in place, any alternative irritancy models that are successfully validated will mean fewer animals will be used for Draize testing.

Reduction of the number of animals could also be achieved by making the one animal preliminary study mandatory and only allowing one more animal for confirmatory studies. It may also be possible to re-use an animal and study up to 3

45 chemicals per animal, but this should be carefully balanced against the welfare of individual animals. (Combes et al, 2004)

Refinement will require a consideration of the dose and dose volume used, the type of data collected and its subsequent analysis and how the need to restrain animals can be reduced or avoided.

2.3.7 Scope for future work

The use of QSARs is probably limited in the area of skin irritation. However, there may be potential in the use of rules and SARs for predictive purposes. Skin irritation is clearly an area where considerable quantities of data are extant, but which are not readily available (cf. Gerner et al 2004). This may prove a useful exercise in data accessibility to attempt to negotiate to retrieve some, or all, of these data.

2.3.8 Bibliography

Bagley DM, Gardner JR, Holland G, Lewis RW, Regnier JF, Stringer DA, Walker AP (1996) Skin irritation: Reference chemicals data bank. Toxicology in Vitro 10:1-6. Botham PA (2004) The validation of in vitro methods for skin irritation. Toxicology Letters 149: 387-390 Combes RD, Gaunt I, Balls M (2004) A Scientific and animal welfare assessment of the OECD Health Effects Test Guidelines for the safety testing of chemicals under the European Union REACH system. ATLA 32: 163-208 Fentem JH, Botham PA (2002) ECVAM's activities in validating alternative tests for skin corrosion and irritation. ATLA 30 Suppl. 2: 61-67 Fentem JH, Briggs D, Chesné C, Elliott GR, Harbell JW, Heylings JR, Portes P, Roguet R, van de Sandt JJM, Botham PA (2001) A prevalidation study on in vitro tests for acute skin irritation: results and evaluation by the Management Team. Toxicology in Vitro 15: 57-93 Gerner I, Schlegel K, Walker JD, Hulzebos E (2004) Use of physicochemical property limits to develop rules for identifying chemical substances with no skin irritation or corrosion potential. QSAR and Combinatorial Science 23:726-733 Götte E (1964) Hautverträlichkeit von tensiden, gemessen am lösevermögen für zein. 4th International Congress on Surfactants, Volume III, p.83, Griffiths HA, Wilhelm KP, Robinson MK, Wang XM, McFadden J, York M, Basketter DA (1997) Interlaboratory evaluation of a human patch test for the identification of skin irritation potential/hazard. Food and Chemical Toxicology 35: 255-260. Heylings JR, Clowes HM, Hughes L (2001) Comparison of tissue sources for the skin integrity function test (SIFT). Toxicology in Vitro 13: 597-600 Heylings JR, Diot S, Esdaile DJ, Fasano WJ, Manning LA, Owen HM (2003) A prevalidation study on the in vitro skin irritation function test (SIFT) for prediction of

46 acute skin irritation in vivo: results and evaluation of ECVAM Phase III. Toxicology in Vitro 17: 123-138 Klöcking H-P, Schlegelmilch U, Klöcking R (1994) Assessment of Membrane Toxicity using [3H]-AA release in U937 cells. Toxicology in Vitro 8: 775-777 Klöcking R, Schlegelmilch U, Klöcking H-P (1993) [3H] - Arachidonic acid release and other in vitro methods as alternatives to the eye irritation test. Third International Conference on Practical In Vitro Toxicology, 25-29 July 1993, Nottingham, UK. Kodithala K, Hopfinger AJ, Thompson ED, Robinson MK (2002) Prediction of skin irritation from organic chemicals using membrane-interaction QSAR analysis. Toxicological Sciences 66: 336-346 Langley G (2001) The Way Forward. Action to end animal toxicity testing. Report compiled for the BUAV and ECEAE. OECD (2002) OECD Guideline for the testing of chemicals. Acute Dermal Irritation/Corrosion. www.oecd.org/home/. Patlewicz G, Rodford R, Walker JD (2003) Quantitative structure-activity relationships for predicting skin and eye irritation. Environmental Toxicology and Chemistry 22: 1862-1869 Van de Sandt JJM, Maas WJM, Doornink PC, Rutten AAJJL (1995) Release of arachidonic and linoleic acid metabolites in skin organ cultures as characteristics of in vitro skin irritancy. Fundamental and Applied Toxicology 25: 20-28 Van de Sandt JJM, Meuling WJA, Elliott GR, Cnubben NHP Hakkert BC (2000) Comparative in vitro – in vivo percutaneous absorption of the propoxur. Toxicological Sciences 58: 23-31 Van de Sandt JJM, Rutten AAJJL, Koëter HBWM (1991) A new two-compartment skin model for cutaneous toxicity testing. In: Alternative Methods in Toxicology. Vol. 8 (Eds. A M Goldberg and M L Principe) Mary Ann Liebert Inc.; New York. pp. 363- 369, Van de Sandt JJM, van Schoonhoven J, Maas WJM, Rutten AAJJL (1993) Skin organ culture as an alternative to in vivo dermatotoxicity testing. ATLA 21: 443-449 Walker JD, Gerner I, Hulzebos E, Schlegel K (2004) (Q)SARs for predicting skin irritation and corrosion: mechanisms, transparency and applicability of predictions. QSAR and Combinatorial Science 23:721-725 Welss T, Basketter DA, Schroder KR (2004) In vitro skin irritation: facts and future. State of the art review of mechanisms and models. Toxicology in Vitro 18: 231-243 Zuang V, Balls M, Botham PA, Coquette A, Corsini E, Curren RD, Elliott GR, Fentem JH, Heylings JR, Liebsch M, Medina J, Roguet R, van de Sandt H, Wiemann C, Worth AP (2002) Follow-up to the ECVAM Prevalidation Study on In vitro Tests for Acute Skin Irritation. ECVAM Sin Irritation Task Force Report 2. ATLA 30: 109-129

47 2.4 Skin Corrosivity

2.4.1 Established toxicity tests (e.g. OECD Guidelines etc)

REACH requirements:

Skin corrosion is covered in Annex V of the REACH proposals. Annex V states that the assessment of skin irritation should comprise the following steps:

1. an assessment of available human and animal data 2. an assessment of the acid or alkaline reaction 3. an in vitro study for skin corrosion

If the test substance is a strong acid (pH < 2.0) or base (pH > 11.5), flammable in air at room temperature, very toxic in contact with the skin or a dermal acute toxicity study has shown no irritation up to the limit dose of 2000mg/kg, the study need not be conducted and the test substance is classified accordingly.

Method outline:

Two in vitro methods for the assessment of skin corrosion have recently been adopted by the OECD, with a third method under review for selected substances only.

Transcutaneous electrical resistance test (TER; OECD TG 430)

Monitors changes in electrical resistance, following 24hr exposure to the test substance, as an indicator of loss of corneum integrity and barrier function. Uses (rat) skin discs from 28-30day old, humanely killed animals.

Human skin model tests using EPISKIN™ or EpiDerm™ skin models (OECD TG 431).

Reconstructed human epidermal equivalent (commercial systems) used to assess cell viability using the MTT reduction test following test substance exposure.

Corrositex™ assay (OECD draft TG 435)

An artificial barrier system coupled to a pH-based chemical detection system which utilises pH indicators and colorimetric determination to monitor passage of the test substance across the barrier.

No live animals are required for any of these tests.

48

2.4.2 Issues relating to the feasibility of applying the 3Rs to the endpoint.

Skin corrosivity is often considered to be a “physical” and irreparable effect as compared to irritation which is biological. Taken in its simplest terms, therefore, the mechanism is well comprehended. It is possible, however, that there are complex and, as yet, unappreciated mechanisms involved in skin corrosion.

On the surface, there are few high quality data available for modelling. However, the situation is probably confused due to the similarity of this endpoint with skin irritation.

The possible implications of these issues are that it may be beneficial to consider the development of alternatives for skin irritation and corrosion together.

2.4.3 In vitro methods

Recently, the OECD has accepted two in vitro methods for the assessment of skin corrosion, the Transcutaneous electrical resistance test (TER, TG 430) and the Human skin model test (using EPISKIN™ or EpiDerm™ skin models, TG 431). These methods have all been validated through ECVAM (Fentem et al, 1998; Liebsch et al, 2000; ECVAM 1998a, 1998b, 2000a). Another method the membrane barrier test (Corrositex® assay, draft TG 435) is being reviewed with the potential to become a third alternative method, for selected substances only. The Corrositex® assay has also been validated by ECVAM and has been endorsed by ESAC and ICCVAM for testing specific classes of chemicals, such as organic bases and inorganic acids (Fentem et al, 1998; NIH, 1999; ECAVM 2000b)

Transcutaneous electrical resistance (TER) test

The TER test (Barlow et al., 1991; Oliver et al., 1986; 1988; Oliver, 1990) is based on the experience that TER measurements have shown to be of value in predicting severe cutaneous effects in vivo. In addition to the changes in TER, a second endpoint, dye binding (sulforhodamine B), is used to reduce the number of false positive predictions encountered previously with surfactants and neutral organics.

Human skin model test

The human skin model tests are based on the experience that corrosive chemicals show cytotoxic effects following short-term exposure of the stratum corneum of the epidermis. The tests are designed to predict and classify the skin corrosivity potential of a chemical by assessment of its effect on a reconstituted human epidermis.

EPISKIN™ (www.loreal.com) and EpiDerm™ (www.mattek.com) are a three- dimensional human skin models comprising a reconstructed epidermis with a functional stratum corneum. Their use for skin corrosivity testing involves topical application of test materials to the surface of the skin, and the subsequent assessment of their effects on cell viability. Cytotoxicity is expressed as the reduction

49 of mitochondrial dehydrogenase activity measured by formazan production from MTT.

Corrositex® membrane barrier test

Corrositex® (www.invitrointl.com) is a standardised, quantitative in vitro test for skin corrosivity, based upon determination of the time which is required for a test material to pass through a bio-barrier membrane (a reconstituted matrix, constructed to have physico-chemical properties similar to rat skin), and produce a visually detectable change. The time required for this change to occur (the breakthrough time) is reported to be inversely proportional to the degree of corrosivity of the test material, i.e. the longer it takes to detect a change in the Chemical Detection System (CDS), the less corrosive is the substance.

2.4.4 In silico approaches

QSARs

QSARs for skin corrosivity are reviewed briefly by Walker et al (2004).

There are relatively few QSARs specifically for skin corrosivity. Of the studies available, it is noticeable that many are integrated with skin irritation. Of successful models, most are qualitative in nature and based on physicochemical properties deemed to be important (cf. Barratt 1995; 1996; Eriksson et al 1994).

SAR-type rule based on physicochemical properties such as melting point, log P, aqueous solubility etc have been developed and are reported by Gerner et al (2004). These are integrated into schemes which also identify irritants.

Expert Systems

No explicit expert system models for skin corrosivity have identified. This may be due to the lack of data available for this endpoint. In addition, many of the long-standing models for irritation (e.g. TOPKAT and MultiCASE) may need updating to reflect current knowledge better i.e. compounds in the database that are modelled as irritant may in fact be corrosive.

2.4.5 Integrated testing strategies for specific endpoints

The recently updated OECD TG 404 for skin corrosion/irritation testing (OECD, 2002) now includes a recommended step-wise strategy. Existing information, pH and SAR are used to decide whether testing is needed. ECVAM has confirmed the usefulness of pH in skin corrosion testing (Worth and Cronin 2001). If the test substance is a strong acid/base, it is automatically excluded from further testing. Otherwise, a strategy that involves using validated and OECD accepted in vitro models for skin corrosion, and then validated in vitro skin irritation models

50 (dependant of the outcome of the validation study) are used in preference to in vivo Draize studies.

The BUAV (Langley, 2001) has proposed a combined testing strategy for dermal irritancy and corrosion. For skin corrosion, initial assessments are based on prior information, pH and computer modelling, then validated and OECD accepted in vitro skin corrosion methods are used.

2.4.6 Bibliography

Barlow A, Hirst RA, Pemberton MA, Rigden A, Hall TJ, Botham PA, Oliver GJA (1991) Refinement of an in vitro test for the identification of skin corrosive chemicals. Toxicology Methods 1: 106-115 Barratt MD (1995) Quantitative structure-activity-relationships for skin corrosivity of organic-acids, bases and phenols. Toxicology Letters 75: 169-176. Barratt MD (1996) Quantitative structure-activity relationships for skin irritation and corrosivity of neutral and electrophilic organic chemicals. Toxicology in Vitro 10: 247- 256 Barratt MD, Dixit MB, Jones PA (1996) The use of in vitro cytotoxicity measurements in QSAR methods for the prediction of the skin corrosivity potential of acids. Toxicology in Vitro 10: 283-290 ECVAM (1998a) Statement on the scientific validity of the EPISKINTM test (an in vitro test for skin corrosivity). ATLA 26: 277-280 ECVAM (1998b) Statement on the scientific validity of the rat skin transcutaneous electrical resistance (TER) test (an in vitro test for skin corrosivity). ATLA 26: 275- 277 ECVAM (2000a) Statement on the application of the Epiderm™ human skin model for skin corrosivity testing. ATLA 28: 365-366 ECVAM (2000b) Statement on the application of the Corrositex® assay for skin corrosivity testing. http://ecvam.jrc.it/index.htm Eriksson L, Berglind R, Sjostrom M (1994) A multivariate quantitative structure- activity relationship for corrosive carboxylic-acids. Chemometrics and Intelligent Laboratory Systems 23: 235-245 Fentem JH, Archer GEB, Balls M, Botham PA, Curren RD, Earl LK, Esdaile DJ, Holzhütter HG, Liebsch M (1998) The ECVAM international validation study on in vitro tests for skin corrosivity. 2. Results and evaluation by the Management Team. Toxicology in Vitro 12: 483-524 Gerner I, Schlegel K, Walker JD, Hulzebos E (2004) Use of physicochemical property limits to develop rules for identifying chemical substances with no skin irritation or corrosion potential. QSAR and Combinatorial Science 23:726-733 Langley G The Way Forward. Action to end animal toxicity testing.Report compiled for the BUAV and ECEAE, 2001

51 Liebsch M, Traue D, Barrabas C, Spielmann H, Uphill P, Wilkins S, Wiemann C, Kaufmann T, Remmele M, Holzhütter HG (2000) The ECVAM prevalidation study on the use of EpiDerm for skin corrosivity testing. ATLA 28: 371-401 NIH (1999) Corrositex®: An In Vitro Test Method for Assessing Dermal Corrosivity Potential of Chemicals. The Corrositex® Assay Peer Review Meeting Final Report, ICCVAM and NICEATM. US National Institute of Environmental Health Sciences. NIH publication No. 99-4495, OECD (2002) OECD Guideline for the testing of chemicals. Acute Dermal Irritation/Corrosion. www.oecd.org/home/. Oliver GJA (1990) The evaluation of cutaneous toxicity: past and future. In: Skin Pharmacology and Toxicology: Recent Advances (ed. C.L. Galli, C.N. Hensby and M. Marinovich), pp.147-173. New York: Plenum Press. Oliver GJA, Pemberton MA, Rhodes C (1986) An in vitro skin corrosivity; modifications and validation. Food and Chemical Toxicology 24: 507-512 Oliver GJA, Pemberton MA, Rhodes C (1988) An in vitro model for identifying skin corrosive chemicals. Initial validation. Toxicology In Vitro 2: 7-17 Walker JD, Gerner I, Hulzebos E, Schlegel K (2004) (Q)SARs for predicting skin irritation and corrosion: mechanisms, transparency and applicability of predictions. QSAR and Combinatorial Science 23:721-725 Whittle E, Barratt MD, Carter JA, Basketter DA, Chamberlain M (1996) Skin corrosivity potential of fatty acids: In vitro rat and human skin testing and QSAR studies. Toxicology in Vitro 10: 95-100 Worth AP, Cronin MTD (2001) The use of pH measurements to predict the potential of chemicals to cause acute dermal and ocular toxicity. Toxicology 169: 119-131

52 2.5 Skin Sensitisation

2.5.1 Established toxicity tests (e.g. OECD Guidelines etc)

REACH requirements:

Skin sensitisation is required in Annex V of the REACH proposals. The proposals state that an assessment of the available human and animal data should be carried out before performing the Murine Local Assay (LLNA). If the test substance is found to be corrosive, very toxic or irritant to the skin, a strong acid (pH < 2.0) or base (pH > 11.5), or flammable in air at room temperature the LLNA need not be carried out and the substance is classified accordingly. The REACH proposals also state that if the LLNA is not adequate for the test substance in question, the Guinea Pig Maximisation Test (GPMT) may be used.

Method outline:

The murine LLNA is recommended for use wherever possible as it uses fewer animals and is less invasive than the GPMT. It can also give a measure of potency of skin sensitiser, unlike GPMT.

Local Lymph Node Assay (OECD TG 429)

Test substance is applied to the back of each ear of the mouse and repeated each day for three days. After two days with no treatment, PBS containing 3H-methyl thymidine or 125I-iododeoxyuridine and fluorodeoxyuridine is injected into each mouse via the tail vein. After five hours the mice are killed and the draining auricular lymph nodes from each ear excised. A single cell suspension of lymph node cells is prepared and the radioactivity of the 3H or 125I is measured.

Number of animals used – minimum of 4 per dose/control group, approximately 16 per study.

Guinea Pig Maximisation Test (REACH B6; OECD TG 406)

The test animals are initially exposed to the test substance by intra-dermal injection followed by epidermal application (induction exposure) after 6-8 days. Following a rest period of 10 to 14 days (induction period), during which an immune response may develop, the animals are exposed to a challenge dose. The extent and degree of skin reaction to the challenge exposure in the test animals is compared with that demonstrated by control animals.

Number of animals used – 10 per dose group and 5 per control group, a minimum of 30 per study is recommended.

53

2.5.2 Issues relating to the feasibility of applying the 3Rs to the endpoint.

The mechanisms of skin sensitisation are reasonably well established, both in terms of immunology and chemical reactivity. These have the potential to form the basis of alternatives to whole animal testing.

Many in vivo data are available for various skin sensitisation endpoints including the Guinea Pig Maximisation Test (Cronin and Basketter, 2004); local lymph node assay (Ashby 1995) and various human data (cf. Benezra et al 1985). These could form the basis of the development of alternatives.

2.5.3 In vitro methods

At present there are no in vitro alternatives for skin sensitisation which are likely to be ready for use within REACH. There are in vitro methods under development and these have been helped by the elucidation of the mechanism of skin sensitisation (Grabbe and Schwarz, 1998; Kimber et al, 2002; Kimber and Dearman, 1997, 2002; Smith and Hotchkiss, 2001). There are three main stages in the induction of skin sensitisation: 1) the ability of the test substance to penetrate the skin and react with skin proteins; 2) the activity of Langerhans cells which is in turn dependent on the availability of relevant epidermal cytokines and 3) the stimulation of a T lymphocyte response.

Chemical allergens, haptens, must have the ability to access the epidermis and to react with proteins found within the skin to form stable complexes. Some chemicals may require metabolic activation before being able to penetrate the skin and/or react with skin proteins, these are known as prohaptens. This first stage in the skin sensitisation mechanism is mainly studied using by structure activity relationships (discussed below). It is the following two stages where research using in vitro techniques has been focused.

Stage two involves specific dendritic cells (DC) found in the epidermis called Langerhans cells (LC). These types of cell are found in lymphoid organs and are responsible for monitoring changes in the antigenic environment (de Silva et al, 1996). The main functions of LC are the interaction with, processing and transport of encountered in the skin. Upon topical sensitisation, LC at the site of exposure are induced to leave the epidermis and travel to the skin draining lymph nodes, during which time they acquire immuno-stimulatory properties enabling them to present to responsive T lymphocytes (Kimber et al, 2000). This process is regulated via cytokines such as granulocyte/macrophage colony-stimulating factor (GM-CSF), tumour necrosis factor alpha (TNFα) and interleukin 1β (IL-1β).

It is at this stage where the greatest efforts have been concentrated using in vitro techniques, with a recent ECVAM workshop being devoted to the topic (Casati et al, 2005 and references therein). The biggest problem with using LC in vitro is obtaining a sufficient number of cells for routine use, therefore research has focused on other

54 forms such as those derived from DC, which can be easily obtained from bone marrow and umbilical cord blood (Kimber et al, 2004 and references therein). Studies of DC generally consist of analysing induced changes in production, especially IL-1β, upon exposure to known sensitisers. While research goes on to find the best methods for assessing sensitisation hazard, more sophisticated models are likely to be required to completely remove the need for animal experimentation. Reconstituted skin models containing these functional DC are under development to facilitate sensitisation potential alongside a more relevant exposure scenario (Regnier et al, 1997; Schempp et al, 2000; Facy et al, 2004).

At stage three, the T lymphocytes are activated by the antigen and cell proliferation takes place with the production of allergen–specific T lymphocytes. The quantitative increase in T cells, capable of recognising and responding to the inducing allergen, represents the cellular basis of sensitisation.

Efforts to generate in vitro techniques of this stage have not been as widespread as for DC. Although comparatively straightforward, the stimulation of T lymphocytes in vitro has the disadvantage of requiring primed T lymphocytes from previously sensitised animals (Hauser and Katz, 1990); Caux et al, 1995). There are no other ways of gaining pre-sensitised T lymphocytes to specific test substances, although recent attempts to use LC modified with haptens to provoke T lymphocyte proliferative responses in vitro have had some success (Rustemeyer et al, 1999; Dai et al, 1998; Guironnet et al, 2000).

A pharmaceutically based immunotoxicity method for has recently been evaluated and prevalidated by ECAVM for the prediction of immunostimulants and immonsuppressants (Langezaal et al, 2002). The human whole-blood cytokine release model analyses the release of Interleukin-β and Interleukin-4 by and lymphocytes respectively. Thirty-one pharmaceutical compounds, known to affect the immune system were used to standardise and optimise the procedure by analysing their effects on cytokine release. The in vitro results were expressed as IC50 values for immunosuppression, and SC4 (fourfold increase) values for immunostimulation, and correlated well with therapeutic serum concentrations of the compounds in patients, and in vivo LD50 values from animal studies. Further studies, on a broader range of chemicals, would be required to assess the relevance of this test for the potential screening of skin sensitisation within the REACH system.

2.5.4 In silico approaches

A recent comprehensive review of QSARs for skin sensitisation appears to be currently lacking, although skin sensitisation was an endpoint addressed in the OECD study (OECD 2004). Further, it should be noted that the definitive book of chemistry behind skin sensitisation is Dupuis and Benezra (1982) which, despite its age, is still the basis of much knowledge in this area.

QSARs

55 Models are divided between QSARs for “similar” compounds, normally as defined by chemical structure which is related to mechanism of action. These have confirmed the mechanistic basis of this endpoint. Good examples of this approach include Sosted et al (2004); Patlewicz et al (2004, 2003), Roberts and Basketter (2000), Mekenyan et al (1997) and other papers noted in the bibliography. The majority of these studies either relate some form of skin permeability (e.g. log P) to potency and assume reactivity is constant, or have some description of reactivity such as a molecular orbital property or structural feature.

A number of studies have also attempted to model skin sensitisation for large databases (Cronin and Basketter 1994; Cronin and Dearden 1997). QSARs have struggled in this area, in particular to model reactivity of multiple mechanisms of action.

Expert Systems

There are a number of expert system approaches to predict skin sensitisation. A TOPKAT model is based on Guinea Pig Maximisation Test data (Enslein et al 1997). The MultiCASE model is based on human occurrences of sensitisation (Graham et al 1996). The OASIS / Times models are based on various data and include a module to predict metabolic activation.

Much has also been made of the DEREK rulebase for sensitisation, which includes over 60 rules for this endpoint (making it one of the most developed endpoints in DEREK). There have also been efforts to validate this part of the DEREK rulebase (e.g. Zinke et al 2002).

2.5.5 Integrated testing strategies for specific endpoints

A stepwise process for the determination of skin sensitisation is proposed in Worth and Balls (2002) including: a) an assessment of historical data; b) an assessment of physicochemical properties; c) screening of structures using the DEREK skin sensitisation rulebase; d) assessment of partition parameters and e) in vitro assessment of skin sensitisation before the LLNA test is performed.

A similar scheme is used by Unilever. This scheme involves the use of data from DEREK and from in vitro skin penetration studies before the LLNA test is conducted (Barratt, 1995).

The BUAV (Langley, 2001) propose an animal free testing strategy which initially uses DEREK and then, dependent on the predictions made, uses a validated in vitro skin penetration study on skin fragments, an in vitro test to see if the test substance reacts with human serum proteins and finally more in vitro tests using Langerhans cells and Dendritic cells. This would seem to be quite optimistic as these latter tests are not yet sufficiently developed for validation or regulatory use (see section on in vitro approaches to skin sensitisation).

56 2.5.6 Reduction and Refinement opportunities

The LLNA has recently been incorporated into the OECD Health Effects Test Guidelines after being validation and endorsed by both ECVAM (ECVAM, 2000) and ICCVAM (ICCVAM, 2001). The LLNA test requires the use of fewer animals than the GPMT. It is also less invasive and gives quantitatively more accurate data. Further reductions in the number of animals used for the LLNA test rely on the use of pre- existing control data. It would then be possible, for instance, to eliminate the need for positive controls, reducing the number of animals needed per test by at least 4. This is suggested in the OECD TG, subject to historical positive control data being updated every six months. The TG also states that, where possible, dermal irritancy and acute toxicity data should be taken into account to determine dose ranges. These reduction measures should be further evaluated and, subject to their workability, made mandatory.

For those cases where the LLNA is not suitable, the Buehler test (a similar test to the GPMT; OECD TG 406) is preferred over the GPMT as it uses topical application of test substance, instead of intra-dermal injections, and avoids the use of an adjuvant. A refinement of these methods would be to train the animals for use with the restraining apparatus required.

2.5.7 Scope for future work

This is an endpoint with a strong mechanistic basis. It should be ripe for the development of an integrated strategy based on in vitro and in silico approaches to predict skin permeability. There could also be scope for the development of in chemico reactivity assays.

2.5.8 Other Information

ECVAM meetings: - Workshop (April 2004) on ‘Dendritic cells as a tool for a predictive identification of skin sensitisation hazard’ – ATLA, 2005, 33.1 47-62. - Sensitisation Task Force meeting July 2005 – on testing strategies for skin and respiratory sensitisation

2.5.9 Bibliography Ashby J, Basketter DA, Paton D, Kimber D (1995) Structure-activity relationships in skin sensitisation using the murine local lymph node assay. Toxicology 103: 177- 194. Barratt MD (1995) The role of structure-activity relationships and expert systems in alternative strategies for the determination of skin sensitisation, skin corrosivity and eye irritation. ATLA 23: 111-122

57 Barratt MD (1995) The role of structure-activity-relationships and expert-systems in alternative strategies for the determination of skin sensitization, skin corrosivity and eye irritation. ATLA 23: 111-122 Barratt MD, Basketter DA, Roberts DW (1994) Skin sensitization structure-activity- relationships for phenyl benzoates. Toxicology in Vitro 8: 823-826 Benezra C, Sigman CC, Perry LR, Helmes T, Maibach HI (1985) A systematic search for structure-activity-relationships of skin contact sensitizers – methodology. Journal of Investigative Dermatology 85: 351-356 Caux C, Massacrier C, Dezutter-Dambuyant C, Vanbervliet B, Jacquet C, Schmitt D, Banchereau J (1995) Human dendritic cells Langerhans cells generated in vitro from CD34+ progenitors can prime naïve CD4+ T cells and process soluble antigen. Journal of Immunology 155: 5427–5435 Cronin MTD, Basketter DA (1994) A multivariate QSAR analysis of a skin sensitisation database. SAR and QSAR in Environmental Research 2:159-179. Cronin MTD, Dearden JC (1997) Correspondence analysis of the skin sensitization potential of organic chemicals. Quantitative Structure-Activity Relationships 16: 33- 37 Dai R, Streinlein JW (1998) Naïve, hapten-specific human T lymphocytes are primed in vitro with derivatized blood mononuclear cells. Journal of Investigative Dermatology 110, 29–33. De Silva O, Basketter DA, Barratt MD, Corsini E, Cronin MTD, Das PK, Degwert J, Enk A, Garrigue JL, Hauser C, Kimber I, Lepoittevin JP, Peguet J, Ponec M (1996) Alternative methods for skin sensitisation testing. The report and recommendations of the ECVAM workshop 19. ATLA 24: 683–705 ECVAM (2000) Statement on the validity of the Local Lymph Node Assay for skin sensitisation testing. http://ecvam.jrc.it/index.htm. Enslein K, Gombar VK, Blake BW, Maibach HI, Hostynek JJ, Sigman CC, Bagheri D (1997) A quantitative structure-toxicity relationships model for the dermal sensitization guinea pig maximization assay. Food and Chemical Toxicology 35: 1091-1098 Estrada E, Patlewicz G, Gutierrez Y (2004) From knowledge generation to knowledge archive. A general strategy using TOPS-MODE with DEREK to formulate new alerts for skin sensitisation. Journal of Chemical Information and Computer Science 44:688-698. Facy S, Flouret V, Regnier M, Schmidt R (2004) Langerhans cells integrated into human recontructed epidermis respond to known sensitisers and ultraviolet exposure. Journal of Investigative Dermatology 122: 552–553 Fedorowicz A, Zheng LY, Singh H, Demchuk E (2004) QSAR study of skin sensitization using local lymph node assay data. International Journal of Molecular Sciences 5: 56-66 Franot C, Roberts DW, Basketter DA, Benezra C, Lepoittevin JP (1994) Structure- activity-relationships for contact allergenic potential of α,α-dimethyl-γ-butyrolactone derivatives. 2. Quantitative structure skin sensitization relationships for γ-substituted-

58 α-methyl-α, α -dimethyl-γ -butyrolactones. Chemical Research in Toxicology 7: 307- 312 Grabbe S, Schwarz T (1998) Immunoregulatory mechanisms involved in the elicitation of allergic contact . Immunology Today 19: 37–44 Graham C, Gealy R, Macina OT, Karol MH, Rosenkranz HS (1996) QSAR for allergic contact . Quantitative Structure-Activity Relationships 15: 224-229 Guironnet G, Dalbiez-Gauthier C, Rousset F, Schmitt D, Peguet-Navarro J (2000) In vitro human sensitisation to haptens by monocytederived dendritic cells. Toxicology in Vitro, 14, 517–522 Hostynek JJ, Magee PS (1997) Fragrance allergens: Classification and ranking by QSAR. Toxicology in Vitro 11: 377-384 Hostynek JJ, Magee PS (1999) Performance of an SAR-QSAR model predictive of human ACD. In Vitro and Molecular Toxicology - a Journal of Basic and Applied Research 12: 203-211 Hostynek JJ, Maibach HI (1998) Scope and limitation of some approaches to predicting contact hypersensitivity. Toxicology in Vitro 12: 445-453 ICCVAM (2001) Protocol: Murine Local Lymph Node Assay (LLNA). http://iccvam.niehs.nih.gov/home.htm. Kimber I, Basketter DA, Gerberick GF, Dearman RJ (2002) Allergic contact dermatitis. International Immunopharmacology 2: 201–211 Kimber I, Cumberbatch M, Betts CJ, Dearman RJ (2004) Dendritic cells and skin sensitisation hazard assessment. Toxicology in Vitro 18: 195–202 Kimber I, Cumberbatch M, Dearman RJ, Bhushan M, Griffiths CEM (2000) Cytokines and in the initiation and regulation of epidermal Langerhans cell mobilization. British Journal of Dermatology 142: 401–412 Kimber I, Dearman RJ (1997) Cell and molecular biology of chemical allergy. Clinical Reviews in Allergy and Immunology 15: 145–168 Kimber I, Dearman RJ 2002 Allergic contact dermatitis: the cellular effectors. Contact Dermatitis 46: 1–5 Langezaal I, Hoffmann S, Hartung T, Coecke S (2002) Evaluation and Prevalidation of an Immunotoxicity Test Based on Human Whole-blood Cytokine Release. ATLA 30: 581-595 Langley G (2001) The Way Forward. Action to end animal toxicity testing. Report compiled for the BUAV and ECEAE Mekenyan O, Roberts DW, Karcher W (1997) Molecular orbital parameters as predictors of skin sensitization potential of halo- and pseudohalobenzenes acting as SNAr electrophiles. Chemical Research in Toxicology 10: 994-1000 Patlewicz G, Roberts DW, Walter JD (2003) QSARs for the skin sensitization potential of aldehydes and related compounds. QSAR and Combinatorial Science 22: 196-205 Patlewicz GY, Basketter DA, Pease CKS, Wilson K, Wright ZM, Roberts DW, Bernard G, Arnau EG, Lepoittevin JP (2004) Further evaluation of quantitative

59 structure-activity relationship models for the prediction of the skin sensitization potency of selected fragrance allergens. Contact Dermatitis 50: 91-97 Régnier, M., Staquet, M.J., Schmitt, D. and Schmidt, R. (1997). Integration of Langerhans cells into a pigmented reconstructed human epidermis. Journal ofInvestigative Dermatology 109, 510–512 Roberts DW, Basketter DA (1997) Further evaluation of the quantitative structure- activity relationship for skin-sensitizing alkyl transfer agents. Contact Dermatitis 37: 107-112 Roberts DW, Basketter DA (2000) Quantitative structure-activity relationships: sulfonate esters in the local lymph node assay. Contact Dermatitis 42: 154-161 Rustemeyer T, De Ligter S, Von Blomberg BM, Frosch PJ, Scheper RJ (1999) Human T lymphocyte priming in vitro by haptenated autologous dendritic cells. Clinical and Experimental Immunology 117: 209–216 Schempp CM, Dittmar HC, Hummier D, Simon-Haarhaus B, Schulte-Monting J, Schopf E, Simon JC (2000) Magnesium ions inhibit the antigen-presenting function of human epidermal Langerhans cells in vivo and in vitro. Involvement of ATPase, HLA-DR, B7 molecules, and cytokines. Journal of Investigative Dermatology 115: 680–686 Smith CK, Hotchkiss SAM (2001) Allergic Contact Dermatitis. In Chemical and Metabolic Mechanisms. Taylor and Francis, London. Sosted H, Basketter DA, Estrada E, Johansen JD, Patlewicz GY (2004) Ranking of hair dye substances according to predicted sensitization potency: quantitative structure-activity relationships. Contact Dermatitis 51: 241-254 Worth AP, Balls M (Eds.) (2002) Alternative (Non-animal) Methods for Chemicals Testing: Current Status and Future Prospects. A report prepared by ECAVM and the ECVAM Working Group on Chemicals. ATLA 30 Supplement 1. Zinke, S., Gerner, I. and Schlede, E. (2002). Evaluation of a rule base for identifying contact allergens by using a regulatory database: Comparison of data on chemicals notified in the European Union with ‘structural alerts’ used in the DEREKFW Expert System. ATLA 30: 285-298.

60 2.6 Acute Toxicity

2.6.1 Established toxicity tests (e.g. OECD Guidelines etc)

REACH requirements:

Acute toxicity is covered in Annex VI of the REACH proposals. For gases and volatile liquids (vapour pressure above 10-2 Pa at 20°C) the inhalation route is preferred. For all other test substances two routes are required, one of which is the oral route. The second route is dependent on exposure potential and physicochemical properties. The study need not be conducted if any substance is corrosive, is flammable in air at room temperature or cannot be administered in precise doses due to its chemical or physical properties.

Method outline:

Oral acute toxicity (FDP - REACH B1BIS, OECD TG 420; ATC - REACH B1TRIS, OECD TG 423; UDP - OECD TG 425)

The original oral LD50 test (OECD TG401) to measure acute toxicity was banned in 2002 as it was deemed to be excessively severe in terms of animal welfare and three less severe tests have been developed, validated and accepted by the OECD, namely the fixed dose procedure (FDP), the acute toxic class method (ATC) and the up and down procedure (UDP). Only one of the new test guidelines, the UDP, specifically uses lethality as the endpoint, its use is therefore not recommended for REACH.

The tests involve the administering of the test substance in a single dose via gavage or intubation cannula. Animals are observed for 7-14 days (depending on the test) for signs of changes to skin/eyes/fur, mucous membranes and respiratory, circulatory, autonomic and central nervous systems. Any animal in severe distress is humanely killed. All test animals are subjected to necropsy and all pathological changes recorded for each animal.

Number of animals used – minimum of 3 per dose group, approximately 12 per study (limit test uses 6), all figures for acute toxic class method.

Dermal acute toxicity (REACH B3, OECD TG 402)

The dermal LD50 test is still used although a draft dermal fixed dose procedure is under review by the OECD (Draft TG 434) which will use fewer animals and be less severe.

The tests involve administering the test substance to a shaved piece of animal skin (approx 10% of body surface area). The test substance is held in place with a porous

61 gauze dressing and non-irritating tape for 24 hours, then removed. Animals are observed for 14 days and necropsy carried out as in the oral methods.

Number of animals used – 5 per dose group, approximately 15 per study (limit test uses 5 of each sex).

Inhalation acute toxicity (REACH B2, OECD TG 403)

The inhalation LD50 is still used although it was updated in 1996 to refine the method and reduce the number of animals required by using only one sex per dose group. As for dermal toxicity, there are draft OECD test guidelines under review for methods which refine the use of animals (inhalation acute toxic class (Draft TG 436) and inhalation fixed dose procedure (Draft TG 433)).

The tests involve administering the test substance via inhalation equipment (either head/nose only or whole body exposure) for approximately 4 hours. Animals are observed for at least 14 days and necropsy of the animals carried out as in the oral methods.

Number of animals used – 5 of each sex per dose group, approximately 30 per study (limit test uses 5 of each sex).

2.6.2 Issues relating to the feasibility of applying the 3Rs to the endpoint.

There are many and varied mechanisms of acute toxicity. It is probably true to say that the full range of mechanisms is not fully appreciated. Indeed, due to the relatively “crude” nature of this test there may be limited scope for alternatives.

There are thousands of data for acute toxicity, and there is very great potential here for read-across approaches. Many of the data are available through commercial databases e.g. the MDL Toxicity Database (http://www.mdl.com/products/predictive/toxicity/index.jsp) includes the Registry of Toxic Effects of Chemicals listing. Another product is LeadScope’s Toxscope (http://www.leadscope.com/products/txs.htm). Similar to the MDL product is has access to 150,000 chemical structures from sources such RTECS, NTP, and CPDB.

Whilst such databases provide an excellent repository of data, their practical use for in silico modelling is limited as the quality of the data are highly variable and have not been recorded. The issue of in silico modelling of acute toxicity is complicated further by lack of knowledge of mechanisms of action. The difference in the advance of the science between the modelling of fish and mammalian acute toxicity is striking.

The area of acute toxicity has seen considerable steps forward in the areas of refinement and reduction, particularly in the redefinition of the LD50 test.

62 2.6.3 In vitro methods

A number of recent studies have shown that in vitro basal cytotoxicity can be used as a surrogate endpoint for predicting acute systemic toxicity in vivo, leading to the assumption that in vitro tests may be able to help to reduce the number of animals used for acute toxicity testing. Table 2.6.3a gives an overview of basal cytotoxicity tests currently used for in-house screening studies.

Method Test System Endpoint Developer/ References Neutral Red Uptake Balb/c 3T3 or Normal Cell viability Borenfreund and (INVITTOX No. 3/46) Human Keratinocyte Puerner, 1985 cell cultures Riddell et al, 1986 Liebsch and Spielmann, 1995 Spielmann et al, 1999 Kenacid Blue R dye Cell culture Change in total cell Clothier et al, 1987, binding protein arising from 1988 (INVITTOX No. 15) the inhibition of cell Hulme et al, 1987 proliferation Smith et al, 1992 In vitro prediction of Primary cultures of Cytotoxicity as Shrivastava et al maximum tolerated rat hepatocytes and indicator to predict in 1991, 1992 dose in MDBK and McCoy vivo 28-day max. (INVITTOX No. 66) cells tolerated dose. Transepithelial Renal cell lines - Measurements of Pfaller et al 2000 resistance (TER) and LLC-PK1, epithelial TER and PCP as Duff et al 2002 paracellular proximal tubular cells generalized permeability (PCP) and MDCK epithelial predictors of distal cells nephrotoxicity.

HepG2 cell/protein Hepatoma cell line Cytoxicity as a Dierickx 1989 content HepG2 measure of changes Clemedson et al (MEIC protocol) in protein content 1996

HL-60/ATP content HL-60 cells (human ATP content is Kangas et al, 1984, (MEIC protocol) acute promyelocytic measured as the Wakuri et al 1993 leukaemia) bioluminescence Clemedson et al generated from the 1996 enzymatic luciferin- luciferase reaction

Chang cell/MIT-24 Chang liver cells Deficient outgrowth Ekwall and assay of fusiform or Sandström, 1978 (MEIC protocol) spindle-shaped cells Clemedson et al is used as a criterion 1996 of cyto-inhibition.

Chang cell/pH Chang cells from Colour of the pH Ekwall and change MIT-24 assay indicator phenol red, Sandström, 1978 (MEIC protocol) violet - total Clemedson et al inhibition, red - partial 1996 inhibition.

Human lymphocyte Cell culture e.g. DNA leakage and Ekwall, 1980 cytotoxicity assay HeLa cells lactate Ekwall and

63 (INVITTOX No. 6) dehydrogenase Johanson, 1980 (LDH) release from Skaanild and lymphocytes Clausen, 1989

LS-L929 cytotoxicity Cell culture e.g. Cell viability Kemp et al, 1983, assay mouse fibroblasts determined by the 1988 (INVITTOX No. 38) uptake of dyes - ethidium bromide and fluorescein acetate.

Membrane Perfused cell cultures Efflux of [3H]-2- Walum and permeability in e.g. Rat liver, human deoxy-d-glucose-6- Peterseon, 1982 perfused cell cultures glioma. phosphate, as an Walum and (INVITTOX No. 9) indicator of Marchner, 1983 cytotoxicity

Table 2.6.3a. An overview of methods currently used for non-regulatory purposes for the assessment of basal cytotoxicity. INVITTOX protocols can be found via the ECVAM SIS website http://ecvam-sis.jrc.it .

The MEIC programme was a seven year study in 59 laboratories worldwide, which tested 50 reference chemicals selected by the Swedish Poison Information Centre, using in-house protocols (Clemedson et al, 1996). Each chemical had information about its human toxicity and kinetics from drug/chemical overdoses, from which 50% lethal blood concentrations (LC50) could be derived (Ekwall et al, 1998). These LC50s were compared with in vitro data produced via the testing laboratories. The outcome of the MEIC study was that results from certain in vitro human basal cytotoxicity tests correlated well with the derived LC50 values (Clemedson et al, 2000), specifically, a battery of four tests showed favourable results. The four tests are:

- HepG2 cell/protein content - HL-60/ATP content - Chang liver cell/morphology (MIT-24 assay) - Chang cell/pH change

Supplementary in vitro tests also showed that improved results could be obtained if biokinetics factors such as the passage across the blood-brain barrier were taken in to account. An ongoing study, EDIT (Clemedson et al, 2002), was subsequently set up to evaluate tests relevant to toxicokinetics and organ-specific toxicity for eventual incorporation into an in vitro test battery for the prediction of acute systemic toxicity, particularly to address the outliers found when the correlation between basal cell cytotoxicity and acute lethal potency in vivo was examined.

The Willi Halle Registry of Cytotoxicity (Halle, 2003) is a database of LD50 values obtained from studies using rats and mice and IC50 values obtained from in vitro cytotoxicity assays available in the literature. There are data for a total of 347 chemicals, from which a prediction model has been derived. The use of this prediction model results in a good correlation between the in vivo and in vitro data,

64 again demonstrating the possibilities of using a battery of in vitro tests in the prediction of systemic toxicity.

During an international workshop on ‘In vitro Methods for Assessing Acute Systemic Toxicity’ in 2000 (NIH, 2001), the above studies, amongst others, were discussed. The workshop concluded that none of the in vitro models had been formally validated for their reliability and relevance and recommended that a validation study should be carried out to predict human lethal concentrations in order to improve initial dose selection for in vivo studies. In 2002, the ECVAM/NICEATM validation study of an in vitro basal cytotoxicity test was started, with three laboratories involved – two in the USA and one in the UK (the FRAME Alternatives Lab). Seventy-two chemicals, including the 50 used in the MEIC study, are being tested in mouse Balb/c 3T3 cells and normal human keratinocye cells by using the Neutral Red Uptake (NRU) assay (Gennari et al, 2004). The objectives of this study are: a) to classify the chemicals according to the GHS categories; b) to evaluate the predictibility of starting doses for in vivo studies by using reliable in vitro data and determining the potential reduction in animal numbers required for cytotoxicity testing; and c) to evaluate the correlation with human lethal concentrations. The laboratory testing was due to be completed by the end of 2004, with a final report due in August 2005. Preliminary studies with several chemicals have shown that it is possible to reduce the numbers of animals required for acute toxicity testing by using in vitro data on basal cytotoxicity to select starting doses for the in vivo studies. The general applicability of this approach will have to await the analysis of the outcome of the study with data for more chemicals.

Following on from these studies an integrated project entitled A-Cute-Tox has been set up within the EU with the aim of optimising and pre validating an in vitro test strategy for predicting human acute toxicity. The fundamental premise underlying the project is that a high proportion of acute systemic toxicity is due to loss of critical organ function, arising from basal cytotoxicity in target tissue cells. There are many other factors involved and sometimes these will account for substances whose acute toxicity does not correlate with basal cytotoxicity. The project will involve developing a system of alerts based on accounting for such known outliers on the basis primarily of biokinetics, metabolism and target organ specific effects. The scientific objectives of the project are:

- Compilation, critical evaluation and generation of high quality in vitro and in vivo data for comparative analysis. - Identifying factors that influence the correlation between in vitro toxicity (concentration) and in vivo toxicity (dosage), and to define an algorithm that accounts for this. - Explore innovative tools and cellular systems to identify new endpoints and strategies to better anticipate animal and human toxicity. - To design a simple, robust and reliable in vitro test strategy amenable for robotic testing, associated with the prediction model for acute toxicity.

Although there has been a lot of activity in this area over recent years, full replacements of in vivo acute toxicity tests still require further research. The main problem associated with replacing the acute toxicity tests is that although in vitro basal cytotoxicity tests may be able to predict general cellular toxicity they are not

65 able to predict the toxicokinetic properties of a chemical or in which organs or tissues it is likely to accumulate. Toxicokinetic effects are difficult to predict in vitro and will be discussed in the toxicokinetics section. In vitro models of tissues and organs are available, Table 2.6.3b, although it is unlikely any will be validated in the near future (except the Colony forming unit-granulocyte/macrophage (CF-GMU) assay which has been validated for the detection of acute neutropenia, see below).

Method Test System Endpoint Developer/ References Hepatotoxicity/metabolism – mediated toxicity

Reactive metabolite Hepatic microsomal Measurement of Garle and Fry, 1988 formation by fortified fractions glutathione depletion Garle et al, 1988 liver microsomes (INVITTOX No. 10)

Hepatoma cell cultures Hepatoma cell lines Evaluation of colony- Ferro et al, in vitro models for derived from forming efficiency in 1988,1992 hepatotoxicity human, rat or hepatoma cell lines Bassi et al, 1991, (INVITTOX No. 13) mouse to detect irreversible 1993 toxic effects on both cell growth and survival

Isolation of rat Collagenase Assesses cell Sippel and Estler, hepatocytes perfusion of rat liver viability and enzyme 1990 (INVITTOX No. 20) leakage

Rat hepatotocyte flow Hepatocyte culture Induced changes in Holzer and Maier cytometric cytotoxicity DNA and protein 198a, 1987b test contents Maier et al, 1991 (INVITTIX No. 23)

Liver slice hepatotoxicity Rat or mouse liver Leakage of lactate Wormser et al, screening system slices dehydrogenase and 1990a, 1990b, (INVITTOX No. 42) alanine 1990c aminotransferase Serum-free liver mitogen Primary rat Assessment of Parzefall et al, 1989 test hepatocytes culture growth response (INVITTOX No. 67)

Primary human Human hepatocytes Measurement of 7- Gómez-Lechón et hepatocytes cultures culture ethoxy- and al, 1990 (from surgical biopsies) pentoxy-resorufin o- (INVITTOX No. 69) dealkylase activities

Use of stable cell lines V79 chinese Assessment of Doehmer, 1993 expressing Cyp cDNA hamster cells cytotoxicity of Jensen et al, 1993 (INVITTOX No. 107) xenobiotics and metabolites Two-compartment Human Identification of Riley et al, 1990 human tissue mononuclear cytotoxic metabolites Tingle et al, 1990, cytotoxicity test leucocytes and capable of diffusing 1991. (INVITTOX No. 73) human liver away from the site of microsomes production

Neurotoxicity

66

Spontaneously Primary cultures of Simultaneous Gulden et al, 1992, contracting cultured rat rat myogenic determination of 1994a, 1994b, skeletal muscle cells for satellite cells cytotoxicity, 1994c testing toxic effects on metabolic excitable tissues disturbances and (INVITTOX No. 93) gross disruption of Also used in membrane systems Cardiotoxicity testing for identification of sub-cytotoxic effects peculiar to the excitable membrane

Whole rat brain Rat brain Monitoring of Atterwill 1989 reaggregate spheroid reaggregate development, culture differentiation and (INVITTOX No. 11) relative maturity of the brain reaggregate Nephrotoxicity

Isolated rat glomeruli Specific kidney Examination of cell Bach 1989 and proximal tubules derived cells glucose and/or fatty Kwizera et al, 1990 (INVITTOX No. 5) acid oxidation and Wilks et al, 1990 de novo protein synthesis

LLC-RK1 cell screening Kidney derived cells Cytotoxicity Borenfreund and test determined by the Puerner 1985 (INVITTOX No. 51) neutral red method Williams et al, 1988 (see basal cytotoxicity table)

Alpha-methyl glucose Freshly isolated Inhibition of the Boogaard et al, uptake in isolated proximal tubular uptake of alpha- 1989a, 1989b proximal tubular cells cells from rat kidney methyl glucose to (INVITTOX No. 63) assess acute early stage nephrotoxicity

Cardiotoxicity

Embryonic myocardial Reaggregates of Direct acting cardio- Earl et al, 1991, myocyte reaggregation isolated chick toxins distinguished 1992 cultures as a model for embryo myocardial from toxic Seaman et al, 1994 cardiotoxicity myocytes compounds with (INVITTOX No. 106) indirect cardiotoxicity mechanisms

Haematotoxicity

Colony forming unit- Murine bone Evaluation of the Lewis et al, 1996 granulocyte/macrophage marrow cells or inhibition of CFU- Pessina 1998 assay (CFU-GM) human umbilical GM growth to predict Pessina et al, 2000, (INVITTOX No. 101) cord blood cells neutropenia 2001, 2003

Colony forming unit- Bone marrow cells, Evaluation of the Gribaldo, 2002 megakaryocyte assay platlets inhibition of CFU-MK (CFU-MK) growth to predict

67 thrombocytopenia

Respiratory tract toxicity

Dust toxicity in rat Rat alveolar Cell viability Collan et al, 1988 alveolar macrophage macrophage cells determined by vital cultures dye exclusion and (INVITTOX No.32) enzyme leakage assays to assess toxicity of particulate matter

Isolation of type II Rat lung tissue, Obtain cell Devereux et al, alveolar epithelial cells rabbit lung tissue, suspension enriched 1982 CHO cells in type II alveolar Bond et al, 1983 epithelial cells Zamora et al, 1983 Devereux 1984

Table 2.6.3b. An overview of methods currently available for non-regulatory screening of specific organ/tissue toxicities. INVITTOX protocols can be found via the ECVAM SIS website http://ecvam-sis.jrc.it .

The Colony forming unit-granuloctye macrophage (CFU-GM) assay for haematotoxicity has recently undergone a validation study by ECVAM for the prediction of acute neutropenia (the abnormal decrease of neutrophills - a type of white blood cell - in the blood) in humans (Pessina et al, 2005). The assay relies on the rapid division of haematopoietic stem cells, CFU-GM, to form colonies of granulocyte (mostly neutrophils) and macrophage (e.g. phagocytes) type white blood cells which are visible by microscopy. Upon exposure to a haematotoxic substance, the number and/or size of the colonies is reduced indicative of the destruction of the haematopoietic stem cells. The ESAC recently issued a statement on the validation study (http://ecvam.jrc.it/index.htm) which stated that the assay is not validated as a full replacement method but can be used as an alternative to a second species, therefore reducing the number of animals required for the pharmaceutical evaluation of clinically relevant, tolerated doses of anti-cancer drugs (Pessina et al, 2002). It is unclear whether this method would be applicable to REACH testing, but if haematotoxicity testing was required further validation studies would need to be carried out on a more diverse range of test substances before its use could be permitted.

The recent ECVAM Workshop on ‘Strategies to replace in vivo acute systemic toxicity testing’ (Gennari et al, 2004) has examples of further susceptible functions that have potential for assay system development. These include reactive oxygen species which could be detected by fluorescein diacetate or mitochondrial energy production/metabolism which could be detected by an ATP chemiluminescence assay.

2.6.4 In silico approaches

QSARs for acute toxicity have been reviewed recently by Lessigiarska et al (2005).

68

QSARs

There are relatively few published QSARs models for acute toxicity. Examples include Devillers (2004) and Wang and Baj (1998). The models available in the literature relate mainly to regulatory data, and few list the data. The published studies also tend to be non-linear i.e. involving neural networks. The reason for the use of non-linear techniques is not always clear and may relate to the quality of the data, mixed and complex mechanisms of action, or a personal modelling paradigm.

Expert Systems

In contrast to the paucity of QSARs, there are a number of expert system models of acute toxicity including models of mammalian LD50 from TOPKAT and MultiCASE. Other, less well established, systems with acute toxicity models include PASS, OASIS, ToxBoxes and TerraQSAR. All these systems have probably taken advantage of the large numbers of toxicity data available. How the modellers have dealt with issues of mechanisms and data quality is less clear.

There are also some SARs for acute toxicity. DEREK lists at least four rules for “high acute toxicity”. Whilst this is a small number, this may indicate some possibility to expand the rulebase, depending on a careful definition of “high acute toxicity”.

2.6.5 Integrated testing strategies for specific endpoints

A testing strategy for acute toxicity which includes various in silico and in vitro steps is described in Worth and Balls (2002). In silico methods are required for initial toxicity screening. Then basal cytotoxicity tests are proposed since the MEIC and EDIT projects (Clemendson et al, 2000; 2002) indicate that human basal cytotoxicity tests are predictive of human acute toxicity. The NRU assay for basal cytotoxicity currently undergoing validation is one of the assays proposed. The later stages of the testing strategy involve the use of in vitro biotransformation tests for metabolism and cell-specific toxicity tests. As a last resort a limited in vivo study may be carried out.

The BUAV proposes a four step testing strategy for acute and repeated dose toxicity whereby no animals are used. Step one involves the determination of physical and chemical information about the test substance. Step two uses basal cell cytotoxicity studies to classify highly toxic substances. Step three requires in vitro studies of metabolism integrated with computer simulations of absorption, distribution, metabolism and excretion (ADME), with the outcome predicting local concentration with time in target organs. Step four uses the information from step three to select more specialised tests for organ or tissue toxicity. The strategy is heavily reliant on the use of MEIC protocols which have yet to be validated and accepted for regulatory testing.

Another strategy, proposed by Seibert et al (1994) uses mainly in vitro tests with limited in vivo tests used to confirm the lowest toxicity classes. Three orders of tests

69 are used within this strategy to determine the level of toxicity: first order tests determine basal cytotoxicity; second order tests detect hepatocyte specific toxicity and the role of biotransformation for cytotoxic activity; finally, third order tests provide information concerning further selective cytotoxic activities and interferences with specific, non-vital cell functions. According to the results, the test substance is classified as very toxic, toxic or harmful, depending on which level of testing provides a positive outcome (first, second, third order respectively).

At an ECVAM workshop in September 2003 entitiled ‘Strategies to replace in vivo acute systemic toxicity testing’ (Gennari et al, 2004) it was concluded that an integrated test strategy should be developed for the replacement of in vivo acute toxicity testing based on: - Physico-chemical data - In vitro-in vivo data - Computational methods - Basal cytotoxicity assays - Complementary assays (for metabolism, transport, kinetics and target organ toxicity)

2.6.6 Reduction and Refinement opportunities

As discussed in the established toxicity test section, acute oral toxicity methods that reduce and refine animal testing have been accepted for regulatory use. Similarly, draft acute dermal and inhalation methods are undergoing review. The common link between these newer test guidelines is that they do not rely on lethality as the endpoint but, instead, allow an estimation of toxicity from clinical signs following the administration of lower test concentrations. Further refinement opportunities include looking at the need for restraint devices, especially during acute inhalation toxicity studies. Group housing and environmental enrichment should be used where appropriate.

The acceptance of the basal cytotoxicity validation study on NHK and Balb/c 3T3 cells would reduce animal use by providing a means by which starting doses can be calculated without sighting studies and limit tests being carried out. The number of animals used for preliminary assessment, and as controls, can be reduced by data- sharing or by limiting the number of animals needed to confirm results from a sighting study.

2.6.7 Bibliography

Atterwill CK (1989) Brain reaggregate cultures in neurotoxicological investigations: studies with cholinergic neurotoxins. Scandinavian Cell Toxicology Congress 1. ATLA 30 Suppl. 2: 53-59.

70 Bach PH (1989) The detection of chemically induced renal injury, the cascade of degenerative morphological and functional changes that follow the primary nephrotoxic insult and the evaluation of these changes by in vitro methods. Toxicology Letters 46: 237-250 Bassi AM, Bosco O, Brenci S, Adamo D, Penco S, Piana S, Ferro M, Nanni G (1993) Evaluation of the cytotoxicity of the first 20 MEIC chemicals in two hepatoma cell lines with different xenobiotic metabolism capacities. ATLA 21: 65-72 Bassi AM, Piana S, Penco S, Bosco O, Brenci S, Ferro M (1991) Use of an established cell line in the evaluation of the cytotoxic effects of various chemicals. Bolletin Society Italian Biology Sperimentale 8: 809-816 Bond JA, Mitchell CE, Li AP (1983) Metabolism and macromolecular covalent binding of benzo(a)pyrene in cultured Fischer-344 rat lung type II epithelial cells. Biochemical Pharmacology 32: 3711 Boogaard PJ, Commandeur JNM, Mulder JG, Vermeulen NPE, Nagelkerke JF (1989a) Toxicity of the cysteine-S-conjugates and mercapturic acids of four structurally related difluorethylenes in isolated proximal tubular cells from rat kidney. Uptake of the conjugates and activation to toxic metabolites. Biochemical Pharmacology 38: 3731-3741 Boogaard PJ, Mulder GJ, Nagelkerke JF (1989b) Isolated proximal tubular cells from rat kidney as an in vitro model for studies on nephrotoxicity. II. - Methylglucose uptake as a sensitive parameter for mechanistic studies of acute toxicity by xenobiotics. Toxicology and Applied Pharmacology 101: 144-157. Borenfreund E, Puerner JA (1985) Toxicity determined in vitro by morphological alterations and Neutral Red absorption. Toxicology Letters 24: 118 Clemedson C, Barile FA, Chesne C, Cottin C, Curren R, Ekwall B, Ferro M, Gomez- Lechon M, Imai K, Janus J, Kemp RB, kerszman G, Kjellstrand P, Lavrijsen K, Logemann P, McFarlane-Abdulla E, Roguet R, Segner H, Thuvander A, Walum E, Ekwall B. (2000) MEIC evaluation of acute systemic toxicity: part vii prediction of human toxicity by results from testing of the first 30 reference chemicals with 27 further in vitro assays. ATLA 28: 161-200 Clemedson C, Barile FA, Chesne C, Cottin C, Curren R, Ekwall B, Ferro M, Gomez- Lechon M, Imai K, Janus J, Kemp RB, kerszman G, Kjellstrand P, Lavrijsen K, Logemann P, McFarlane-Abdulla E, Roguet R, Segner H, Thuvander A, Walum E, Ekwall B (2000) MEIC evaluation of acute systemic toxicity: part vii prediction of human toxicity by results from testing of the first 30 reference chemicals with 27 further in vitro assays. ATLA 28: 161-200 Clemedson C, McFarlane-Abdulla E, Andersson M, Barile FA, Calleja MC, Chesné C, Clothier R, Cottin M, Curren R, Daniel-Szolgay E, Dierickx P, Ferro M, Fiskesjö G, Garza-Ocanas L, Gómez-Lechón MJ, Gülden M, Isomaa B, Janus J, Judge P, Kahru A, Kemp RB, Kerszman G, Kristen U, Kunimoto M, Kärenlampi S, Lavrijsen K, Lewan L, Lilius H, Ohno T, Persoone G, Roguet R, Romert L, Sawyer T, Seibert H, Shrivastava R, Stammati A, Tanaka N, Torres Alanis 0, Voss J-U, Wakuri S, Walum E, Wang X, Zucco F, Ekwall B (1996) MEIC evaluation of acute systemic toxicity: Part I. Methodology of 68 in vitro toxicity assays used to test the first 30 reference chemicals. ATLA 24: 251-272

71 Clemedson C, Nordin-Andersson M, Bjerregaard HF, Clausen J, Forsby A, Gustafsson H, Hansson U, Isomaa B, Jørgensen C, Kolman A, Kotova N, Krause G, Kristen U, Kurppa K, Romert L, Scheers E (2002)Development of an in vitro test battery for the estimation of acute human systemic toxicity: an outline of the EDIT project. ATLA 30: 313–321 Clemedson C, Nordin-Andersson M, Bjerregaard HF, Clausen J, Forsby A, Gustafsson H, Hansson U, Isomaa B, Jørgensen C, Kolman A, Kotova N, Krause G, Kristen U, Kurppa K, Romert L, Scheers E (2002) Development of an in vitro test battery for the estimation of acute human systemic toxicity: an outline of the EDIT project. ATLA 30: 313–321. Clothier RH, Hulme L, Ahmed AB, Reeves HL, Smith M, Balls M (1988) In vitro cytotoxicity of 150 chemicals to 3T3-L1 cells, assessed by the FRAME Kenacid Blue Method. ATLA 16: 84-95 Clothier RH, Hulme L, Smith M, Balls M (1987) A comparison of the in vitro cytotoxicities and acute in vivo toxicities of 59 chemicals. Molecular Toxicology 1: 571-577 Collan Y, Kosma V-M, Kulju T, Väänänen I, Remola-Pärssinen E, Pesonen E, Puhakainen R, Rytöluoto-Kärkkäinen R, Manninen R, Pasanen J (1988) Estimation of dust toxicity in rat alveolar macrophage cultures. In: Safety Evaluation of Chemicals on Laboratory Animals, Proceedings of the Finnish-Soviet Symposium, Kuopio 20-22 May 1986, (eds. Nevalainen, T., Voipio, H.-M. and Haataja, H. (eds.). Kuopio, Finland; University of Kuopio, , pp 105-123. Devereux TR (1984) Alveolar type II and Clara cells: isolation and xenobiotic metabolism. Environmental Health Perspective 56: 95 Devereux TR, Jones K, Bend J, Fouts J, Stratham C, Boyd MR (1982) In vitro metabolic activation of the pulmonary toxin, 4-ipomeanol, in non-ciliated bronchiolar epithelial (Clara) and alveolar type II cells isolated from rabbit lung. Journal of Pharmacology and Experimental Therapeutics 242: 485 Devillers J (2004) Prediction of mammalian toxicity of organophosphorus from QSTR modelling. SAR and QSAR in Environmental Research 15: 501-510. Dierickx P (1989) Cytotoxicity testing of 114 compounds by the determination of the protein content in HepG2 cell cultures. Toxicology in Vitro 3: 189-193 Doehmer J (1993) V79 Chinese hamster cells genetically engineered for cytochrome P450 and their use in mutagenicity and metabolism studies. Toxicology 82: 105-118 Duff T, Carter S, Feldman G, McEwan G, Pfaller W, Rhodes P, Ryan M, Hawksworth G (2002) Transepithelial resistance and inulin permeability as endpoints in in vitro nephrotoxicity testing. ATLA 32: 437-459 Earl LK, Kesingland K, Davis KP, Brocklehurst SR, Jones HB 1992 Allylamine toxicity in embryonic myocardial myocyte reaggregate cultures: The role of extracellular metabolism by benzylamine oxidase. Toxicology in Vitro 6: 405-416

72 Ekwall B (1980) Preliminary Studies on the Validity of In Vitro Measurement of Drug Toxicity Using HeLa Cells. II. Drug Toxicity in the MIT-24 System Compared with Mouse and Human Lethal Dosage of 53 Drugs. Toxicology Letters 5: 309-17 Ekwall B, Clemedson C, Crafoord B, Ekwall B, Hallander S, Walum E, Bondesson I (1998) MEIC Evaluation of Acute Systemic Toxicity: Part V Rodent and Human Toxicity Data for the 50 Reference Chemicals. ATLA 26: 571-616 Ekwall B, Johanson A (1980) Preliminary Studies on the Validity of In Vitro Measurement of Drug Toxicity Using HeLa Cells. I. Comparative In Vitro Cytotoxicity of 27 Drugs. Toxicology Letters 5: 299-307 Ekwall B, Sandström B (1978) Combined toxicity to HeLa cells of 30 drug pairs, studied by a two-dimentional microtitre method, Toxicol. Lett., 2, 285-292 Ferro M, Bassi AM, Nanni G (1988) Hepatoma cell cultures as in vitro models for the hepatotoxicity of xenobiotics. ATLA 16: 32-37 Ferro M, Bassi AM, Penco S, Piana S, Usiglio D, Nanni G (1992) Comparative assessment of the cytotoxic effects of different xenobiotics in three hepatoma cell lines. Arzneim.-Forsch./Drug Research 442: 1053-1057 Garle MJ, Fry JR (1988) Detection of reactive metabolites in vitro. Toxicology 54: 101-110 Garle MJ, Khan J, Fry JR (1988) Depletion of glutathione by the hepatotoxins paracetamol and bromobenzene, and their non-hepatotoxic analogues, in a fortified liver microsomal system. Toxicology in Vitro 2: 247-252 Gennari A, van den Berge C, Casati S, Castell J, Clemedson C, Coecke S, Colombo A, Curren R, Dal Negro G, Goldbuerg A, Gosmore C, Hartung T, Langezaal I, Lessigiarska I, Mass W, Mangelsdorf I, Parchmnet R, Prieto P, Sintes JR, Ryan M, Schmuck G, Stitzel K, Stokes W, Vericat J-A, Gribaldo L. (2004) ECVAM Workshop Report 50: Strategies to Replace In vivo Acute Systemic Toxicity Testing. ATLA 32: 437-459 Gómez-Lechón MJ, López P, Donato T, Montoya A, Larrauri Giménez P, Trullenque R, Fabra R, Castell JV (1990) Culture of human hepatocytes from small surgical biopsies. Biochemical characterization an comparison with in vivo. In Vitro Cellullar Development Biology 26: 67-74 Gribaldo L (2002) Haematotoxicology: Scientific basis and regulatory aspects. ATLA 30 Suppl. 2: 111-113 Gülden M, Seibert H, Voss J-U (1994a) In vitro toxicity screening using cultured rat skeletal muscle cells. II. Agents affecting excitable membranes. Toxicology In Vitro 8: 197-206 Gülden M, Seibert H, Voss J-U (1994b) Inclusion of physicochemical data in quantitative comparisons of in vitro and in vivo toxic potencies. ATLA 22: 185-192 Gülden M, Seibert H, Voss J-U (1994c) The use of cultured skeletal muscle cells in testing for acute systemic toxicity. Toxicology In Vitro 8: 779-782 Gülden M, Seibert H, Voss J-U, Wassermann O (1992) Animal cells in vitro as supplement or alternative to animals in acute toxicity testing. Schriftenreihe des Instituts für Toxikologie der Universität Kiel Heft 21: 1-185

73 Halle W (2003) The registry of cytotoxicity: toxicity testing in cell cultures to predict acute toxicity (LD50) and to reduce testing in animals. ATLA 31: 89-198 Holzer C, Maier P (1987a) DNA and protein contents of hepatocytes in primary cultures monitored by flow cytometry: effect of phenobarbital and dimethylsulphoxide. Toxicology In Vitro 1: 203-213 Holzer C, Maier P (1987b) Maintenance of periportal and pericentral oxygen tensions in primary rat hepatocyte cultures: influence on cellular DNA and protein content monitored by flow cytometry. Journal of Cellular Physiology 133: 297-304 Hulme L, Reeves HL, Clothier RH, Smith M, Balls M (1987) An assessment of two alternative methods for predicting the in vivo toxicities of metallic compounds. Molecular Toxicology 1: 589-596 Jensen KG, Loft S, Doehmer J, Poulsen HE (1993) Metabolism of phenacetin in V79 Chinese hamster cell cultures expressing rat liver cytochrome P450 1A2 compared to isolated rat hepatocytes. Biochemical Pharmacology 45: 1171-1173 Kangas L, Gronroos M, Nieminen AL (1984) Bioluminescence of cellular ATP: a new method for evaluating cytotoxicity agents in vitro. Medical Biology 62: 338-343 Kemp RB, Cross DM, Meredith RWJ (1988) Comparison of cell death and adenosine triphosphate as indicators of acute toxicity in vitro. Xenobiotica 18: 633- 639. Kemp RB, Meredith RWJ, Gamble S, Frost M (1983) A rapid cell culture technique for assessing the toxicity of detergent-based products in vitro as a possible screen for eye irritancy in vivo. Cytobiosystems 36: 153-159 Kesingland K, Earl LK, Roberts JC, Jones HB (1991) Allylamine toxicity in embryonic myocardial myocyte reaggregate cultures. Toxicology in Vitro 5: 145-156 Kwizera EN, Wilks MF, Bach PH (1990) Effects of aminoglycosides on the incorporation of amino acids and on fatty acid oxidation in freshly isolated rat renal proximal tubules. In Vitro Toxicology 3: 243-253 Langley G (2001) The Way Forward. Action to end animal toxicity testing. Report compiled for the BUAV and ECEAE, Lessigiarska I, Worth AP, Netzeva TI (2005) Comparative Review of QSARs for Acute Toxicity. European Commission Report EUR 21559 EN, Ispra, Italy (contact Dr Andrew Worth for more information [email protected]) Lewis ID, Rawling T, Dyson PG, Haylock DN, Juttner DN, To LB (1996) Standardization of the CFU-GM assay using hematopoietic growth factors. Journal of Hematometery 5: 625-630 Liebsch M, Spielmann H (1995) Balb/c 3T3 cytotoxicity test. Methods and Molecular Biology, vol. 43, In vitro Toxicity testing Protocols. O’Hare S., Atterwill CK., eds. pp. 177-187. Humana Press, Totowa, NJ. Maier P, Schawalder H, Elsner J (1991) Single cell analysis in toxicity testing: the mitogenic activity of thioacetamide in cultured rat hepatocytes analyzed by DNA/protein flow cytometry. Archives of Toxicology 65: 454-464

74 NIH (2001) Report of the International Workshop on In Vitro Methhods for Assessing Acute Systesmic Toxicity. NIH Publication 01-4499, 186pp. Research Triangle Park, NC, USA: NIEHS Parzefall W, Monschau P, Schulte-Hermann R (1989) Induction by cyproterone acetate of DNA synthesis and mitosis in primary cultures of adult rat hepatocytes in serum free medium. Archives of Toxicology 63: 456-46 Pessina A (1998) The Granulocyte Macrophage Colony-Forming unit Assay. In: Animal Cell Culture Techniques. Ed. M.Clynes, Springer-Verlag, Berlin-Heidelberg, New York, pp. 217-230, Pessina A, Albella B, Bayo B, Bueren J, Brantom P, Casati S, Croera C, Gagliardi G, Foti P, Parchment R, Parent-Massin D, Schoeters G, Sibiril Y, Van den Heuvel R, Gribaldo L (2003) Application of the CFU-GM assay to predict acute drug-induced neutropenia: an international blind trial to validate a prediction model for the maximum tolerated dose (MTD) of myelosuppressive xenobiotics. Toxicological Sciences 75: 355-367 Pessina A, Albella B, Bayo M, Bueren J, Brantom P, Casati S, Croera C, Parchment R, Parent-Massin D, Schoeteres G, Sibiri Y, van den Heuvel R, Gribaldo L (2002) In Vitro Tests for Haematotoxicity: Prediction of drug-induced myelosuppression by the CFU-GM assay. ATLA 30 Suppl. 2: 75-79 Pessina A, Albella B, Bueren J, Brantom P, Casati S, Corrao G, Gribaldo L, Parchment R, Parent-Massin D, Piccirillo M, Rio B, Sacchi S, Schoeters G, Van den Heuvel R (2000) Method developement for a prevalidation study of in vitro GM-CFU assay for predicting myelotoxicity. In: Progress in the reduction, refinement and replacement of animal experimentation. Balls, M, van Zeller, A-M, Halder, M (eds). Amsterdam: Elsevier, pp. 679-691. Pessina A, Albella B, Bueren J, Brantom P, Casati S, Gribaldo L, Croera C, Gagliardi G, Foti P, Parchment R, Parent-Massin D, Sibiril Y, Schoeters G, Van den Heuvel R (2001) Prevalidation of a model for predicting acute neutropenia by colony forming unit granulocyte/macrophage (CFU-GM) assay. Toxicology in Vitro 15: 729- 740 Pessina A, Malerba F, Gribaldo L (2005) Hematotoxicity testing by cell clonogenic assay in drug development and preclinical trials. Current Pharmaceutical Design 11: 1055-1065 Pfaller W, Troppmair E (2000) Renal transepithelial resistance (TER) and paracellular permeability (PCP) are reliable endpoints to screen for nephrotoxicity. In: Progress in the Reduction Refinement and Replacement of Animal Experimantation. (Ball M., van Zeller A. M., Halder M. eds.) Elsevier Science B. V., Vol. 1, p. 291-304 Riddell RJ, Clothier RH, Balls M (1986) An evaluation of three in vitro cytotoxicity assays. Food and Chemical Toxicology 24: 469-471 Riley RJ, Roberts P, Coleman MD, Kitteringham NR, Park BK (1990) Bioactivation of dapsone to a cytotoxic metabolite: in vitro use of a novel two compartment system which contains human tissues. British Journal of Clinical Pharmacology 30: 417-426

75 Seaman CW, Toseland CDN, Maile PA, Francis I, White DJ (1994) A study of the cardiotoxic potential of pharmaceutical compounds on chick myocardial myocyte reaggregate cultures. Toxicology in Vitro 8: 543-544 Seibert H, Gulden M, Voss J-U (1994) An In Vitro Toxicity Testing Strategy for the Classification and Labelling of Chemicals According to Their Potential Acute Lethal Potency. Toxicology in Vitro 8: 847-850 Shrivastava R, Delomenie C, Chevalier A, John G, Ekwall B, Walum E, Massingham R (1992) Comparison of in vivo acute lethal potency and in vitro cytotoxicity of 48 chemicals. Cell Biology and Toxicology 8: 157-170 Shrivastava R, John GW, Rispat G, Chevalier A, Massingham R (1991) Can the in vivo maximum tolerated dose be predicted using in vitro techniques? A working hypothesis. ATLA 19: 393-402 Sippel H, Estler C-J (1990)Comparative evaluation of hepatotoxic side effects of various new trypanocidal diamidines in rat hepatocytes and mice. Arzneimittel- Forschung/Drug Research 40: 290-293 Skaanild MT, Clausen J (1989) Estimation of LC50 values by assay of lactate dehydrogenase and dna redistribution in human lymphocyte cultures. ATLA 16: 293- 296 Smith LM, Clothier RH, Hillidge S, Balls M (1992) Modification of the FRAME Kenacid Blue method for cytotoxicity tests on volatile materials. ATLA 20: 230-234 Spielmann H, Genschow E, Liebsch M, Halle W (1999) Determination of the starting dose for acute oral toxicity (LD50) testing in the up and down procedure (UDP) from cytotoxicity data. ATLA 27: 957-966 Tingle MD, Coleman MD, Park BK (1990) An investigation of the role of metabolism in dapsone -induced methaemoglobinaemia using a two compartment in vitro test system. British Journal of Clinical Pharmacology 30: 829-838 Tingle MD, Coleman MD, Park BK (1991) A comparison between in vitro and in vivo haemotoxicity of dapsone analogues. British Journal of Clinical Pharmacology 31: 594 Wakuri S, Izumi J, Sasaki K, Tanaka N, Ono H (1993) Cytotoxicity study of 32 MEIC chemicals by colony formation and ATP assays, Toxicology in vitro 7: 517-521 Wang GL, Bai NB (1998) Structure-activity relationships for rat and mouse LD50 of miscellaneous alcohols. Chemosphere 36: 1475-1483 Wilks MF, Kwizera EN, Bach PH (1990) Assessment of heavy metal nephrotoxicity in vitro using isolated rat glomeruli and proximal tubular fragments. Renal Physiology and Biochemistry 13: 275-284 Williams PD, Laska DA, Tag LK, Hottendorf GH (1988) Comparative toxicities of cephalosporin antibiotics in a rabbit kidney cell line (LLC-RK1). Antimicrobial Agents and Chemotherapy 32: p314 Wormser U, Ben Zakine S, Eisen O, Nyska A (1990c) The liver slice system: a rapid and simple acute toxicity test for assessment of environmental toxic substances. Proceedings of the Second International Conference on Environmental Analytical Chemistry: January 17-19, 1990, Honolulu, Hawaii, US. US Environmental

76 Protection Agency, US National Institute of Standards and Technology, The Center for Environmental Research, Cornell University. Wormser U, Ben Zakine S, Nyska A (1990b) Cadmium-induced metallothionein synthesis in the rat liver slice system. Toxicology In Vitro 4: 791-794 Wormser U, Ben Zakine S, Stivelband E, Eisen O, Nyska A (1990a) The liver slice system: A rapid in vitro acute toxicity test for primary screening of hepatotoxic agents. Toxicology In Vitro 4: 783-789 Worth AP, Balls M (Eds.) (2002) Alternative (Non-animal) Methods for Chemicals Testing: Current Status and Future Prospects. A report prepared by ECAVM and the ECVAM Working Group on Chemicals. ATLA 30 Suppl 1. Zamora PO, Benson JM, Marshall TC, Mokler BV, Li AP, Dahl AR, Brooks AL, McClellan RO (1983) Cytotoxicity and mutagenicity of vapor phase environmental pollutants in rat lung epithelial cells and Chinese hamster ovary cells. Journal of Toxicology and Environmental Health 12: 27

77

2.7 Chronic Toxicity

2.7.1 Established toxicity tests (e.g. OECD Guidelines etc)

REACH requirements: Repeated dose toxicity studies are covered in Annex VI and Annex VII of the REACH proposals. The 28-day repeated dose study is required for Annex VI and the 90-day repeated dose study for Annex VIII (the 90-day may also be requested at Annex VI level if results from a 28-day study suggest the need for further studies). For both studies, the proposals state the use of one species (preferably rodent) and that the most appropriate route of administration should be used, depending on the exposure potential and physicochemical properties of the test substance.

A 28-day study is not necessary if relevant information from a previous 90-day study is available, if the test substance undergoes immediate disintegration and there is sufficient data on the cleavage products or if relevant human exposure can be excluded. The 90-day study is not needed if extrapolation of a 90-day NOAEL from the 28-day study is possible, if results for a reliable chronic study are available or if the test substance is unreactive, insoluble and not inhalable and there is no evidence of absorption or toxicity from a 28-day ‘limit test’.

Further studies, for example, a long-term repeated toxicity study (≥ 12 months; REACH B30, OECD TG 452) should only be carried out under specific circumstances, for example if severe toxicity is seen in either of the repeated dose tests but the available evidence is inadequate for toxicological and/or risk characterisation.

Method outline:

28-day repeated dose study (oral – REACH B7, OECD TG 407; dermal – REACH B9, OECD TG 410; inhalation – REACH B8, OECD TG 412)

The test substance is administered daily by the specified route for a period of 28 days. During this period the animals are observed for signs of clinical toxicity. Animals observed to be in severe distress are killed and full necropsy carried out. All other animals are killed at the end of the test period and detailed necropsy and histopathology performed.

Number of animals used – 5 of each sex per dose/control group, approximately 40 per study (limit test uses 20 animals for oral and dermal studies).

90-day repeated dose study (oral – REACH B26, OECD TG 408; dermal REACH B28, OECD TG 411; inhalation – REACH B27, OCED TG 413)

78 The 90-day study is carried out as for the 28-day study except that the test substance is administered daily for 90 days.

Number of animals used - 10 of each sex per dose/control group, approximately 80 per study (limit test uses 30-40 animals for oral and dermal studies).

2.7.2 Issues relating to the feasibility of applying the 3Rs to the endpoint.

Chronic toxicity may be brought about by a number of endpoints, all of which may be elicited by a number of mechanisms, some (or many) of which may be only poorly understood. Due to the complexity of these issues, finding effective alternatives to chronic toxicity may be difficult.

There are fewer chronic toxicity data for modelling and comparative purposes than for acute toxicity.

It is probable that no one single alternative method will be effective, but due to the complications with mechanisms of action, a combinations of methods and hence mechanisms and individual endpoints may be worth investigating.

2.7.3 In vitro methods

Repeat dose studies more accurately reflect potential exposure scenarios in that the effects of lower doses over longer exposure times (i.e. several weeks) are assessed, and there is the possibility of assessing recovery, in addition to initial toxicity. Prospects for replacing animal use for chronic and repeat dose toxicity are, at present, very limited and there are no alternative repeat-dose/subchronic tests accepted for regulatory testing. Indeed, very few alternatives are suited for long term toxicity studies: Sustainability is the key to longer term in vitro studies and requires cells which do not change structurally or functionally during the course of the study. Hence, primary cells, tissue slices and organ preparations are not normally suited to these types of test as they last only a few days (Worth and Balls, 2002)

To address this issue, several cell culture based systems are being developed. An in vitro cell culture system for chronic studies must be suited to the assessment of toxicokinetics: It must mimic in vivo blood flow and the generation of tissue fluid, provide nutrients, oxygen, enzyme cofactors and growth factors. Such systems may allow ADMET studies to be conducted. Single cell or co-culture-based systems where in vivo rates of perfusion are mimicked such that the bioaccumulation of reactive metabolites is avoided, and the metabolic kinetics more closely resemble the in vivo situation are in development. For example, metabolically competent hepatocytes maintained as reconstructed collagen sandwich monolayers show extended viability such that repeat-dose, long-term and reversible effects can be studied (Canova et al., 2004). Similarly, cells can be grown around hollow fibre bioreactors through which the culture medium flows or on perfuseable inert polymer support membranes or culture plate inserts such as two-compartment human skin

79 models, where cells cultured on an inert filter can be washed and analysed after an initial exposure using a non-toxic indicator then re-challenged and re-analysed (Hanley et al, 1999; Pazos et al, 2002, Minuth et al, 1992).

The proceedings of the ECVAM status seminar 2002, also discusses long-term in vitro studies being developed at ECVAM, with the perfusion culture systems EpiFlow and Minucell™ under evaluation (Prieto, 2002). Perfusion systems require cell types that can be maintained in culture for weeks without the need for sub-culturing. Although cell culture cannot truly mimic whole-body responses to toxicity, they are suited to low cost, high throughput testing (Pfaller et al, 2001).

It is also important for the cells to retain the functional and structural features seen in vivo. Genetic transformation of primary cells and cell lines to prevent senescence and cell death or to create engineered cell lines that stably display the desired characteristics is becoming increasingly commonplace. More recently, it has become possible to guide differentiation of stem cells into specific cell types. Such systems form the basis of the embryonic stem cell test. Stem cells can be programmed to form spheroid cultures such as neurospheres with a self renewing inner core mass of stem cells and an outer layer of neural cells which are suited to long-term assessment of neurotoxicity (Tahti et al., 2003). Such sustainable co-culture systems allow important intercellular networks to be recreated in vitro. This is of significance to long-term toxicity testing since repeated or sustained exposure to the test substance may result in longer range effects than seen following acute exposure.

In these cases it is necessary to ensure that the cells in culture retain both their structural and functional features during the course of the study. They must also be confirmed to accurately represent in vivo equivalent systems. This has been made possible using genomic-based technologies such as microarray and proteomics to monitor protein expression and metabolic profiling to ensure metabolic competence is retained.

The use of reporter systems and reporter-metabolic co-cultures allows mechanistic details which can be directly extrapolated to in vivo effects. For example, oestrogen- responsive reporter cell lines are being developed for long-term as screening tools for endocrine disruption (Pennie et al., 1998) and as a means to identify biomarkers of endocrine disruption based on microarray and proteomics analysis to. A recent proposal, the Predictomics STREP project (ECVAM) - ‘Short term in vitro assay for long term toxicity’ – is unique in its mechanistic integration of the three levels of cellular dynamics (genome, proteome, cytome) together with advanced cell culture technology to detect early biomarkers of cellular injury.

The advantages of using microarray analysis, proteomics and metabolic profiling, to both confirm the status of the cell system and to make toxicity predictions, are yet to be fully realised. However it is now recognised that long term exposure to a test substance may alter protein expression and that individual susceptibility to chemicals is related to variations in genetic make-up and protein expression. A wide range of endpoints needs to be investigated in order for the data to be comparable to results obtained by in vivo testing. Biomarker screening and metabolic profiling may make the assessment of hepatotoxicity, nephrotoxicity, neurotoxicity, cardiotoxicity and

80 respiratory toxicity more manageable by providing surrogate endpoints since they are suitable for high throughput screening platforms.

2.7.4 In silico approaches

QSARs

There are few or no established QSARs to predict chronic mammalian toxicity.

Expert Systems

There are a number of expert system models of mammalian chronic toxicity including models of from TOPKAT (Gombar et al 1991) and MultiCASE. Other, less well established, systems with chronic toxicity models include PASS, OASIS, ToxBoxes and TerraQSAR. All these systems have probably taken advantage of the large numbers of toxicity data available. How the modellers have dealt with issues of mechanisms and data quality is less clear.

There have been some attempts to validate the predictions from expert systems, and from TOPKAT in particular e.g. Mumtaz et al (1995); OECD (2004); Venkatapathy et al (2004).

There are no true SARs directly for chronic toxicity. However, chronic toxicity could be implied by some of the rules provided by DEREK for organ toxicity. Whilst this is a small number, this may indicate some possibility to expand the rulebase, depending on a careful appraisal of mechanisms of action.

2.7.5 Integrated testing strategies for specific endpoints

The BUAV (Langley, 2001) concedes that in vitro tests for chronic toxicity are less well developed than those for acute toxicity but still proposes an entirely non-animal testing strategy. The first step consists of the identification of likely toxic activity from in silico methods such as QSARs and DEREK. The second step proposes that basal cell cytotoxicity test are carried out similar to those for acute toxicity, but conducted over a longer period, to identify non-specific toxicity. Step three involves in vitro metabolism studies and computer simulations of ADME, to predict concentration and time course in target organs and tissues. As with acute toxicity step four consists of specialised in vitro tests on target organs and tissues only with longer exposure times. This testing strategy, like many proposed by the BUAV, is overly optimistic regarding the likely time required to develop and validate non-animal methods. For chronic toxicity, long-term culturing methods are required and any potential methods need to involve repeat dose administration and measurement of recovery. As such, it

81 will be some considerable time before a entirely non-animal testing strategy will be available for regulatory use.

2.7.6 Reduction and Refinement opportunities

For oral toxicity, the 90-day study is preferred as many toxic effects are not seen as early as 28 days via this route of administration. For dermal and inhalation studies, however, the 28-day study is preferred as toxic effects are much more likely to be seen within this timeframe. With the procedures being very stressful for the animals, a shorter testing time, and the use of half the number of animals, is highly recommended (Combes et al, 2004).

The number of animals used for each study should be determined on a case-by-case basis, taking into consideration the requirements for statistical validity. All the tests could be refined using preliminary range-finding or pilot studies, to reduce the incidence of morbidity and death in the main study.

2.7.7 Other Information

ECVAM meetings: First meeting of Task Force on chronic toxicity, Feb 2004 – awaiting report.

2.7.8 Bibliography

Canova N. Kmonickova E, Lincova D, Vitek L, Farghali H (2004) Evaluation of a Flat Membrane Hepatocyte Bioreactor for Pharmacotoxicological Applications: Evidence that Inhibition of Spontaneously Produced Nitric Oxide Improves Cell Functionality. ATLA 32: 25-35 Combes RD, Gaunt I, Balls M (2004) A Scientific and animal welfare assessment of the OECD Health Effects Test Guidelines for the safety testing of chemicals under the European Union REACH system. ATLA 32: 163-208 Gombar VK, Enslein K, Hart JB, Blake BW, Borgstedt HH (1991) Estimation of maximum tolerated dose for long-term bioassays from acute lethal dose and structure by QSAR. Risk Analysis 11: 509-517. Hanley AB, McBride J, Oehlschlager S, Opara E (1999) In vitro models for investigations of chronic toxicity and reversibility: use of a flow cell bioreactor as a chronic toxicity model system. Toxicology in Vitro 13: 847-851. Langley G (2001) The Way Forward. Action to end animal toxicity testing. Report compiled for the BUAV and ECEAE,

82 Minuth WW, Dermietzel R, Kloth S, Hennerkes B. (1992) A new method culturing renal cells under permanent superfusion and producing a luminal basal medium gradient. Kidney International 41: 215-219. Mumtaz MM, Knauf LA, Reisman DJ, Peirano WB, Derosa CT, Gombar VK, Enslein K, Carter JR, Blake BW, Huque KI, Ramanujam VMS (1995) Assessment of effect levels of chemicals from quantitative structure-activity relationship (QSAR) models. 1. Chronic lowest-observed-adverse-effect level (LOAEL) Toxicology Letters 79: 131-143 Pazos P, Fontaner S, Prieto P (2002) Long term in vitro toxicity models: comparisons between a flow cell bioreactor, a static cell bioreactor, and static cell cultures. ATLA 30: 515-523. Pennie WD, Aldridge TC, Brooks AN (1998) Differential activation by of ER alpha and ER beta when linked to different response elements. Journal of Endocrinology 158: R11-R14 Pfaller W, Balls M, Clothier R, Coecke S, Dierickx P, Ekwall B, Hanley BA, Hartung T, Prieto P, Ryan MP, Schmuck G, Šladowski D, Vericat J-A, Wendel A, Wolf A, Zimmer J (2001) Novel advanced in vitro methods for long-term toxicity testing. the report and recommendations of ECVAM Workshop 45. ATLA 29: 393-426 Prieto P (2002) Barriers, nephrotoxicity and chronic testing in vitro. ATLA Suppl. 2: 101-105 Tahti H, Nevala H, Toimela T (2003) Refining in vitro neurotoxicity testing –The development of blood-brain barrier models. ATLA 31: 273-276. Venkatapathy R, Moudgal CJ, Bruce RM (2004) Assessment of the oral rat chronic lowest observed adverse effect level model in TOPKAT, a QSAR software package for toxicity prediction. Journal of Chemical Information and Computer Sciences 44: 1623-1629. Worth AP, Balls M (2002) Alternative (Non Animal) methods for Chemicals Testing: Current Status and Future Prospects. A Report prepared by ECVAM and the ECVAM Working Group on Chemicals. ATLA 30 Supplement 1.

83 2.8 Mutagenicity

2.8.1 Established toxicity tests (e.g. OECD Guidelines etc)

REACH requirements:

Mutagenicity studies are required in Annexes V, VI and VIII of the REACH proposals, as shown in Figure 2.8.1. Unlike most other endpoints, a negative mutagenicity result in vitro can be considered sufficient evidence for non-mutagenic potential but positive results must be confirmed in vivo. At Annex V level the in vitro gene mutation study in bacteria (Ames test) is required. At Annex VI level, further in vitro studies are required, a cytogenicity study and a gene mutation study in mammalian cells if both the Ames test and cytogenicity test are negative. If there are any positive results within these in vitro tests, in vivo mutagenicity studies are required.

-ve Annex V Ames STOP Test +ve ------

-ve -ve -ve Annex VI Cytogenicity Gene Mutation Ames STOP Test Test result +ve +ve ------

Annex VIII In vivo Tests

Figure 2.8.1 Mutagenicity tests required at the various Annex levels of REACH and the order in which they should be performed.

The REACH proposals do not state which of in vivo tests are required but it would be most logical to carry out the in vivo versions of the test(s) which proved positive at in vitro level. Table 2.8.1 shows the available test methods for in vitro and in vivo mutagenicity testing.

REACH/ In vivo / Name Endpoint measured OECD in vitro B13/14/TG471 In vitro Bacterial reverse mutation Gene mutations in bacteria

84 (Ames) test B10/TG473 In vitro Mammalian Chromosomal Chromosome aberrations aberration test B12/TG474 In vivo Mammalian erythrocyte Structural and numerical micronucleus test chromosomal changes B11/TG475 In vivo Mammalian bone-marrow Chromosomal aberrations chromosomal aberration test B17/TG476 In vitro Mammalian gene mutation test Gene mutations

B20/TG477 In vivo Sex-linked recessive lethal test Gene mutations in germ line in drosphilia melanogaster B22/TG478 In vivo Rodent dominant lethal test Chromosomal aberrations/gene mutations in germinal tissue B19/TG479 In vitro Sister chromatid exchange DNA damage assay in mammalian cells B15/TG480 In vitro Saccharomyces cerevisiae Gene mutations in yeast gene mutation assay B16/TG481 In vitro Saccharomyces cerevisiae Mitotic recombination in yeast mitotic recombination assay B18/TG482 In vitro Unscheduled DNA synthesis in DNA damage/induction of DNA mammalian cells repair B23/TG483 In vivo Mammalian spermatogonial Inheritable chromosomal chromosome aberration test aberrations B24/TG484 In vivo Mouse spot test Mutations in foetal cells

B25/TG485 In vivo Mouse heritable translocation Heritable structural and assay numerical chromosomal changes B39/TG486 In vivo Unscheduled DNA synthesis in DNA damage/induction of DNA mammalian liver cells repair

Table 2.8.1 – Methods available for the regulatory testing of mutagenicity.

The Micronucleus test (MNT) and the Chromosome aberration test (CAT) are most commonly used. It has been suggested that one of these two tests (preferably the MNT), performed alongside an Unscheduled DNA synthesis test, are the only in vivo mutagenicity tests that are scientifically necessary (Combes et al, 2004).

Method outline:

Micronucleus Test (MNT) and Chromosome Aberration Test (CAT)

The test substance is administered either orally or by intraperitoneal injections. Samples of either bone marrow or peripheral blood are taken, at least twice (starting between 24 and 48 hours after treatment) and analysed for the appropriate mutagenic effects.

Number of animals used – 5 animals per sex in each dose/control group, approximately 60 per study (limit test uses 30 animals).

85 2.8.2 Issues relating to the feasibility of applying the 3Rs to the endpoint.

The concept of mutagenicity is very well established in toxicology and many of the mechanisms that bring about a mutagenic event have been researched thoroughly. There are large numbers of high quality data available for this endpoint, many of which have been used for modelling purposes previously. Richards and Williams (2003) provide an excellent overview of sources of data for mutagenicity.

There should, therefore be excellent opportunities for modelling and replacement for this endpoint.

2.8.3 In vitro methods

A number of in vitro methods are already available for the testing of mutagenicity, as discussed in the established toxicity tests section. These tests are generally used in a battery, to decide if further in vivo studies are required. The two most common tests are the Micronucleus test (MNT) which is used to detect cytogenetic damage in bone-marrow and the chromosome aberration test which detects chromosome damage. There are currently in vivo versions of both these tests accepted for regulatory use but only an in vitro version of the of the chromosome aberration test accepted.

An in vitro version of the MNT (Garriott et al, 2002; Kirsch-Volders et al, 2003; Phelps et al, 2002) is currently undergoing retrospective validation at ECVAM. The OECD currently has a draft test guideline (OECD TG 487) under review, so it seems highly likely that if the validation is accepted a full OECD test guideline could soon be enforced. The in vitro micronucleus test would form part of the in vitro test battery already used and could be used to supersede the in vitro chromosome aberration test (as required at Annex VI level of the REACH testing requirements), as the in vitro MNT can detect both clastogens (which cause breakage of DNA) and aneugens (which induce aneuploidy – abnormal number of ). The MNT is also easier to carry out than the chromosome aberration test as scoring of micronuclei is much easier than scoring chromosome damage, making the time required and the cost of in vitro cytogenetics testing equivalent to that of the in vivo method (Worth and Balls, 2002).

A problem encountered in the early development of the in vitro MNT was that apoptosis could interfere with scoring of micronuclei leading to false positive results. To overcome this problem a transfected cell line was developed (CTLL 2 cell line transfected with the bcl2 gene) which could not go into apoptosis (Meintieres, et al, 2003).

Another potential in vitro alternative is the COMET assay (Singh et al, 1988; Tice et al, 2000; Hartmann et al, 2003), although this assay is not as well developed as the in vitro MNT. The COMET assay detects DNA strand breaks which, when subjected to electrophoresis, result in migration of DNA fragments out of the nucleus to form the tail of a comet like structure. The extent of migration of the DNA fragments, and therefore the size of the comet’s tail, indicates the amount of DNA damage. Double

86 strand breaks are detectable under conditions of DNA denaturation, such as high alkalinity. Advantages of the COMET assay are that it can be carried out in any type of mammalian cell line or tissue, proliferating cells are not required and it is very sensitive, being able to detect strand breaks in individual cells. This assay has the potential to replace the in vivo unscheduled DNA synthesis test , and if mutagenicity can be detected in specific tissues could also potentially replace other in vivo tests. At present it is not decided whether a formal validation study should be carried out or whether a weight of evidence validation would be more appropriate.

The rapidly developing field of toxicogenomics is also expected to have a large impact on the fields of mutagenicity and carcinogenicity. Changes in gene and protein expression as a result of exposure to a toxic chemical, can be measured in virtually any tissue (in vitro or in vivo) and as our understanding of these processes increase, development of more relevant tools for assessing these endpoints are likely to be discovered. Initial studies suggest that patterns of induced gene expression changes may be characteristic of specific classes of toxic compounds, and that the identification of these distinctive fingerprints could assist in the classification of agents with different mechanisms of action (Aaedema and MacGregor, 2002; Farr and Dunn, 1999). Although there are rapid developments within this field it is unlikely that any new tests for mutagenicity will be available for use during the implementation period of REACH.

2.8.4 In silico approaches

As befits an important endpoint with plenty of data, there are a multitude of models for mutagenicity prediction. It is well beyond the scope of this report to review all these models, indeed some excellent review exist, which means this is not necessary. For good reviews in this area, refer to the work of Dr Aldo Benigni, in particular the book (Benigni, 2003) and other review articles (Benigni 2004; Benigni et al 2003). Other good review articles are also available including Cronin and Dearden 1995c; Lewis et al 2003; Patlewicz et al 2003).

QSARs

As noted above there are many QSARs for mutagenicity. As an example, a search on ISI Web of Science (18th April 2005) using the keywords “QSAR” and “mutagenicity” alone revealed well over 100 articles in the last ten years (this figure is for information only and does not represent a complete search). Many of papers are based around models for particular classes of compounds, for instance amines and aldehydes (Benigni et al 2003). The topic of QSARs for individual classes of compounds is reviewed well in toto by Passerini (2003) and generally throughout in Benigni (2003). The advantage of models based on chemical class is that there is generally a strong mechanistic basis to them. In addition, they are often transparent and easily understood. The disadvantage is that they are local models, clearly with only a limited applicability domain. In terms of REACH this would result in extremely patchy coverage, and many models to be used. Benigni has also discussed the

87 application of the Setubal principles to QSARs for mutagenicity in the recent OECD exercise (OECD 2004).

In addition to class based models, there are also modelling approaches based on much more heterogeneous databases. Often these are multivariate in nature and suffer from the problems of lack of transparency and ease of use. Examples of neural network approaches include Vracko et al (2004) and other approaches Hawkins et al (2004); Votano et al (2004).

Expert Systems

Large heterogeneous databases of compounds are ideal for certain types of expert system modelling. Most expert systems have a variety of models based on mutagenicity data. For expert systems such as MultiCASE there are numerous models available for purchase (see Klopman et al 2003; 2004). The application of a “global” multi-CASE model for in vitro chromosomal aberrations in mammalian cells is described in Annex 4 of the OECD Document (2004) by Niemelå and Wedebye. Fewer models are available in the TOPKAT system, as well as other software such as PASS, ToxScope, etc. The OASIS predictive method is described by Mekenyan et al (2004).

SARs have been developed for mutagenicity starting with the Ashby and Tennent “supermolecule”. Many of these alterts, along with others, have been coded into the DEREK and HazardExpert models. DEREK contains a comprehensive set of rules (over 75) for mutagenicity.

2.8.5 Integrated testing strategies for specific endpoints

A testing strategy for mutagenicity, proposed in Worth and Balls (2002), involves four steps. Firstly, validated QSARs are used to predict mutagenic potential. Secondly, validated in vitro tests for gene mutation and chromosome aberration are carried out, unless the results are ambiguous at this stage they should be used to classify the test substance. The third stage involves a short in vivo test whereby the fourth step is only carried out if a positive result is found. The fourth step uses a combination of a validated germ line mutagenicity tests and biokinetic data to finally confirm the mutagenic potential of the test substance.

The BUAV (Langley, 2001) proposes a strategy using computational approaches and in vitro testing. The first step uses expert systems such as DEREK, COMPACT and TOPKAT to flag up potential mutagens. The next step requires the use of three regulatory approved in vitro tests: the Ames test, the mammalian cell mutation test and the chromosome aberration test. If all the tests agree, a classification can be made but in the event of mixed results further in vitro tests – the in vitro micronucleus test and the COMET assay – are used to finally distinguish mutagenic or non- mutagenic potential.

The Department of Health has produced a testing strategy (DOH, 2000) whereby physical and chemical properties should be determined and the use of SAR and

88 computer predictions carried out prior to any testing. Stage one involves using a minimum of two in vitro assays – the bacterial reverse mutation assay and the in vitro micronucleus assay. The need to use the ICH approved in vitro mouse lymphoma assay is assessed on a case-by-case basis taking into account likelihood of human exposure and production volume. No further testing is required if all three tests are negative. At stage two, in vivo testing is conducted to confirm positive results or where results from stage 1 are equivocal. Assay choice is assessed, case- by-case, although the rodent bone marrow or pheripheral blood micronucleus test is recommended. Stage three in vivo testing is limited to those substances that are considered to affect germ cells.

2.8.6 Reduction and Refinement opportunities

There are several in vitro methods accepted for regulatory assessment of mutagenicity including the MNT, the CAT and the Unscheduled DNA synthesis assays. The use of in vitro methods, scoring systems and test strategies to pre- screen for mutagens will reduce the number of animals used especially for tests such as the Mouse heritable translocation assay (for detecting germ-line genotoxicity) which uses a minimum of 500 animals. Single sex assays, data- sharing and the use of historical positive and negative controls would greatly reduce the number of animals required per test.

Initiatives that maximise the amount of information from the study are important both in terms of reduction and refinement. For instance, by taking bone-marrow and/or peripheral blood samples from animals from a 28 day repeated dose study, statistical confidence is improved and the need to recruit new animals to a study avoided. Two recent methods describe how flow cytometry or imaging methods can be used to analyse blood from rodents for the presence of micronuclei (Criswell et al, 2003; Torous et al, 2003). These methods are potentially very useful in this respect, provided that they do not increase the pain and suffering of individual animals.

Wherever possible, the test substance should be administered orally and intraperitoneal injections avoided. In any case, the route of administration should reflect the most probable exposure scenario.

There are transgenic mouse and rat strains which can be used to assess the mutagenic potential of chemicals (Dean et al,1999; Nohmi et al, 2000; Dashwood, 2003). The principle behind using such strains is that exposure to a mutagen will cause random changes in the genetic make-up of the transgenic animal such that the functional expression of specific genes is altered. These changes can be detected by bioassay of blood or tissue samples. One example is the Big Blue™ strains where mutations within the lac repressor will result in the expression of beta- galactosidase which catalyses a production of a coloured product. In so far as such transgenic strains are already available, they are less invasive and more sensitive than traditional in vivo genotoxicity methods and can potentially reduce the number of animals required for each test and the exposure dose required. However, their

89 predictive powers are variable and the OECD proposes that they are more suited to be used as part of a testing strategy (OCED, 2005).

2.8.7 Bibliography

Aardema M, MacGregor JT (2002) Toxicology and genetic toxicology in the new era of “toxicogenomics”: impact of “omics” technologies. Mutation Research 499: 13-25 Benigni R (2004) Chemical structure of mutagens and carcinogens and the relationship with biological activity. Journal of Experimental and Clinical Cancer Research 23: 5-8 Benigni R (ed.) (2003) Quantitative Structure-Activity Relationship (QSAR) Models of Mutagens and Carcinogens. Boca Raton FL, USA: CRC Press Benigni R, Passerini L, Rodomonte A (2003) Structure-activity relationships for the mutagenicity and carcinogenicity of simple and alpha-beta unsaturated aldehydes. Environmental and Molecular Mutagenesis 42: 136-143 Combes RD, Gaunt I, Balls M (2004) A scientific and animal welfare assessment of the OECD Health Effects Test Guidelines for the safety testing of chemicals under the European Union REACH system. ATLA 32: 163-208 Criswell KA, Krishna G, Zielinski D, Urda GA, Juneau P, Bulera S, Bleavins MR (2003) Validation of a flow cytometric acridine orange micronuclei methodology in rats. Mutation Research 528: 1-18 Cronin MTD, Dearden JC (1995) QSAR in toxicology .3. Prediction of chronic toxicities. Quantitative Structure-Activity Relationships 14: 329-334. Dashwood RH (2003) Use of Transgenic and mutant animal models in the study of heterocyclic amine-induced mutagenesis and carcinogenesis. Journal of Biochemistry and Molecular Medicine 36: 35-42 Dean SW, Brooks TM, Burlinson B, Mirsalis J, Myhr B, Recio L, Thybaud V (1999) Transgenic mouse mutation assay systems can play an important role in regulatory mutagenicity testing in vivo for the detection of site-of-contact mutagens. Mutagenesis 14: 141-151 DOH (2000) Guidance on a strategy for testing of chemicals for mutagenicity. Committee on Mutagenicity of chemicals in food, consumer products and the envrionemnt. December 2000: http://www.doh.gov.uk/com/htm Farr S, Dunn RT (1999) Concise review: gene expression applied to toxicology. Toxicological Sciences 50: 1-9 Garg A, Bhat KL, Bock CW (2002) Mutagenicity of aminoazobenzene dyes and related structures: a QSAR/QPAR investigation. Dyes and Pigments 55: 35-52. Garriott ML, Phelps JB, Hoffman WP (2002) A protocol for the in vitro micronucleus test. I. Contributions to the development of a protocol suitable for regulatory submissions from an examination of 16 chemicals with differentmechanisms of action and different levels of activity. Mutation Research 517: 123-134

90 Hartmann A, Agurell E, Beevers C, Brendler-Schwaab S, Burlinson B, Clay P, Collins A, Smith A, Speit G, Thybaud V, Tice RR (2003) Recommendations for conducting the in vivo alkaline Comet assay. Mutagenesis 18: 45-51 Hawkins DM, Basak SC, Mills D (2004) QSARs for chemical mutagens from structure: ridge regression fitting and diagnostics. Environmental Toxicology and Pharmacology 16: 37-44 Kirsch-Volders M, Sofuni T, Aardema M, Albertini S, Eastmond D, Fenech M, Ishidate M Jr, Kirchner S, Lorge E, Morita T, Norppa H, Surralles J, Vanhauwaert A, Wakata A (2003) Report from the in vitro micronucleus assay working group. Mutation Research 540: 153-163 Klopman G, Chakravarti SK, Harris N, Ivanov J, Saiakhov RD (2003) In-silico screening of high production volume chemicals for mutagenicity using the MCASE QSAR expert system. SAR and QSAR in Environmental Research 14: 165-180 Klopman G, Zhu H, Fuller MA, Saiakhov RD (2004) Searching for an enhanced predictive tool for mutagenicity. SAR and QSAR in Environmental Research 15: 251- 263 Langley G (2001) The Way Forward. Action to end animal toxicity testing. Report compiled for the BUAV and ECEAE, Lewis DFV, Ioannides C, Parke DV (2003) A quantitative structure-activity relationship (QSAR) study of mutagenicity in several series of organic chemicals likely to be activated by cytochrome P450 enzymes. Teratogenesis, Carcinogenesis and Mutagenesis 187-193 Meintieres S, Biola A, Pallardy M, Marzin D (2003) Using CTLL-2 and CTLL-2 bcl2 cells to avoid interference by apoptosis in the in vitro micronucleus test. Environmental Molecular Mutagenesis 41: 14-27 Mekenyan O, Dimitrov S, Serafimova R, Thompson E, Kotov S, Dimitrova N, Walker JD (2004) Identification of the structural requirements for mutagenicity by incorporating molecular flexibility and metabolic activation of chemicals I: TA100 model. Chemical Research in Toxicology 17:753-766 Nohmi T, Suzuki T, Masumura K (2000) Recent advances in the protocols of transgenic mouse mutation assays. Mutation Research-Fundamental and Molecular Mechanisms of Mutagenesis 455: 191-215. OECD (2005) Draft Detailed Review Paper on Transgenic Rodent Mutation Assays. OECD Environment, Health and Safety Publications. Series on Testing and Assessment, No. XX, Passerini L (2003) QSARs for individual classes of chemical mutagens and carcinogens. In: Benigni R ed. Quantitative Structure-Activity Relationship (QSAR) Models of Mutagens and Carcinogens. Boca Raton FL, USA: CRC Press pp. 81-123. Patlewicz G, Rodford R, Walker JD (2003) Quantitative structure-activity relationships for predicting mutagenicity and carcinogenicity. Environmental Toxicology and Chemistry 22: 1885-1893 Phelps JB, Garriott ML, Hoffman WP (2002) A protocol for the in vitro micronucleus test. II. Contributions to the validation of a protocol suitable for regulatory

91 submissions from an examination of 10 chemicals with different mechanisms of action and different levels of activity. Mutation Research 521: 103-112 Richard AM, Williams CR. Public sources of mutagenicity and carcinogenicity data: use in structure-activity relationship models. (2003) In: Benigni R ed. Quantitative Structure-Activity Relationship (QSAR) Models of Mutagens and Carcinogens. Boca Raton FL, USA: CRC Press pp. 145-173. Singh NP, McCoy MT, Tice RR, Schneider EL (1988) A Simple technique for quantification of low levels of damage in individual cells. Experimental Cell Research 175:184-191 Tice RR, Agurell E, Anderson D, Burlinson B, Hartmann A, Kobayashi H, Miyamae Y, Rojas E, Ryu JC, Sasaki YF (2000) The single cell gel/comet assay: Guidelines for in vitro and in vivo genetic toxicology testing. Environmental and Molecular Mutagenesis 35: 206-221 Torous DK, Hall NE, Murante FG, Gleason SE, Tometsko CR, Dertinger SD (2003) Comparative scoring of micronucleated reticulocytes in rat peripheral blood by flow cytometry and microscopy. Toxicological Sciences 74: 309-14 Votano JR, Parham M, Hall LH, Kier LB, Oloff S, Tropsha A, Xie QA, Tong W (2004) Three new consensus QSAR models for the prediction of Ames genotoxicity. Mutagenesis 19: 365-377 Vracko M, Mills D, Basak SC (2004) Structure-mutagenicity modelling using counter propagation neural networks. Environmental Toxicology and Pharmacology 16: 25- 36 Worth AP, Balls M (Eds.) (2002) Alternative (Non-animal) Methods for Chemicals Testing: Current Status and Future Prospects. A report prepared by ECVAM and the ECVAM Working Group on Chemicals. ATLA 30 Supplement 1.

92 2.9 Carcinogenicity

2.9.1 Established toxicity tests (e.g. OECD Guidelines etc)

REACH requirements:

Carcinogenicity studies may be required in Annex VIII of the REACH proposals if the test substance has a widespread dispersive use or there is evidence of long-term human exposure and the test substance is classified as a category 3 mutagen or evidence from any repeated dose studies carried out indicates signs of hyperplasia and/or pre-neoplastic lesions.

The number of animals used and the duration of carcinogenicity tests mean that they can cost in the region of €1 million per test substance and therefore are only carried out when absolutely necessary.

Method Outline:

Although mutagenicity testing detects genotoxic carcinogens it has been recognised that not all carcinogens are genotoxic and therefore non-genotoxic carcinogens need to be tested for.

Rodent bioassay (REACH B32, OECD TG 451)

Two species of animal are required, generally rats and mice, as tumour induction potential is species-dependent. Animals are exposed to the test substance via oral, dermal or inhalation routes depending on its physical and chemical characteristics. The oral route is most commonly used. Daily exposures are normally used but this may vary depending on the route of administration. Tests are conducted for the majority of the animal’s normal lifespan with termination of the study occurring generally at 18 months for mice and hamsters and 24 months for rats. Full necropsy and histopathology is then carried out.

Number of animals used – 50 animals of each sex per dose/control group, approximately 400 per study.

2.9.2 Issues relating to the feasibility of applying the 3Rs to the endpoint.

Carcinogenicity has been viewed as one of the most difficult endpoints to model. This is because of a number of reasons such as the complexity of mechanisms, ambiguity of measurement and relevance of animal testing to humans.

From a mechanistic point of view, there are various and complex mechanisms of carcinogenicity. Some, or many, of these mechanisms are well established. Information on others is lacking. The complexity of these mechanisms, as well as their very specific nature, makes modelling and the search for replacements difficult.

93 There are many data available for modelling, and many or most of these have been used in the development of QSARs and expert systems. Richards and Williams (2003) provide an excellent overview of sources of data for carcinogenicity. Many of these data are compilations of existing data. With regard to data release, there have been some interesting and significant developments in this area. A collaborative research agreement between the US Food and Drug Administration and MultiCASE has expanded the data available for modelling (Matthews and Contrera 1998), however due to the commercial nature of these data they have not been made openly available to the public.

2.9.3 In vitro methods

In vitro methods for modelling the carcinogenic process have been established for a number of years and are based on morphological cell transformation. Until recently, these methods had generally been used to determine mechanisms associated with carcinogenic potential.

The Cell transformation assay (CTA) is used to detect genotoxic and non-genotoxic carcinogens and was reviewed in an ECVAM workshop report (Combes et al, 1999). Cell transformation is the induction of certain phenotypic alterations in cells in tissue culture that are characteristic of tumorigenic cells (Barratt et al, 1986 – 9 in workshop). Several different cell transformation assays have been devised, either involving rodent cell lines e.g. Balb/c 3T3 cells (Kakunaga, 1973; Tsuchiya and Umeda, 1995), immortalized fibroblast cell lines of rodent or human origin e.g. C3H10T1/2 cells (Renzikoff et al, 1973; Landolph, 1985) and also systems comprising primary cells, such as those used in the Syrian Hamster Embryo (SHE) cell assay (Di Paolo, 1980; LeBoeuf et al, 1999). Some of these systems are used to detect initiators of carcinogenesis, acting by a genotoxic mechanism via electrophilic interactions with cellular macromolecules such as DNA, while others can be used in a two-stage process to additionally detect tumour promoters via non-genotoxic mechanisms. More recently, an in vitro CTA for tumour promoters has been developed (Sasaki et al, 1998; 1990) involving Bhas 42 cells derived from Balb/c T3 cells that have been transfected with the v-Ha-ras oncogene.

In all these cases, exogenous metabolising systems for the detection of pro- carcinogens are not routinely used. When metabolising systems have been added, there have been some problems with the assays, although it is known that SHE cells can express some metabolic competence (Stuard et al, 1999). The ECVAM workshop (Combes et al, 1999) concluded that ‘a thorough investigation is needed to ascertain the necessity of using exogenous metabolising systems in the various cell transformation assays.’ Unfortunately, such an investigation has not been conducted, and it is suggested that it should form part of the design of a pre- validation study of the CTA, involving the use of SHE and Balb/c 3T3 cells currently being planned by ECVAM.

The OECD has produced a detailed review paper on the CTA assay (OECD, 2001) and comments have been requested from member countries. Dependent on the

94 outcome of current validation studies, a specific test guideline is expected to be produced.

Another potential in vitro method which is able to detect non-genotoxic carcinogens is the Gap junction intercellular communication (GJIC) assay (Rivedal et al, 2000; Blaha et al, 2002). Structure-activity relationships have shown that substances which are able to inhibit GJIC can induce tumours in rodents but that there is no relationship between GJIC inhibition and genotoxic activity. This implies that GJIC is involved in non-genotoxic tumour induction and a potential endpoint for screening methods for non-genotoxic carcinogenicity. The GJIC assay is currently only used for non-regulatory purposes, but with further investigation, its potential as a toxicological test method could be realised.

2.9.4 In silico approaches

There are many models for the prediction of carcinogenicity and it is appropriate to view these in the same light as models for mutagenicity. It is well beyond the scope of this report to review all these models, indeed some excellent reviews exist. As with mutgenicity, for good reviews in this area, refer to the work of Dr Aldo Benigni, in particular the book (Benigni, 2003). Other relevant review articles include Benigni and Zito (2003), Cronin and Dearden (1995c), Patlewicz et al (2003) and Richards and Benigni (2003).

QSARs

As noted above there are many QSARs for carcinogenicity. As with mutagenicity, many of these studies are based around models for particular classes of compounds, for instance amines (Benigni et al 2000). The topic of QSARs for individual classes of compounds is well reviewed by Passerini (2003) and generally throughout Benigni (2003). The advantage of models based on chemical class is that there is generally a strong mechanistic basis to them. In addition, they are often transparent and easily understood. The disadvantage is that they are local models, clearly with only a limited applicability domain. In terms of REACH this would result in extremely patchy coverage, and many models to be used. Benigni has also discussed the application of the Setubal principles to QSARs for mutagenicity in the recent OECD exercise (OECD 2004).

In addition to class based models, there are also modelling approaches based on much more heterogeneous databases. Often these are multivariate in nature and suffer from the problems of lack of transparency and ease of use. Examples of these approaches include Contrera et al (2003) as well as many others.

Expert Systems

As with mutagenicity, large heterogeneous databases of compounds are ideal for certain types of expert system modelling. Most expert systems have a variety of models based on carcinogenicity data. For expert systems such as MultiCASE there are numerous models available for purchase (see Klopman et al 2004; 2005). As

95 noted above, the MCASE systems have benefited from the inclusion of FDA regulatory data, which are claimed to have improved the predictive ability of these models (Matthews and Contrera 1998). Fewer models are available in the TOPKAT system, as well as other software such as PASS, ToxScope, etc.

The OncoLogic is a rule-based system has been developed by the US EPA and at the time of writing this report a version is being prepared for free distribution by the EPA (although it is not yet available). This is often viewed as one of the most comprehensive pieces of software due to its chemical coverage. More information on OncoLogic is available from Woo and Lai (2005).

SARs have been developed for carcinogenicity starting with the Ashby and Tennent “supermolecule”. Many of these alerts, along with others, have been coded into the DEREK and HazardExpert models. DEREK contains a set of rules (over 45) for carcinogenicity.

The prediction of carcinogenic potency has also been evaluated through the two comparative exercises from the US National Toxicology Program. The results from these were frankly less than encouraging, with human experts often performing better than computational systems. Both exercises are well described by Benigni (2004).

2.9.5 Integrated testing strategies for specific endpoints

Following on from the mutagenicity testing strategy proposed in Worth and Balls (2002), if a test substance is found to be mutagenic after both in vitro and in vivo testing it should not be subjected to further studies for carcinogenesis and should be classified as a mutagenic carcinogen.

The BUAV (Langley, 2001) follows on from their mutagenicity strategy to propose the cell transformation assay using Syrian hamster embryo cells followed by mechanism-based in vitro studies such as the intercellular communication assay if unequivocal results are reported from the previous studies.

2.9.6 Reduction and Refinement opportunities

The 1981 test guideline for carcinogenicity is in urgent need of revision in that the test should not be carried out if a chemical has already proved to be genotoxic, since such a chemical should already be considered to be carcinogenic. The only exception to this rule is when there is unavoidable and widespread exposure to the chemical (Combes et al, 2004).

Inhalation studies should be avoided if possible and whole body exposure should only be used as a last resort. The oral route of administration is preferred.

96 Results from a 90-day sub-chronic study should be taken into account if the rodent bioassay is conducted. If there is evidence of tumorigenesis or positive genotoxicity from the sub-chronic study then the bioassay should not be conducted, otherwise dosing for the bioassay should be based on the sub-chronic results so that less treatment groups are needed (Combes et al, 2004). If chronic studies are also being carried out, use of the combined chronic toxicity/carcinogenicity study (OECD TG453) is recommended.

There have been discussions about the relevance of using two species in the rodent bioassay as non-genotoxic chemicals display species-specific tumorigenic potential. Carcinogenic responses seen in rodents often have no corresponding human risk; of the hundreds of pharmaceuticals which have proven positive in the rodent bioassay only 20 are known to be carcinogenic to humans. The International Conference on the Harmonisation of Technical Requirements for the Registration of Pharmaceuticals for human use (ICH) agreed that, in certain circumstances, only one long term rodent bioassay is required (preferably in the rat as all known human carcinogens are also positive in rats) along with a supplementary study to provide extra information (ICH, 1997). This second study may use a transgenic mouse model.

The use of transgenic mice in the rodent bioassay (Tennant et al, 1995; van Zeller and Combes, 1999; Battershill and Fielder, 1998; Blain et al, 1998) would lead to shorter assay times (6-9 months instead of 2 years) and fewer animals would be required (20-25 animals of each sex per treatment/control group instead of 50). However, the transgenic rodent bioassay has failed to identify some known carcinogens such that some caution is needed when deciding whether it is suitable. A study by the Health and Environmental Sciences Institute (HESI) branch of the International Life Sciences Institute (ILSI) was undertaken between 1996 and 2000 to validate a number of transgenic mouse models (rasH2, TgAC, p53, XPA, XPA/p53) proposed for use within the ICH guideline. The outcome of this study was that although the models could generally predict positive and negative human carcinogens, the organ specificity of the models did not correlate well with potential carcinogenicity at specific sites in humans. Using these models could significantly limit insight into potential mechanisms of action (Cohen et al, 2001) and therefore the validation of these transgenic models for complete replacement of the two year mouse bioassay was unsuccessful.

The OECD has reviewed the use of the transgenic rodent bioassay for both mutagenicity and carcinogenicity testing with reference to results for a series of chemicals (OECD, 2005). The results indicate that such models are better suited to the assessment of some forms of carcinogenicity than others. A testing strategy is essential if unnecessary or inappropriate use of the transgenic bioassay is to be avoided.

2.9.7 Bibliography

Battershill JM, Fielder RJ (1998) Mouse-specific carcinogens: an assessment of hazard and significance for validation of short-term carcinogenicity bioassays in transgenic mice. Human and Experimental Toxicology 17: 193-205

97 Benigni R (2004) Prediction of human health endpoints: mutagenicity and carcinogenicity. In MTD Cronin, DJ Livingstone (eds) Predicting Chemical Toxicity and Fate. CRC Press, Boca Raton FL, USA, pp 173-192. Benigni R, Giuliani A (2003) Putting the Predictive Toxicology Challenge into perspective: reflections on the results. Bioinformatics 19: 1194-1200 Benigni R, Giuliani A, Franke R, Gruska A (2000) Quantitative structure-activity relationships of mutagenic and carcinogenic aromatic amines. Chemical Reviews 100: 3697-3714 Benigni R, Zito R (2003) Designing safer drugs: (Q)SAR-based identification of mutagens and carcinogens. Current Topics in Medicinal Chemistry 3: 1289-1300 Blaha L, Kapplova P, Vondracek J, Upham B, Machala M (2002) Inhibition of gap- junctional intercellular communication by environmentally occurring polycyclic aromatic hydrocarbons. Toxicological Sciences 65: 43-51 Blain PG, Battershill JM, Venitt S, Cooper CC, Fielder RJ (1998) Consideration of short-term carcinogenicity tests using transgenic mouse models. Mutation Research 403: 259-263 Cohen SM, Robinson D, MacDonald J (2001) Alternative models for carcinogenicity testing. Toxicological Sciences 64: 14-17 Combes R, Balls M, Curren R, Fischbach M, Fusenig N, Kirkland D, Lasne C, Landolph J, LeBoeuf R, Marquardt H, McCormick J, Muller L, Rivedal E, Sabbioni E, Tanaka N, Vasseur P, Yamasaki H (1999) Cell transformation assays as predictors of human carcinogenicity. ECVAM Workshop Report 39. ATLA 27: 745-767 Combes RD, Gaunt I, Balls M (2004) A Scientific and animal welfare assessment of the OECD Health Effects Test Guidelines for the safety testing of chemicals under the European Union REACH system. ATLA 32: 163-208 Contrera JF, Matthews EJ, Benz RD (2003) Predicting the carcinogenic potential of pharmaceuticals in rodents using molecular structural similarity and E-state indices. Regulatory Toxicology and Pharmacology 38: 243-259 Di Paolo JA (1980) Quantitative in vitro transformation of Syrian golden hamster embryo cells with the use of frozen stored cells. Journal of the National Cancer Institute 64: 1485-1489 ICH (1997) Harmonised Tripartite Guideline, Testing for Carcinogenicity of Pharmaceuticals. International Conference on Harmonisation of Technical Requirements for the Registration of Pharmaceuticals for human use. Kakunaga T (1973) A quantitative system for assay of malignant transformation by chemical carcinogens using a clone derived from Balb 3T3. International Journal of Cancer 12: 463-473. Klopman G, Chakravarti SK, Zhu H, Ivanov JM, Saiakhov RD (2004) ESP: A method to predict toxicity and pharmacological properties of chemicals using multiple MCASE databases. Journal of Chemical Information and Computer Science 44:704- 715. Klopman G, Ivanov J, Saiakhov R, Chakravarti S (2005) MC4PC – An artificial intelligence approach to the discovery of quantitative structure-toxic activity

98 relationships. In Helma C (ed) Predictive Toxicology. CRC Press, Boca Raton FL., pp. 423-457. Landolph RA (1985) Chemcial transformation in C3H10T1/2 CCl18 mouse embtro fibroblasts: historical background, assessment of transformation assay and evolution and optimization of the transformations assay protocol. In Transformation assay of established cell lines: Mechanisms and Application, IRAC Scientific Publications No, 67 (ed. T Kakunaga and H Yamasaki) pp185-203. Lyon, France. Langley G (2001) The Way Forward. Action to end animal toxicity testing. Report compiled for the BUAV and ECEAE, LeBoeuf RA, Kerckaert KA, Aardema MJ, Isfort RJ (1999) Use of Syrian hamster embryo and Balb/c 3T3 cell transformation for assessing the carcinogenicity potential of chemicals. In The Use of Short- and Medium-term tests for Carcinogenic Hazard Evaluation. IRAC Scientific Publications No. 146 (ed DB McGregor, JM Rice and S Venitt), pp 409-425. Lyon, France Matthews EJ, Contrera JF (1998) A new highly specific method for predicting the carcinogenic potential of pharmaceuticals in rodents using enhanced MCASE QSAR-ES software. Regulatory Toxicology and Pharmacology 28: 242-264 OECD (2001) Detailed Review Paper on Non-Genotoxic Carcinogens Detection: The Performance of In-Vitro Cell Transformation Assays. OECD, Environment Health and Safety Publications Series on Testing and Assessment No. 31, May 2001 OECD (2005) Draft Detailed Review Paper on Transgenic Rodent Mutation Assays OECD Environment, Health and Safety Publications, Series on Testing and Assessment, No. XX, Passerini L (2003) QSARs for individual classes of chemical mutagens and carcinogens. In: Benigni R ed. Quantitative Structure-Activity Relationship (QSAR) Models of Mutagens and Carcinogens. Boca Raton FL, USA: CRC Press pp. 81-123. Patlewicz G, Rodford R, Walker JD (2003) Quantitative structure-activity relationships for predicting mutagenicity and carcinogenicity. Environmental Toxicology and Chemistry 22: 1885-1893 Reznikoff T, Bertram JS, Brankow DW, Heidelberger C (1973) Quantitative and qualitative studies on chemical transformation of cloned C3H mouse embryo cell, sensitive to post-confluence inhibition of cell division. Cancer Research 33: 3239- 3249 Richard AM, Benigni R (2002) AI and SAR aroaches for predicting chemical carcinogenicity: Survey and status report. SAR and QSAR in Environmental Research 13: 1-19 Richard AM, Williams CR (2003) Public sources of mutagenicity and carcinogenicity data: use in structure-activity relationship models. In: Benigni R ed. Quantitative Structure-Activity Relationship (QSAR) Models of Mutagens and Carcinogens. Boca Raton FL, USA: CRC Press pp. 145-173. Rivedal E, Mikalsen SO, Sanner T (2000) Morphological transformation and effect on gap junction intercellular communication in Syrian hamster embryo cells as screening tests for carcinogens devoid of mutagenic activity. Toxicology In Vitro 14: 185-192

99 Sasaki K, Mizusawa H, Ishidate M,Tanaka N (1990) Transformation of ras transfected BALB 3T3 clone (Bhas 42) by promoters: Application for screening and specificity of promoters. Toxicology in Vitro 4: 657-659 Sasaki K, Mizusawa H,Ishidate M (1988) Isolation and characterization of ras- transfected BALB/3T3 clone showing morphological transformation by 12-O- tetradecanoylphorbol-13-acetate. Japanese Journal of Cancer Research 79: 921- 930 Stuard SB, Kerckaert GA, Lehman-McKeeman LD(1999) Characterisation of the metabolic capacity of Syrian Hamster Embryo (SHE) Cells. Toxicological Sciences 48: 366 Tennant RW, French JE, and Spalding JW (1995) Identifying chemical carcinogens and assessing potential risk in short-term bioassays using transgenic mouse models. Environmental Health Perspectives 103: 942-950 Tsuchiya T, Umeda M (1995) Improvement in the efficiency of the in vitro transformation assay method using Balb/3t3 A31-1-1 cells. Carcinogenesis 8: 1887- 1894 van Zeller A-M and Combes RD (1999) Transgenic mouse bioassays for carcinogenicity testing: a step in the right direction? ATLA 27: 839–846. Woo Y-T, Lai DY (2005) OncoLogic: A mechanism-based expert system for predicting the carcinogenic potential of chemicals. In Helma C (ed) Predictive Toxicology. CRC Press, Boca Raton FL., pp. 385-413. Worth AP, Balls M (Eds.) (2002) Alternative (Non-animal) Methods for Chemicals Testing: Current Status and Future Prospects. A report prepared by ECVAM and the ECVAM Working Group on Chemicals. ATLA 30 Supplement 1.

100 2.10 Development / Reproductive Toxicity

2.10.1 Established toxicity tests (e.g. OECD Guidelines etc)

REACH requirements:

The reproductive/developmental screening study is required in Annex VI of the REACH proposals with a full developmental toxicity study carried out if there are positive results from the screening study. At Annex VII level, a full developmental study is required unless it has already been performed. Depending on the outcome of these tests, it may be necessary to conduct further studies on a second non- rodent species. The two generation reproductive study is required at Annex VII level if results from 28-day or 90-day repeated dose studies suggest adverse effects on reproductive organs. Otherwise this test should be carried out at Annex VIII level. Figure 2.10.1 shows diagrammatically the order in which these tests should be carried out.

-ve Annex VI Reproductive/Developmental STOP Screening Study

+ve ------

-ve -ve Annex VII Developmental Study 28/90 day repeated STOP (Rodent) dose results

+ve +ve Developmental Study (Second species)

------

Annex VIII Two Generation Reproductive Study

Figure 2.10.1. Reproductive and Developmental toxicity tests required at the various Annex levels of REACH and the order in which they should be performed.

These studies are not necessary if the test substance is known to be either a genotoxic carcinogen or a germ cell mutagen and appropriate risk management measures are implemented. Two generation studies are not necessary if the test

101 substance is shown to be of low toxicological activity, no systemic absorption occurring via any relevant route of exposure or when there is no significant human exposure to the test substance.

Method outline: The REACH proposals require that the reproductive/developmental screening study is carried out if there is no evidence from in silico or in vitro studies to suggest that the test substance is a developmental toxicant. If there is any evidence of potential developmental toxicity the full developmental toxicity study must be carried out. If there are any indications of reproductive toxicity from any previous repeated dose toxicity studies then the two-generation reproductive toxicity study should also be performed.

Reproductive/Developmental screening study (OECD TG 421)

The test substance is administered orally, for a minimum of 4 weeks for males (two weeks prior to mating, during the mating period and two weeks post mating) and throughout the study for females (starting two weeks prior to mating). The length of the study is subject to the fecundity of the females but is generally expected to last approximately 54 days (14 days prior to mating, up to 14 days mating, 22 days gestation and 4 days lactation). During the administration period, the animals are observed each day for signs of clinical toxicity. Animals observed to be in severe distress are killed and full necropsy carried out. All other animals are killed at the end of the test period, and full necropsy and histopathology performed.

Number of animals used – 10 of each sex per dose/control group plus their offspring, approximately 80 per study plus offspring (limit test uses 40 animals plus offspring).

Developmental study (REACH 31, OECD TG 414)

The test substance is administered via the most appropriate route (dependent on exposure information) daily to pregnant animals (mating is either natural or by artificial insemination) throughout the period of organogenesis. One day prior to term, the mother is sacrificed, the uterus removed and foetuses examined for visceral or skeletal abnormalities.

Number of animals used – 12-20 pregnant females (depending on species) per treatment/control group, approximately 48-80 pregnant females per study plus their offspring (limit test uses 24-40 pregnant females plus offspring).

Two-generation reproductive study (REACH B35, OECD TG 416)

The test substance is administered daily via the most appropriate route as determined from exposure information (preferably orally in diet or water) to both male and female animals. Males of parental generation should be dosed during growth and for at least one complete spermatogenic cycle to observe adverse effects on spermatogenesis. Females of the parental generation should be dosed for at least two complete oestrous cycles to observe any adverse effects on oestrus. At

102 weaning, the administration of the test substance is continued to F1 offspring into adulthood, mating and production of an F2 generation and up until the F2 generation is weaned. All animals should be observed daily for behavioural changes, problems with labour and all signs of toxicity. All parental and F1 offspring should be killed when they are no longer needed to allow assessment of reproductive effects and F2 offspring should be killed after weaning. All animals undergo full necropsy and histopathology.

Number of animals used – 20 females and 10-20 males, plus two generations of offspring, per treatment/control group, approximately 80 females and 40-80 males per study plus two generations of offspring (limit test uses 60-80 animals plus two generations of offspring).

2.10.2 Issues relating to the feasibility of applying the 3Rs to the endpoint.

Developmental and reproductive toxicology are viewed as complex endpoints. There are a variety of different effects that may be brought following exposure to toxicants. The issue is complicated further that the same chemicals may cause different effects depending on the type of exposure and the time during gestation. Because of these factors, it may be considered that there is only moderate knowledge across the broad spectrum of effects.

There are comparatively few reproductive or developmental toxicity data available for modelling or replacement purposes.

2.10.3 In vitro methods

Three in vitro tests for developmental toxicity testing have recently been validated by ECVAM and endorsed for use within screening studies. These tests are the Embryonic stem cell test (EST; Genschow et al, 2004)), the Post implantation rat whole embryo test (Piersma et al, 2004) and the Micromass test (Spielmann et al, 2004). Although they are presently not capable of completely replacing the in vivo developmental toxicity tests, their use as screening tests within a testing strategy could reduce the number of animals required.

Developmental toxicity particularly that which leads to teratogenicity can be due to a variety of mechanisms, including reactive binding of chemicals with nucleic acids and proteins leading to mutagenesis and carcinogenesis, as well as disruption of tubulin, microfilaments and the cytoskeleton (Schardein, 2000; Shepard, 1998). As such, it would not be surprising if metabolites of chemicals were responsible for some of the effects observed. Incorporating metabolising systems into the in vitro developmental toxicity tests would, therefore, be expected to improve their predictivity, as has been shown for chemicals, such as cyclophosphamide in the micromass assay (Wiger et al, 1989). The ECVAM validation study focused entirely on test chemicals that do not require metabolic activation, since the addition of metabolising systems might

103 have complicated the standardisation of these tests (Genshow et al., 2004). Thus, further use of these assays will largely depend on how they can be adapted to routinely include metabolising systems to test indirect-acting agents, and how relevant and reliable these modified protocols turn out to be. For the testing of large numbers of chemicals, it should be noted that the EST: a) is the only method that could be adapted for relatively high throughput; and b) does not involve the killing of large numbers of pregnant animals. It is therefore recommended that an improved protocol for this test should be developed as a first priority for future validation. There are plans for further test development incorporating metabolism, followed by a validation study in the recently initiated ReProTect Project (outlined below).

Large numbers of in vitro methods are under development for reproductive and developmental toxicity endpoints, the most advanced are shown in Table 2.10.3. The reproductive cycle is very complex, and no single in vitro test will ever be able to replace any of the in vivo tests. It is more likely that batteries of complementary tests will be required, covering the many different aspects of both male and female fertility, and their reproductive organs and functions (Brown et al, 1995). These batteries will then form part of integrated testing strategies to screen out the most toxic chemicals prior to any in vivo testing and may eventually replace the in vivo tests. However, much research has to be undertaken before this will be realised.

Another area of concern for reproductive toxicity is that of endocrine disruption. Over the last decade or so, it has become apparent that exposure to certain chemicals can have harmful effects such as modulating or disrupting hormones of the endocrine system. Initially, there was concern for the general effects of chemicals on the environment, but more recently the issue has broadened to include chemicals which can affect the endocrine system in humans and wildlife, and which therefore could cause developmental and reproductive disorders. At present there are no validated test guidelines for endocrine disruption, although two in vivo assays are undergoing validation studies at the US EPA. The Uterotrophic assay (Dorfman and Dorfman, 1950) is used to detect acting via interaction with the oestrogen receptor or by interfering with normal binding to this receptor by endogenous hormones. The endpoint measured is uterine weight change in immature (intact) or ovariectomised rodents, following administration of the test substance. To measure disruption of the male endocrine system the Hershberger assay (Hershberger et al, 1953) is used, which detects the ability of a test substance to elicit agonistic or antagonistic effects on the androgen receptor. These tests, like most for human endocrine disruption, focus on the oestrogen and androgen hormones and their functions, although endocrine disruption can also affect thyroid function and numerous other processes and involve many different mechanisms and nuclear receptor pathways.

A large number of in vitro assays are under development for the screening of endocrine disrupting chemicals (including some of the tests listed in the reproductive toxicity section of Table 2.20.3). Evaluation of some of these tests is on-going at the US EPA (http://www.epa.gov/scipoly/oscpendo/ edsparchive/standards.htm), and as part of the work of the OECD in validating methods for Endocrine Disruptors.

104 In vitro tests for endocrine disruption fall into four main categories (Baker, 2001): - receptor binding assays which measure binding affinity of a test substance to a hormone receptor - cell proliferation assays which measure the ability of a test substance to stimulate growth of hormone responsive cells - reporter gene assays which measure the ability of a test substance to activate transcription of a reporter gene construct in cells - analysis of hormone-sensitive gene expression, which measures the ability of a test substance to induce expression of hormone-sensitive genes

To date, little effort has been shown to incorporate metabolising systems into in vitro endocrine disrupter tests, even though there is much evidence to show that endogenous and exogenous are extensively metabolised by Phase I and Phase II enzymes, both in the liver and in hormonally active tissues (Tanaka et al, 2000; MacGregor et al, 2001). The absence of metabolism could lead to either false- positive results (lack of detoxification) or false-negative results (lack of activation) and as such it is recommended that the metabolising capacity of cell systems used for tests be assessed (Combes, 2004).

Method Test System Endpoint Developer/ References Developmental toxicity

Embryonic stem cell 3T3 cells or ES cells; Inhibition of Spielmann et al, test (EST) a permanent cell line differentiation of ES 1997 (INVITTOX No. 113) derived from mouse cells (measured by Genschow et al, embryonic stem cells microscopy); 2000 inhibition ES and 3T3 cell viability (measured by MTT) Post implantation rat Rat embryos Assessment of Piersma, 1993 whole embryo test morphology Van Maele-Fabry et (INVITTOX No. 123) al, 1992

Micromass test Rat micromass Inhibition of formation Flint, 1983, 1993 (INVITTOX No. cultures of limb bud of foci of 122/114) differentiating chondrocytes in micromass culture

Rabbit articular Rabbit articular Effect on the Hassell and Horigan, chondrocyte chondrocytes production of 1982 functional toxicity test proteoglycan by the (INVITTOX No. 41) cells, as detected by the dye Alcian Blue

Lung cell assay Human foetal lung Effect of the test Barile et al, 1988, (INVITTOX No. 48) fibroblasts or rat lung compound on total 1989, 1990 epithelial cells protein synthesis, and DNA synthesis

Chick embryo retina chick embryo neural Effects on cell-cell Daston et al, 1991, cell culture retina cells recognition and 1995 interaction, growth

105 and differentiation

Frog Embryo Xenopus frog Abnormalities of the Dawson and Bantle, Teratogenesis Assay embryos developing tadpoles 1987 - Xenopus (FETAX) Fort et al, 1998

Reproductive toxicity

Analysis of Modified Leydig cell Evaluation of the Chaudhary and progesterone line (cLHMRMA 10) effects toxicants Stocco, 1989 produced by a have on the process Freeman et al, 1993 modified Leydig cell of cAMP-controlled Nikula et al, 1999 line steroidogenesis, cytotoxicity, proliferation and apopotosis

Follicle culture Mouse follicle Changes in Cortvrindt et al, 1996 bioassay steroidogenesis and Cortvrindt and Smitz, receptor expression 2002 profiles in relation to oocyte quality

Human adreno- Human adreno- Assesses the Sanderson et al, cortical carcinoma cortical carcinoma complete 2000 cell line H295R cell line H295R steroidogenesis pathway, including aromatase activity from cholesterol to 17β

Aromatase assay human placental The release of Bueggemeier and JEG-3 and JAR tritiated water from Katlic, 1990 choriocarcinoma cell the substrate [1β-3H]- Letcher et al, 1999 lines andostendione

Testis slices assay Sliced testes measurement of EPA, 2003 and lactate dehydrogenase

Computer assisted spermatozoa Fertility, potential, Hong et al, 1981 sperm analysis viability, motility, Weiss, 1989 velocity, motion, and Seibert and Gosch, morphology of 1990 mammalian semen analysed in real time

Hamster egg Hamster egg Information about the Francavilla et al, penetration test functional status of 2000, 2002 (HEPT) / Hypo- sperm including Zahalsky et al, 2003 osmotic swelling test penetration (HOS)

Bovine spermatozoa Bovine spermatozoa Change in Seibert et al, 1989 cytotoxicity test spermatozoa motility Seibert and Gosch, (INVITTOX No. 21) and velocity and ATP 1990 contents

106

Blood-testis barrier Sertoli cells Secretion of lactate Monsees et al, 2000 model and inhibin B, cell viability

Sertoli cell co- Sertoli cells cultured Disruption of Fielder et al, 1997 cultures with spermatocytes, testicular Leydig cell or spermatogenesis peritubular cells

In vitro fertilisation Spermatozoa Assessment for Holloway et al, sperm toxicity test morphologic, 1990a, 1990b (INVITTOX No. 59) physiologic and biochemical parameters of sperm which can then be directly correlated with fertilising capacity

Culture of human Human granulosa Inhinition of Hughes et al, 1990 cumulus glanulosa luteal cells progesterone Miller et al, 1992 cells production (INVITTOX No. 92)

Oestrogen receptor Oestrogen and an Ability of substances Rijks et al, 1996 binding assays (e.g. oestrogen receptor to compete with Stoessel and PANVERA) oestrogen for binding Leclercq,1986

Fertuck et al, 2001

Oestrogen receptor Oestrogen- ICCVAM, 2002 transcriptional responsive reporter Ability of a test assays (e.g. ER- cell line (e.g. using substance to inhibit EcoScreen-assay™) MCF-7 or T47D cell the induction of the lines transfected with reporter gene a reporter gene - product or the pLuc) stimulation of cell growth by a reference oestrogen

Androgen receptor Androgen and an Ability of substances Wilson et al, 2002 binding assays (e.g. androgen receptor to compete with Kemppainen et al, PANVERA) androgen for binding 1999

Androgen receptor Androgen responsive ICCVAM, 2002 transcriptional reporter cell line (e.g. Ability of a test Hartig et al, 2002 assays (e.g. AR- CV-1 cell line substance to inhibit EcoScreen-assay™) transfected with the induction of the pLuc) reporter gene product or the stimulation of cell growth by a reference androgen

107

Table 2.10.3. The more advanced of the available methods for non-regulatory assessment of developmental and reproductive toxicity. INVITTOX protocols can be found via the ECAVM-SIS website http://ecvam-sis.jrc.it/.

Several of the tests in Table 2.10.3 are currently undergoing some form of evaluation. The Frog Embryo Teratogenesis Assay - Xenopus (FETAX) was extensively reviewed by ICCVAM in 2000 for its ability to predict developmental toxicity. The review concluded that the method was promising but not yet suitably validated. However, no further validation studies have yet been performed. ECVAM is undertaking an evaluation of the use of Leydig cells for use within a battery of tests in an integrated strategy for reproductive toxicity testing. The US EPA is evaluating a number of tests including the testis slice assay, the human adreno-cortical carcinoma cell line (H295), the aromatase assay, and various other assays for endocrine disruption.

Furthermore, an international study entitled ‘ReProTect’ has recently been initiated to further the technological development of in vitro methods and sensor technologies for developmental and reproductive toxicity (including endocrine disruption). By engaging a number of international laboratories with differing expertise to further develop tests such as those in Table 2.10.3, and the many more tests which are not so advanced, this should facilitate their eventual validation. The potential opportunities arising from the results of ‘ReProTect’ for reducing animal use are substantial. By breaking down the reproductive system into well defined sub elements, the development of alternatives will expand as it becomes clear where new research is needed. A number of in vitro methods discussed in Table 2.10.3 are to be evaluated as part of ReProTect, including: Computer assisted sperm analysis; oestrogen and androgen receptor binding studies and oestrogen and androgen transcriptional assays. Since the same in vivo experiments are used for drugs, chemicals, household products and cosmetics, the results from the ReProTect project will be applicable to all product and chemical types.

The ReProTect project was discussed at a recent ECAVM Workshop – ‘Chemical effects on mammalian fertility’ for which a full report is not yet available. According to the ECVAM website, the key topics discussed were:

- Identification of 26 toxicological target cells, tissues, organs and biological mechanisms involved in fertility that are targeted by reproductive toxicants - Presentation and discussion of in vitro modes currently under evaluation in ReProTect or by US EPA in order to define the toxicological information that can provide - Identification and prioritisation of 11 in vitro models that are available and need further exploration before integration into testing strategy - Identification of information gaps for which participants could not think of any in vitro models - Information exchange on 5 in vitro tests between ECVAM and US EPA that are currently in optimisation/validation phase

108 - Questionnaire concerning fertility that should be sent to regulators

It was recommended that ECAVM conducts a literature search on alternative methods to fill the information gaps, where there seem to be no current methods, in order to facilitate the development and improvement of in vitro models that are relevant to a testing strategy for reproductive toxicity. Also, it was recommended that the cell lines which are developed through the ReProtect project be available from a public cell bank, so that if regulatory approval is eventually sought, toxicologists will easily be able to acquire the relevant cells to carry out the tests.

2.10.4 In silico approaches

There are no recent reviews in this area that could be obtained. The topic was partially reviewed by Cronin and Dearden (1995d).

QSARs

There are comparatively few QSARs for reproductive effects. Except within restricted chemical classes e.g. Deviller et al (2002b), Dawson et al (1996), Richard and Hunter (1996), most QSARs have non-linear and multivariate in nature. This again confirms the problems of modelling this endpoint. Example of non-linear and multivariate QSARs include Arena et al (2004), Devillers et al (2002a),

Expert Systems

A number of quantitative models for developmental toxicity effects are available in the MultiCASE (Ghannoni et al 1997) and TOPKAT (Gombar et al 1991; 1995) expert systems. SARs for development toxicity are found in DEREK and HazardExpert, however the rule bases are small and only poorly developed.

2.10.5 Integrated testing strategies for specific endpoints

The BUAV (Langley, 2001) proposes a simple two step strategy for developmental toxicity. Step one involves computational screening using DEREK and TOPKAT and step two involves the use of alternative methods validated by ECVAM for use in a testing strategy before any in vivo tests are conducted – namely the embryonic stem cell test, the Micromass test and the Post implantation rat whole embryo test.

For reproductive toxicity, the BUAV (Langley, 2001) proposes a strategy which could identify substances likely to affect fertility, fertilization and implantation. Step one involves in silico screening for potentially relevant structural alerts and physicochemical properties. Step two involves a number of in vitro tests to assess chemical effects on: - viability, motility, morphology and biochemistry of human sperm - sperm development (using sertoli cells in culture) - Oocyte development (using ovarian follicle cells in culture) - Testosterone production using rodent Leydig cells in culture

109 - Endocrine disruption using oestrogen and androgen receptor binding assays Step three allows for more complex tests, such as in vitro fertilization studies, to be conducted if the results from previous tests are inconclusive.

A testing strategy specifically for endocrine disruption in humans has been proposed by the US EPA Endocrine Disruptors Screening and testing Advisory Committee (Worth and Balls, 2002) and comprises both in vitro and in vivo tests. Compound prioritisation is followed by tier one screening assays to identify substances with endocrine disrupting potential. The substances identified then go on to tier two assays, designed to identify adverse effects and establish dose-response relationships for hazard and risk assessment.

2.10.6 Reduction and Refinement opportunities

The use of the combined repeated-dose toxicity study, in conjunction with the reproductive/developmental screening test (OECD TG 422), would considerably reduce the number of animals since both tests would not need to be conducted. Positive results from either of the screening tests should be used without the need for further testing.

Specific refinement opportunities include using group housing where appropriate (except during the latter stages of pregnancy, littering and rearing), and changing how dose levels are calculated from previous toxicity tests, so that more relevant dose levels are used. The oral route of administration is preferred in all studies as dermal and inhalation administration would need to be terminated at the end of pregnancy (Combes et al, 2004).

2.10.7 Bibliography

Abe T, Saito H, Niikura V, Shigeoka T, Nakano Y (2000) Embryonic development assay with Daphnia magna: application to toxicity of chlorophenols. Water Science and Technology 42: 297-304 Arena VC, Sussman NB, Mazumdar S, Yu S, Macina OT (2004) The utility of structure-activity relationship (SAR) models for prediction and covariate selection in developmental toxicity: Comparative analysis of logistic regression and decision tree models. SAR and QSAR in Environmental Research 15:1-18 Baker V (2001) Endocrine disruptors – testing strategies to assess human hazard. Toxicology In Vitro 15: 413-419 Barile FA, Guzowski DE, Ripley C, Siddiqi Z, Bienkowski RS (1990) Ammonium chloride inhibits basal degradation of newly synthesised collagen in human fetal lung fibroblasts. Archives of Biochemistry and Biophysics 276: 125-131 Barile FA, Ripley-Rouzier C, Siddiqi Z, Bienkowski RS (1988) Effects of prostaglandin E1 on collagen production and degradation in human fetal lung fibroblasts. Archives of Biochemistry and Biophysics 265: 441-446

110 Barile FA, Siddiqi Z, Ripley-Rouzier C, Bienkowski RS(1989) Effects of puromycin and hydroxynorvaline on net production and intracellular degradation of collagen in human fetal lung fibroblasts. Archives of Biochemistry and Biophysics 270: 294-301 Brown NA, Spielmann H, Bechter R, Flint OP, Freeman SJ, Jelinek RJ, Koch E, Nau H, Newall DR, Palmer AK, Renault J-Y, Repetto MF, Vogel R, Wiger R (1995) Screening chemicals for reproductive toxicity: the current alternative. ECVAM Workshop Report 12. ATLA 23: 868-882 Bueggemeier RW, Katlic NE (1990) Aromatase inhibition by an enzyme-activated irreversible inhibitor in human carcinoma cell cultures. Cancer Research 50: 3652– 3656 Chaudhary LR, Stocco DM (1989) Inhibition of hCG- and cAMP-stimulated progesterone production in MA-10 mouse Leydig tumor cells by ketoconazole. Biochemistry International 18: 251–262 Combes RD (2004) The case for taking account of metabolism when testing for potential endocrine disruptors in vitro. ATLA 32: 121-135 Combes RD, Gaunt I, Balls M (2004) A scientific and animal welfare assessment of the OECD Health Effects Test Guidelines for the safety testing of chemicals under the European Union REACH system. ATLA 32: 163-208 Cortvrindt R, Smitz J (2002) Follicle culture in reproductive toxicology: a tool for in vitro testing of ovarian function? Human Reproduction Update 8: 243–254 Cortvrindt R, Smitz J, Van Steirteghem AC (1996) In-vitro maturation, fertilization and embryo development of immature oocytes from early preantral follicles from prepuberal mice in a simplified culture system. Human Reproduction 11: 2656–2666 Daston G P, Baines D, Elmore E, Fitzgerald M P, Sharma S (1995) Evaluation of chick embryo neural retina culture as a screen for developmental toxicants. Fundamental and Applied Toxicology 26: 203–210 Daston GP, Baines D, Yonker JE (1991) Chick embryo neural retina cell culture as a screen for developmental toxicity. Toxicology and Applied Pharmacology 109: 352– 366 Dawson DA, Bantle JA (1987) Development of a reconstituted water medium and preliminary validation of the frog embryo teratogenesis assay xenopus (FETAX). Journal of Applied Toxicology 7: 237-244 Dawson DA, Schultz TW, Hunter RS (1996) Developmental toxicity of carboxylic acids to Xenopus embryos: A quantitative structure-activity relationship and computer-automated structure evaluation. Teratogenesis Carcinogenesis and Mutagenesis 16: 109-124 Devillers J, Chezeau A, Thybaud E (2002a) PLS-QSAR of the adult and developmental toxicity of chemicals to Hydra attenuata. SAR and QSAR in Environmental Research 13: 705-712 Devillers J, Chezeau A, Thybaud E, Rahmani R (2002b) QSAR modeling of the adult and developmental toxicity of glycols, glycol ethers and xylenes to Hydra attenuata. SAR and QSAR in Environmental Research 13: 555-566

111 Dorfman RI, Dorfman AS (1950) assays using the rat uterus. Endocrinology 55: 65-69 EPA (2003) Study Plan for Prevalidation of the Sliced Testis Assay. US EPA, http://www.epa.gov/scipoly/oscpendo/docs/edmvs/prevalidation_study_plan.pdf Fertuck KC, Kumar S, Sikka HC, Matthews JB, Zacharewski TR (2001) Interaction of PAH-related compounds with the alpha and beta isoforms of the . Toxicology Letters 121: 167–177 Fielder RJ, Atterwill CK, Anderson D, Boobis AR, Botham P, Chamberlain M, Combers R, Duffy PA, Lewis RW. Lumley CE, Kimber I, Newall DR (1997) BTS Working Party Report on In Vitro Toxicology. Human and Experimental Toxicolology 16: S1-S40 Flint OP (1993) In vitro tests for teratogens: desirable endpoints, test batteries, and current status of the micromass teratogen test. Reproductive Toxicology 7 Suppl 1: 103-111 Flint OP (1983) A micromass culture method for rat embryonic neural cells. Journal of Cell Science 61: 247-262 Fort DJ, Dawson DA, Bantle JA (1988) Development of a metabolic-activation system for the frog embryo teratogenesis assay - Xenopus (FETAX). Teratogenesis Carcinogenesis and Mutagenesis 8: 251-263 Francavilla F, Romano R, Santucci R, Macerola B, Ruvolo G, Francavilla S (2002) Effect of human sperm exposure to progesterone on sperm-oocyte fusion and sperm-zona pellucida binding under various experimental conditions. International Journal of Andrology 25: 106–112 Francavilla F, Santucci R, Macerola B, Ruvolo G, Romano R (2000) Nitric oxide synthase inhibition in human sperm affects sperm-oocyte fusion but not zona pellucida binding. Biology of Reproduction 63: 425–429 Freeman DA, Goeze PM, Porpaczy Z (1993) Finasteride blocks progesterone synthesis in MA-10 Leydig tumor cells. Endocrinology 133: 1915–1917 Genschow E, Scholz G, Brown N, Piersma A, Brady M, Clemann N, Huuskonen H, Paillard F, Bremer S, Becker K, Spielmann H (2000) Development of prediction models for three in vitro embryotoxicity tests in an ECVAM validation study. In Vitro Molecular Toxicology 13: 51-66 Genschow E, Spielmann H, Scholz G, Pohl I, Seiler A, Clemann N, Bremer S, Becker K (2004) Validation of the Embryonic Stem Cell Test in the International ECVAM Validation Study on Three In Vitro Embryotoxicity Tests. ATLA 32: 209–244 Ghanooni M, Mattison DR, Zhang YP, Macina OT, Rosenkranz HS, Klopman G (1997) Structural determinants associated with risk of human developmental toxicity. American Journal of Obstetrics and Gynecology 176: 799-805 Gombar VK, Borgstedt HK, Enslein K, Hart JB, Blake BW (1991) A QSAR model of teratogenesis. Quantitative Structure-Activity Relationships 10: 306-332 Gombar VK, Enslein K, Blake BW (1995) Assessment of developmental toxicity potential of chemicals by quantitative structure-toxicity relationship models. Chemosphere 31: 2499-2510

112 Hartig PC, Bobseine KL, Britt BH, Cardon MC, Lambright CR, Wilson VS, Gray LE Jr (2002) Development of two androgen receptor assays using adenoviral transduction of MMTV-luc reporter and/or hAR for endocrine screening. Toxicological Sciences 66: 82–90 Hassell JR, Horigan EA (1982) Chondrogenesis: A model developmental system for measuring the teratogenic potential of compounds. Teratogenesis, Carcinogenesis and Mutagenesis 2: 325-331 Hershberger L, Shipley E, Meyer R, (1953) Myotrophic activity of 19-nortestosterone and other steroids determined by modified levator animuscle method. Proceedings of the Society for Experimental Biology 83: 175-180 Holloway AJ, Moore HDM, Foster PMD (1990a) The use of rat in vitro fertilization to detect reductions in the fertility of spermatozoa from males exposed to ethylene glycol monomethyl ether. Reproductive Toxicology 4: 21-27 Holloway AJ, Moore HDM, Foster PMD (1990b) The use of in vitro fertilization to detect reductions in the fertility of male rats exposed to 1,3-dinitrobenzene. Fundamental and Applied Toxicology 14: 113-122 Hong CY, Chaput de Saintonge DM, Turner P (1981) A simple method to measure drug effects on human spermatozoa motility. British Journal of Clinical Pharmacology 11: 385–387 Hughes SF, Haney AF, Hughes CL Jr. (1990) Use of granulosa cells for in vitro screening of reproductive toxicants. Reproductive Toxicology 4: 11-15 ICCVAM (2002) Development of stably transfected cell lines to screen Endocrine Disrupters, Website http://iccvam.niehs.nih.gov/methods/endodocs/final/erta_brd/ertaappx/erta_b4.pdf Kemppainen JA, Langley E, Wong CI, Bobseine K, Kelce WR, Wilson EM (1999) Distinguishing androgen receptor agonists and antagonists: distinct mechanisms of activation by medroxyprogesterone acetate and . Molecular Endocrinology 13: 440–454 Langley G (2001) The Way Forward. Action to end animal toxicity testing. Report compiled for the BUAV and ECEAE, Letcher RJ, Drenth HJ, Norstrom RJ, Bergman A, Safe S, Pieters R (1999) Cytotoxicity and aromatase (CYP 19) activity modulation by organochlorines in human placental JEG-3 and JAR choriocarcinoma cells. Toxicology and Applied Pharmacology 160: 10–20 MacGregor JT, Collins JM, Sugiyama Y, Tyson CA, Dean J, Smith L, Andersen M, Curren RD, Houston JB, Kadlubar FF, Kedderis GL, Krishnan D, Li AP, Parchment RE, Thummel K, Tomaszewski JE, Ulrich R, Vickers AEM, Wrighton SA (2001) In vitro human tissue models in risk assessment: report of a consensus-building workshop. Toxicological Sciences 59: 17–36 Miller MM, Weitzman GA, Delacey NW Jr., Breckinridge S, London SN (1992) Improved reproductive toxicant screening with human granulosa cells. Biology of Reproductive 46 (suppl 1): 122

113 Monsees TK, Franz M, Gebhardt S, Winterstein U, Schill WB, Hayatpour J (2000) Sertoli cells as a target for reproductive hazards. Andrologia 32: 239-246 Nikula H, Talonpoika T, Kaleva M,Toppari J( 1999) Inhibition of hCG-stimulated steroidogenesis in cultured mouse Leydig tumor cells by and octylphenols. Toxicology and Applied Pharmacology 157: 166–173 Piersma AH (1993) Whole embryo culture and toxicity testing. Toxicology In vitro 7: 763-768 Piersma AH, Genschow E, Verhoef A, Spanjersberg MQI, Brown NA, Brady M, Burns A, Clemann N, Seiler A, Spielmann H (2004) Validation of the Postimplantation Rat Whole-embryo Culture Test in the International ECVAM Validation Study on Three In Vitro Embryotoxicity Tests. ATLA 32: 275–307 Richard AM, Hunter ES (1996) Quantitative structure-activity relationships for the developmental toxicity of haloacetic acids in mammalian whole embryo culture. Teratology 53: 352-360 Rijks LJ, Boer GJ, Endert E, de Bruin K, van den Bos JC, van Doremalen PA, Schoonen WG, Janssen AG, van Royen EA (1996) The stereoisomers of 17alpha- [123I]iodovinyloestradiol and its 11beta-methoxy derivative evaluated for their oestrogen receptor binding in human MCF-7 cells and rat uterus, and their distribution in immature rats. European Journal of Nuclear Medicine 23: 295–307 Sanderson JT, Seinen W, Giesy JP, van den BM (2000) 2-Chloro-s-triazine herbicides induce aromatase (CYP 19) activity in H295R human adrenocortical carcinoma cells: a novel mechanism for estogenicity. Toxicological Sciences 54: 121–127 Schardein JL (2000) Chemically Induced Birth Defects. Third Edition, Revised and Expanded. Marcel Dekker Inc, New York, Seibert H, Gosch U (1990) A short-term bovine sperm cell assay for the evaluation of the in vitro cytotoxicity of chemicals. ATLA 17: 228–232 Seibert H, Kolossa M, Wasserman O (1989) Bovine spermatozoa as an in vitro model for studies on the cytotoxicity of chemicals: Effects of chlorophenols. Cell Biology and Toxicology 5: 315-330. Shepard TH (1998) Catalog of Teratogenic Effects. 9th Ed Johns Hopkins University Press, Baltimore, Spielmann H, Genschow E, Brown NA, Piersma AH, Verhoef A, Spanjersberg MQI, Huuskonen H, Paillard F, Seiler A (2004) Validation of the rat limb bud micromass test in the international ecvam validation study on three in vitro embryotoxicity tests. ATLA 32: 245–274 Spielmann H, Pohl I, Döring B, Liebsch M, Moldenhauer F (1997) The embryonic stem cell test (EST), an in vitro embryotoxicity test using two permanent mouse cell lines: 3T3 fibroblasts and embryonic stem cells. In Vitro Toxicology 10: 119-127 Stoessel S, Leclercq G (1986) Competitive binding assay for estrogen receptor in monolayer culture: measure of receptor activation potency. Journal of Steroid Biochemistry 25: 677–682

114 Tanaka E, Terada M, Misawa S (2000) Cytochrome P450 2E1: its clinical and toxicological role. Journal of Clinical Pharmacy and Therapeutics 25: 165–175 Van Maele-Fabry G, Delhaise F, Picard JJ (1992) Evolution of the developmental scores of sixteen morphological features in mouse embryos displaying 0-30 somites. International Journal of Developmental Biology 36: 161-167 Weiss DG (1989) Videomicroscopic measurements in living cells: dynamic determination of multiple endpoints for in vitro toxicology. Journal of Molecular Toxicology 1: 465–488 Wiger R, Trygg B, Holme JA (1989) Toxic Effects of Cyclophosphamide in differentiating chicken limb bud cell-culture using rat-liver 9000-G supernatant or rat- liver cells as an activation system - an in vitro short-term test for proteratogens. Teratology 40: 603-613 Wilson VS, Bobseine K, Lambright CR, Gray L.E, Jr (2002) A novel cell line, MDA- kb2, that stably expresses an androgen- and glucocorticoid-responsive reporter for the detection of hormone receptor agonists and antagonists. Toxicological Sciences 66: 69–81 Worth AP, Balls M (Eds.) (2002) Alternative (Non-animal) Methods for Chemicals Testing: Current Status and Future Prospects. A report prepared by ECAVM and the ECVAM Working Group on Chemicals. ATLA 30 Supplement 1. Zahalsky MP, Zoltan E, Medley N, Nagler HM (2003) Morphology and the sperm penetration assay. Fertility and Sterility 79: 39–41

115 2.11 Bioavailability – including Toxicokinetics, Metabolism,

2.11.1 Established toxicity tests (e.g. OECD Guidelines etc)

Reach requirements:

Toxicokinetics data is required in Annex VI of the REACH proposals where it is stated that for toxicokinetics there should be an ‘assessment of the toxicokinetic behaviour of the substance to the extent that can be derived from the relevant available information’. This implies that specific toxicokinetic studies should not be carried out. The current methods are therefore only described for completeness.

Method Outline:

Toxicokinetics (REACH B36, OECD TG 417)

The test substance is administered via an appropriate route (determined by data on human exposure), either in a single dose or as repeated doses for a defined period. Depending on the type of study, the substance and/or metabolites are determined in body fluids, tissues and/or waste products.

Number of animals used – 20 per dose/control group, approximately 60 per study.

2.11.2 Issues relating to the feasibility of applying the 3Rs to the endpoint.

There is much interest in being able to predict “bioavailability”. This has generally been derived from the pharmaceutical industry wishing to screen out poorly bioavailable compounds early on in the drug development process. Bioavailability is a very complex process, with incredible variations depending on the compounds, inter-species variability and even whether or not animals may or may not have been fed (in the case of oral uptake). There are a large number of competing kinetic processes such as absorption, distribution, clearance, metabolism etc.

For the successful modelling of bioavailability and the ultimate replacement of animal tests, it seems unlikely that a single assay will be sufficient and it may be better to attempt to model individual components (e.g. absorption, clearance, metabolism etc) of the process. Many of these are well known and the mechanisms are established. Many mechanisms are simple physicochemical process relating to trivial effects such as passive diffusion. However, knowledge is still required in some more specific events such as active influx or efflux through a membrane, or particular metabolic routes.

There are many data for bioavailability and its component parts that may give clues for modelling. For instance, some authors have used the World Drug Index as a

116 starting point for in silico modelling. However, in terms of reliable data for validation and computational modelling, there are still relatively few appropriate data sets.

There is wide and growing knowledge on in vivo metabolism. This includes the routes of metabolism and likely metabolic events. Despite this knowledge, metabolism still remains a difficult endpoint to model. This is due to a number of factors including the fact that molecules may be a substrate for a number of different enzymes and often the relative amounts are not known. In addition, possible metabolites may be predicted but the relative proportion (or half-life) is seldom not or quantified. With regard to these problems, the replacement of in vivo testing with accurate alternatives will be time-consuming.

Absorption, assumed to be a process of passive diffusion through a membrane, can be considered to be well understood. Effects such as active and facilitated diffusion are less well studied and quantified although are the subject of on-going research, particularly by the pharmaceutical industry. Many databases for a whole variety of membrane permeability data are available. These are mainly of a highly variable nature, however. Bearing these points in mind there should be possibilities for the imaginative use of alternatives and computational models for absorption.

The science of pharmacokinetics is complex, although well established. Many data for endpoints such as volume of distribution, clearance etc are available through standard text book sources e.g. Hardman et al (2001). The quality of these data has been observed to be variable however.

2.11.3 In vitro methods

There are no in vitro alternatives that have been fully validated for any of the areas of bioavailability. However, one method, Gut absorption using Caco-2 cells, has been subjected to a pre-validation study at ECVAM. Table 2.11.3 below shows some of the techniques under development for the various endpoints associated with toxicokinetics and bioavailability.

Endpoint Test Test system Key References Oral bioavailability Cell based assays Caco-2, TC-7 cells Gan and Thakker, 1997

Intestinal absorption Artificial membranes PAMPA Kansy et al, 1998.

Inhalation absorption Primary cells, cell lines pulmonary epithelium e.g. Hukkanen et al, 2000 and reconstituted airway A549, Calu-3, primary epithelial systems alveolar cells. MatTek's EpiAirway™ A549 cells can also be www.mattek.com used to look at airway metabolism. SkinEthic’s Reconstituted human alveolar epithelium in vitro. www.skinethic.com

117

Dimova et al, 2005

Ocular absorption Cells and organotypic e.g human corneal Reichl et al, 2004 coculture systems epithelial, stromal, and endothelial cell co-cultures Estimation of free Plasma various methods including Heringa et al, 2004 plasma concentrations protein/erythrocyte equilibrium dialysis, binding studies and microdialysis ultrafiltration, blood-tissue partitioning ultracentrifugation, circular dichroism, chromatography and polymer-based microextraction. Blood-brain transport Primary brain cells or 3D e.g bovine brain epithelial Garberg, 1998 cell culture cells or neurosphere Tahti et al, 2003 culture of neuronal cells Bioaccumulation 3T3 L1 adipocyte cell Bioaccumulation of aryl Viravaidya and line hydrocarbons in body fat Schuler, 2002

Biotransformation/ Cell based Hepatocyte bioreactor Canova et al, 2004 Metabolism Biotransformation/ Isolated/recombinant Metabolic stability tests and Lee et al, 2005 Metabolism enzymes, tissue slices, high throughput screening Elaut et al, 2004 tissue homogenates/ systems e.g. MetaChip microsomal fractions Biotransformation/ cell lines V79 Cell Battery to monitor Krebsfaenger et al, Metabolism human polymorphic 2003. differences in CYP Bull et al, 2001 function. Reporter cell lines Towensend et al, 2002

Renalclearance/ Cell based assay HK-2 proximal tubule Peraza et al, 2003. metabolism epithelial cell line

Table 2.11.3. An outline of some of the tests under development for various endpoints involved in toxicokinetics and bioavailability.

Biokinetic modelling methods are becoming more and more relevant to the testing of substances other than pharmaceuticals, were they are mostly used, although their validation for regulatory testing is unlikely in the near future. They can provide information useful in the design of in vivo toxicokinetic studies, and for interpreting in vitro and in vivo data. Biokinetic modelling is also useful to address the fundamental problem in risk assessment of the need to relate external dose effects with internal dose effects - the dose that actually reaches the target in humans. The internal target organ dose can be predicted by undertaking toxicokinetic studies, taking account of ADME. Physiologically-based biokinetic (PBPK) modelling is a technique for predicting ADME in vivo by combining results from the literature and computational techniques, and by extrapolating data from in vitro studies between species (Blaauboer, 2002; 2003). It involves representing organs or groups of organs as discrete compartments interconnected with physiological volumes and blood flows. Such models can account for physiological influences and internal tissue dose, allowing interspecies extrapolation, and can predict metabolic and physiological parameters.

118

Significant advances are currently being made in biokinetic modelling, including the development of software programs and databases for the rapid generation of new models (George Loizou, personal communication). These will improve the usefulness of the approach for evaluating large numbers of chemicals, and will assist with the interpretation of in vitro hazard predictions for risk assessment purposes.

An ECVAM Workshop was held in 2004 - ‘Metabolism: a bottle neck in in vitro toxicological test development’. Although no full report has yet been published, according to the ECVAM website (http://ecvam.jrc.it/index.htm) the aim was to identify toxicokinetic and metabolism issues in the different key areas of regulatory toxicity testing and integrate the obtained knowledge into strategies towards the replacement of in vivo testing. The workshop concluded that: validation requires careful selection of chemicals with clear information on the role of metabolism in toxicities; human-based metabolising systems (e.g. human hepatocytes) should be used in preference to animal based systems to reduce to need for species extrapolation, although their suitability for regulatory testing needs to be demonstrated and that for the future validation of biokinetic modelling there needs to be good quality data for a sufficient number of chemicals which can be used as a test set.

The recommendations of the workshop require ECVAM to commission a study to establish a database of chemicals that are toxic to humans and which are detoxified as a consequence of metabolism and to commission an initiative to create a depository of reference chemicals for validation of metabolising systems.

2.11.4 In silico approaches

For ease, the in silico approaches are separated into different aspects of bioavailability.

2.11.4.1 Toxicokinetics

There are a large number of excellent reviews in the area of “Predicitive ADME” which encompasses all aspects of predicting bioavailability. The number of reviews probably reflects the recent interest in this area, in particular from the pharmaceutical industry. Good reviews include Duffy (2004), Egan et al (2002), Ekins et al (2000a), (2000b), Winkler (2004) and several others noted in the bibliography.

QSARs

There are many QSARs for various toxicokinetic effects (many membrane absorption, see section below). These range from very simple models, based on a few fundamental parameters, to much more complex models which are multivariate in nature. From a pragmatic point of view, simpler models are often of greater use, although they lose out on statistical fit.

119 Expert Systems

There has been a burgeoning commercial market in expert systems for ADME property prediction. These include MultiCASE, ADME Boxes and ADME-Works. It should be noted that comparatively less has been performed with regard to the evaluation and validation of these products, and it is possible that many are still under development.

For many decades there have been attempted to develop QSARs for metabolic processes. More recent and relevant attempts are brought together very well by de Graaf et al (2005), de Groot et al (2004), Lewis et al (1999) and several others.

2.11.4.2 Metabolism

QSARs

QSARs for metabolism have tended to be for kinetics effects such as rate constants and binding affinities etc. These are well reviewed e.g. by Lewis et al (1999) and Hansch et al (2004). Such QSARs have normally been quite simple and based on fundamental physico-chemical properties as well as molecular orbital properties. The practical use of such QSARs is often limited by the very restricted endpoint, as well as chemical diversity of the training set.

Expert Systems

There are a variety of expert systems for the prediction of metabolism. The majority of the better established systems include “rules” that are associated with metabolic events, mostly taken from the standard literature. These rules are in the form of “Sub-Structure A” will be metabolised to “Sub-Structure B”. Examples of the application of such rules are Meta (MultiCASE), Meteor, MetabolExpert and TIMES. The application of such rules may often lead to the prediction of large numbers of metabolites, especially when secondary metabolites are included. The TIMES software also includes some estimate of the probability of individual metabolites being produced (Mekenyan et al 2004).

Some more modern expert systems for the prediction of metabolism (ADME Boxes, ADME-Works, ADMET-Predictor) claim to be more quantitative in their assessment. A further system, COMPACT (Lewis 2001), is based more on molecular modelling and whether a compound has the appropriate properties that may allow oxidisation and whether the dimensions (i.e. molecular planarity) are appropriate.

2.11.4.2 Absorption

There is currently a lack of a comprehensive review in the specific area of modelling uptake and absorption in various membranes. General information is given by Duffy (2004) which is a good starting point to this area.

120

QSARs

Many QSARs have been proposed for absorption through membranes. These can be divided into those based on a small number of descriptors on a mechanistic basis e.g. modelling of skin permeability coefficients (Moss et al 2002). There is probably a strong argument in favour of this type of approach for modelling passive diffusion. More multivariate approaches have also been taken cf. Basak et al 2002; 2004. Other endpoints that have been modelled include intestinal permeability (Egan and Lauri 2002).

Expert Systems

There are a number of expert systems that are available to model absorption. DermWin is freely available through EPIWIN to model skin permeability. Other products such as MultiCASE, Molinspiration, ADME-Boxes, ADME-Works and PK- Map have an emphasis towards oral absorption of drugs.

2.11.4.3 Other Pharmacokinetic Endpoints

A good review of the prediction is provided by Duffy (2004).

QSARs

QSARs are available for various pharmacokinetic endpoints. Non-linear, e.g. decision tree, approaches are seen to be particularly valuable in this regard (Manga et al 2003).

Expert Systems

MultiCASE has a number of modules to predict relevant endpoints including protein binding, urinary excretion and volume of distribution.

2.11.5 Integrated testing strategies for specific endpoints

A three tiered strategy for assessing metabolism is proposed in Worth and Balls (2002). The first tier involves using in vitro cell culture assays to screen for metabolism. The second tier looks at the induction of biotransformation enzymes, by analysing the increased transcription of the respective genes. Finally, the third tier uses in vitro models for evaluating polymorphic effects on metabolism.

The BUAV (Langley, 2001) discussed a strategy for predicting toxicokinetic properties which involves the combination of in vitro studies, preferably with human cells, with computer simulations of the organ systems of the human body – physiologically-based pharmacokinetic modelling (PBPK).

121

2.11.6 Reduction and Refinement opportunities

Toxicokinetic information can be obtained from acute and repeated dose studies without the need for specific studies. Temporal measurements utilising radioactive tracers, in vivo reporter systems and/or imaging make it possible to monitor biological events within a small number of animals (Pogge and Slikker, 2004; Zhao et al, 2001. Such techniques also allow more mechanistic information to be derived and reduce the problems with the statistical handling of data from a large number of individual animals because the course of the disease from initial infection/tumour growth through to therapeutic treatment can be monitored within the same individual. The use of metabolism cages is encouraged to avoid more invasive methods of sample collection such as blood sampling. Where blood biochemistry is essential, cannulas should be used in conjunction with any necessary pain relief. Group housing should be avoided where there is a risk of infection or sampling devices being dislodged. The prior application of alternative methods will substantially reduce the number of animals used for toxicokinetic studies. Preliminary in vitro, ex vivo or in vivo studies should aim to establish the measurements that will need to be taken and experimental species and conditions selected for their suitability for sampling and handling for such procedures.

2.11.7 Bibliography

Basak SC, Mills D, El-Masri HA, Mumtaz MM, Hawkins DM (2004) Predicting blood: air partition coefficients using theoretical molecular descriptors. Environmental Toxicology and Pharmacology 16: 45-55 Basak SC, Mills D, Hawkins DM, El-Masri HA (2002) Prediction of tissue-air partition coefficients: A comparison of structure-based and property-based methods. SAR and QSAR in Environmental Research 13: 649-665 Beliveau M, Tardif R, Krishnan K (2003) Quantitative structure-property relationships for physiologically based pharmacokinetic modeling of volatile organic chemicals in rats. Toxicology and Applied Pharmacology 189: 221-232 Bender A, Glen RC (2004) Molecular similarity: a key technique in molecular informatics. Organic and Biomolecular Chemistry 2: 3204-3218 Blaauboer BJ (2002) The applicability of in vitro-derived dat in hazard idenifiaction and characterisation of chemicals. Environmental Toxicology and Pharmacology 11: 213-225 Blaauboer BJ (2003) The integration of data on physic-chemical properties, in vitro- derived toxicity data and physiologically based kinetic and dynamic moldelling as a tool in hazard and risk assessment. A commentary. Toxicology Letters 138: 161-171 Bull S, Langezaal I, Clothier R, Coecke S (2001) A Genetically engineered cell- based system for detecting metabolism-mediated toxicity. ATLA 29: 703-716 Butina D, Segall MD, Frankcombe K (2002) Predicting ADME properties in silico: methods and models. Drug Discovery Today 7: S83-S88

122 Canova N, Kmonicova E, Lincova D, Vitek L, Farghali H (2004) Evaluation of a Flat Membrane Hepatocyte Bioreactor for Pharmacotoxicological Applications: Evidence that Inhibition of Spontaneously Produced Nitric Oxide Improves cell functionality. ATLA 32: 25-35. Chen HF, Yao XJ, Petitjean M, Xia HO, Yao JH, Panaye A, Doucet JP, Fan BT (2004) Insight into the bioactivity and metabolism of human glucagon receptor antagonists from 3D-QSAR analyses. QSAR and Combinatorial Science 23: 603-620 de Graaf C, Vermeulen NPE, Feenstra KA (2005) Cytochrome P450 in silico: an integrative modeling approach. Journal of Medicinal Chemistry 48: 2725-2755. de Groot MJ, Kirton SB, Sutcliffe MJ (2004) In silico methods for predicting ligand binding determinants of cytochromes P450. Current Topics in Medicinal Chemistry 4: 1803-1824 Dimova S, Brewster M.E, Noppe M, Jorissen M, Augustjins P (2005) The use of human nasal in vitro cell systems during drug discovery and development. Toxicology in vitro 19: 107-122 Duffy JC (2004) Prediction of pharmacokinetic parameters in drug design and toxicology. In MTD Cronin, DJ Livingstone (eds) Predicting Chemical Toxicity and Fate. CRC Press, Boca Raton FL, USA, pp 229-261. Duffy JC (2004) Prediction of pharmacokinetic parameters in drug design and toxicology. In MTD Cronin, DJ Livingstone (eds) Predicting Chemical Toxicity and Fate. CRC Press, Boca Raton FL, USA, pp 229-261. Duffy JC (2004) Prediction of pharmacokinetic parameters in drug design and toxicology. In MTD Cronin, DJ Livingstone (eds) Predicting Chemical Toxicity and Fate. CRC Press, Boca Raton FL, USA, pp 229-261. Egan WJ, Lauri G (2002) Prediction of intestinal permeability. Advanced Drug Delivery Reviews 54: 273-289 Egan WJ, Walters WP, Murcko MA (2002) Guiding molecules towards drug-likeness. Current Opinion in Drug Discovery and Development 5: 540-549 Ekins S, Boulanger B, Swaan PW, Hupcey MAZ (2000a) Towards a new age of virtual ADME/TOX and multidimensional drug discovery. Molecular Diversity 5: 255- 275. Ekins S, Waller CL, Swaan PW, Cruciani G, Wrighton SA, Wikel JH (2000b) Progress in predicting human ADME parameters in silico. Journal of Pharmacological and Toxicological Methods 44: 251-272 Elaut G, Török G, Papeleu P, VanhaeckeT, Laus G, Tourwé D Rogiers V (2004) Rat Hepatocyte Suspensions as a Suitable In Vitro Model for Studying the Biotransformation of Histone Deacetylase Inhibitors. ATLA 32 Suppl. 1: 105–112 Gan LSL, Thakker DR (1997) Applications of the Caco-2 model in the design and development of orally active drugs: Elucidation of biochemical and physical barriers posed by the intestinal epithelium. Advanced Drug Delivery Reviews 23: 77-98 Garberg P (1998) In vitro models of the blood brain barrier. ATLA 26: 821-847 Hansch C, Mekapati SB, Kurup A, Verma RP (2004) QSAR of cytochrome P450. Drug Metabolism Reviews 36: 105-156

123 Hardman JG, Goodman AG, Limbird LE (eds) (2001) Goodman and Gilman's The Pharmacological Basis of Therapeutics. 10th ed. McGraw-Hill, New York. Heringa MB, Schreurs RHMM, Busser F, Van Der Saag PT, Van Der Burg B, Hermens JLM (2004) Toward more useful in vitro toxicity data with measured free concentrations. Environmental Science and Technology 38: 6263-6270 Hukkanen J, Lassila A, Päivärinta K, Valanne S, Sarpo S, Hakkola J, Pelkonen O, Raunio H (2000) Induction and regulation of xenobioticmetabolising cytochrome P450s in human A549 lung adenocarcinoma cells. American Journal of Respiratory Cellular Molecular Biology 22: 360-366 Kansy M, Senner F, Gubernator K (1998) Physicochemical high throughput screening: parallel artificial membrane permeation assay in the description of passive absorption processes. Journal of Medicinal Chemistry 41: 1007-1010 Krebsfaenger N, Muedter TE, Zanger UM, Eichelbaum MF, Doehmer J (2003) V79 Chinese Hamster Cells genetically engineered for polymorphic cytochrome P450 2D6 and their predictive value for humans. ALTEX 20: 143-154 Langley G (2001) The Way Forward. Action to end animal toxicity testing. Report compiled for the BUAV and ECEAE Lee MY, Park CB, Dordick JS, Clark DS (2005) Metabolizing enzyme toxicology assay chip (MetaChip) for high-throughput microscale toxicity analyses. Proceedings of the National Academy of Sciences USA 102: 983-987 Lewis DFV (1999) Frontier orbitals in chemical and biological activity: Quantitative relationships and mechanistic implications. Drug Metabolism Reviews 31: 755-816 Lewis DFV (2001) COMPACT: a structural approach to the modelling of cytochromes P450 and their interactions with xenobiotics. Journal of Chemical Technology and Biotechnology 76: 237-244 Manga N, Duffy JC, Rowe PH, Cronin MTD (2003) A hierarchical QSAR model for urinary excretion of drugs in humans as a predictive tool for biotransformation. QSAR and Combinatorial Science 22:263-273. Mekenyan OG, Dimitrov SD, Pavlov TS, Veith GD (2004) A systematic aroach to simulating metabolism in computational toxicology. I. The TIMES heuristic modelling framework. Current Pharmaceutical Design 10: 1273-1293 Moss GP, Dearden JC, Patel H, Cronin MTD (2002) Quantitative structure- permeability relationships (QSPRs) for percutaneous absorption. Toxicology in Vitro 16: 299-317. Peraza MA, Carter DE, Gandolfi AJ (2003) Toxicity and metabolism of subcytotoxic inorganic arsenic in human renal proximal tubule epithelial cells (HK-2). Cell Biology and Toxicology 19: 253-264 Pogge A, Slikker W Jr. (2004) Neuroimaging: new approaches for neurotoxicology. Neurotoxicology 25: 525-531. Reichl S, Bednarz J, Muller-Goymann CC (2004) Human corneal equivalent as cell culture model for in vitro drug permeation studies. British Journal of Ophthalmology 88 560-565

124 Rettie AE, Jones JP (2005) Clinical and toxicological relevance of CYP2C9: Drug- drug interactions and pharmacogenetics. Annual Review of Pharmacology and Toxicology 45: 477-494 Tahti H, Nevala H, Toimela T (2003) Refining in vitro neurotoxicity testing –The development of blood-brain barrier models. ATLA 31: 273-276 Townsend AJ, Kabler SL, Doehmer J, Morrow CS (2002) Modeling the metabolic competency of glutathione S-transferases using genetically modified cell lines. Toxicology 181-182: 265-269 van de Waterbeemd H (2002) High-throughput and in silico techniques in drug metabolism and pharmacokinetics. Current Opinion in Drug Discovery and Development 5: 33-43 Viravaidya K, Shuler ML (2002) Prediction of naphthalene bioaccumulation using an adipocyte cell line model. Biotechnology Progress 18: 174-181 Winkler DA (2004) Neural networks in ADME and toxicity prediction. Drugs of the Future 29: 1043-1057 Worth AP, Balls M (Eds.) (2002) Alternative (non-animal) methods for chemicals testing: current status and future prospects. a report prepared by ECAVM and the ECVAM Working Group on chemicals. ATLA 30 Supplement 1. Zhao M, Yang M, Baranov E, Wang X, Penman S, Moossa AR and Hoffman RM (2001) Spatial-temporal imaging of bacterial infection and antibiotic response in intact animals. Proceedings of the National Academy of Sciences 98: 9814-9818. Zhao YH, Le J, Abraham MH, Hersey A, Eddershaw PJ, Luscombe CN, Boutina D, Beck G, Sherborne B, Cooper I, Platts JA (2001) Evaluation of human intestinal absorption data and subsequent derivation of a quantitative structure-activity relationship (QSAR) with the Abraham descriptors. Journal of Pharmaceutical Sciences 90: 749-784

125 2.12 Acute Environmental Toxicity

2.12.1 Established toxicity tests (e.g. OECD Guidelines etc)

REACH requirements:

The short-term fish toxicity testing is required in Annex VI of the REACH proposals, but it is not necessary if the test substance is highly insoluble (water solubility < 10 µg/l) or is unlikely to cross biological membranes (molecular weight > 800, molecular diameter > 15Å) or if a long-term fish toxicity study has been conducted. Depending on the results of the short-term study, the long-term fish toxicity study may be recommended.

Method Outline:

Acute fish toxicity study (REACH C1, OECD TG 203)

The test substance is dissolved in water (low solubility substances may be ultrasonically dispersed or organic solvent used to assist dissolution). Fish are exposed to the test substance for 96 hours and mortalities recorded at 24, 48, 72 and 96 hours. The concentration which kills 50% of the fish (LC50) is determined.

Number of fish used – 7 per dose/control group, approximately 42 per study (limit test uses 14 fish).

2.12.2 Issues relating to the feasibility of applying the 3Rs to the endpoint.

There are many acute environmental toxicities that may be considered. For ease of evaluation it is perhaps most pertinent to consider only toxicity to algae, Daphnia and fish i.e. the EU base set of toxicities.

Many modes of toxic action are well established, and for some modes (e.g. narcosis) there is much fundamental knowledge relating to specific mechanisms.

There are some very well established databases for modelling purposes. Many of these databases, along with an assessment of their relative quality, are summarised by Cronin (2005).

Given all the evidence, such as mechanistic insight and available data, the modelling of acute environmental toxicities should be achievable for many classes of chemicals.

2.12.3 In vitro methods

126 A recent ECETOC workshop on Alternative Testing approaches in Environmental Risk Assessment (ECETOC, 2004) has reviewed the available alternatives to using fish in toxicity and bioaccumulation studies. The in vitro alternatives reviewed include fish cell lines and fish embryo tests.

The fish embryo test (DarT, Danio rerio toxicity, Nagel, 2002) has been accepted by the German authorities for the testing of effluents and it was proposed at the ECETOC workshop that recommendations should be made to replace fish tests with a fish embryo test in all countries which require fish tests for effluent testing. In theory this is acceptable but there is a problem regarding the potential of tests using embryos being deemed protected animal living material in some countries. There was concern that if the embryo test was validated its use would not be accepted by certain regulators, and that the fish test would still need to be carried out to fulfil their requirements.

Research into the acceptability of embryonic tests as non-animal alternatives needs to be carried out to ascertain the potential regulatory acceptance of this test. Further evaluation of the test also needs to be carried out, including its use for a broader range of chemicals and with different fish species.

The report of ECVAM workshop no. 47 discusses fish cells and cell lines with respect to their use in Ecotoxicology testing (Castano et al, 2003). Various fish cell lines (see Table 2.12.2) and primary fish cells, for example gill epithelia (Pärt et al, 1993) and fish liver hepatocytes (Braunbeck and Segner, 2000) can be used to test chemicals and environmental samples with most having good correlations of toxicity rankings when in vitro EC50s were compared with in vivo LC50s. Cytotoxicity is measured using similar endpoints to those used for acute toxicity to humans, such as cell viability via MTT assay and neutral red uptake. Exceptions were noted for certain types of chemicals, and further analysis showed that these were generally toxic via specific modes of action which could not be detected in cells, for example organophosphates inhibit acetylcholine esterase.

Fish cell line Fish species References

FHM Golden orfe Brandao et al, 1992

Golden orfe Dierickx, 1993

GFS Carp Saito et al, 1991, 1993 Fathead Minnow Guppy

127 Goldfish Saito et al, 1994 Medaka Guppy

RTG-2 Rainbow Trout Castano et al, 1996 Zebra fish Lange et al, 1995

Zebra fish Lange et al, 1995

Rainbow trout Castano et al, 1996

Rainbow trout Castano et al, 1996 Bols et al, 1985

BG/F Rainbow trout Babich et al, 1990 Platessa Babich and Borenfreund, 1990

PLHC-1 Medaka Bruschweiler et al, 1995

Medaka Bruschweiler et al, 1995

BF-2 Bluegill Babich et al, 1986

R1 Rainbow trout Rusche and Kohlpoth, 1993

CHSE-214 Chinook Salmon Castano et al, 1995

Table 2.12.2. Details of different fish cell lines available for acute ecotoxicology studies (adapted from Castano et al, 2003).

Although there are several fish primary cells and cell lines for ecotoxicology testing it is frequently noted that they are not as sensitive as in vivo fish studies (Castano et al, 1996; Bols et al, 1985; Segner et al, 1993). This is the most controversial point against their use, and in the short term it is most likely that a battery of fish cell tests will only be used for screening purposes, rather than as a full replacement for fish acute toxicity studies.

128

2.12.4 In silico approaches

There are a large number of reviews in this area including the works of Schultz et al (2003a, 2003b). Cronin (2003) also gives an account of the mechanistic basis of acute environmental toxicity prediction. Other pertinent references to this topic are given in the bibliography. An assessment of QSARs to predict acute environmental effects is given in OECD (2004). A very complete review of QSARs for the prediction of acute toxicity in general (and including environmental effects) is given by Lessigiarska et al (2005).

QSARs

Due to the quantity, and in some cases quality, of the toxicological data there have many QSARs developed for acute environmental toxicity. These are often split between those developed on a mechanistic basis, or those derived from multivariate statistical methods. Both approaches have some merit and relative advantages are described in Cronin (2003) and Cronin et al (2002). QSARs have been developed for large numbers of toxicity data for various endpoints, most notably using the fathead minnow database and also Tetrahymena pyriformis toxicity data. Attempts have also been made model data from regulatory sources, although the results were relatively poor suggesting the poor quality of these data (Lessigiarska et al 2004b).

Expert Systems

With the number and high quality of toxicity data available for fish such as the guppy and fathead minnow, it is not surprising that a variety of expert systems have been developed to model fish toxicity.

The ECOSAR program has been developed by the US EPA as a means of providing information for the Pre-Manufactory Notices required by US legislation. It is freely downloadable through EPIWIN. It contains separate hydrophobicity-based QSAR for classes of chemicals for acute toxicity to many species. Similar to this is the PBT Profiler which provides estimates of toxicity through a similar process.

A number of commercial packages are also available to predict toxicity, again based on the fathead minnow and guppy databases. These include TOPKAT, MultiCASE, OASIS and TerraQSAR. An excellent comparison of their performance was made by Moore et al (2003).

2.12.5 Integrated testing strategies for specific endpoints

No testing strategies were found for this endpoint.

129 2.12.6 Reduction and Refinement opportunities

A reduction to the acute fish toxicity tests has recently been proposed by Hutchinson et al (2003). The objective of the ‘acute threshold (step down) test’ is to apply comparative threshold data from algae and daphnia acute tests to estimating the LC50, therefore reducing the number of treatment groups required. It is estimated that the number of fish per study can be reduced from 42 to 10. Although this test has been developed for pharmaceuticals, its application to chemical testing should be straight-forward, although it would require validation.

2.12.7 Bibliography

Babich H, Goldstein SH, Borenfreund E (1990) In vitro cyto- and genotoxicity of organomercurials to cells in culture. Toxicology Letters 50: 143–149 Babich H, Borenfreund E (1990) In vitro cytotoxicities of inorganic lead and di- and trialkyl lead compounds to fish cells. Bulletin of Environmental Contamination and Toxicology 44: 456–460. Babich H, Puerner JA, Borenfreund E (1986) In vitro cytotoxicity of metals to bluegill (BF-2) cells. Archives of Environmental Contamination and Toxicology 15: 31–37. Bols NC, Boliska SA, Dixon DG, Hodson PV, Kaiser KLE (1985) The use of fish cell cultures as an indication of contaminant toxicity to fish. Aquatic Toxicology 6: 147– 155. Brandao JC, Bohets HHL, van de Vyver IE, Dierickx PJ (1992) Correlation between the in vitro cytotoxicity to cultured fathead minnow fish cells and fish lethality data for 50 chemicals. Chemosphere 25: 553–562. Braunbeck T, Segner H (2000) Isolation and cultivation of teleost hepatocytes. In The Hepatocyte Review (ed. M.N. Berry and A.M. Edwards), pp. 49–72. Dordrecht, The Netherlands: Kluwer Academic Publishers. Brüschweiler BJ, Würgler FE, Fent K (1995) Cytotoxicity in vitro of organotin compounds to fish hepatoma cells PLHC-1 (Poeciliopsis lucida). Aquatic Toxicology 32: 143–160. Carlsen L, Walker JD (2003) QSARs for prioritizing PBT substances to promote pollution prevention. QSAR and Combinatorial Science 22: 49-57 Castaño A, Bols N, Braunbeck T, Dierickx P, Halder M, Isomaa B, Kawahara K, Lee LEJ, Mothersill C, Pärt P, Repetto G, Riego Sintes J, Rufli H, Smith R, Wood C, Segner H (2003) The use of fish cells in ecotoxicology. The report and recommendations of ECVAM Workshop 47. ATLA 31: 317-351 Castaño A, Cantarino MJ, Castillo P, Tarazona JV (1996) Correlations between the RTG-2 cytotoxicity test and in vivo LC50 rainbow trout bioassay. Chemosphere 32: 2141–2157.

130 Castaño A, Vega MM, Tarazona JV (1995) Acute toxicity of selected metals and phenols on RTG-2 and CHSE-214 fish cell lines. Bulletin of Environmental Contamination and Toxicology 55: 222–229. Cronin MTD (2003) Quantitative structure-activity relationships for acute aquatic toxicity: the role of mechanism of toxic action in successful modeling. In: R Benigni (Ed) Quantitative Structure-Activity Relationship (QSAR) Models of Mutagens and Carcinogens. CRC Press, Boca Raton FL, USA, pp. 235-258. Cronin MTD, Aptula AO, Duffy JC, Netzeva TI, Rowe PH, Valkova IV, Schultz TW (2002) Comparative assessment of methods to develop QSARs for the prediction of the toxicity of phenols to Tetrahymena pyriformis. Chemosphere 4: 1201-1221 Davies J, Ward RS, Hodges G, Roberts DW (2004) Quantitative structure-activity relationship modeling of acute toxicity of quaternary alkylammonium sulfobetaines to Daphnia magna. Environmental Toxicology and Chemistry 23: 2111-2115 Di Marzio W, Saenz ME (2004) Quantitative structure-activity relationship for aromatic hydrocarbons on freshwater fish. Ecotoxicology and Environmental Safety 59: 256-262 Dierickx PJ (1993) Comparison between fish lethality data and the in vitro cytotoxicity of lipophilic solvents to cultured fish cells in a twocompartment model. Chemosphere 27: 1511–1518. Dimitrov S, Koleva Y, Lewis M, Breton R, Veith G, Mekenyan O (2003) Modeling mode of action of industrial chemicals: Alication using chemicals on Canada's Domestic Substances List (DSL). QSAR and Combinatorial Science 22: 5-17 Dimitrov S, Koleva Y, Schultz TW, Walker JD, Mekenyan O (2004) Interspecies quantitative structure-activity relationship model for aldehydes: Aquatic toxicity. Environmental Toxicology and Chemistry 23: 463-470 ECETOC (2004) Workshop on Alternative Testing Approaches in Environemtnal Risk Assessement. Workshop report 5. European Centre for Ecotoxicology and Toxicology of Chemicals, Brussels, Escuder-Gilabert L, Martin-Biosca Y, Sagrado S, Villanueva-Camanas RM, Medina- Hernandez MJ (2001) Biopartitioning micellar chromatography to predict ecotoxicity. Analytica Chimica Acta 448: 173-185 Espinosa G, Arenas A, Giralt F (2002) An integrated SOM-fuzzy ARTMAP neural system for the evaluation of toxicity. Journal of Chemical Information and Computer Sciences 42: 343-359 Gini G, Craciun MV, Konig C, Benfenati E (2004) Combining unsupervised and supervised artificial neural networks to predict aquatic toxicity. Journal of Chemical Information and Computer Sciences 44: 1897-1902 Huang H, Wang XD, Ou WH, Zhao JS, Shao Y, Wang LS (2003) Acute toxicity of benzene derivatives to the tadpoles (Rana japonica) and QSAR analyses. Chemosphere 53: 963-970 Hutchinson TH, Barrett S, Buzby M, Constable D, Hartmann A, Hayes E, Huggett D, Laenge R, Lillicrap AD, Struab JO, Thompson RS (2003) A strategy to reduce the

131 numbers of fish used in acute excotoxicity testign of pharmaceuticals. Environmental Toxicology and Chemistry 22: 3031-3036 Lange, M., Gebauer, W., Markl, J. and Nagel, R. (1995) Comparison of testing acute toxicity on embryo of zebra fish, Brachydanio rerio and RTG-2 cytotoxicity as possible alternatives to the acute fish test. Chemosphere 30: 2087–2102. Lessigiarska I, Cronin MTD, Worth AP, Dearden JC, Netzeva TI (2004a) QSARs for toxicity to the bacterium Sinorhizobium meliloti. SAR and QSAR in Environmental Research 15: 169-190 Lessigiarska I, Worth AP, Netzeva TI (2005) Comparative Review of QSARs for Acute Toxicity. European Commission Report EUR 21559 EN, Ispra, Italy (contact Dr Andrew Worth for more information [email protected]) Lessigiarska I, Worth AP, Sokull-Kluttgen B, Jeram S, Dearden JC, Netzeva TI, Cronin MTD (2004b) QSAR investigation of a large data set for fish, algae and Daphnia toxicity. SAR and QSAR in Environmental Research 15: 413-431 Licht O, Weyers A, Nage R (2004) Ecotoxicological characterisation and classification of existing chemicals - Examples from the ICCA HPV initiative and comparison with other existing chemicals. Environmental Science and Pollution Research 11: 291-296 Maeder V, Escher BI, Scheringer M, Hungerbuhler K (2004) Toxic ratio as an indicator of the intrinsic toxicity in the assessment of persistent, bioaccumulative, and toxic chemicals. Environmental Science and Technology 38: 3659-3666 Moore DRJ, Breton RL, MacDonald DB (2003) A comparison of model performance for six quantitative structure-activity relationship packages that predict acute toxicity to fish. Environmental Toxicology and Chemistry 22: 1799-1809 Nagel R (2002) DarT: The embryo test with the zebrafish Danio rerio – a general model in ecotoxicology and toxicology. ALTEX 19 Suppl. 1: 38-48 Niculescu SP, Atkinson A, Hammond G, Lewis M (2004) Using fragment chemistry data mining and probabilistic neural networks in screening chemicals for acute toxicity to the fathead minnow. SAR and QSAR in Environmental Research 15: 293- 309 Niederlehner BR, Cairns J, Smith EP (1998) Modeling acute and chronic toxicity of nonpolar narcotic chemicals and mixtures to Ceriodaphnia dubia. Ecotoxicology and Environmental Safety 39: 136-146 Pärt P, Norrgren L, Bergström E, Sjöberg P (1993) Primary cultures of epithelial cells from rainbow trout gills. Journal of Experimental Biology 175: 219–232. Roberts DW, Costello J (2003) QSAR and mechanism of action for aquatic toxicity of cationic surfactants. QSAR and Combinatorial Science 22: 220-225 Roberts DW, Costello JF (2003) Mechanisms of action for general and polar narcosis: A difference in dimension. QSAR and Combinatorial Science 22: 226-233 Rusche B, Kohlpoth M (1993) The R1 cytotoxicity test as replacement for the fish tests stipulated in the German Waste Water Act. In Fish Ecotoxicology and Ecophysiology (ed. T. Braunbeck, W. Hanke and H. Segner), pp. 81–92. Weinheim, Germany: VCH-Wiley.

132 Saito H, Iwami S, Shigeoka T (1991) In vitro cytotoxicity of 45 pesticides to goldfish GF-Scale (GFS) cells. Chemosphere 23: 525–537. Saito, H., Koyasu, J. and Yoshida, K. (1993) Cytotoxicity of 109 chemicals to goldfish GFS cells and relationships with 1-octanol/water partition coefficients. Chemosphere 26: 1015–1028. Saito, H., Koyasu, J., Shigeoka, T. and Tomita, I. (1994) Cytotoxicity of chlorophenols to goldfish GFS cells with the MTT and LDH assays. Toxicology in Vitro 8: 1107–1112. Schultz TW, Cronin MTD, Netzeva TI (2003a) The present status of QSAR in toxicology. Journal of Molecular Structure (Theochem) 622: 23-38. Schultz TW, Cronin MTD, Netzeva TI, Aptula AO (2002) Structure-toxicity relationships for aliphatic chemicals evaluated with Tetrahymena pyriformis. Chemical Research in Toxicology 15: 1602-1609 Schultz TW, Cronin MTD, Walker JD, Aptula AO (2003b) Quantitative structure- activity relationships (QSARs) in toxicology: a historical perspective. Journal of Molecular Structure (Theochem) 622: 1-22. Segner, H. and Lenz, D. (1993) Cytotoxicity assays with the rainbow trout R1 cell line. Toxicology in Vitro 7: 537–540. Tao S, Xi XH, Xu FL, Dawson R (2002) A QSAR model for predicting toxicity (LC50) to rainbow trout. Water Research 36: 2926-2930

133 2.13 Chronic Environmental Toxicity

2.13.1 Established toxicity tests (e.g. OECD Guidelines etc)

REACH requirements:

The long-term fish toxicity study is required in Annex VII of the REACH proposals. There are three methods for determining this endpoint: the fish early life stage toxicity test (FELS); the fish short-term toxicity on embryo and sac-fry stages and the fish juvenile growth test. The FELS toxicity test is recommended when the test substance has the potential to bioaccumulate. Tests are not necessary if the test substance is unlikely to cross biological membranes (molecular weight > 800, molecular diameter > 15Å) or if exposure of the aquatic compartment is unlikely.

Method outline:

Fish early life stages toxicity test (OECD TG 210)

This test involves exposing fertilised fish eggs to an aqueous solution of the test substance. The test is continued until all control fish are free-feeding. Lethal and sub-lethal effects are assessed and by comparison with control values to determine lowest and no observed effect levels.

Number of fish eggs used – 60 per dose/control group, approximately 360 per study.

Fish short-term toxicity test on embryo and sac-fry stages (REACH C14, OECD TG 212)

This test involves exposing fertilised fish eggs to an aqueous solution of the test substance. The test is continued until just before the yolk-sac of any larvae has been completely adsorbed or before mortalities by starvation occur in the controls. Lethal and sub-lethal effects are assessed and compared with control values to determine lowest and no observed effect levels.

Number of fish eggs used – 30 per dose/control group, approximately 180 per study.

Fish juvenile growth test (REACH C14, OECD TG 215)

This test involves exposing juvenile fish in the exponential growth phase to an aqueous solution of the test substance. Fish are fed daily for 28 days then weight changes since the start of the experiment are calculated and the effects on growth rate determined to estimate the concentration that would cause an x% variation in growth rate, i.e. ECx (e.g. EC10, EC20 or EC30). Alternatively a lowest or no observed effect level can be calculated as in the other tests.

134 Number of fish used – min 16 per dose/control group, approximately 96 per study.

2.13.2 Issues relating to the feasibility of applying the 3Rs to the endpoint.

Compared to acute toxicity, relatively little is known about the mechanisms of chronic environmental toxicity. It is assumed in terms of lethality that the mechanisms are similar, i.e. a narcotic will still act as a narcotic and more specific toxicants will cause an effect in the same manner. However, with regard to potency, issues such as degradation (both biotic and abiotic) and bioaccumulation of hydrophobic chemicals become more important.

It is true to say that compared with acute environmental effects there are many fewer data available for chronic effects. This will affect the possibilities for replacement in this area radically.

2.13.3 In vitro methods

Although there are many potential cell systems available for acute fish toxicity, longer-term effects are more difficult to establish using cells although are perhaps more important in terms of toxicity testing as long-term exposure to low levels of chemicals is more common that acute exposure. One promising approach has been noted for testing prolonged exposure in fish cells; this is the method of Mothersill et al (1995) which uses fish fins or skin explants. Techniques are described which can maintain primary cell cultures for approximately 8 weeks so that longer-term toxicity studies can be conducted. In vivo exposure experiments performed in parallel with in vitro exposures, at similar doses, have confirmed that the technique results in toxicity that arises from exposures in the same dose range as the in vivo results.

Endocrine disruption has become an important area for environmental toxicity testing as many pollutants have been found to affect the endocrine systems of numerous species of wildlife. Studies have shown that various industrial chemicals which are discharged into rivers can affect the endocrine functions of fish (Larsson et al, 2000; Parks et al, 2001) and that municipal effluent can even feminise downstream male fish because of the high levels of steroidal oestrogens from the contraceptive pill (Larsson et al, 1999; Purdom et al, 1994; Jobling et al, 1998), masculinarisation of aquatic species has also been recorded (Gimeno et al, 1998).

As yet, there are no fully validated test guidelines for environmental endocrine disruptor testing, although the Fish Screening Assay and the Amphibian Metamorphosis Assay have both been the subject of recent reviews by the OECD (OECD 2004a; OECD 2004b) and are both under validation at the US EPA (www.epa.gov). Other screening studies have also been developed using small fish such as the fathead minnow, medaka and zebrafish (Ankley and Johnson, 2004) or the rainbow trout (Thorpe et al, 2000).

135 Several in vivo biomarker endpoints have been developed for detecting oestrogenic activity of xenobiotics or water samples including plasma steroid concentrations (Kramer et al, 1998), binding to and activation of the oestrogen receptor (Yadetie et al, 1999) and the induction of vitellognein production in male and female fish (Christiansen et al, 2000; Tyler et al, 1999; Sumpter and Jobling 1995). The induction of vitellogenin production (the major egg yolk precursor protein) is the most used of these biomarkers for the development of endocrine disruptor screening assays. It is oestrogen dependant and can be assessed in vitro using either primary fish hepatocytes (Jobling and Sumpter, 1993; Pelissero et al, 1993; Smeets et al, 1999) or permanent fish cell lines with hepatoma cells (Gagné and Blaise, 2000; Maitre et al, 1986).

Other in vitro assays are based on the same principles as those for human endocrine disruption, such as competitive ligand binding, cell proliferation and oestrogen receptor transcription assays (Zacharewski, 1997). Fent (2001) has developed a reporter gene system using the Rainbow trout cell line RTG-2. It is based on the transfection of a reporter gene plasmid consisting of an oestrogen responsive element fused to a firefly luciferase gene, which induces luciferase expression after binding of an oestrogenic agonist to the oestrogen receptor and transcriptional activation. The recombinant yeast assay (RTS; Garcia Reyero et al, 2001), is one of the most convenient screens of environmental samples. It detects both natural oestrogens and xenoestrogens with the added advantage that the yeasts used are easy to manipulate and grow, therefore providing the opportunity to quickly screen large numbers of test substances (Ballatori and Villalobos, 2002).

Thus far, mainly cell lines from the Rainbow trout have been used for in vitro studies to detect endocrine disruptors in the environment. However, it has been suggested that, in addition to this species, carp, medaka, stickleback and zebrafish especially should also be considered for use in endocrine disruptor testing on the grounds that they occur in different regions of the world. With respect to Europe, it is probably most appropriate to focus on the use of rainbow trout and stickleback. It is suggested that cell lines from all these species should be developed and investigated for use in screening, so as to be able to compare the in vitro and in vivo responses in cells of the same species. Indeed, carp primary hepatocytes have also been used to detect chemicals binding to the oestrogen receptor and induce vitellogenin (Sanderson et al., 2001).

The need for using cultures of fish cells, in contrast to mammalian cell cultures, is a controversial issue. It is possible to generate mammalian cell lines with nuclear receptors linked to fish specific gene expression systems, such as vitellogenin synthesis (see for example Mao and Shapiro, 2000). Further studies are required to assess the importance of tests with fish cell lines for screening endocrine disruptors.

2.13.4 In silico approaches

There are few good reviews in the area of chronic environmental toxicity prediction, probably reflecting the lack of models available.

136 QSARs

There are very few details of QSARs for chronic toxicity, examples included Dyer et al (2000) and Versteeg et al (1997).

Expert Systems

The only expert systems that appear to have predictive capacity for chronic environmental effects are ECOSAR and PBT, as described in Section 2.15.5.

QSARs for Endocrine Disruption

A number of QSAR models have been developed for ligand binding to the oestrogen receptor (ER) (Tong et al 1997a, b; Waller et al 1996 and many others). Unfortunately, most of these QSAR models were developed based on datasets available in the literature; these datasets were both too small and/or lacked structural diversity. Although these models yield good statistical results in the training and cross-validation steps and explain some structural determinants for ER binding, they have limited applicability in predicting the ER-ligand binding affinity of chemicals that cover a wide range of structural diversity.

In order to obtain an adequate training set to develop a more robust QSAR model, Tong et al (2004) developed a rat ER binding assay. For many years the ER competitive binding assay was considered the gold standard. However, many variants have been developed, leading to some significant differences in results. Our ER binding assay was rigorously validated, and so provides high quality data for model development. Each experimental value is replicated at least twice. Tong et al (2004) assayed 232 chemicals to obtain a training set for model development. The ER binding activity is represented by the Relative Binding Affinity (RBA), where the RBA value for the endogenous ER ligand, 17β-estradiol (E2), was set to 100. This NCTR dataset contains chemicals that were selected to cover the structural diversity of chemicals that bind to ER with an activity distribution ranging over six orders of magnitude both of which are important for the prediction of structurally diverse oestrogens. This NCTR dataset has been used extensively to build and validate a series of QSAR models for ER binding as shown in Figures 2.13.4a and 2.13.4b below.

137

LOW

Filtering Phase 1 Molecular weight < 94 or >1000

rity No ring structure o i r Phase II P Active/Inactive Assignment d

n 3 Structural Alerts

u 7 Pharmacophore Queries o

p 1 Decision Tree Phase III m o Quantitative Activity Prediction C CoMFA model Computational Intensity Phase IV Priority Setting Incorporate human knowledge HIGH Develop a set of rules Reevaluate some chemicals Consider other priority setting factors

Figure 2.13.4a. Overview diagram of the National Center for Toxicological Research (NCTR) “Four-Phase” approach for priority setting. In Phase I, chemicals with molecular weight <94 or >1000 or containing no ring structure will be rejected. In Phase II, three approaches (structural alerts, pharmacophores, and classification methods) that include a total of 11 models are used to make a qualitative activity prediction. In Phase III, a 3D QSAR/CoMFA model is used to make a more accurate quantitative activity prediction. Phase IV, an expert system is expected to make a decision on priority setting based on a set of rules. Different phases are hierarchical; different methods within each phase are complementary (taken from Tong et al 2004).

138

Chemicals

NO Ring Unlikely to be ER ligand YES NO Aromatic Ring Containing O, S, N

YES YES NO Phenolic ring The binding potential is determined by these structural features: YES • H-bond ability • Precise O-O distance • Rigid structure H O X n NO • Steric moieties mimic 7α X=C or other atom s and 11β-position n = 1-3 • Satisfactory hydrophobicity (log Kow) YES

Likely to be ER ligand

Figure 2.13.4b. Flowchart for the identification of ER ligands using a set of "IF- THEN" rules: a) IF a chemical contains no ring structure THEN it is unlikely to be an ER ligand; b) IF a chemical has a non-aromatic ring structure THEN it is unlikely to be an ER ligand if it does not contain an O, S, N or other heteroatoms for H-bonding. Otherwise, its binding potential is dependent on the occurrence of the key structural features; c) IF a chemical has a non-OH aromatic structure THEN its binding potential is dependent on the occurrence of the key structural features; and e) IF a chemical contains a phenolic ring THEN it tends to be an ER ligand if it contains any additional key structural features. For the chemicals containing a phenolic ring separated from another benzene ring with the bridge atoms ranging from none to three, it will most likely be an ER ligand (taken from Tong et al 2004).

139

2.13.5 Integrated testing strategies for specific endpoints

No testing strategies were found for this endpoint.

2.13.6 Reduction and Refinement opportunities

Acute and chronic environmental toxicity testing in fish is not necessary if there is evidence to show that the test substance has the potential to bioaccumulate (ECETOC, 2004). Where chronic testing is required, the dose selection and route of exposure should be representative of the most relevant exposure scenarios. Species selection should be guided by the procedures envisaged, particularly if long-term effects are to be monitored. Where possible, sighting studies or dose information from acute toxicity studies used to calculate dose ranges. Long-term studies should only be considered where there is sufficient evidence to suggest that a test substance has effects on chronic exposure which are not seen during acute trials and where there is a perceivable risk of long term environmental pollution.

2.13.7 Bibliography

Ankley GT, Johnson RD (2004) Small fish models for identifying and assessing the effects of endocrine-disrupting chemicals. ILAR Journal 45: 469-483 Ballatori N, Villalobos AR (2002) Defining the molecular and cellular basis of toxicity using comparative models. Toxicology and Applied Pharmacology 183: 207–220 Christiansen LB, Pedersen KL, Pedersen SN, Korsgaard B, Bjerregaard P( 2000) In vivo comparison of xenoestrogens using rainbow trout vitellogenin induction as a screening system. Environmental Toxicology and Chemistry 19: 1867–1874 Dyer SD, Stanton DT, Lauth JR, Cherry DS (2000) Structure-activity relationships for acute and chronic toxicity of alcohol ether sulfates. Environmental Toxicology and Chemistry 19: 608-616 ECETOC (2004) Workshop on Alternative Testing Approaches in Environemtnal Risk Assessement. Workshop report 5. European Centre for Ecotoxicology and Toxicology of Chemicals, Brussels, Fent K( 2001) Fish cell lines as versatile tools in ecotoxicology: assessment of cytotoxicity, cytochrome P4501A induction potential and estrogenic activity of chemicals and environmental samples. Toxicology in Vitro 15: 477-488. Gagné F, Blaise C (2000) Evaluation of environmental with a fish cell line. Bulletin of Environmental Contamination and Toxicology 65: 494–500

140 Garcia-Reyero N, Grau E, Castillo M, Lopez de Alda M, Barcelo D, Pina B (2001) Monitoring of endocrine disruptors in surface waters by the yeast recombinant assay. Environmental Toxicology and Chemistry 20: 1152–1158 Gimeno S, Komen H, Jobling S, Sumpter J, Bowmer T (1998) Demasculinisation of sexually mature male common carp, Cyprinus carpio, exposed to 4-tert-pentylphenol during spermatogenesis. Aquatic Toxicity 43: 93-109 Jobling S, Nolan M, Tyler CR, Brighty G, Sumpter JP (1998) Widespread sexual disruption in wild fish. Environmetnal Science and Technology 32: 2498-2506 Jobling S, Sumpter JP (1993) Detergent components in sewage effluent are weakly oestrogenic to fish: An in vitro study using rainbow trout (Oncorhynchus mykiss) hepatocytes. Aquatic Toxicity 27: 361-372 Kramer VJ, Miles-Richardson S, Pierens SL, Giesy JP (1998) Reproductive impairment and induction of alkaline-labile phosphate, a biomarker of estrogen exposure, in fathead minnows (Pimephales promelas) exposed to waterborne 17β- estradiol. Aquatic Toxicology 40: 335–360 Larsson DGJ, Adolfsson ER, Parkkonen J, Pettersson M, Berg AH, Olsson PE, Forlin L (1999) Ethinylestrodiol – An undesired fish conctraceptive? Aquatic Toxicity 45: 91-97 Larsson DGJ, Hallman H, Forlin L (2000) More male fish embryos near a pump mill. Environmental Toxicology and Chemistry 19: 2911-2917 Maitre JL, Valotaire Y, Guguen-Guillouzo C (1986) Estradiol-17β stimulation of vitellogenin synthesis in primary culture of male rainbow trout hepatocytes. In Vitro Cellular and Developmental Biology 22: 337–343 Mao CJ, Shapiro DJ (2000) A histone deacetylase inhibitor potentiates estrogen receptor activation of a stably integrated vitellogenin promoter in HepG2 cells. Endocrinology 141: 2361-2369 Mothersill C, Lyng F, Lyons M, Cottell D (1995) Growth and differentiation of epidermal cells of the rainbow trout established as explants and maintained in various media. Journal of Fish Biology 46: 1011–1025. OECD (2004a) Detailed Review Paper on Fish Screening Assays for the Detection of Endocrine Active Substances. OECD Series on Testing and Assessment, Number 47, OECD (2004b) Detailed Review Paper on Amphibian Metamorphosis Assay for the Detection of Thyroid Active Substances. OECD Series on Testing and Assessment, Number 46, Parks LG, Lambright CS, Orlando EF, Guillette LJ Ankley GT, Gray LE (2001) Masculinization of female mosquitofish in kraft mill effluent contaminated Fenholloway River water is associated with androgen receptor agonist activity. Toxicological Sciences 62 : 257-267 Pelissero C, Flouriot G, Foucher JL, Bennetau B, Dunogues J, Le Gac F, Sumpter JP (1993) Vitellogenin synthesis in cultured hepatoctyte – an in vitro test for the estrogenic potency of chemicals. Journal of Steroid Biochemistry and Biology 44: 263-272

141 Purdom CE, Hardiman PA, Bye VJ, Eno NC, Tyler CR, Sumpter JP (1994) Estrogenic effects of effluents from sewage treatment works. Chemical Ecology 8: 275-285 Sanderson JT, Letcher RJ, Heneweer M, Giesy JP, van den Berg M (2001) Effects of chloro-s-triaziine herbicides and metabolites on aromatase activity in various human cell lines and on vitellogenin production in male carp hepatocytes. Environmental Health Perspectives 109: 1027-1031 Smeets JMW, Rouhani Rankouhi T, Nichols KM, Komen H, Kaminski NE, Giesy JP, van den Berg M (1999) In Vitro vitellogenin production by carp (Cyprinus carpio) hepatocytes as a screening method for determining (anti)estrogenic activity of xenobiotics. Toxicology and Applied Pharmacology 157: 68-76 Sumpter JP, Jobling S (1995) Vitellogenesis as a biomarker for estrogenic contamination of the aquatic environment. Environmental Health Perspectives 103: 173–178 Thorpe KL, Hutchinson TH, Hetheridge MJ, Sumpter JP, Tyler CR (2000) Developemtn of an in vivo screening assay fro estrogenic chemicals using juvenile rainbow trout (Oncorhynchus mykiss). Environmental Toxicology and Chemistry 19: 2812-2820 Tong W, Fang H, Hong H, Xie Q, Perkins R, Sheehan DM (2004) Receptor-mediated toxicity: QSARs for estrogen receptor binding and priority setting of potential estrogenic endocrine disruptors. In MTD Cronin, DJ Livingstone (eds) Predicting Chemical Toxicity and Fate. CRC Press, Boca Raton FL, USA, pp 285-314. Tong WD, Perkins R, Strelitz R, Collantes ER, Keenan S, Welsh WJ, Branham WS, Sheehan DM (1997a) Quantitative structure-activity relationships (QSARs) for estrogen binding to the estrogen receptor: Predictions across species. Environmental Health Perspectives 105: 1116-1124. Tong WD, Perkins R, Xing L, Welsh WJ, Sheehan DM (1997b) QSAR models for binding of estrogenic compounds to and beta subtypes. Endocrinology 138: 4022-4025 Tyler CR, van Aerle R, Hutchinson TH, Maddix S, Trip H (1999) An in vivo testing system for endocrine disruptors in fish early life stages using induction of vitellogenin. Environmental Toxicology and Chemistry 18: 337–347 Versteeg DJ, Stanton DT, Pence MA, Cowan C (1997) Effects of surfactants on the rotifer, Brachionus calyciflorus, in a chronic toxicity test and in the development of QSARs. Environmental Toxicology and Chemistry 16: 1051-1058 Waller, C.L., Oprea, T.I., Chae, K., Park, H.K., Korach, K.S., Laws, S.C., Wiese, T. E., Kelce, W.R. and Gray, L.E., Jr. (1996) 'Ligand-based identification of environmental estrogens', Chemical Research in Toxicology, 9: 1240-1248 Yadetie F, Arukwe A, Goksoyr A, Male R (1999) Induction of hepatic estrogen receptor in juvenile Atlantic salmon in vivo by the environmental estrogen, 4- . Science of the Total Environment 233: 201–210 Zacharewski T (1997) In Vitro Bioassays for Assessing Estrogenic Substances. Environmental Science and Technology 31: 613-623

142

2.14 Bioaccumulation (Environmental)

2.14.1 Established toxicity tests (e.g. OECD Guidelines etc)

REACH requirements:

Testing for environmental bioaccumulation is required in Annex VII of the REACH proposals. This testing is carried out on plants, terrestrial organisms and aquatic species. For aquatic species, REACH recommends the use of a single species, preferably fish. Bioaccumulation testing is not necessary if the test substance has a low bioaccumulation potential (log Kow < 3) or is unlikely to cross biological membranes (molecular weight > 800, molecular diameter > 15Å) or if exposure of the aquatic compartment is unlikely.

Method outline:

Flow-through fish test (REACH C13, OECD TG 305)

This test is used to determine bioconcentration factor (BCF) values. The test consists of an exposure (uptake) phase and a post-exposure (depuration) phase. During the uptake phase fish are exposed to an aqueous solution of the test substance for 28 days or until equilibrium is reached (up to 60 days). The fish are then transferred to water free of the test substance for the depuration phase. The concentration in/on the fish is monitored during both phases and the BCF is calculated from the ratio of the concentration in the fish and in the water at apparent steady-state. A kinetic BCF - the ratio of rate constants of uptake and depuration (assuming first order kinetics) – can also be calculated.

Number of fish used – min 36 per dose/control group, approximately 108 per study.

2.14.2 Issues relating to the feasibility of applying the 3Rs to the endpoint.

The mechanism of environmental bioaccumulation is assumed to be one of passive accumulation. As such this mechanism is well established and should be relatively straightforward to model. There are many data available for bioaccumulation, although the quality of these data is highly variable and often not known.

2.14.3 In vitro methods

Most bioaccumulation assessments rely on computer model predictions which, in turn, rely on the octanol–water partition coefficient (Kow). A more recent model relies

143 on the use of biopartitioning micellar chromatography to evaluate bioaccumulation (Bermúdez-Saldaña et al, 2005). While these models are very useful for an initial screening of existing chemicals, they are less reliable for less well characterised chemical classes or chemicals that are metabolised to less lipophilic products. Indeed, bioaccumulation integrates absorption, distribution, metabolism, and excretion and it is generally agreed that a better analysis of these processes – especially of metabolism - will improve assessment of bioavailability and bioaccumulation. Metabolically competent fish cells lines (Lee et al, 1993) are being developed for use as alternatives to in vivo metabolic studies. Also methods for the primary cell culture of gill epithelial cells (Fletcher et al, 2000; Kelly et al, 2000) are being developed as fish gills are generally the first point of contact for waterborne toxicants and the major route of uptake into the body.

2.14.4 In silico approaches

Computational methods to predict bioaccumulation are well reviewed by Dearden (2004).

QSARs

Most QSARs for bioaccumulation are based on the use of partition coefficients. Whilst a linear relationship would be expected, as noted by Dearden (2004), non- linear models based on partitioning are also common e.g. Dimitrov et al (2002b). Such models are generally simple to use and implement.

Expert Systems

ECOSAR and the PBT Profiler provide free models for the prediction of bioaccumulation. Commercial models are provided by MultiCASE and ADMEWorks, but they are unlikely to provide a significant improvement over the ECOSAR models.

2.14.5 Integrated testing strategies for specific endpoints

No testing strategies were found for this endpoint.

2.14.6 Reduction and Refinement opportunities

Bioaccumulation studies should only be undertaken where there is evidence to suggest that bioaccumulation occurs (ECETOC, 2004). The use of radioactive tracers or other methods which allow tissue-specific bioaccumulation to be monitored is encouraged. Any information from acute toxicity or sighting studies should be used to guide dose selection.

144 2.14.7 Bibliography

Arnot JA, Gobas FAPC (2003) A generic QSAR for assessing the bioaccumulation potential of organic chemicals in aquatic food webs. QSAR and Combinatorial Science 22: 337-345 Baker JR, Mihelcic JR, Sabljic A (2001) Reliable QSAR for estimating K-oc for persistent organic pollutants: correlation with molecular connectivity indices. Chemosphere 45: 213-221 Bermudez-Saldana JM, Escuder-Gilabert L, Medina-Hernandez MJ, Villanueva- Camanas RM, Sagrado S (2005) Modelling bioconcentration of pesticides in fish using biopartitioning micellar chromatography. Journal of Chromatography A 1063: 153-160 Dearden JC (2004) QSAR modeling of bioaccumulation. In MTD Cronin, DJ Livingstone (eds) Predicting Chemical Toxicity and Fate. CRC Press, Boca Raton FL, USA, pp 333-355. Dimitrov SD, Dimitrova NC, Walker JD, Veith GD, Mekenyan OG (2002a) Predicting bioconcentration factors of highly hydrophobic chemicals. Effects of molecular size. Pure and Applied Chemistry 74: 1823-1830 Dimitrov SD, Dimitrova NC, Walker JD, Veith GD, Mekenyan OG (2003) Bioconcentration potential predictions based on molecular attributes - an early warning approach for chemicals found in humans, birds, fish and wildlife. QSAR and Combinatorial Science 22: 58-68 Dimitrov SD, Mekenyan OG, Walker JD (2002b) Non-linear modeling of bioconcentration using partition coefficients for narcotic chemicals. SAR and QSAR in Environmental Research 13: 177-184 ECETOC (2004) Workshop on Alternative Testing Approaches in Environemtnal Risk Assessement. Workshop report 5. European Centre for Ecotoxicology and Toxicology of Chemicals, Brussels, Fletcher M, Kelly S, Pärt P, O’Donnell MJ, Wood CM (2000) Transport properties of cultured branchial epithelia from freshwater rainbow trout: a novel preparation with mitochondria-rich cells. Journal of Experimental Biology 203: 1523–1537 Kelly SP, Fletcher M, Pärt P, Wood CM (2000) Procedures for the preparation and culture of “reconstructed” rainbow trout branchial epithelia. Methods in Cell Science 22: 153–163 Knekta E, Andersson PL, Johansson M, Tysklind M (2004) An overview of OSPAR priority compounds and selection of a representative training set. Chemosphere 57: 1495-1503 Lee LEJ, Clemons JH, Bechtel DG, Caldwell SJ, Han K-B, Pasitschniak-Arts M, Mosser DD, Bols NC (1993) Development and characterization of a rainbow trout liver cell line expressing cytochrome P450-dependent mono-oxygenase activity. Cell Biology and Toxicology 9: 279–294 MacDonald D, Breton R, Sutcliffe R, Walker J (2002) Uses and limitations of quantitative structure-activity relationships (QSARs) to categorize substances on the

145 canadian domestic substance list as persistent and/or bioaccumulative, and inherently toxic to non-human organisms. SAR and QSAR in Environmental Research 13: 43-55 Mackay D, Webster E (2003) A perspective on environmental models and QSARs. SAR and QSAR in Environmental Research 14: 7-16 Sverdrup LE, Nielsen T, Krogh PH (2002) Soil ecotoxicity of polycyclic aromatic hydrocarbons in relation to soil sorption, lipophilicity, and water solubility. Environmental Science and Technology 36: 2429-2435 Tao S, Hu HY, Xu FL, Dawson R, Li BG, Cao J (2001) QSAR modeling of bioconcentration factors in fish based on fragment constants and structural correction factors. Journal of Environmental Science and Health Part B - Pesticides Food Contaminants and Agricultural Wastes 36: 631-649

146 3. Additional Information on In Silico Methodologies With Regard to Their Usage in REACH

3.1 International Efforts Associated with QSAR and In Silico Modelling

There is no professional society (either national or international) overseeing the development and use of in silico techniques. The nearest available institution is the “International QSAR and Modelling Society” although it must be recognised that this is a relatively informal body, and has no professional status. In addition, the remit of the International QSAR and Modelling Society is focussed more towards the issue of drug design, than toxicity prediction. Other worldwide organisations, such as the Society of Environmental Toxicology and Chemistry (SETAC), have some interests in QSAR but no formal jurisdiction in areas such as evaluation of models.

There are a number of international bodies that have taken some initiatives with regard to the development of QSARs, and in particular their evaluation, possible validation and provision of guidance. The European Commission is represented by the European Chemical’s Bureau’s (Ispra, Italy) Action of QSAR (contact Dr Andrew Worth [[email protected]]). This action integrates with the OECD activities on QSAR (which are chaired by Dr Gil Veith [[email protected]]). More details on these activities and their possible relevance to alternatives to animal testing is provided in the following sections.

In Europe industry (i.e. chemical and personal products) is represented by the European Chemical Industry Council (CEFIC) in Brussels. CEFIC has been able to lobby policy makers. It is also able to fund research through their “Long-range Research Initiative” (LRI) and several existing and historical projects have related to in silico methods. Some of the REACH Implementation Projects are being developed through CEFIC.

There are numerous other groups that may be relevant to the development of in silico methods including the European Centre for the Ecotoxicology and Toxiocology Of Chemicals (ECETOC) in Brussels.

3.1.1 Further information

CEFIC LRI: http://www.cefic-lri.org European Centre for the Ecotoxicology and Toxiocology Of Chemicals (ECETOC): http://www.ecetoc.org/Content/Default.asp? European Chemical Bureau activity on QSAR: http://ecb.jrc.it/QSAR/ International QSAR and Modelling Society: http://www.qsar.org/ International QSAR Foundation to Reduce Animal Testing. Contact Dr Gil Veith ([email protected])

147 OECD Activities on QSAR: http://www.oecd.org/document/23/0,2340,en_2649_34365_33957015_1_1_1_1,00.h tml OECD’s Database on Chemical Risk Assessment Models: http://webdomino1.oecd.org/comnet/env/models.nsf Society of Environmental Toxicity and Chemistry: http://www.setac.org/

3.2 Regulatory Use of QSAR

There is relatively widespread use of (Q)SARs and other in silico methods to predict toxicity and fate by Regulatory Agencies worldwide. Some of the experiences of these agencies may be useful and illustrative for the implementation of REACH. In addition, some tools may be available for use. The area of where (Q)SAR has been used by regulatory agencies is very well covered and reviewed by Cronin et al (2003a, b); Cronin (2004). Specific issues of (Q)SARs for REACH implementation are described by Worth et al (2004).

3.2.1 A Summary of the Use of In Silico Prediction Methods by International Regulatory Agencies

Details of initiatives and uses of in silico methods by international regulatory agencies are summarised below. This is not a complete listing, but is intended to update the reviews of Cronin et al (2003a, b) and Cronin (2004).

3.2.1.1 Canada

Health Canada

Health Canada has applied simple, discriminating tool to address the hazard for all 23,000 substances in the Domestic Subtances List (DSL). The intention is to prioritise compounds on the DSL into one of the following categories: “greatest potential for exposure” (GPE), “intermediate potential for exposure” (IPE) & “lowest potential for exposure” (LPE). The procedure draws on information submitted in compilation of the Domestic Substances List.

The Complex Hazard (ComHaz) Tool has been developed. This is a hierarchical approach for multiple endpoints and data sources. It deals with qualitative and quantitative endpoints and is considered to be conservative. It also includes a weight of evidence approach for qualitative endpoints. Sources of information are considered hierarchically. It performs comprehensive literature searching of both electronic and hardcopy resources. It also takes account of, or reviews, toxicological or epidemiological studies.

A large variety of QSAR models are applied to chemical structures of concern, modules from TOPKAT and CASETOX in particular. Other models used include

148 DEREK for Windows and surrogate/analogue approaches (Leadscope, visual grouping).

Further information can be obtained from the Health Canada Existing Substances Division Website http://www.hc-sc.gc.ca/exsd-dse

Environment Canada

The Artificial Intelligence Expert System Project has beeb initiated by Environment Canada’s National Water Resources Institute (NWRI). The system is based on Artificial Neural Networks and can be used to predict a variety of endpoints. It is trained with structural information and the endpoint data to be modeled: i.e. physical chemical or toxicity endpoints. To help interpret the predictions and support its use in regulatory applications, output includes the Tanimoto similarity with members of the training set and the identity of the ten closest analogues with experimental data.

The current model is for acute fathead minnow toxicity. It is intended to apply the program to Daphnia and algal endpoints. The supplementation of these modules with “in house” New Substances data will be explored, at least for internal use. The program is also being applied to a physical chemical endpoint (water solubility).

Environment Canada have also developed the so-called “Green Sheets”. These are a mechanism to assess the quality of studies and are designed to handle data submitted by industry. They also have the advantage of assisting in the identification of robust data for QSAR development and analogue use.

Further information can be obtained from the Environment Canada’s Website: http://www.ec.gc.ca/substances/nsb/eng/index_e.htm

3.2.1.2 Denmark

Danish Environmental Protection Agency – Database and Decision Support System

The Danish EPA has developed a QSAR database and support system, The database contains predictions on 166,000 chemicals (including about 47,000 discrete EINECS chemicals). The database includes predictions from around 60 models of physico-chemical properties, ecotoxicological and environmental fate properties and human health endpoints.

The models applied to make predictions stored in the database are from:

• TOPKAT • MultiCASE • EPISuite • Danish EPA models based on test data or equations from the literature and data from laboratory tests

149 The database is currently in stored in a Chem-X Database and is in the process of being transferred into the OASIS Database Manager System. This will include the CAS number, name as well as a sub-structure search function.

Other tools that are being applied include:

• LeadScope • META • CATABOL • Pharma´s Tox-Boxes

The following physico-chemical endpoints are recorded: molecular mass, melting and boiling point, vapour pressure, log P, log Koc, Fugacity I & III, hydrolysis, atmospheric oxidation (OH and ozone), Henry´s law constant, water solubility.

The ecotoxicity and environmental fate endpoints predicted include: bioconcentration factor, biodegradation (BIOWIN1-6 and M-CASE), Steady-state fish (T95), fish LC50s (bilinear non-polar and polar narcosis, ECOSAR, DK-EPA), Daphnia LC50 (ECOSAR , DK-EPA), Algae EC50 (ECOSAR, DK-EPA), Tetrahymena (ciliate) EC50 (DK-EPA).

The human health endpoints predicted include: oral uptake and dermal permeability, rat LD50 rat (TOPKAT), rat 28 day NOAEL (TOPKAT), skin irritation (severe) (M- CASE), skin sensitisation (3 models, TOPKAT and M-CASE), reproductive toxicity (TOPKAT), human developmental toxicity (M-CASE), oestrogen binding (3 models, DK EPA), anti-androgen reporter gene (DK EPA), Ames mutagenicity (3 models), DNA reactivity (Ashby), chromosomal aberrations (CHO and CHL), mouse lymphoma, mouse micronucleus, mouse SCE bone marrow (in vivo), rodent dominant lethal test, CHO forward mutation assay, Drosophila SLRL, mouse Comet assay, 12 cancer models (mouse and rat, NCI, FDA, FDA properitary), mouse and rat TD50

3.2.1.3 Japan

National Institute of Technology and Evaluation (NITE)

The Japanese National Institute of Technology and Evaluation (NITE) has developed a program for the Utilisation of QSAR to examine existing chemicals. QSAR predictions provide important pieces of information for the evaluation of chemicals in the making of regulatory decisions. The targets under the current Chemical Substances Control Law (CSCL) are biodegradation, log P and bioaccumulation. The Institute has organised the “NITE QSAR Committee” (NQC) of which members are experts in QSAR and hazard of chemicals. Through collaboration with METI and NQC members, there is an intention to establish the following methodology:

• External validation of models • Screening of existing chemicals not yet measured • Protocol for the evaluation of individual chemicals by QSARor biodegradation and bioaccumulation

150 • Provision of the QSAR Reports on individual Chemicals to METI and the Council for their decision making • NITE to provides free public access to chemical data

Futher information is available from http://www.safe.nite.go.jp/japan/db.html

3.2.1.4 United States Environmental Protection Agency

US Environmental Protection Agency

The US Environmental Protection Agency has more than 25 years experience in the development of (Q)SARs. This has to thee development and refinement of new U.S. assessment methods:

Chemical Categories: Currently 45 categories which are routinely updated. This provides information on chemical analogues, potential concerns, and testing recommendations.

EPA/OPPT Predictive Models: These include EPISuite, ECOSAR, OncoLogic, ChemSTEER, E-FAST

Computational Toxicology/Genomics: More information is available on this programme from http://www.epa.gov/comptox/comptox_framework.html

PBT Profiler (2002): Developed jointly by EPA, The American Chemistry Council, The Chlorine Chemistry Council, The Synthetic Organic Chemical Manufacturers Association, and Environmental Defense.

Analog Identification Method (AIM) (2004): A web-based tool that identifies chemical analogues and points the user to experimental data. It facilitates the need for robust, searchable databases. AIM currently searches: IUCLID, NTP, HSDB, RTECS, AEGLS, DSSTox, ATSDR, HPV, IRIS, TSCATS

Sustainable Futures (SF): The SF objective is to inform decision-making within the chemical industry by educating them on the use of (Q)SAR for determining potential hazards and risks of untested materials. Under SF, OPPT gives companies the same U.S. EPA assessment tools used to evaluate new chemical submissions (PMNs), and provides hands-on training to industry.

Other U.S. Applications of (Q)SAR: U.S. (Q)SAR models were used as a quality control check on SIDS data starting in the early 1990’s. (Q)SAR predictions are currently accepted under the U.S. HPV Program to fill data gaps for certain endpoints. The 2003 Framework for a Computational Toxicology Program in ORD addresses use of computational toxicology to improve the Agency’s prioritisation of data requirements and risk assessments. U.S. Office of Pesticide Programs uses ECOSAR to establish food tolerance exemptions under the Food Quality Protection Act and utilises (Q)SAR to determine biodegradation potential. U.S. Office of Water

151 uses OncoLogic to predict cancer potential of chemicals for prioritisation, and uses AQUATOX to predict the environmental fate of chemicals.

US Food and Drug Administration

The US Food and Drug Administration (FDA) Center for Drug Evaluation and Research (CDER) operates a number of programs and activities in the area of in silico toxicology. Some of these are summarised below.

Database Projects: Creating an FDA knowledge base and Institutional memory. This is a unique repository of the results of clinical and non-clinical studies and post- marketing clinical adverse events. The Informatics and Computational Safety Analysis Staff (ICSAS) Toxicology Database Project was initiated to develop an electronic database for pharmaceutical toxicology studies stored in FDA archives. This database also provides information needed to develop computational toxicology (ComTox) software program database modules and serves as a resource for Center research and regulatory decisions.

Chemical Structure Similarity Searching: The ISIS/Host software program has been provided to CDER to evaluate the capability to toxicological and chemical structure information in the database. A new project has also begun under which CACTVS chemoinformatics software will also be used for this purpose. When fully implemented, CDER reviewers will be able to quickly receive a list of drugs in FDA files that are structurally related to a compound in an Investigational New Drug (IND) application, including links to background resource material.

The Computational Toxicology Program and ComTox Consulting Service: Computational toxicology (ComTox) incorporates information from toxicology databases and applies advances in computer technology and quantitative structure activity relationship (QSAR) methods to screen compounds for potential toxicity. To overcome this lack of effective commercial programs, ICSAS and MultiCASE, Inc. established a Cooperative Research and Development Agreement (CRADA). Together, they developed human expert rules to enhance the performance of the MCASE/MC4PC quantitative structure activity relationship software program. • MCASE/MC4PC software reduces chemicals to 2 - 10 atom fragments and sorts the fragments in relation to biological activity or toxicity (structural alerts). • MCASE/MC4PC lists the structural alerts linked to a query compound and lists the structures, names, and activity of compounds in the database that are related to the substance. More recently, ICSAS has begun to work with other QSAR products, including MDL QSAR, Bioreason ClassPharmer, LHASA DEREK for Windows, Prous Science BioEpisteme, and Leadscope Enterprise, which have completely different logical approaches to toxicological predictions.

ComTox Regulatory Application of ICSAS MCASE/MC4PC-ES by the Center for Food Safety and Applied Nutrition (CFSAN): The CFSAN Office of Food Additive Safety developed risk management methods to prioritise the use of limited review

152 resources on premarket applications that present the greatest potential risk to the public health. With partial funding by the Office of the FDA Commissioner, ICSAS is assisting CFSAN in developing and applying appropriate MCASE/MC4PC toxicology and clinical effects modules to meet their needs and in training personnel in the use of MCASE/MC4PC. Further investigations will also be conducted to evaluate ComTox programs to estimate potential FCS reproductive toxicity.

Application of Computational Toxicology to Assess Clinical Adverse Drug Reactions: ICSAS has developed a chemical structure (".mol"-file) based indexing system for the CDER drug dictionary to facilitate the analysis of information in the clinical Spontaneous Reporting System (SRS) database. With the help of CDER's Office of Post Marketing Drug Risk Assessment, a feasibility study is underway to evaluate the application of MCASE/MC4PC to predict post-marketing adverse events using the SRS database and the more recently developed Adverse Event Reporting System (AERS) database.

Further information on the US FDA CDER activities is available from: http://www.fda.gov/cder/Offices/OPS_IO/ICSAS.htm#ComToxProgram

3.2.2 Further information

Breton R, Boxall A (2003) Pharmaceuticals and personal care products in the environment: Regulatory drivers and research needs. QSAR and Combinatorial Science 22: 399-409 Brock WJ, Rodricks JV, Rulis A, Dellarco VL, Gray GM, Lane RW (2003) Food safety: Risk assessment methodology and decision-making criteria. International Journal of Toxicology 22: 435-451 Carlsen L, Walker JD (2003) QSARs for prioritizing PBT substances to promote pollution prevention. QSAR and Combinatorial Science 22: 49-57 Contrera JF, Matthews EJ, Benz RD (2003) Predicting the carcinogenic potential of pharmaceuticals in rodents using molecular structural similarity and E-state indices. Regulatory Toxicology and Pharmacology 38:243-259. Contrera JF, Matthews EJ, Kruhlak NL, Benz R Daniel (2004) Estimating the safe starting dose in Phase I clinical trials and No Observed Effect Level based on QSAR modeling of the human maximum Recommended Daily Dose. Regulatory Toxicology and Pharmacology 40:185-206. Cronin MTD (2004) The use by governmental regulatory agencies of quantitative structure-activity relationships and expert systems to predict toxicity. In MTD Cronin, DJ Livingstone (eds) Predicting Chemical Toxicity and Fate. CRC Press, Boca Raton FL, USA, pp 413-427. Cronin MTD, Jaworska JS, Walker JD, Comber MHI, Watts CD, Worth AP (2003a) Use of QSARs in international decision-making frameworks to predict health effects of chemical substances. Environmental Health Perspectives 111: 1391-1401 Cronin MTD, Walker JD, Jaworska JS, Comber MHI, Watts CD, Worth AP (2003b) Use of QSARs in international decision-making frameworks to predict ecologic

153 effects and environmental fate of chemical substances. Environmental Health Perspectives 111: 1376-1390 Danish EPA. Environmental Project No 636 (2001) Report on the advisory list for self-classifiaction of dangerous substances. ( http://www.mst.dk/homepage/) Licht O, Weyers A, Nage R (2004) Ecotoxicological characterisation and classification of existing chemicals - Examples from the ICCA HPV initiative and comparison with other existing chemicals. Environmental Science and Pollution Research 11: 291-296 Mackay D, Hubbarde J, Webster E (2003) The role of QSARs and fate models in chemical hazard and risk assessment. QSAR and Combinatorial Science 22: 106- 112 Matthews EJ, Benz RD, Contrera JF (2000) Use of toxicological information in drug design. Journal of Molecular Graphics and Modeling 18:605-614. Matthews EJ, Contrera JF (1998) A new highly specific method for predicting the carcinogenic potential of pharmaceuticals in rodents using enhanced MCASE QSAR-ES software. Regulatory Toxicology and Pharmacology 28:242-264. Matthews EJ, Kruhlak NL, Benz RD, Contrera JF (2004) Assessment of the health effects of chemicals in humans: I. QSAR estimation of the Maximum Recommended Therapeutic Dose (MRTD) and No Effect Level (NOEL) of organic chemicals based on clinical trial data. Current Drug Discovery Technologies 1:61-76. Matthews EJ, Machuga EJ (1995) Threshold of estimated toxicity for regulation of indirect food additives. Toxicology Letters 79: 123-129 Nabholz JV (1991) Environmental hazard and risk assessment under the United States Toxic Substances Control Act. Science of the Total Environment 109/110: 649-665 Polloth C, Mangelsdorf I (1997) Commentary on the application of (Q)SAR to the toxicological evaluation of existing chemicals. Chemosphere 35, 2525-2542 Richard AM (1998) Commercial toxicology prediction systems: a regulatory perspective. Toxicology Letters 103: 611-616 Russom CL, Anderson EB, Greenwood BE, Pilli A (1991) ASTER - integration of the AQUIRE database and the QSAR system for use in ecological risk assessments. Science of the Total Environment 109/110: 667-670 Sosted H, Basketter DA, Estrada E, Johansen JD, Patlewicz GY (2004) Ranking of hair dye substances according to predicted sensitization potency: quantitative structure-activity relationships. Contact Dermatitis 51: 241-254 Tong WD, Fang H, Hong HX, Xie Q, Perkins R, Anson J, Sheehan DM (2003) Regulatory application of SAR/QSAR for priority setting of endocrine disruptors: A perspective. Pure and Applied Chemistry 75: 2375-2388 Vedani A, Dobler M, Lill MA (2003) Internet laboratory for predicting harmful effects triggered by drugs and chemicals - A progress report. ALTEX-Alternativen zu Tierexperimenten 20: 85-91 Worth AP, van Leeuwen CJ, Hartung T (2004) The prospects for using (Q)SARs in a changing political environment-high expectations and a key role for the European

154 Commission's Joint Research Centre. SAR and QSAR in Environmental Research 15: 331-343

3.3 Strategies for the Use of QSARs

Section 3.2 and the associated reviews describe the use of in silico methods by regulatory authorities worldwide to predict toxicological and fate endpoints. With regard to REACH technical guidance is is being developed at the time of preparation of this report. This may come from the REACH Implementation Projects (see Section 4) and also from the OECD Guidance Document which is due to be complete in late 2005 or early 2006. The OECD Guidance Document will provide “instruction” in how to evaluate predictions from QSARs, the basis for which is given in Section 3.3.1.

3.3.1 Evaluation and Validation of QSARs

Following an industry funded meeting “Regulatory Acceptance of (Q)SARs for Human Health and Environmental Endpoints” in Setubal, Portugal (4-6 March 2002) there was an emphasis towards validation and ultimately regulatory acceptance of in silico models (Jaworska et al, 2003). The so-called “Setubal Principles” were proposed, this was a set of six criteria on which to validate QSARs for possible regulatory acceptance. It should be noted that there are many practical difficulties to the validation of (Q)SARs, in particular obtaining data for a meaningful external validation, as well as obtaining transparent models for some methodologies (e.g. commercial expert systems, neural networks etc). The Setubal Principle have been discussed by the OECD which has resulted in the principles defines in Section 3.3.1.1. Further information regarding the process of QSAR validation is available from Worth et al (2004a, b)

3.3.1.1 Agreed OECD Principles for the Evaluation and Validation of QSARs

The following information is taken from the OECD (2004). The agreed OECD Principles for the Validation, for regulatory purposes, of (Q)SAR Models, which are intended to be read in conjunction with the associated explanatory comments, are as follows:

To facilitate the consideration of a (Q)SAR model for regulatory purposes, it should be associated with the following information:

1. a defined endpoint 2. an unambiguous algorithm 3. a defined domain of applicability 4. appropriate measures of goodness-of–fit, robustness and predictivity 5. a mechanistic interpretation, if possible

Notes

1. The intent of Principle 1 (defined endpoint) is to ensure clarity in the endpoint being predicted by a given model, since a given endpoint could be determined by

155 different experimental protocols and under different experimental conditions. It is therefore important to identify the experimental system that is being modeled by the (Q)SAR. Further guidance is being developed regarding the interpretation of “defined endpoint”. For example, a no-observed-effect level might be considered to be a defined endpoint in the sense that it is a defined information requirement of a given regulatory guideline, but cannot be regarded as a defined endpoint in the scientific sense of referring to a specific effect within a specific tissue/organ under specified conditions.

2. The intent of Principle 2 (unambiguous algorithm) is to ensure transparency in the model algorithm that generates predictions of an endpoint from information on chemical structure and/or physicochemical properties. It is recognized that, in the case of commercially-developed models, this information is not always made publicly available. However, without this information, the performance of a model cannot be independently established, which is likely to represent a barrier for regulatory acceptance. The issue of reproducibility of the predictions is covered by this Principle, and will be explained further in the guidance material.

3. The need to define an applicability domain (Principle 3) expresses the fact that (Q)SARs are reductionist models which are inevitably associated with limitations in terms of the types of chemical structures, physicochemical properties and mechanisms of action for which the models can generate reliable predictions. Further work is recommended to define what types of information are needed to define (Q)SAR applicability domains, and to develop appropriate methods for obtaining this information.

4. The revised Principle 4 (appropriate measures of goodness-of–fit, robustness and predictivity) includes the intent of the original Setubal Principles 5 and 6. The wording of the principle is intended to simplify the overall set of principles, but not to lose the distinction between the internal performance of a model (as represented by goodness-of-fit and robustness) and the predictivity of a model (as determined by external validation). It is recommended that detailed guidance be developed on the approaches that could be used to provide appropriate measures of internal performance and predictivity. Further work is recommended to determine what constitutes external validation of (Q)SAR models.

5. It is recognised that it is not always possible, from a scientific viewpoint, to provide a mechanistic interpretation of a given (Q)SAR (Principle 5), or that there even be multiple mechanistic interpretations of a given model. The absence of a mechanistic interpretation for a model does not mean that a model is not potentially useful in the regulatory context. The intent of Principle 5 is not to reject models that have no apparent mechanistic basis, but to ensure that some consideration is given to the possibility of a mechanistic association between the descriptors used in a model and the endpoint being predicted, and to ensure that this association is documented.

3.3.1.2 Application of the OECD Principles

156 The OECD states that the principles should be considered as scientific goals that provide generic base-line guidance for integrating the use of (Q)SAR models into regulatory frameworks. The validation of individual (Q)SAR models using these principles will be done as part of national or regional initiatives, from which other member countries can benefit in considering their regulatory application.

Flexibility will be needed in the interpretation and application of each principle because, ultimately, the proper integration of the use of (Q)SAR models into any type of regulatory/decision-making framework depends upon the needs and constraints of the specific regulatory authority.

It should be emphasised that these principles only identify the types of information that are considered useful for the regulatory application of (Q)SAR models in a regulatory context. The definition of criteria for determining the scientific validity and national regulatory acceptability of (Q)SAR models falls outside the scope of this work item, but could eventually be considered by national authorities

3.3.2 Further information

Barratt MD (1995) The role of structure-activity-relationships and expert-systems in alternative strategies for the determination of skin sensitization, skin corrosivity and eye irritation. ATLA 23: 111-122 Barratt MD (1998) Integrating computer prediction systems with in vitro methods towards a better understanding of toxicology. Toxicology Letters 103: 617-621 Barratt MD, Langowski JJ (2000) Validation and development of the Derek skin sensitisation rulebase by analysis of the BgVV list of contact allergens. In: Progress in the Reduction, Refinement and Replacement of Animal Experimentation. Balls M, van Zeller A-M, Halder ME (Eds). Elsevier, pp. 493-512. Benigni R (2004) Prediction of human health endpoints: mutagenicity and carcinogenicity. In Predicting Chemical Fate and Toxicity (Cronin, M.T.D., and Livingstone, D.J., Eds.) CRC Press, Boca Raton FL, pp. 173-192. Benigni R, Giuliani A (2003) Putting the Predictive Toxicology Challenge into perspective: reflections on the results. Bioinformatics 19: 1194-1200 Chamberlain M (1997) Application of QSAR, expert systems and in vitro methods. In van Zutphen LFM and Balls M (eds) Animal Alternatives, Welfare, and Ethics: Developments in Animal and Veterinary Sciences 27: 723-730 Contrera JF, Matthews EJ, Benz RD (2003) Predicting the carcinogenic potential of pharmaceuticals in rodents using molecular structural similarity and E-state indices. Regulatory Toxicology and Pharmacology 38: 243-259 Cronin MTD (2002) The current status and future applicability of quantitative structure-activity relationships (QSARs) in predicting toxicity. Alternatives to Laboratory Animals 30 (Supplement 2): 81-84 Cronin MTD, Dearden JC, Walker JD, Worth AP (2003) Quantitative structure- activity relationships for human health effects: Commonalities with other endpoints. Environmental Toxicology and Chemistry 22: 1829-1843

157 ECETOC (European Centre for Ecotoxicology and Toxicology of Chemicals) (2003) (Q)SARs: Evaluation of the commercially available software for human health and environmental endpoints with respect to chemical management applications. Technical Report No 89. ECETOC, Brussels. Strategy Papers. Eriksson L, Jaworska J, Worth AP, Cronin MTD, McDowell RM, Gramatica P (2003) Methods for reliability and uncertainty assessment and for applicability evaluations of classification- and regression-based QSARs. Environmental Health Perspectives 111: 1361-1367 Jaworska JS, Comber M, Auer C, van Leeuwen CJ (2003) Summary of a Workshop on regulatory acceptance of (Q)SARs for human health and environmental endpoints. Environmental Health Perspectives 111: 1358-1360 Mekenyan O, Dimitrov S, Schmieder P, Veith G (2003) In silico modelling of hazard endpoints: Current problems and perspectives. SAR and QSAR in Environmental Research 14: 361-371 Moore DRJ, Breton RL, MacDonald DB (2003) A comparison of model performance for six quantitative structure-activity relationship packages that predict acute toxicity to fish. Environmental Toxicology and Chemistry 22: 1799-1809 Öberg T (2004) A QSAR for baseline toxicity: Validation, domain of application, and prediction. Chemical Research in Toxicology 17: 1630-1637 Organisation for Economic Co-Operation and Development (OECD) (2004) Report from the Expert Group on (Quantitative) Structure-Activity Relationships [(Q)SARs] on the principles for the validation of (Q)SARs. OECD Environment Health and Safety Publications, Series on Testing and Assessment, Paris, France, p. 206 (available from http://appli1.oecd.org/olis/2004doc.nsf/linkto/env-jm-mono(2004)24) Rosenkranz HS (2003) SAR modelling of complex phenomena: Probing methodological limitations. ATLA 31: 393-399 Tong WD, Xie W, Hong HX, Shi LM, Fang H, Perkins R (2004) Assessment of prediction confidence and domain extrapolation of two structure-activity relationship models for predicting estrogen receptor binding activity. Environmental Health Perspectives 112: 1249-1254 Veith GD (2004) On the nature, evolution and future of quantitative structure-activity relationships (QSAR) in toxicology. SAR and QSAR in Environmental Research 15: 323-330 Worth AP, Cronin MTD, van Leeuwen CJ (2004a) A framework for promoting the acceptance and regulatory use of (quantitative) structure-activity relationships. In MTD Cronin, DJ Livingstone (eds) Predicting Chemical Toxicity and Fate. CRC Press, Boca Raton FL, USA, pp 429-440. Worth AP, Hartung T, van Leeuwen CJ (2004b) The role of the European Centre for the Validation of Alternative Methods (ECVAM) in the validation of (Q)SARs. SAR and QSAR in Environmental Research 15: 345-358

3.4 The Application Toolbox for QSAR

158 There is a general appreciation, going back to the Setubal meeting (Jaworska et al 2003), that some kind of automated system is required to assist risk assessors in the use of in silico methods, including both predictive technologies and existing data. Originally termed the “Decision Support System”, this has evolved into what is now term the “Application Toolbox for QSAR” or “QSAR Application Toolbox”. The latter term is currently preferred to ensure that the system would not be seen as a decision making technology.

In early 2005 the European Chemicals Bureau commissioned a Scoping Study into the current state of the art of technology for a possible QSAR Application Toolbox. The Scoping Study is being performed by Prof Guisepina Gina at the Politechnico di Milano. At the time of preparing this report, the final conclusions of the study have not been published. However, some of the material presented below (as well as elsewhere in Section 3) has been taken from an expert meeting held in Milan (28-29 April 2005).

The functionalities and components of the QSAR Application Toolbox are envisioned as follows

• Input, storage and visualisation of chemical data • Allow user to input, store and visualise and interconvert 1D, 2D and 3D representations of chemicals • Storage of common chemical identifiers • Storage of common descriptors • User-defined searching of local databases and storage of imported data • Link to OECD Global Portal and storage of imported data • Allow user to define regulatory context (one or more chemicals and endpoints for assessment) • Use of prediction tools • Allow user to develop simple models and generate statistical characteristics and defined applicability domains • Apply (Q)SARs or (Q)SAR batteries to provide estimates for single chemicals, mixtures and their components • Assess whether estimates are within domains of built-in models • Identify groups of analogous chemicals to enable read-across and chemical category formation • Identify metabolites, and other chemicals sharing same metabolites • Output of predictions and their interpretation • Allow user to define format (and level of detail) for output • Provide supporting information on models and tools, including information on model validity and guidance on interpretation of outputs

This is clearly an ambitious, but valuable, project. The intention is to bring together and integrate as much as possible of the available technologies in this area. Some specific aspects of the QSAR Application Toolbox are described below.

159 3.4.1 Projects Relating to the Application Toolbox for QSAR

Some funding has already been applied to develop the QSAR Application Toolbox.

TANGRAM

This was originally a project put forward for EU 6th Framework funding. The project is currently underwritten by: ECETOC, CEFIC LRI, CEFIC, IHCP, EPA ORD, EPA OPPT, OECD. The focus is on solutions rather than “deeper technical understanding”. The attributes of the system being developed are:

• user friendliness in its operation • incorporation of decision support tools for appropriate (Q)SAR selection • provision of guidance for non-(Q)SAR experts • generally availability • accessible through the internet; • a dynamic system, i.e. allowing for the continuous refinement of existing (Q)SARs

It is also possible that, following the ECB’s Scoping Study mentioned above, further specific funding could be made available for the Toolbox. A decision on this is possible in late 2005.

3.4.2 Applicability Domain

In the area of predictive toxicology the applicability domain is taken to express the scope and limitations of a model, i.e. the range of chemical structures for which the model is considered to be applicable (Netzeva et al 2005). The concept of the applicability domain is enshrined within the OECD principles for validation (see Section 3.4). Whilst this issue has been fundamental to the use of QSAR (and indeed any predictive technique) since their conception, there remain few reliable methods to define and apply an applicability domain in predictive toxicology. The current status of methods to define the applicability domain for use in (Q)SAR has been assessed recently by Netzeva et al (2005).

There is current debate on the best methods to define the applicability domain for a model in predictive toxicology. The ultimate solution is likely to be lacking for a number of years. However, there are some initiatives that are beginning to address the issue of applicability domain as described below.

AMBIT

This project is supervised by Dr Joanna Jaworska (Procter and Gamble, Brussels) in collaboration with the Bulgarian Academy of Sciences (Sofia). A database is the heart of the AMBIT software. In an IT sense the database is a software system allowing users to store, retrieve and search structured information. It stores chemical

160 information (structure, names, CAS, SMILES, many other computer representations of structure, properties, etc.) and allows for the

• Applicability Domain (assessment in parameter and structural space) • Chemical Grouping (by structure and activity) • CAS-SMILES converter (many other computer formats can be generated) • Guidance on model choice in case of multiple models (based on decision theory)

At the time of preparation of this report, the AMBIT Database holds 463,426 compounds. The structures are stored in a compressed CML format with SMILES and fingerprints generated. A web version is ready. The structures stored include:

• NCI dataset (249071 structures) http://cactus.nci.nih.gov/ncidb2/download.html • Ligand.info (251369 structures) http://ligand.info • SRC KOWWIN Training data set (2464 structure) • SRC KOWWIN Validation data set (10839 structures)

The complete documentation of AMBIT Database is available at http://ambit.acad.bg/docs/

3.4.3 Data Presentation and Quality

There are a number of reasons for collecting data and assessing its quality:

• Toxicological information and data underpin all predictive approaches • Should suitable existing data exist for a compound then testing should not be necessary • Existing data may also be useful for read across and analogue approaches to toxicity prediction.

A number of databases are available collecting together toxicological data and information, these have been reviewed recently by Cronin (2005). Modern databasing formats now enable more information to be stored regarding a toxicity test than simply including the potency to qualitative outcome, this information includes details of test protocol etc. Some new databases e.g. ToxScope from LeadScope are using the XML format for data presentation, which allows databases to be searched e.g. via web-sites. Underpinning many of these efforts is the production of the OECD Harmonised Template for toxicological information. Some of the current initative relating to the REACH proposal are described below.

REACH-IT System and the OECD Harmonised Templates

The REACH Implementation Project 2 (see Section 4.2.2) includes the REACH-IT system. This integrates and connects internationally accepted and harmonised IT tools and data formats including the following:

161

• IULCID 5 • OECD Harmonised Templates • Connection to the OECD Global Portal

IUCLID 5 is an upgrade to IUCLID 4. It integrates a totally new design with building of the new system starting in 2005, beta test to end by 2005 with deployment in mid- 2006. For each endpoint study description, all kinds of endpoint study records are encoded using the same data format defined by the data format project (OECD Harmonised Templates).

The Harmonised Templates will:

• Be used in IUCLID 5 for endpoint data storage and exchange • Be used in IUCLID 5 / REACh-IT for endpoint data storage and exchange • Most probably be the base for Global Portal endpoint data storage and exchange

Intergovernmental Forum on Chemical Safety (IFCS)

The Interngovernmental Forum on Chemical Safety is an alliance of all stakeholders concerned with the sound management of chemicals operating on the basis of full and open participation of all partners. It provides a global platform where governments, international, regional and national organisations, industry groups, public interest associations, labour organisations, scientific associations and representatives of civil society can meet to build partnerships, provide advice and guidance, make recommendations and monitor progress.

More information on the IFCS is available from: http://www.who.int/ifcs/

The ECOTOX Database

ECOTOX is a environmental database developed by the US EPA. It is a good example of the possibilities of bringing data together. ECOTOX is a series of linked databases containing comprehensive data on the toxicity of chemicals to aquatic and terrestrial organisms, including plants. The ECOTOX database includes test results published in the open literature, and from other government data sources

More information on, and access to, the ECOTOX Database is available from: http://www.epa.gov/ecotox/

3.4.4 Further information

Benfenati E (2004) Modelling aquatic toxicity with advanced computational techniques: Procedures to standardize data and compare models. Proceedings Knowledge Exploration in Life Science Informatics (2004), Lecture Notes In Artificial Intelligence 3303: 235-248

162 Bradbury SP, Russom CL, Ankley GT, Schultz TW, Walker JD (2003) Overview of data and conceptual approaches for derivation of quantitative structure-activity relationships for ecotoxicological effects of organic chemicals. Environmental Toxicology and Chemistry 22: 1789-1798 Cronin MTD (2005) Toxicological information for use in predictive modelling: quality, sources, and databases. In Helma C (ed) Predictive Toxicology. Marcel Dekker. pp. 93-133. Jaworska JS, Comber M, Auer C, van Leeuwen CJ (2003) Summary of a Workshop on regulatory acceptance of (Q)SARs for human health and environmental endpoints. Environmental Health Perspectives 111: 1358-1360 Jaworska JS, Nikolova-Zheliazkova N, Aldenberg, T (2005) Review of statistical methods for QSAR AD estimation by the training set. ATLA in press Kaiser KLE (2004) Toxicity data sources. In MTD Cronin, DJ Livingstone (eds) Predicting Chemical Toxicity and Fate. CRC Press, Boca Raton FL, USA, pp 17-29. Netzeva TI, Worth AP, Aldenberg T, Benigni R, Cronin MTD, Gramatica P, Jaworska JS, Klopman G, Marchant CA, Myatt G, Nikolova-Jeliazkova N, Patlewicz GY, Perkins R, Roberts DW, Schultz TW, Stanton DT, van de Sandt JJM, Tong W, Veith G, Yang C (2005) Current Status of Methods for Defining the Applicability Domain of (Quantitative) Structure-Activity Relationships. The Report and Recommendations of ECVAM Workshop 52. ATLA 33: 152-173 Nikolova N, Jaworska J (2003) Approaches to measure chemical similarity – a review. QSAR and Combinatorial Science 22:1006-1026. Roncaglioni A, Benfenati E, Boriani E, Clook M (2004) A protocol to select high quality datasets of ecoxtoxicity values for pesticides. Journal of Environmental Science and Health Part B - Pesticides Food Contaminants and Agricultural Wastes 39: 641-652 Tong W, Xie Q, Hong H., Shi L, Fang H, Perkins R (2004) Assessment of prediction confidence and domain extrapolation of two structure-activity relationship models for predicting estrogen receptor binding activity. Environmental Health Perspectives 112: 1249-1254.

163

3.5 Brief Details of Commercial and Non-Commercial Expert Systems for ADMET Prediction

Table 3.5 Brief Details of Commercial and Non-Commercial Expert Systems for ADMET Prediction

Name Brief Possibility for Computer Available Database Summary of Description validation of Requirements Endpoints Performance predictions Statistics (where available)

Statistical Systems TOPKAT QSAR models Optimum PC / UNIX Carcinogenicity FDA Specificity 82- developed from Prediction Space 95% large estimates give an Sensitivity 85- heterogeneous indication of 96% databases of whether the Carcinogenicity NTP Specificity 82- toxicological compounds falls 94% information within the model Sensitivity 82- using domain. 90% substructural Database can be Carcinogenicity Weight of Specificity 93- fragments and searched for Evidence 97% (electro)- similar Sensitivity 93- topological compounds 95% indices Mutagenicity Ames Specificity 83- 100% Sensitivity 75- 100%

164 Rat Oral LD50 Range of factors that 95% of compounds were predicted within: 3-9 Rat Inhalation R2 varies LD50 between 0.85- 0.87 Rat LOAEL Range of factors that 95% of compounds were predicted within: 3-5 Rat MTD R2 varies between 0.80- 0.92 Developmental Specificity 86- Toxicity 97% Potential Sensitivity 86- 88% MCASE, CASE, QSAR models Missing PC / VAX A07: NTP Carcinogenicity CASETOX etc derived from fragments (i.e. Rodent large those not A08: NTP Carcinogenicity heterogeneous represented in Mouse data bases using the database) Carcinogenicity A09: NTP Rat uniquely derived identified A0C: Gold Carcinogenicity molecular CPDB Rodent fragments Sensitivity 52- A0D: Gold 67% Carcinogenicity CPDB Rat Specificity 63- 68%

165 Sensitivity 45- A0E: Gold 50% Carcinogenicity CPDB Mouse Specificity 64- 72% A0F: NTP Carcinogenicity Female Rat A0G: NTP Male Carcinogenicity Rat A0H: NTP Carcinogenicity Female Mouse A0I: NTP Male Carcinogenicity Mouse A20: NTP Mutagenicity Salmonella Mutagenicity A2C: GeneTox Mutagenicity A2F: Mutations Sensitivity 64- in Mouse 100% Lymphoma Specificity (not determined) A2H: Sensitivity 75- Salmonella 78.5% Mutagenicity (Ames) Specificity 78.2 – 90% A60: NTP Sister Mutagenicity Chromatid Exchange Sensitivity 44- A61: NTP 80% Mutagenicity Chromosomal Specificity 50- Aberration 80%

166 Sensitivity 80 – A62: 100% Mutagenicity Micronuclei Specificity 50 – Induction 70% Teratogenesis A44: Composite A45: Developmental Teratogenesis Toxicants - Mouse A46: Teratogenesis Developmental Toxicants - Rat A47: Developmental Teratogenesis Toxicants - Rabbit A48: Developmental Teratogenesis Toxicants - Human A49: FDA + Teratogenesis TERIS A50: Maximum Rat Toxicity Tolerated Dose - Mouse A51: Maximum Rat Toxicity Tolerated Dose - Rat PASS Assesses None known Internet Maximal Error similarity of connection Various of Prediction Literature data molecules to Endpoints ranges from 10- those with 29%

167 known activity ToxScope – Models None known PC Various Model developed from Endpoints, a library of though little Literature data Not given molecular detail on fragment individual models ToxScope – N / A - data base N / A PC N / A N / A N / A Data base only

Expert Systems DEREK for Structural Searchable PC Windows fragment database of No current developed by an ‘knowledge’ and performance expert user toxicological Many endpoints Literature data statistics group are related information available to specific toxicity OncoLogic® Class based Reasoning for PC No current Literature and selection of evaluations is performance Carcinogenicity regulatory expert rules and provided in an statistics submission data SARs expert report available HazardExpert Rule based Searchable PC prediction knowledge No current system in database performance conjunction with provided Many endpoints Literature data statistics quantitative available toxicokinetic assessment

168 169 4. REACH Implementation Projects (RIPs)

4.1 Introduction

REACH Implementation Projects (RIPs) are being performed to assess the logistics of the REACH proposals before their implementation. There are seven RIPs of which one has already been performed (RIP1 – process definition), three are ongoing (RIP2 – IT infrastructure, RIP3 – guidance to industry, RIP4 – guidance to authorities) and three are yet to be started (RIP5, 6 and 7 – setting up agency). The full titles are listed below:

RIP 1: REACH Process Description RIP 2: REACH –IT RIP 3: Technical Guidance and Tools for Industry RIP 4: Technical Guidance and Tools for Authorities RIP 5: Setting up the Pre-Agency RIP 6: Setting up the Agency RIP 7: Commission preparations for REACH

4.2 Description of REACH Implementation Projects

4.2.1 REACH Implementation Project 1

RIP 1 involved the REACH Process Description i.e. the Development of a detailed description of the REACH processes. The aim of RIP 1 was to achieve a better stakeholder understanding of the REACH procedures and to provide a basis for the detailed work in the other RIP projects. This has been finalised and the main findings are published on home-pages of DG ENV, ENTR and JRC:

http://europa.eu.int/comm/environment/chemicals/reach.htm http://europa.eu.int/comm/enterprise/chemicals/chempol/index.htm http://ecb.jrc.it/REACH/

4.2.2 REACH Implementation Project 2

RIP 2 is to develop the IT system to support REACH implementation (REACH-IT). The aim of this project is to ensure that the REACH processes in the Agency, the MS Competent Authorities, the industry, the Commission and other affected stakeholders are supported (and partially enabled) by (an) appropriate IT system(s) and corresponding interfaces.

RIP 2 includes the following objectives:

• The REACH workflow is automated by an IT system; • The REACH dossier submission is mainly organised through an IT system;

170 • The REACH dossier creation and management is supported by a new version 5 of IUCLID, already well introduced in the stakeholder community in its version 4; • Non-confidential REACH data are published on a REACH dissemination website; • Future corrective and evolutive maintenance of the REACH-IT and IUCLID 5 systems are taken over by the most appropriate organisations; • Future first-level support (helpdesk) is run by the most appropriate organisation. • The IUCLID 5 system must also be able to accommodate non-REACH requirements.

4.2.3 REACH Implementation Project 3

This RIP is to develop guidance documents for industry. The aim of this project is to develop in time before entry into force of the REACH legislation the appropriate guidance documents and tools for industry in order to facilitate a smooth implementation of the legislation.

Of all the RIPs, RIP 3 is acknowledged to have the greatest relevance to the use of Alternative to Aninal Testing and some of the initiatives are described in more detail below.

RIP3 is divided into 10 subprojects of which two are particularly pertinent for this report, namely RIP 3.2 - Technical Guidance document on preparing the Chemical Safety Report and RIP 3.3 - Technical Guidance Document on information requirements. Various consortia have been formed to bid for the tenders for these projects. Details of the 10 RIP 3 subprojects are given below.

RIP 3.1: Guidance Document on Preparing the Technical Dossier for Registration. Objective: Develop guidance for manufacturers, importers on how they can prepare their technical dossiers for registration under REACH.

RIP 3.2: TGD on preparing the Chemical Safety Report. Objective: Develop guidance for manufacturers, importers and down-stream users of chemicals on how they can:

• carry out the chemical safety assessment covering Workers, Consumers and the Environment • how they could document the assessment in the report, including listing of the exposure scenarios with appropriate Risk Management Measures • how they can communicate information using the safety data sheet according to REACH.

RIP 3.3: TGD on Information Requirements on Intrinsic Properties of substances. Objective: Develop guidance for industry on how they can fulfill the information requirements on intrinsic properties of substances.

171

RIP 3.4: Guidance Document on Data sharing (Pre-registration). Objective: Develop guidance for manufacturers and importers (M/I) on how to share vertebrate animal data as a consequence of pre-registration of phase-in substances and inquiries for non-phase in substances.

RIP 3.5: Guidance Document on Downstream-User Requirements. Objective: The guidance document should provide the downstream users (DUs) with clear guidance on what their obligations are under the REACH legislation regarding use of substances and exposure scenarios and what information they should have available, identify and communicate up/down the supply chain. Where possible the guidance shall be supply-chain role driven, differentiating for instance between formulators of a preparation and the users of substances or preparations.

RIP 3.6: Guidance on Classification and Labeling under Global Harmonised System. On hold until proposal for implementing GHS is adopted by the Commission.

RIP 3.7: Guidance on preparing an Application Dossier for Authorisation. Objective: The guidance document should provide industry all necessary guidance on the process to be followed in applying for an authorisation.

RIP 3.8: Guidance on fulfilling the Requirements for articles. Objective: Develop guidance that enables producers/importers of articles (and everyone else, including the enforcing authority) to judge whether a substance in an article should be registered/notified or not. RIP 3.9: Technical Guidance Document on carrying out a Socio-Economic Analysis or input for one. Objective: The guidance document should specify when and how to conduct a socio-economic analysis (SEA) under REACH. The guidance should strive for making SEA outputs as comprehensive, consistent and user-friendly (for the SEA committee) as possible taking into account the broad range of chemicals, uses, alternatives, etc. to be analysed and the various parties and processes to be covered by the guidance.

RIP 3.10: Technical Guidance Document for characterisation and checking of Substance Identity. Objective: Develop guidance for manufacturers and importers (M/I) on how to characterise and record the identity of a substance

More detail on the relevant aspects of RIP 3 is given below.

RIP3.2 involves the determination of how to perform hazard assessment and the completion of chemical data sheets. RIP3.3 combines this information with exposure data to provide guidance on risk assessment/management. Work on both these projects started in January 2005 and initial scoping study reports are due to be submitted to the Commission in July 2005. It is not known whether these reports will be made publicly available or not. Two stakeholder meetings will take place during the scoping studies, the first took place in March and the second is in June.

The RIP3.3 scoping study will provide more information regarding when it will be

172 possible to apply alternative methods. The tender requires the consortia to look into certain minimal considerations:

- use of existing information (human data, non-GLP studies etc) - application of a weight of evidence approach - appropriate use of (Q)SARs – when and how? - when and how to use in vitro methods - application of using categories and read-across - when testing is technically impossible - when testing can be omitted based on exposure considerations - further guidance on implementation of the rules for adaptation as provided in the annexes to REACH

There are various different groups involved in looking into these considerations, three information working groups (IMGs), a framework working group and four endpoint working groups.

IMG1 is involved with exposure considerations i.e. defining when there are grounds for derogation against testing due to exposure conditions. The consortia proposed the use of thresholds of toxicological concern (TTCs) at the first stakeholder meeting. IMG2 is involved with testing requirements - how to use all existing data and how in vitro studies can be used to reduce the need for further animal testing. IMG3 is involved with non-testing i.e. the use of (Q)SARs, categories and read across. These IMG projects are all finished and will be presented at the next stakeholder meeting in June.

The Framework working group is headed by Dr Geoff Randell (ex Astra Zeneca Toxicologist, now a consultant) and is working on developing a general decision making process.

The four endpoint working groups (irritation/corrosion, reproductive/developmental, aquatic toxicity and biodegradation) are producing flow charts to test ten chemicals (five initially) for these specific endpoints. These flow charts are to be a guide for industry to decide if testing is required and how to go about it rather than which specific tests to conduct. These studies should be finalised in June to go into the scoping study report in July.

Within the scope of RIP3.3 the consortia were not recommending an alternative to the tonnage trigger approach. The Commission will only allow derogation from the required testing for the higher tonnage bands, 100-1000 tonnes and 1000+ tonnes (level 1 and level 2 testing) whereas derogation could be allowed for base set testing (10-100 tonnes) also. The fact that the Commission may not allow any flexibility in the testing requirements at this level is likely to lead to large numbers of animals being used for reproductive and developmental screening studies even if there is existing data to prove that these studies are not required.

Another anomaly the consortia have to get round is the fact that the REACH proposals require Classification and Labelling data before a risk assessment is carried out. Normally, this would be done the other way round, otherwise the

173 classification may change after the risk assessment is carried out, which would require new labelling, at a cost to industry.

4.2.4 REACH Implementation Project 4 RIP 4 will develop of guidance documents for authorities. The aim of RIP 4 is to develop in time before entry into force of the REACH legislation the appropriate guidance documents and tools for the authorities in order to facilitate a smooth implementation of the legislation.

4.2.5 REACH Implementation Project 5 RIP 5 will set up the Pre-Agency. The specific aim of this RIP is to establish the pre- Agency prior to entry into force of REACH, ensuring that it can effectively, efficiently, and transparently carry out the tasks allocated to it.

4.2.6 REACH Implementation Project 6 RIP 6 will set up the Agency. The specific aim of this RIP is to establish the European Chemicals Agency within 18 months of entry into force of REACH, ensuring that it can effectively, efficiently, and transparently carry out the tasks allocated to it.

4.2.7 REACH Implementation Project 7 RIP 7 prepare the European Commission for REACH The specific aim of this RIP is to ensure that the Commission can effectively, efficiently, and transparently carry out the tasks allocated to it on entry into force of REACH.

4.3 Further Information on RIPs

There appears to be little published documentation relating specifically to RIPs. This is probably because these are new and on-going projects (at the time of writing this report) and subject to commercial confidentiality. Therefore the most successful place to find information is on the internet. The following web-sites were found to be useful in the preparation of this report.

http://europa.eu.int/comm/environment/chemicals/reach.htm http://europa.eu.int/comm/enterprise/chemicals/chempol/index.htm http://ecb.jrc.it/REACH/ http://www.cefic.be/Templates/shwStory.asp?NID=29&HID=450

174

175