<<

Integrated Biological-Chemical Approach for the Identification of

Polyaromatic Mutagens in Surface

Von der Fakultät für Mathematik, Informatik und Naturwissenschaften der RWTH Aachen University zur Erlangung des akademischen Grades einer Doktorin der Naturwissenschaften genehmigte Dissertation

vorgelegt von

D.E.A

Christine Gallampois aus Châtillon sous Bagneux (Frankreich)

Berichter: Universitätsprofessor Dr. Henner Hollert Universitätsprofessor Dr. Andreas Schäffer

Tag der mündlichen Prüfung: 11.07.2012

Diese Dissertation ist auf den Internetseiten der Hochschulbibliothek online verfügbar.

A Mamée

SUMMARY

Surface waters are essential for human , to supply of drinking and as an important resource for agricultural, industrial and recreational activities. However, tonnes of pollutants enter these surface waters every year. Amongst the substances discharged into the environment, a large number are known to be mutagenic, forming part of a complex mixture of effluents found in municipal waste waters from domestic and industrial waste water treatment plants and in surface runoff from agricultural land . Moreover, these surface waters contain many unknown compounds. Consequently, water pollution can be a serious problem for public and aquatic ecosystem health. Effect-directed analysis (EDA) is a tool to identify chemicals responsible for the observed toxic effects. It is based on a combination of chemical and biological analysis. The chemical analysis enable the separation and isolation of toxicants from complex samples, while the biological analysis enables the “tracking” of the toxicants during the separation of the chemicals, which allows decreasing the complexity of the sample in focusing only on chemicals causing the biological effects. Among planar aromatic compounds such as azaarenes, polyaromatic (PAHs), keto-, nitro-keto-, hydroxyl-, amino- and nitro-PAHs, as well as quinones and hydroxy- quinones there are many known environmental mutagens often present in surface waters in complex mixtures and at low concentrations. Planar aromatic compounds may be sampled by selective adsorption to blue rayon (BR). The first part of the thesis work presented here aimed to provide a specific fractionation method for these compounds applicable in effect-directed analysis. This procedure relies on three fractionation steps: (i) solid phase extraction using mixed-mode cation- and anion-exchange sorbents, (ii) reversed-phase HPLC polymeric C18 column and (iii) reversed-phase HPLC phenyl-hexyl column. Based on 47 analytical standards and a BR extract tested with the Ames fluctuation test, the ability of the method to recover and isolate mutagenicity is shown. Standard group recoveries ranked from 37% (quinones) to 85% (keto-polyaromatic hydrocarbons and amides) and these chemicals were separated satisfactorily, with little overlap between neighbouring fractions. The recovery of the BR mutagenicity was over 75 %. As the sample mutagenicity was mainly present in only 7 of 50 fractions (neutral compounds in fractions N-2-6, N-2-7, N-2-8 and N-7-12 and acidic compounds in fractions A-2-9, A-2-10 and A-8-9), this three step fractionation method has been shown to be a reliable method to decrease sample complexity in order to identify polar planar compounds responsible for mutagenic effects. The aim of the second part of the EDA study was to develop and apply an identification strategy to handle the data produced by liquid chromatography-high resolution mass spectrometry (LC-HRMS) and to identify chemicals that are responsible for the mutagenic effects in selected fractions. This was done in two steps: (i) Reference standards were analysed in order to understand their ionisation behaviour by electrospray (ESI) and atmospheric pressure chemical ionisation (APCI) and their MS/MS fragmentation behaviour with the aim of deriving rules that can be used for the unknown identification, and (ii) the method was then applied in the frame of a target-, suspect compounds- and non-target- screening for two mutagenic environmental fractions, N-2-8 and N-7-12 (target-and suspect- compounds-screening for this latter). Typical losses associated with certain chemical classes

i

(e.g. nitro-, hydroxyl-, keto-PAHs) were determined for both APCI- and ESI-, positive and negative mode. As a larger range of compound groups (particularly nitro compounds) could be ionised in APCI, this technique was used to identify unknowns of interest in original sample screening. The target screening based on the exact mass and retention time of the reference standards did not provide any hit and hence, no compounds were identified in the two fractions. The suspect-screening strategy based on the exact mass of the compounds enabled the identification of pigment yellow 1 in fraction N-7-12. Finally non-target screening was applied, which relied on (i) the determination of the empirical formula based on the exact mass of the parent ion, its isotopic pattern and its MS2 fragmentation, (ii) the selection of the candidates based on the comparison of MSn fragments of the unknown and compounds present in chemical database such as ChemSpider or PubChem, and (iii) the match of the physico-chemical properties of the unknown compound with candidates. This strategy enabled to identify and confirm the presence of benzyl(diphenyl) oxide in the fraction N-2-8 and to propose a list of 92 candidates, including 46 aromatic amines.. The aim of the third part of this study was to apply two different approaches to further decrease the candidates list, containing 92 candidates for fraction N-2-8. This was done by applying (i) retention prediction in reversed phase liquid chromatography calculating the Chromatographic Hydrophobicity Index (CHI) using Linear Solvation-Energy Relationship (LSER), which is based on the Abraham equation and (ii) the mutagenicity prediction of aromatic amines on the basis of the stability of corresponding nitrenium ions as the ultimate electrophiles attacking the DNA. Applying the LSER model, the list of candidates was decreased to 22. The mutagenicity prediction method identified nine amino-candidates as mutagens, 24 amino-candidates for which the mutagenicity could not be predicted and thirteen as non-mutagens. In combining the results of these two methods, 22 compounds remained on the list. For two of them, the retention prediction and mutagenicity were positive, while for the 20 other the retention was in agreement with the prediction, but no mutagenicity was predicted. Hence, these two methods allowed to drastically decrease the number of candidates. The methods described in this thesis work provide a good basis with new criteria for candidate selection (acid/base property and APCI/ESI ionisation behaviour of the different compound classes, which can be used in any identification procedure) for EDA studies of planar mutagens in water samples. Two approaches (LSER and mutagenicity prediction) showed their efficiency in reducing the number of candidates. As general considerations, the LSER approach can easily be included in EDA studies as a pre-step for confirmation. Furthermore, this work showed that new tools are needed to improve the identification procedure, which requires databases (e.g. self-made databases for target and suspect- compounds screening), analytical and separation techniques (e.g. ionisation and fragmentation, to gain even more information regarding the analytes to be identified), as well as certain software tools (e.g. helping in the adduct formation prediction).

ii ZUSAMMENFASSUNG

Oberirdische Gewässer haben für das menschliche Leben, insbesondere für die Trinkwasserversorgung und als wichtige Ressource für Landwirtschaft und Industrie sowie für Freizeitaktivitäten eine große Bedeutung. Dennoch gelangen jedes Jahr Tonnen von Schadstoffen in diese Gewässer. Unter den Substanzen, die in die aquatische Umwelt gelangen, sind eine große Zahl bekannter Mutagene, die als Teil komplexer Mischungen aus häuslichen und industriellen Abwässern, sowie im Oberflächenabfluss landwirtschaftlich genutzten Flächen gefunden werden. Daneben sind diese Gewässer aber auch einer Vielzahl unbekannter Verbindungen belastet. Folglich kann die Verschmutzung des Wassers eine ernste Gefahr für die Volksgesundheit und das Gewässermanagement sein. Die Methode der Wirkungsorientierten Analytik (EDA) erlaubt es, Chemikalien zu identifizieren, welche für beobachtete toxische Wirkungen verantwortlich sind. Es basiert auf einer Kombination von chemischen und biologischen Analysen. Die chemische Analyse erlaubt die Trennung und Isolierung von Schadstoffen aus komplexen Proben, während die biologische Analyse ermöglicht, die Komplexität der Probe zu verringern, indem man sich nur auf die Fraktionen fokussiert, welche die Stoffe enthalten, die für die biologischen Effekte verantwortlich sind. Unter der Gruppe der planaren aromatischen Verbindungen, wie z.B. Azaarene, polyzyklische aromatische Kohlenwasserstoffe (PAK), Keto-, Nitro-Keto-, Hydroxyl-, Amino-und Nitro- PAKs sowie Chinone und Hydroxy-Chinone, gibt es viele bekannte Umweltmutagene, die oft in niedrigen Konzentrationen und in komplexen Gemischen in unseren Oberflächengewässern vorkommen. Solche planar-aromatischen Verbindungen können durch selektive Adsorption an Blau-Rayon (BR) als Sammelmatrix beprobt werden. Der erste Teil der hier vorgestellten Arbeit zielte daher darauf ab, ein spezifisches Verfahren zur Fraktionierung dieser Verbindungen im Rahmen einer Wirkungsorientierten Analytik zu entwickeln. Dieses neue Verfahren basiert auf drei Fraktionierungsschritten: (i) Festphasenextraktion unter Verwendung von gemischten Kationen- und Anionenaustausch-Sorptionsmitteln, (ii) Umkehrphasen-HPLC mit einer polymeren C18-Säule und (iii) Umkehrphasen-HPLC mit einer Phenyl-Hexyl Säule. Basierend auf 47 Analysenstandards und einem BR-Extrakt, welche mit dem Ames-Test überprüft wurden, wurde die Fähigkeit des Verfahrens zur Extraktion und Isolierung von mutagenen Proben gezeigt. Die Methode erzielte Wiederfindungen von 37 (Chinone) bis 85% (Keto-polycyclische aromatische Kohlen- wasserstoffe und Amide) und trennte die Chemikalien zufriedenstellend auf, mit wenig Überlappung zwischen den benachbarten Fraktionen. Die Mutagenität in der BR-Probe war im Wesentlichen in 7 der 50 Fraktionen zu finden (neutrale Verbindungen in Fraktionen N-2- 6, N-2-7, N-2-8 und N-7-12 und sauren Verbindungen in Fraktionen A-2-9, A- 2-10 und A-8- 9) und konnte zu 75% wiedergefunden werden. Es konnte gezeigt werden, dass die entwickelte Drei-Stufen Fraktionierungsmethode zuverlässig war, um die Proben-Komplexität zu verringern und die polar-planaren Verbindungen als verantwortlich für die mutagenen Wirkungen zu identifizieren. Das Ziel des zweiten Teils der hier vorgestellten Studie war die Entwicklung und Anwendung einer Identifikations-Strategie anhand von Flüssigchromatographie - hochauflösenden Massenspektrometern (LC-HRMS) erzeugten Daten, um die für die mutagene Wirkungen

iii

verantwortlichen Chemikalien in den ausgewählten Fraktionen zu identifizieren. Dies wurde in zwei Schritten erreicht: (i) Analysestandards wurden gemessen, um deren Verhalten im Elektrospray Ionisation (ESI) und Atmospheric Pressure Chemical Ionisation (APCI) Modus hinsichtlich ihrer MS2-Fragmentierungsverhalten zu verstehen. Dies hatte zum Ziel, sich Regeln zu erarbeiten, die man für die Identifizierung unbekannter Substanzen verwenden kann. (ii) Diese Methode wurde dann im Rahmen einer Target-, Suspect- und Non-Target Screening Analyse für zwei mutagene Fraktionen angewendet, N-2-8 und N-7-12 (Target und Suspect-Screening für die Letztere). Typische Verluste in einigen Stoffklassen (zB Nitro-, Hydroxyl-, Keto-PAKs) wurden sowohl für APCI und ESI im positiven und negativen Modus zugeordnet. Da ein größerer Teil der mutagenen Stoffklassen (insbesondere Nitro- verbindungen) auch in APCI ionisiert werden können, wurde diese Technik verwendet, um Unbekannte in den Screening-Proben zu identifizieren, während die Informationen aus dem ESI Modus dazu verwendet wurden, um die Anwesenheit oder Abwesenheit von bestimmten funktionellen Gruppen in der molekulare Struktur der unbekannten Verbindungen zu bestätigen. Das Target-Screening, basierend auf der genauen Masse und Retentionszeit der Analysestandards, ergab keinen Treffer, und somit wurden zunächst keine Verbindungen in den beiden Fraktionen identifiziert. Das Suspect-Screening, basierend auf der genauen Masse der Verbindung, ermöglichte die Identifizierung von Pigment Yellow 1 in der Fraktion N-7-12. Schließlich wurde eine Non-Target-Screening Methode angewandt, welche auf folgenden Schritten aufgebaut ist: (i) Bestimmung der Summenformel anhand der exakten Masse des Mutter-Ions, sowie anhand dessen Isotopenmusters und ihrer MS2-Fragmentierung; (ii) Auswahl der Kandidaten basierend auf dem Vergleich von MSn Fragmenten der unbekannten Verbindungen mit denen, die in chemischen Datenbanken wie z.B. ChemSpider oder PubChem vorhandenen sind, und (iii) Überprüfung der Übereinstimmung der physikalisch-chemischen Eigenschaften der unbekannten Verbindungen mit denen der Kandidaten. Mit dieser Strategie konnte das Vorhandensein von Benzyl (diphenyl) phosphinoxid in Fraktion N-2-8 identifiziert und bestätigt werden und eine Liste von 92 Kandidaten vorgeschlagen werden. Das Ziel des dritten Teils der Studie war es, anhand zweier unterschiedlicher Ansätze die Anzahl der 92 Kandidaten für die Fraktion N-2-8 zu verringern. Dies wurde folgendermaßen erreicht: (i) Vorhersage der Retenzionszeit in Umkehrphasen-Flüssigchromatographie anhand des so genannten Chromatographischen Hydrophobizitätsindex (CHI) unter Verwendung der linearen Solvatation-Energie-Beziehung (LSER), die auf der Abraham Gleichung beruht, und (ii) Vorhersage der Mutagenität von aromatischen Aminen auf der Grundlage der Stabilität entsprechender Nitrenium Ionen - Zwischenprodukte, welche letzten Endes den elektrophilen Angriff auf die DNS auslösen. Durch die Anwendung des LSER Modells konnte die Liste der Kandidaten auf 22 verringert werden. Die Mutagenitäts-Vorhersage wiederum identifizierte neun der Amino-Kandidaten als potenziell mutagen, während für 24 Amino-Kandidaten die Mutagenität nicht vorhergesagt werden konnte und weitere dreizehn Kandidaten als nicht- mutagen identifiziert wurden. In Kombination der Ergebnisse dieser beiden Methoden blieben 22 Verbindungen auf der Liste. Für zwei von ihnen waren sowohl die Vorhersage der Retention als auch der Mutagenität positiv, während für die anderen 20 Substanzen zwar die Retentionszeit übereinstimmte, aber keine Mutagenität vorhergesagt wurde. Diese beiden Methoden erlaubten somit eine drastische Verringerung der Zahl der Kandidaten. iv Die in dieser Arbeit beschriebenen Methoden bilden eine gute Grundlage für die Auswahl von Kandidaten anhand neuer Kriterien (Säure-/Base-Eigenschaften und APCI / ESI Ionisation Verhalten von verschiedenen Stoffklassen, die in praktisch jedem Identifikationsverfahren verwendet werden können) für die Wirkungsorientierte Analytik von polar-planaren Mutagenen in Wasserproben. Zwei Ansätze (LSER und Mutagenitätsvorhersage) waren geeignet, der Zahl der Kandidaten zu reduzieren. Als allgemeine Anregung kann der LSER- Ansatz leicht in EDA-Studien als Vorab-Schritt für die Bestätigung von Unbekannten integriert werden. Des Weiteren konnte diese Arbeiten zeigen, dass neue Werkzeuge benötigt werden, um das Verfahren der Stoffidentifizierung zu verbessern, wie z.B. Datenbanken (z.B. selbst erstellte Target und Suspect-Screening Datenbanken), Analytische- und Trennverfahren (z.B. Ionisierung und Fragmentierung, um noch mehr Informationen über den zu identifizierenden Analyten zu gewinnen), sowie bestimmte Software Anwendungen (z.B. für die Bestimmung der Summenformel).

v

LIST OF PUBLICATIONS

Bougeard, C., Gallampois, C., Brack, W. (2011). Passive dosing: An approach to control mutagen exposure in the Ames Fluctuation test, Chemosphere, 83 (4), 409-414.

Akkanen, J., Slootweg, T., Mäenpää, K., Leppänen, M.T., Agbo, S., Gallampois, C., Kukkonen, J.V.K. (2012) Book Chapter: Bioavailability of organic contaminants in freshwater environments. The handbook of Environmental Chemistry: Emerging and Priority Pollutants in Rivers: Bringing science into River Management Plans, Vol. 19, Eds. Guasch, Helena, Ginebreda, Antoni and Geiszinger, Anita, Springer.

Schymanski, E.L., Gallampois, C.M.J., Krauss, M., Meringer, M., Neumann, S., Schulze, T., Wolf, S., Brack, W. (2012) Consensus Structure Elucidation Combining GC/EI-MS, Structure Generation and Calculated Properties, Analytical Chemistry in press. DOI:10.1021/ac203471y.

Gallampois, C.M.J., Schymanski, E.L., Bataineh, M., Krauss, M., Brack, W. Sampling and isolation of planar polycyclic mutagens in river water using Blue Rayon (submitted in Analytical and Bioanalytical Chemistry).

Gallampois, C.M.J., Schymanski, E.L, Buchinger, S., Krauss, M., Reifferscheid, G., Brack, W. Development of a HPLC-ESI/APCI-Orbitrap method and a procedure to identify planar polycyclic mutagens in water (In prep.).

Gallampois, C.M.J., Schymanski, E., Krauss, M., Brack, W. Identification of pigment yellow 1 in a blue rayon extract by suspect-compounds screening procedure (In prep.).

PLATFORM PRESENTATIONS

Gallampois, C., Schymanski, E.L., Bataineh, M., Brack “Use of effect-directed analysis to identify mutagens in a blue rayon extract”. Oral presentation at the 3rd International Symposium: Genotoxicity in aquatic systems: Causes, effects and future needs (22-24 September 2010), Freiburg in Breisgau, Germany.

Gallampois, C., Bataineh, M., Löffler, I., Sperreuter, A., Streck, G., Brack, W. “Identification of mutagenic compounds in a blue rayon extract”. Oral presentation at the 20th SETAC Europe (23-27 May 2010), Seville, Spain.

Bougeard, C., Gallampois, C., Kormos, J., Simon, E., Guash, H., Brack, W., Lamoree, M., Ternes, T. “Identification of mutagens in the European river”. Joined oral presentation at the "Emerging and Priority Pollutants: Bringing Science into River Management Plans" meeting (25-26 March 2010), Girona, Spain.

Gallampois, C., Bataineh, M., Umbuzeiro, G.A, Löffler, I., Sperreuter, A., Streck, G., Brack, W. “Mutagenic compounds in the River Elbe- Development of a RP-Fractionation method applied to a blue rayon extract”. Oral presentation at the 19th SETAC Europe (31 May-4 June 2009), Gothenburg, Sweden.

vi

POSTER PRESENTATIONS

Gallampois, C., Schymanski, E.L., Bataineh, M., Streck, G., Brack, W. “Use of computer tools for structure elucidation in effect-directed analysis - Mass frontier vs. MZmine”. Poster presentation at the 20th SETAC Europe (23-27 May 2010), Seville, Spain.

Gallampois, C., Bataineh, M., Löffler, I., Streck, G., Brack, W. “Blue rayon, a passive sampler to monitor mutagenicity in surface waters”. Poster presentation at the 20th SETAC Europe (23-27 May 2010), Seville, Spain.

Gallampois, C., Bataineh, M., Löffler, I., Sperreuter, A., Streck, G., Brack, W. “Effect directed analysis of a River Elbe water sample”. Poster presentation at the Chemicals In The Environment conference (24 March 2010), Leipzig, Germany.

Gallampois, C., Bataineh, M., Löffler, I., Sperreuter, A., Streck, G., Brack, W. “Development of an effect directed analysis method to identify polar planar mutagenic compounds in waters”. MODELKEY meeting “How to assess the impact of key pollutants in aquatic ecosystems” (30 November- 2 December 2009), Leipzig, Germany.

Gallampois, C., Umbuzeiro, G.A, Streck, G., Bataineh, M., Brack, W. “Blue rayon: a passive sampler to study the presence of mutagenic compounds in surface waters”. Poster presentation at the 3rd International Passive Sampler Workshop (27-30 May 2009), Prague, Czech Republic.

Gallampois, C., Streck, G., Brack, W. “Blue rayon: a passive sampler to extract and concentrate polycyclic compounds in waters”. Poster presentation at the 18th SETAC Europe (25-29 May 2008), Warsaw, Poland.

vii

TABLE OF CONTENTS

CHAPTER 1: INTRODUCTION 1

1.1 CHEMICALS IN FRESHWATER RESOURCES 1 1.2 IMPORTANCE OF PLANAR MUTAGENS IN THE ENVIRONMENT 2 1.3 OBJECTIVE 2

CHAPTER 2: SCIENTIFIC BACKGROUND 5

2.1 EFFECT-DIRECTED ANALYSIS 5 2.2 SAMPLING WITH BLUE RAYON 6 2.2.1 USING BLUE FIBRE PASSIVE SAMPLER 9 2.2.1.1 Cleaning process of the blue fibres 9 2.2.1.2 How to use the blue fibres 9 2.2.1.3 Hanging method vs. solid phase extraction method 10 2.2.1.4 Extraction of the chemicals adsorbed on blue fibres. 11 2.3 MUTAGENICITY TESTING OF SURFACE WATERS 11 2.3.1 AMES BIOASSAY 11 2.3.2 DIFFERENT STRAINS OF SALMONELLA AND THEIR USE 12 2.4 SEPARATION IN EDA STUDIES 16 2.5 ANALYTICAL METHODS IN EDA STUDIES 17 2.6 STRATEGIES FOR IDENTIFICATION OF TOXICANTS 18 2.6.1 TARGET ANALYSIS APPROACH 19 2.6.2 SUSPECT COMPOUNDS SCREENING APPROACH 20 2.6.3 NON-TARGET SCREENING APPROACH (UNKNOWN IDENTIFICATION) 20 2.7 FURTHER CLASSIFIERS TO SUPPORT IDENTIFICATION 22 2.7.1 RETENTION PREDICTION 22 2.7.2 MUTAGENICITY PREDICTION 23

CHAPTER 3: DEVELOPMENT AND APPLICATION OF INTEGRATED TOOLS FOR SAMPLING AND ISOLATION OF PLANAR POLYCYCLIC MUTAGENS 25

3.1 INTRODUCTION 25 3.2 MATERIALS AND METHODS 25 3.2.1 CHEMICALS 25 3.2.2 MUTAGENICITY ASSAY 29 3.2.3 OPTIMISATION OF THE FRACTIONATION PROCEDURE 30 3.2.3.1 Method development strategy 30 3.2.3.2 Ion exchange cartridges 30 3.2.3.3 Separation on analytical columns 31 3.2.3.4 Fractionation on preparative columns 32 3.2.4 COLLECTION AND PREPARATION OF ORGANIC EXTRACT FROM RIVER WATER 32 3.2.5 EVALUATION OF THE METHOD 33 3.3 RESULTS AND DISCUSSION 33 3.3.1 EVALUATION OF THE METHOD BASED ON STANDARDS 33 3.3.1.1 Ion exchange recoveries and separation 33 3.3.1.2 Selection of the stationary phases for the preparative liquid chromatography 38 3.3.1.3 Separation with the combination of the polymeric C18 and phenyl hexyl columns 38 3.3.1.4 Recoveries of the liquid chromatography fractionations 44 3.3.2 FRACTIONATION OF THE BR SAMPLE 44

ix

3.3.2.1 Fractionation by ion exchange 45 3.3.2.2 Fractionation by polymeric C18 47 3.3.2.3 Fractionation by phenyl-hexyl 50 3.4 CONCLUSIONS 54

CHAPTER 4: DEVELOPMENT OF A HPLC‐ESI/APCI‐ORBITRAP METHOD AND A PROCEDURE TO IDENTIFY PLANAR POLYCYCLIC MUTAGENS IN RIVER WATER 55

4.1 INTRODUCTION 55 4.2 MATERIAL AND METHODS 56 4.2.1 STANDARDS AND SAMPLES 56 4.2.2 LC-MS SYSTEM 56 4.2.3 PEAK DETECTION AND DATA PROCESSING 56 4.2.4 IDENTIFICATION PROCEDURE OF ACTIVE FRACTIONS 57 4.2.5 MUTAGENICITY ASSAY 59 4.3 RESULTS AND DISCUSSION 60 4.3.1 IONISATION OF STANDARD COMPOUNDS BY ESI AND APCI SOURCES 60 4.3.2 TARGET AND SUSPECT COMPOUNDS SCREENING IN FRACTIONS N-2-8 AND N-7-12 67 4.3.3 UNKNOWN IDENTIFICATION IN FRACTION N-2-8 69 4.3.3.1 Determination of the empirical formula 71 4.3.3.2 From the empirical formula to the candidate(s) 76 4.3.4 LIMITATION OF THE METHOD AND NEW CHALLENGES 144 4.3.5 STRAINS TO DIAGNOSTIC MUTAGENICITY TO SUPPORT STRUCTURE ELUCIDATION OF UNKNOWN MUTAGENS 145 4.4 CONCLUSIONS 147

CHAPTER 5: USE OF RETENTION AND MUTAGENICITY PREDICTION TOOLS TO DECREASE THE NUMBER OF CANDIDATES FOR CONFIRMATION 149

5.1 INTRODUCTION 149 5.2 MATERIEL AND METHODS 149 5.2.1 LC-MS/MS SYSTEM 149 5.2.2 LSER METHOD 150 5.2.3 MUTAGENICITY PREDICTION 151 5.3 RESULTS AND DISCUSSION 152 5.3.1 APPLICATION OF THE LSER METHOD TO FRACTION N-2-8 152 5.3.2 APPLICATION OF THE NITRENIUM STABILITY TO THE SELECTED CANDIDATES FOR FRACTION N-2-8 158 5.3.3 COMBINING RETENTION AND MUTAGENICITY PREDICTION 160 5.4 CONCLUSIONS 161

CHAPTER 6: SUMMARY AND FUTURE WORK 163

ACKNOWLEDGEMENTS 163

REFERENCES 169

APPENDIX 175

CURRICULUM VITAE 181

x

LIST OF FIGURES

Figure 2- 1: Scheme of effect-directed analysis ...... 6 Figure 2- 2: Structures of hemin and copper phthalocyanine ...... 6 Figure 2- 3: Synthesis of blue fibres ...... 7 Figure 2- 4: Hanging method...... 9 Figure 2- 5: Mechanism of nitroarenes activation ...... 12 Figure 2- 6: Mechanism of aromatic amines activation ...... 12 Figure 2- 7: Identification procedures of pollutants in environmental samples ...... 19 Figure 3- 1: Structure of the MCX and MAX sorbents ...... 34 Figure 3- 2: LC-APCI-MS chromatograms of the standard compounds mixture ...... 40 Figure 3- 3: Recoveries of ion exchange separation...... 46 Figure 3- 4: Mutagenicity of the ion exchange fractions ...... 46 Figure 3- 5: Recoveries of the neutral fraction ...... 47 Figure 3- 6: Mutagenicity of neutral sub-fractions (polymeric C18 fractionation step) ...... 48 Figure 3- 7: Recoveries of the acidic fraction A ...... 49 Figure 3- 8: Mutagenicity of acidic sub-fractions (polymeric C18 fractionation step)...... 49 Figure 3- 9: Recoveries of fraction N-2 ...... 50 Figure 3- 10: Recoveries of fraction N-7...... 51 Figure 3- 11: Mutagenicity of the N-2 sub-fractions (phenyl-hexyl fractionation step)...... 51 Figure 3- 12: Mutagenicity of the N-7 sub-fractions (phenyl-hexyl fractionation step)...... 52 Figure 3- 13: Recoveries of fraction A-2...... 53 Figure 3- 14: Recoveries of fraction A-8...... 53 Figure 3- 15: Mutagenicity of the A-2 sub-fractions (phenyl-hexyl fractionation step)...... 54 Figure 3- 16: Mutagenicity of the A-8 sub-fractions (phenyl-hexyl fractionation step)...... 54 Figure 4- 1: Suspect-compounds screening identification procedure ...... 58 Figure 4- 2: Strategy for the unknown identification ...... 59 Figure 4- 3: Structure of Pigment Yellow 1 ...... 67 Figure 4- 4: Mass spectrum (MS2) of Pigment Yellow 1...... 68 Figure 4- 5: Full scan chromatogram with the peaks of interest ...... 70 Figure 4- 6: Mass spectrum of Peak 3 ...... 72 Figure 4- 7: Mass spectrum of Peak 19 ...... 74 Figure 4- 8: MS2 spectrum of the unknown compound Peak 3 ...... 77 Figure 4- 9: Predicted fragments for candidate 1 ...... 79 Figure 4- 10: Predicted fragments for candidate 2 ...... 79 Figure 4- 11: Predicted fragments for candidate 3 ...... 80 Figure 4- 12: Predicted fragments for candidate 4 ...... 80 Figure 4- 13: MS2 spectrum of unknown compound Peak 19 ...... 82 Figure 4- 14: MS2 spectrum of standard 1H-benz[g]indole ...... 84 Figure 4- 15: Predicted MS2 spectrum of candidate 1 for Peak 35 ...... 85 Figure 4- 16: Predicted MS2 spectrum of candidates 2, 3 and 4 for Peak 35 ...... 86 Figure 4- 17: MS2 spectrum of unknown compound Peak 36 ...... 87 Figure 4- 18: Predicted MS2 spectrum of candidates 1, 2, 3 and 4 for Peak 36 ...... 88 Figure 4- 19: MS2 spectrum of unknown compound Peak 40 ...... 89 Figure 4- 20: Predicted MS2 spectrum of best candidates for Peak 40 ...... 90 Figure 4- 21: MS2 spectrum of unknown compound Peak 50 ...... 94 Figure 4- 22: MS2 spectrum of 1-isopropyl-5-methyl-1H-indole-2,3-dione ...... 98 Figure 4- 23: Predicted MS2 spectrum of best candidates for Peak 50 ...... 99 Figure 4- 24: MS2 spectrum of unknown compound Peak 60 ...... 101

xi

Figure 4- 25: Predicted MS2 spectrum of candidate 1 for Peak 60 ...... 103 Figure 4- 26: Predicted MS2 spectrum of candidates 2, 3and 4 for Peak 60 ...... 104 Figure 4- 27: MS2 spectrum of unknown compound Peak 99 ...... 105 Figure 4- 28: Predicted MS2 spectrum of candidates 1, 2, 3, 4, 5 and 6 for Peak 99 ...... 108 Figure 4- 29: Predicted MS2 spectrum of candidate 7 for Peak 99 ...... 108 Figure 4- 30: MS2 spectrum of unknown compound Peak 109……………………………..110 Figure 4- 31: Predicted MS2 spectrum of candidates 1 and 2 for Peak 109 ...... 112 Figure 4- 32: Predicted MS2 spectrum of candidates 3, 4 and 5 for Peak 109 ...... 112 Figure 4- 33: Predicted MS2 spectrum of candidate 1 for Peak 138 ...... 117 Figure 4- 34: Predicted MS2 spectrum of candidates 2, 3, 5 and 6 for Peak 138 ...... 118 Figure 4- 35: Predicted MS2 spectrum of candidates 7 and 8 for Peak 138 ...... 119 Figure 4- 36: Predicted MS2 spectrum of candidate 4 for Peak 138 ...... 120 Figure 4- 37: Predicted MS2 spectrum of candidate 9 for Peak 138 ...... 120 Figure 4- 38: Predicted MS2 spectrum of candidate 10 for Peak 138 ...... 121 Figure 4- 39: Predicted MS2 spectrum of 3-(2-hydroxybenzoyl)-5H-indeno[1,2-b]pyridin-5- one for Peak 143 ...... 122 Figure 4- 40: Predicted MS2 spectrum of 6H-indolo[6,7-a]pyrrolo[3,4-c]carbazole-6,8(7H)- dione for Peak 208 ...... 124 Figure 4- 41: Predicted MS2 spectrum of candidate 1 and 2 from mass search for Peak 208 126 Figure 4- 42: MS2 spectrum of unknown compound Peak 212 ...... 127 Figure 4- 43: Predicted MS2 of candidate 1 for Peak 212 ...... 128 Figure 4- 44: Predicted MS2 of candidates 2 and 3 for Peak 212 ...... 129 Figure 4- 45: MS2 spectrum of unknown compound Peak 213 ...... 130 Figure 4- 46: Predicted MS2 of candidate 1 for Peak 233 ...... 137 Figure 4- 47: Predicted MS2 of candidate 2 for Peak 233 ...... 137 Figure 4- 48: Predicted MS2 of candidates 3, 4 and 5 for Peak 233 ...... 138 Figure 4- 49: MS2 spectrum of unknown compound Peak 242 ...... 140 Figure 4- 50: Mutagenicity of fraction N-2-8 using three strains without S9 activation……146 Figure 4- 51: Mutagenicity of fraction N-2-8 using three strains with S9 activation……….147

xii

LIST OF TABLES

Table 2- 1: Recoveries of compounds by blue cotton ...... 8 Table 2- 2: found with the blue fibres with the Salmonella strains used ...... 13 Table 2- 3: Substructures involved in mutagenicity...... 23 Table 3- 1: Name, structure and suppliers of the analytical standards used ...... 26 Table 3- 2: Gradient used on preparative columns...... 32 Table 3- 3: Recoveries of the standard compounds in the individual fractionation steps ...... 35 Table 3- 4: Pearson correlation coefficients for the selection of preparative columns ...... 38 Table 3- 5: Capacity factors k and retention times of the 47 standard compounds on the preparative polymeric C18 and phenyl-hexyl columns...... 41 Table 3- 6: Calculated slopes of the raw and recombined sample/fractions for each fractionation step...... 45 Table 4- 1: Ionisation and fragmentation behaviour of the standards in APCI and ESI...... 61 Table 4- 2: Ionisation rules for the presence of functional group in ESI and APCI +/-...... 66 Table 4- 3: Peaks of interest contained in the fraction N-2-8 ...... 69 Table 4- 4: Isotopic composition ...... 71 Table 4- 5: Proposed empirical formulas for Peak 3 ...... 73 Table 4- 6: Determination of empirical formulas for remaining peaks ...... 75 Table 4- 7: Top match candidates for Peak 3...... 78 Table 4- 8: Example ChemSpider hits for Peak 19 ...... 82 Table 4- 9: Top Candidates for Peak 19...... 83 Table 4- 10: Fragments for the Peak 35 ...... 84 Table 4- 11: Candidates for Peak 35 ...... 85 Table 4- 12: Candidates for Peak 36 ...... 87 Table 4- 13: Candidates for Peak 40 ...... 89 Table 4- 14: Candidates for Peak 40 from the mass search ...... 92 Table 4- 15: Candidates for Peak 50 ...... 95 Table 4- 16: Top match candidates for Peak 50 ...... 100 Table 4- 17: Fragments for Peak 60 ...... 101 Table 4- 18: Candidates Peak 60 ...... 102 Table 4- 19: Candidates Peak 99 ...... 106 Table 4- 20: Candidates for Peak 109 ...... 111 Table 4- 21: Fragments for Peak 138 ...... 113 Table 4- 22: Candidates for Peak 138………………………………………………………114 Table 4- 23: Fragments for Peak 143 ...... 122 Table 4- 24: Candidate for Peak 143 ...... 122 Table 4- 25: Candidate for Peak 208 ...... 123 Table 4- 26: Candidates for Peak 208 from the mass search...... 125 Table 4- 27: Candidate for Peak 212 ...... 128 Table 4- 28: Candidate for Peak 213 ...... 131 Table 4- 29: Fragments for Peak 228 ...... 134 Table 4- 30: Fragments for Peak 233 ...... 134 Table 4- 31: Candidates for Peak 233 ...... 135 Table 4- 32: Candidates for Peak 242 ...... 141 Table 5- 1: Calibration standards...... 151 Table 5- 2: Descriptor and CHI values for selection of candidates...... 153 Table 5- 3: Mutagenicity prediction results for the selected candidates for fraction N-2-8 . 158 Table 5- 4: Candidates of interest for the confirmation step of the blue rayon EDA study 161

xiii

LIST OF EQUATIONS

Equation 2-1: Abraham equation…………………………………………………………. 22 Equation 2-2: Relative energies calculation for the mutagenicity prediction ……………..24 Equation 3-1: Mutagenicity baseline……………………………………………………….30 Equation 5-1: Calibration of the C18 column for the determination of the CHI of the unknowns………………………………………………………………..... 152 Equation 5-2: Resulting Abraham equation for the calculation of the CHI of the candidates……………………………………………………………..….. 152

xiv Chapter 1

CHAPTER 1: INTRODUCTION

1.1 Chemicals in freshwater resources

Surface waters are essential for human life, they supply drinking water and are an important resource for agricultural, industrial and recreational activities. However, tonnes of pollutants enter surface waters every year. Domestical, agricultural and industrial effluents contribute significantly to the contamination of surface waters [1]. Industries produce a wide range of chemicals, including pesticides, pharmaceuticals, dyes, solvents, plasticizers, antioxidants and surfactants [2]. In general, industrial wastewaters have high concentrations of organic pollutants. Furthermore, these chemicals exhibit a large diversity of functionality in their molecular structure and thus different properties (e.g. solubility, volatility and lipophilicity) affecting their fate and behaviour in the environment. They also exhibit different acute and chronic toxicity. Wastewater effluents have pronounced toxic effects on aquatic organisms [3-6] at different trophic levels (e.g. algae, plants, crustaceans and fish) [7] and on human consumers of drinking water and fish. Thus, water pollution can result in serious public health and aquatic ecosystem problems [8]. In Europe, the Water Framework Directive (WFD-2000/60/EC), effective from 2000, aims to reduce chemical pollution of surface waters [9]. This directive establishes guidelines for water pollution by prioritising substances that are highly toxic for the aquatic environment. According to the Annex II of the Directive 2008/105/EC, this list contains 33 substances or groups of substances (e.g. biocides, plant protection products, metals, polyaromatic hydrocarbons, polybromated biphenylether). These priority compounds are used to assess the organic contamination of aquatic systems. However, about 60 million compounds are known and registered in the Chemical Abstract Service (CAS). The number of these compounds that are of potential environmental concern is unknown but, it clearly exceeds the number of chemicals that can be regulated and monitored individually. In addition, numerous transformation products further enhance the complexity of the chemical mixtures present in water. Organic pollutants can react in aquatic environment, for example through reactions involving hydrolysis, photodegradation, oxidation, or metabolisation. These transformation reactions lead to the creation of numerous transformation products that in most cases nothing is known. Furthermore, some of the transformation products can be more persistent and/or more toxic than their parent compounds [10]. This may be highlighted by the highly mutagenic phenylbenzotriazoles that form from dinitrophenyl azo dye during the industrial process and disinfection at sewage plants [11,12]. It is also possible that genotoxic compounds can be generated by the interaction of chlorine and organic components that are naturally present in surface waters (e.g. fulvic and humic acids) [13]. Thus, it is evident that the impact assessment of chemicals on the water system cannot solely be done on the basis of the priority substances. The identification of toxic compounds in complex chemical mixtures is a new challenge in environmental analytical chemistry that must be addressed.

1 Chapter 1

1.2 Importance of planar mutagens in the environment

A wide range of industrial effluents that enter surface waters have been associated with mutagenic effects, including those from organic chemical manufacturers, metal refining operations, dye manufacturers, petroleum refineries and pulp and paper mills [5,14]. Thus, humans, terrestrial organisms and aquatic organisms are exposed to a large variety of naturally-occurring and man-made mutagens. Amongst these, nitrosamines (e.g. N- nitrosodymethylamine) [15] and polycyclic aromatic compounds [8,16] are classes of chemicals well known to include potent mutagens. The abundant presence of polycyclic aromatic compounds with few rings in the environment is cause for concern. Several heterocyclic amines, which are emitted into the atmosphere through combustion of various materials such as wood, grass, garbage and petroleum, as well as into the terrestrial and aquatic environments through domestic and industrial waste [17], are identified as mutagens [17,18]. Nitro-aromatic compounds are another group of chemicals that include potent mutagens such as nitropyrenes and nitrobenzenes. Their use in pharmaceuticals, dyes, food additives, pesticides, and explosives result in their wide distribution in the environment [19]. Monitoring the effects of organic environmental pollutants is an on-going challenge for researchers. A large battery of bioassays is now available to assist in this task. Compounds contributing to surface water mutagenicity are often present at very low concentrations and their presence can vary temporally [16,20,21]. Furthermore, trace mutagens in water co-exist with a multitude of other chemicals. These may be present at greater concentration masking mutagenic effects (e.g. by reducing bioavailability or by cytotoxicity) [22]. Thus, passive samplers or extraction devices that are selective for specific groups of mutagens are required to assess mutagenicity using bioassays and subsequent chemical analysis [13,23].

1.3 Objective

The overall objective of the present thesis was the development of an integrated approach to identify mutagens in contaminated river waters. The approach was tested at the a site on the River Elbe that is downstream from the industrial area of Pardubice (Prelouc, Czech Republic) where dye, explosive and pharmaceutical industries release effluents. To fulfil this objective, the present thesis will present:

(i) The scientific background of the approach and the individual tools used in the development and application of the different methods (Chapter 2).

(ii) The establishment and application of integrated tools for the sampling and isolation of planar polycyclic mutagens from surface waters and development of an effective fractionation procedure for these compounds (Chapter 3). Blue rayon passive sampling, ion exchange solid-phase extraction, RP-LC fractionation and Ames testing for mutagenicity were the major tool utilised.

(iii) The development of analytical tools for the analysis of planar mutagens using an LTQ-Orbitrap, with the aim of providing a list of candidate compounds in mutagenic fractions (Chapter 4). The optimisation of the LC-MS method using chemical standard is described

2 Chapter 1 followed by two procedures to identify contaminants present in fractions N-2-8 and N-7-12. The first approach was target identification, using a chemical library composed by the chemical standards used for methods development and compounds produced in the sampling site. The second approach was identification of unknowns based on fragmentation studies of the standards by ESI and APC using search and fragmentation prediction in ChemSpider/PubChem.

(iv) The development of a chemical-analytical approach to prioritise and decrease the number of candidates (Chapter 5). The chemical method used chromatographic retention prediction of candidate structures in comparison to measured retention of the analytes. The biological approach was based on nitrenium “theory” energy to predict the mutagenicity of the candidates.

Finally, this thesis concludes with the discussion of future research directions for the field.

3

Chapter 2

CHAPTER 2: SCIENTIFIC BACKGROUND

2.1 Effect-directed analysis

One of the difficulties encountered in understanding the biological effects of exposure to organic pollutants is the multitude of contaminants at trace concentrations that combined with numerous natural compounds that are present in surface waters. Although chemical analytical techniques have advanced detection methods for most target analytes in trace concentrations, major challenges cannot be addressed by analytical chemistry alone. The complexity of environmental samples which typically contain tens of thousands of compounds precludes the analysis of all individual compounds. Also, the need for understanding the how concentrations of target analytes affect the hazards and risks of the complex mixture. To address these limitations, bioassays can be utilised to detect and quantify the effects of complex mixtures. However, they provide no or very limited information on the individual substances causing the effects. Thus, only the integration of chemical-analytical and bioanalytical techniques allow the identification of which components of a mixture significantly contribute to the toxic effects. This requires effect-directed complexity reduction by combined fractionation and biotesting aiming at an elimination of non-toxic fractions from the sample. Such combinations of chemical and biological methods are becoming more common (as seen in [24] with about 10 publications in 1990 and over 300 in 2008). Effect-directed analysis (EDA) is the most prominent non-target approach to identify toxicants in the environment and has been applied since the late 1970s [5]. The method has been improved significantly. However, EDA is still far from being applicable in routine monitoring and success rates are still well below what is required. Therefore, the development and integration of novel promising analytical and prediction tools into EDA is urgently needed. Since its development EDA has been applied to complex mixtures from different pools of aquatic systems [25-30]. The principle of EDA is presented in Figure 2-1 [31]. EDA involves stepwise fractionation procedures to separate the compounds according to their physico-chemical properties (e.g. hydrophobicity, planarity, polarity, molecular size and presence of functional groups). At each fractionation step, bioassays identify the active fractions. The non-active fractions can then be discarded from further processing, enabling the prioritisation of the most active fractions. These fractionation procedures can involve as many steps as necessary to simplify the fractions for chemical analysis and structure elucidation, which are increasingly integrating predictive computer tools. The most commonly used techniques for the identification step are chromatographic methods coupled with mass spectrometry. Nuclear magnetic resonance (NMR) or infra-red (IR) are often not suitable for the analysis in EDA as the amount and purity of compounds are often insufficient [22]. Finally, the tentatively identified compounds need to be confirmed as the cause of toxicity. This includes the confirmation of the molecular structure e.g. by comparison with analytical standards and the confirmation of the contribution to toxic effects [22].

5 Chapter 2

Complex mixture

Biological analysis

Chemical analysis

Biological Fractionation analysis

Responsible toxicant(s)

Figure 2- 1: Scheme of effect-directed analysis (redrawn from [31])

2.2 Sampling with blue rayon

According to Hayatsu [32], the discovery of copper phthalocyaninne trisulphonate (CPT) and its ability to adsorb planar compounds was unexpected. Indeed, while they were working on mutagenesis modulators using the Salmonella bioassays, they fofound that hemin and other porphyrins had the ability to inhibit the mutagenicity of polycyclic aromatic compounds [16,32]. Copper phthalocyanine derivatives are among the blue pigments used as dyes. This pigment has a similar structure to hemin (Fiigure 2-2). Hayatsu and co-workers already knew that hemin interacted with planar structures to form a complex. Thus, they investigated if CPT could be used as a ligand to specifically adsorb planar compounds.

Hemin Copper phthalocyanine

Figure 2- 2: Structures of hemin and copper phthalocyanine

Phthalocyanines are blue and green pigments with commercial vallues suitable for many different supports ranging from cellulose to glass beads [16]. Cotton was firstly chosen as a support because of easy manipulation as well as the ability to swell in water or other liquids, allowing efficient contact with molecules in solution. C.I. Reacttive Blue 21 (a commercial dye, see Figure 2-3) was selected to prepare the blue cotton because of its availability and its ability to react with hydroxyl groups to form a stable etheer linkage [16]. It is strongly hydrophilic due to the three sulphonate substituents on the ring. The reaction between C.I. Reactive Blue 21 and cellulose (cotton) occurs under alkaline conditions. One sulphonic group condensates with one hydroxyl group of the cellulose to form CPT, with a

6 Chapter 2 loss of sodium bisulphate. This reaction is illustrated in Figure 2-3. One gram of dry cotton contains 10 µmol of CPT.

SO 3 Na SO 3 Na

N N N N N SO 3 Na N SO Na - 3 - Na O 3S N Cu N + Cellulose - OH Na O 3S N Cu N N N N N N N NaHSO4

SO 2NH SO -CH -CH -OSO Na SO 2NH 2 2 2 3 SO2-CH2-CH2-O-Cellulose C.I Reactive Blue 21 Copper phthalocyanine trisulsulphonate-fibres

Figure 2- 3: Synthesis of blue fibres (redrawn from [16])

In order to increase the concentration of CPT in the support and therefore the number of binding sites for contaminants, amorphous rayon fibres, which is an artificial fibre, was used to carry the CPT ligands. Rayon fibres can be efficiently washed with solvents. With the same synthesis process as blue cotton, about three more CPT was bound to the rayon than bound to the cotton (30 µmol of CPT /g of dry rayon). More recently, blue chitin [16], which is the powder version of the blue rayon, was developed. Chitin is a poly-N-acetylglucosamine and can be stained with copper phthalocyanine trisulphonate (CPT) pigments in its powder form [16] and has an even greater concentration of CPT than blue rayon (40 µmol of CPT /g of powder). Blue chitin is suitable for preparing a packed column [33]. CPT complexes are very stable and cannot be released without the use of solvent mixtures [34]. When used in situ, the adsorption process is dependent on concentration and the velocity of the water stream [35,36]. The pigment itself is not sensitive to light and temperature, but the adsorbed compounds can be, and care has to be taken following exposure. Hayatsu [16] tested the recoveries of planar compounds with different numbers of aromatic rings. In general, recoveries are good (between 60 and 100 %) for compounds with three or more fused aromatic rings, decreasing with decreasing numbers of fused rings as seen in Table 2-1. CPT, shown in Figure 2-3, has a large planar structure and the mechanism of adsorption to the blue fibre is directly related to the planarity of compounds. Planar chemicals in solution form a 1:1 complex with CPT based on strong π-π interactions between the extensive heterocyclic aromatic system of CPT and the aromatic ligands. The increasing polarisability of the ligands due to an increasing aromatic system or specific substituents results in greater dispersion forces (dipole-induced dipole interactions) and thus greater stability of the complexes. However, the stabilisation effect of substituents can be disturbed if coplanar interaction is hindered [37]. Thus, large aromatic compounds will form more stable complexes with CPT than small aromatic compounds. However, there are some exceptions, in which the geometry of the substituents may be involved. For instance, actinomycine D and carminic acid, which both contain three fused rings, have poor recoveries in comparison with other compounds with three fused rings. This can be explained by the presence of non-planar substituents causing a steric distortion of the aromatic , decreasing the π-π interactions and thus the stability of the complex [37]. In contrast, 2-amino-1-methyl-6- phenylimidazo[4,5-b]pyridine and quercetin have high recoveries (95 %) despite containing

7 Chapter 2 only two aromatic rings in their structure. In this case, the aromatic ring substituent is directly linked and conjugated to the two ring structure, such that the size of the planar molecule is comparable to a three ring system [16]. Therefore, blue fibres are specifically able to adsorb planar compounds, with the best recoveries expected for compounds with more than three fused rings.

Table 2- 1: recoveries of compounds by blue cotton [16] Number Example (recovery) Example exceptions Recovery of rings (from-to %) ≥ 5 Benzo(a) (95%) 85 -95

4 1-nitro-pyrene (80%) 80-100

NO2 3 2-amino-3- Actinomycine D (25%) 60 - 95 methylimidazo[4,5-f] H3C CH3 H3C CH3 N N NH HN quinolone (90%) H H H3C O O CH3 N O ON O O O CH3 CH3 O N N N N H3C O O O O CH3 H2N HON O NH H3C CH3 H3C CH3

N N NH2 H C 3 O O

CH3 CH3 Carminic acid (40%) OH OH O CH3 O OH O

HO OH OH HO OH O OH 2 Adenine (10%) 2-amino-1-methyl-6-phenylimidazo[4,5-b] 0 - 35 H NH2 pyridine (95%) N N CH3 N N N NH2 N N Quercetin (95%) OH HO

O OH

HO O OH 1 Histidine (2%) 0 - 35 N O N HO NH2H

0 Nitrosodimethylamine 0 - 20 (0%) O N N CH3 H3C

8 Chapter 2

2.2.1 Using Blue fibre Passive Sampler 2.2.1.1 Cleaning process of the blue fibres

Blue fibres contain a small amount of non-bound pigment that remains in the material even after exhaustive washing [13,16]. Thus, the fibres must be cleaned before the first use or re-use and the cleaning process shouuld be followed by a mutagenicity test prior to use. Hayatsu [16] recommended washing the blue cotton with a //aammonia solution and methanol, and Kummrow et al. [13] proposed a full protocol to clean any blue rayon sample, based on the same solvents. Such protocols could be extended to any type of blue fibres. The washing procedure described by Kummrow et al. [13] was efficient for the purpoose of this work, and 93% of the blue rayon fibres washed once showed negative results for thhe Salmonella typhimurium (Ames) test. A second washing procedure was performed when necessary and this additional washing step was 100% effective to clean blue rayon [13].

2.2.1.2 How to use the blue fibres

The blue fibres can be used either as passive sampler (hanging method [38]) or as a sorbent for solid phase extraction (drop in flask [34] or packed in column [39]).

- The hanging method The hanging method was first described by Sakamoto and Hayatsu in 1990 [38] (Figure 2-4). The blue rayon (BR) is placed in plastic nets linked to a flooating boardr . It is then deployed in the river and removed 24 hours later. The blue rayon from the nets is rinsed with distilled water to remove all solid material and non-adsorbed compoundds, then dried in paper towels and stored in glass bottles until extraction in the laboratory. This method is commonly used to sample surface watere s [11,12,40-51].

Figure 2- 4: Hanging method (redrawn from [38])

9 Chapter 2

- Solid phase extraction The blue fibres can also be used as a sorbent for solid phase extraction to concentrate the planar compounds from sampled waters. They can be used in two different ways:

(i) Drop in flask: water is collected and bottled at the sampling site. Fibres packed in a plastic net are immersed in the bottle (e.g. 1 g of BR/L of sampled water [34]), which is manually shaken for 30 min. The net is then taken out of the bottle, the fibres are rinsed with distilled water, dried by blotting the fibres onto a paper towel and kept for extraction in the laboratory. This method can be used in the field [34] to avoid transporting samples, or in the laboratory [52].

(ii) Fibres in column: the blue rayon/cotton is packed in glass columns. Water samples are collected and passed through the column. The recommended ratio of BR (g) per litre of sample is from 0.2 to 0.5 g for treated water and 1 g per litre for raw water [13]. The flow rates ranged from 10 mL/min [53] to 50 mL/min at room temperature [13,39,54,55]. After passing the sample through the column, the blue rayon is removed from the glass column, washed several times with distilled water to remove suspended matter and non-adsorbed compounds [13,39,54], dried with a clean paper towel and kept for extraction. The blue rayon can also be kept in the column [55], rinsed with distilled water and dried using airflow.

2.2.1.3 Hanging method vs. solid phase extraction method

In general, there are advantages and disadvantages to both passive and active sampling methods. Drawbacks to traditional water sampling methods are: (i) insufficient volume of water to satisfy the detection limit requirement of analytical methods, (ii) the water sample represents only the contaminants present at the time of sampling, such that episodic events are often missed, (iii) the need of repetitive sampling in order to estimate the time-weighted average (TWA) concentration of the pollutant of interest, which are essential in risk assessment of chemical stressors, and (iv) repetitive sampling can be physically, logistically and financially difficult [56]. However, these shortcomings are overcome by the use of passive sampling methods, that have the advantages of: (i) devices used are generally simple and reliable, (ii) low cost as only one device is necessary at a given sampling location, (iii) provide TWA concentrations of the contaminants of interest, (iv) concentrating ultra-trace levels of contaminants, enabling sufficient sampling requirements for the detection limit of the analytical methods and (v) no large volumes of water have to be transported. The main drawbacks of the passive samplers are that: (i) they do not enable the calculation of toxicant concentration in water, as it is not possible to calculate the amount of water sampled, and hence provide only semi quantitative concentrations (e.g. BR concentrations are given in the form of g of chemical/g of BR [30]), (ii) to predict TWA concentrations they require calibration at appropriate temperature ranges that will be encountered in the field deployment for each targeted toxicant [57], and (iii) the uptake rate can be disturbed by water turbulences, water temperature and biofouling [21]. In effect-directed analysis (EDA) studies, both sampling methods can be used as long as there is enough material for fractionation, biotest and then identification of the chemicals responsible for the toxic effects. In the case of BR, the hanging method has the advantages of passive sampling over conventional sampling (water collection and extraction). As the blue

10 Chapter 2 fibres are placed for a longer period of time (24 hours), the analytes are accumulated. The result is time-weighted average concentration, which eliminates the risk of non-detection due to variations within 24 hours. Furthermore this technique proved to be an efficient, reliable, easy and convenient method to concentrate polyaromatic mutagens in water and thus was chosen to collect the sample used in this thesis work.

2.2.1.4 Extraction of the chemicals adsorbed on blue fibres.

Compounds adsorbed to the blue fibres, either from the field or from columns, can be recovered by elution with hydrophilic organic solvents, most effectively with methanol/ mixtures (50:1, v/v). Ammonia may help to dissociate the complex by coordinating itself to the central metal of the ligand [16]. The pieces of blue fibres are immersed in this methanolic mixture and agitated for at least 30 minutes, then repeated with a fresh methanolic mixture. The extracts are then combined and evaporated.

2.3 Mutagenicity testing of surface waters 2.3.1 Ames bioassay

There are many tests for detecting the mutagenicity of surface waters, and bioassays with bacteria have proven to be very effective for monitoring due to their sensitivity, low cost, reliability and performance within a short time [8]. Amongst these bacteria assays, the Salmonella mutagenicity test has been the most widely used to screen mutagenicity in waters [8,14,53,58-60]. The Salmonella mutagenicity test was developed by Ames in 1975 [61] and its use as well as the number of bacteria strains has been growing ever since. It is based on strains of Salmonella typhimurium that carry mutations in the genes involved in histidine synthesis and thus require histidine for growth. In the presence of mutagenic agents, reversions back to the non-auxotrophic state can occur resulting in growth in histidine- deficient medium, while non-mutated cells do not grow [61]. Each strain contains a different type of mutation in the histidine functional unit of the bacteria genome [61,62]. DNA mutation can occur via several mechanisms. Compounds can react with the DNA, resulting in the creation of DNA adducts or base deletion, which distort the DNA structure [63]. This distortion can disrupt enzymatic DNA repair and replication, increasing the chance of erroneous base replacements, deletions or insertion of base pairs [63]. There are two types of mutations in the Salmonella strains. Firstly, in base-pair substitutions only a single base of the DNA sequence is modified, such as, the replacement of leucine (GAG/CTC) by proline (GGG/CCC). Secondly, frameshift mutations occur where a few bases are inserted or deleted. Mutation will affect the reading frame of a nearby repetitive C-G-C-G- C-G-C-G sequence [64]. The strain TA100 is specially constructed to detect base pair mutations while TA98 is sensitive to frameshift mutations. Some chemicals (e.g. nitro-compounds and aromatic amines) are biologically inactive and need to be metabolised to form active compounds (e.g. by cytochrome P450). Thus, to make the Ames test sensitive to these indirect mutagens, external metabolic activation was introduced using S9 mix, which contains liver homogenates from rats [61].

11 Chapter 2

2.3.2 Different strains of Salmonella and their use

Within the last decade, new strains of Salmonella were engineered on the basis of TA98 and TA100 which are particularly sensitive to specific groups of mutagens with certain functional groups in their molecular structure (e.g. nitro- and amino-groups). The aim of the new engineered strains, such as YG-type strains, is to gain information regarding the relationship between the presence of functional groups and mutagenicity. For instance some nitroarenes and aromatic amines are highly mutagenic and are widely distributed in the environment due to their common use in many industries (e.g. dyes used in textile, paper, plastic, pharmaceuticals, cosmetics and food) [65]. These chemicals require several metabolic steps to become mutagenic [65-68]. The mechanisms of activation of nitroarenes and aromatic amines are presented in Figure 2-5 and Figure 2-6, respectively. The initial step is the N- reduction (nitroarenes) or N-oxidation (aromatic amines) to the corresponding N- hydroxylamine, which undergoes N-O bond cleavage to a reactive electrophilic nitrenium ion intermediate [65,67,69]. This is a di-coordinate intermediate with a single pair of electrons and a positive charge on the nitrogen [67]. This nitrenium is the highly reactive intermediate able to covalently bind and damage the DNA. The YG-type strains over-express nitroreductase- and/or O-acetyltransferase, resulting in an increase of metabolisation to the DNA-reactive form and, thus increased mutagenicity.

- Nitroreductase OH - OH ArN + ArNO 2 ArN H H

Mutations DNA adducts

Figure 2- 5: Mechanism of nitroarenes activation

Cytochrome P450 OH O-acetyltransferase O-CO-CH3 ArN ArN ArNH2 H H Non enzymatic loss of acetate + Mutations DNA adducts ArN H Figure 2- 6: Mechanism of aromatic amines activation [69]

The YG-type strains are all derivatives of TA98 or TA100. YG1021 is derived from TA98, has elevated nitroreductase activity and is highly sensitive to nitroarenes [68]. YG1024 and YG1029 were derived from TA98 and TA100 respectively. Both strains have elevated O- acetyltransferase activity, increasing their sensitivity to aromatic amines [68]. YG1041 and YG1042 are also derivatives of TA98 and TA100, respectively and have enhanced levels of nitroreductase and O-acetyltransferase activity and are highly sensitive to nitroarenes and aromatic amines [70]. Thus, significantly higher mutagenicity in these YG-type strains in comparison to TA98 and TA100 is expected where aromatic amines and/or nitroarenes are present [8]. This can be observed in Table 2-2, where examples of identified chemicals using

12 Chapter 2 the BR sampler and Salmonella strains (TA- and YG-types) are presented. The YG-type strains are much more sensitive than TA-type strains as soon as aromatic amines and/or nitroarenes are involved in the mutagenicity. The number of revertants is always much greater in the YG-type strains than it is in the TA-type strains with and without the S9 mix. Thus, the use of these specific strains can provide information regarding the presence or absence of nitroarenes and/or aromatic amines. The second observation concerning the different strains is that base pair substitution strains (TA100, YG1029 and YG1042) are not as sensitive as the frameshift strains (TA98, YG1024 and YG1041), inducing less revertants than the frameshift strains.

Table 2- 2: Molecules found with the blue rayon/cotton or blue chitin with the Salmonella strains used. # no CAS Number present in databases. – no mutagenicity detected Molecules identified and involved in Salmonella strains Mutagenicity Reference the mutagenicity of a BR extract (number of [CAS Number] revertants) 2-amino-3-methylimidazo[4,5- TA98 + S9 (1 g BR) From 161 to 890 [42] f]quinoline (IQ) [76180-96-6] YG1024 + S9 (0.2 g BR) From 502 to 1775 NH2 TA100 + S9 (1 g BR) - N YG1029 + S9 (0.2 g BR) - N CH3

N 2-amino-9H-pyrido[2,3-b]indole (AαC) [26148-68-5]

NH2 N These three N identified H compounds 1,4-dimethyl-5H-pyrido[4,3-b]indol-3- accounted for amine (Trp-P-1) [62450-06-0] 26% of the total H CH3 mutagenicity of N the BR sample NH2 N

H3C 1,4-dimethyl-5H-pyrido[4,3-b]indol-3- TA100 – S9 - amine (Trp-P-1) TA100 + S9 - [47] YG1024 – S9 - 1-methyl-5H-pyrido[4,3-b]indol-3- YG1024 + S9 From 0 to 4571 amine (Trp-P-2) YG1041 – S9 From 1297 to H YG1041 + S9 3278 N YG1042 – S9 From 0 to 3185 NH2 YG1042 + S9 From 0 to 1731 N -

H3C

13 Chapter 2

Molecules identified and involved in Salmonella strains Mutagenicity Reference the mutagenicity of a BR extract (number of [CAS Number] revertants) 2-[2-(acetylamino)-4-[bis(2- TA98 + S9 7300 [71] methoxyethyl)amino]-5- TA100 + S9 223 methoxyphenyl]- 5-amino-7-bromo-4- chloro-2H-benzotriazole (PBTA-1) H3C O Cl HN OCH3 H2N N N N N OCH Br 3 2-[2-(acetylamino)-4-[N-(2- TA98 + S9 8600 [71] cyanoethyl)ethylamino]-5- TA100 + S9 307 methoxyphenyl]- 5-amino-7-bromo-4- chloro-2H-benzotriazole (PBTA-2) TA98 + S9 93000 [11] [215245-16-2] TA100 + S9 520 H3C YG1024 + S9 3200000 Cl O YG1029 + S9 39000 H N 2 N CH3 N N N Br HN O N H3C 2-[2-(acetylamino)-4-[(2- TA98 + S9 81000 [12] hydroxyethyl)amino]-5- YG1024 + S9 3000000 methoxyphenyl]-5-amino-7-bromo-4- chloro-2H-benzotriazole (PBTA-3) # H3C O Cl HN H2N N H N N N OH Br 2-[2-(acetylamino)-4-amino-5- TA98 + S9 19000 [43] methoxyphenyl]-5-amino-7-bromo-4- TA100 + S9 310 chloro-2H-benzotriazole (PBTA-4) # YG1024 + S9 7800000 H3C YG1029 + S9 15000 O Cl HN H2N N

N NH2 N

Br 2-[4-[bis(2-acetoxyethyl)amino]-2- TA98 + S9 56000 [48] (acetylamino)-5 -methoxyphenyl] -5- TA100 + S9 387 amino-7-bromo-4-chloro-2H- YG1024 + S9 723000 benzotriazole (PBTA-5) # YG1029 + S9 1580 H3C O CH3 Cl HN O H2N N O N N N O O Br CH3

14 Chapter 2

Molecules identified and involved in Salmonella strains Mutagenicity Reference the mutagenicity of a BR extract (number of [CAS Number] revertants) 2-[2-(acetylamino)-4-[bis(2- TA98 + S9 17900 [48] hydroxyethyl)-amino]-5-methoxy- TA100 + S9 146 phenyl]-5-amino-7-bromo-4-chloro- YG1024 + S9 485000 2H-benzotriazole (PBTA-6) # YG1029 + S9 953 H3C O Cl HN OH H2N N N N N OH Br 2-[2-(acetylamino)-4-(diethylamino)-5- TA98 + S9 43000 [50] methoxyphenyl]-6-amino-4-bromo-7- TA100 + S9 393 chloro-2H-benzotriazole (PBTA-7) # YG1024 + S9 1430000 H3C YG1029 + S9 648 O Cl HN H N 2 N CH3 N N

N CH3

Br 2-[2-(acetylamino)-4-[(2- TA98 + S9 1060 [49] hydroxyethyl)amino]-5- TA100 + S9 36 methoxyphenyl]-6-amino-4-bromo-2H- YG1024 + S9 159000 benzotriazole (non-Cl-PBTA-3) # YG1029 + S9 640 H3C O HN H2N N H N N N OH Br 2-[2-(acetylamino)-4-(diethylamino)-5- TA98 + S9 1520 [49] methoxyphenyl]-6-amino-4-bromo-2H- TA100 + S9 512 benzotriazole (non-Cl-PBTA-7) # YG1024 + S9 178000 H3C YG1029 + S9 1100 O HN H N 2 N CH3 N N

N CH3

Br 4-amino-3,3'-dichloro-5,4'- TA98 - S9 66000 [46] dinitrobiphenyl# TA98 + S9 1700 TA100 - S9 880 O2N TA100 + S9 90 YG1024 - S9 140000 H N NO YG1024 + S9 2900 2 2 YG1021 - S9 7700 YG1021 + S9 3000 Cl Cl

15 Chapter 2

Molecules identified and involved in Salmonella strains Mutagenicity Reference the mutagenicity of a BR extract (number of [CAS Number] revertants) 3,3’-dichlorobenzidine [1194-65-6] YG1024 - S9 266000 [30] Cl YG1024 + S9 189000 YG1029 - S9 2100 YG1029 + S9 25300 H2N NH2

Cl 4,4’-diamino-3,3’-dichloro-5- TA98 + S9 8700 [45] nitrobiphenyl TA100 + S9 100 O2N YG1024 + S9 24200 YG1029 + S9 900

H2N NH2

Cl Cl

2.4 Separation in EDA studies

In general, separation in EDA is an important step as it is used to isolate compounds within different fractions, but also provides information that can be used during analytical identification. In order to sequentially reduce the complexity of environmental samples by the removal of non-toxic substances, fractionation methods are used. They are based on different physico-chemical properties of the analyte including: polarity, hydrophobicity, ability of protonation/deprotonation, molecular size, planarity and the presence of functional groups [72]. The fractionation provides information on the physical-chemical properties of the compounds present in the sample. Such information is then used during the identification procedure of the EDA study. In most cases, aqueous samples are fractionated using reversed-phase high- performance liquid chromatography (RP-HPLC) to simplify the samples [72]. RP-HPLC separation is predominantly based on lipophilicity, expressed by the octanol-water partitioning coefficient (log Kow). Most of RP-HPLC applications are carried out with octadecyl silica (ODS) columns [12,30,46,49,73,74]. However, numerous new stationary phases with other functionalities have been developed and provide different selectivity to the traditional C18 and C8 stationary phases [73,75,76]. For instance, in addition to lipophilicity, amino- phases provide acid-base interaction with analytes, cyano- phases provide dipolar interactions, phenyl- phases provide π-π interactions, and nitro- phases provide strong dipolar interactions [75]. Thus, the different physical-chemical properties of analytes, such as size and electron density of aromatic systems, intramolecular steric hindrance or bonds can be exploited to reach the maximum separation between the different compounds in water [77]. The main disadvantage of RP-HPLC fractionation is the requirement of additional solid phase extraction (SPE) to concentrate the chemicals for analysis or biotesting. The main advantage of using RP-HPLC fractionation method is it provides information about lipophilicity of the toxic analyte [72].

Log Kow represents a good model for retention of neutral compound on C18 columns. A correlation between the retention time and the log Kow can be used to determine the log Kow range of each fraction and thus, assign a log Kow range for peaks within the fraction of

16 Chapter 2

interest. During the identification procedure, the log Kow of each candidate can be calculated by EPISuite [78] and compared with the determined range of the fraction. Thus, candidates can be eliminated or kept depending if their log Kow is within the defined log Kow range of the fraction. However, in the case of complex molecules, for which the presence of multiple functional groups can involve additional interactions to the lipophilicity, the prediction based on the log Kow approach can be less accurate. To address to this limitation another tool, based on the Abraham equation [79], can be used and is further described below, Section 2.7.1.

Alternatively or in addition, chemicals in aqueous samples can be separated into acid, base and neutral fractions, using ion-exchange separation techniques. This separation procedure is based on the ability of chemicals to be protonated and deprotonated according to the solution pH. This ability is given by the acid dissociation constant of the analytes (Ka, typically expressed by the logarithm form pKa). Thus, in solution at low pH, basic compounds are protonated, while neutral and acidic compounds remain non-charged. In high pH solutions, acidic compounds are deprotonated while neutral and basic compounds are not charged. By adjusting the pH of the solution, chemicals are protonated, deprotonated or neutral and can be separated by charge. Laven et al. [80] used serial mixed-mode cation- and anion-exchange solid-phase extraction to separate basic, neutral and acidic pharmaceuticals in wastewater. The two different polymeric phases used in SPE contain sulphonate anion functional groups to selectively interact with the protonated basic compounds at pH 2 or quaternary cation functional groups to selectively interact with deprotonated acid at pH 11. By the combination of these two ion-exchange phases the three classes of pharmaceutical were successfully separated.

2.5 Analytical methods in EDA studies

In general, identification of unknowns relies mainly on information gained by chromatography connected to mass spectrometry. Gas chromatography-mass spectrometry methods (GC-MS) have largely contributed to the characterisation of non-polar contaminants in different matrices and are often used in EDA studies [4,26,81,82]. This is due to the availability, ease of use and reproductibility of the method. This method presents also the advantage of a hard ionisation (e.g. electron ionisation) enabling the production of numerous fragments and/or easily identifiable mass spectra [83]. Large spectral databases (e.g. NIST) are available and suitable to identify pollutants. The major disadvantage of GC-MS methods is the limited range of compounds that can be analysed. Only compounds that vaporize below about 300°C and are stable up to this temperature can be successfully detected, eliminating those with low volatility, thermal instability, or high polarity [84]. Liquid chromatography-mass spectrometry (LC-MS) methods are becoming increasingly popular as they extend the investigation of water contaminants to non-volatile, polar and highly polar, and thermally labile compounds. This has allowed the detection of chemicals that were not routinely detected historically (e.g. pharmaceuticals, pesticides and personal care products) [85]. Many more compound classes can be determined by LC-MS than GC-MS because of the broad scope of high pressure liquid chromatography (HPLC), including reversed phase, normal phase, and ion exchange [84]. This variety of stationary

17 Chapter 2 phases allows for selective, efficient separation without derivatisation. For instance, the use of different stationary phases in combination proved to be efficient in separing polyaromatic compounds into distinct compound groups according to polarity, planarity and number of aromatic rings, in one HPLC run [77]. LC-MS/MS analyses are performed most commonly with (i) electrospray (ESI), an ionisation process that uses electrical fields to generate charged droplets and subsequently generate analyte ions by evaporation and (ii) atmospheric pressure chemical ionisation (APCI), which is a gas phase chemical ionisation process where the evaporating solvent acts as the chemical reagent to ionise the analytes [86]. As a simple rule, analytes that can be ionised in solution are more suitable for ESI methods, while other analytes are more suitable for APCI [87,88]. Complementary ionisation techniques can be used to cover the widest range of identified compounds. In comparison with electron ionisation, ESI and APCI are considered as soft ionisation techniques and generally only provide information about the molecular weight of substances. However, this can be useful, especially with tandem high accuracy mass spectrometry. This method provides the exact molecular weight of the compounds (MS) and the exact mass of fragments (MSn) and thus information regarding the substructure of the compounds. LC coupled to high resolution mass spectrometry is an efficient technique for identification of pollutants in water. There are different instruments that provide high resolution mass spectra including time-of-flight (TOF), quadrupole TOF and Orbitrap. Of particular interest, the LTQ-Orbitrap MSMS combines the tandem mass spectrometry capability of the ion trap with the high resolution and mass accuracy capability of the Orbitrap. This combination provides high- accuracy mass measurements of the precursor ions and their product ions (MSn), resulting in the determination of the m/z of an ion within 5 ppm and covers a wide mass range for detection [89]. However this technique suffers from the lack of spectral libraries as the libraries existing for GC-MS techniques and identification by LC-MS methods requires strategies and the assistance of computer programs.

2.6 Strategies for identification of toxicants

When analysing a mixture that contains known and unknown compounds at low concentrations in complex samples, LC coupled to high resolution MS has opened new perspectives in chemical identification. In recent years, several strategies were developed to identify target and non-target pollutants in environmental samples. There are three types of analytical approaches: (i) target analysis based on analytical standards, (ii) suspects screening without reference standards, and (iii) non-target analysis for identification of unknowns without reference standards [90]. A workflow for these approaches is presented in Figure 2-7.

18 Chapter 2

Sampling, extraction, purification

HPLC separation and high resolution mass spectrometry

Target analysis Suspect screening Non-target screening with reference with reference without reference standards standards standards

Target ion list Suspect ion list

Automated exact mass Exact mass filtering Exact mass filtering filtering and peak detection

Non-target ion list

Matching of measured Elemental formula fit by and theoretical isotope heuristic filtering of pattern of suspects molecular formula

Structure search in database

Matching of measured Rt Matching of measured Rt Matching of measured Rt with predicted Rt of with Rt of standards with predicted Rt of suspects database hits

Matching of measured and Matching of measured and Matching of measured and predicted MS/MS with that of predicted MS/MS predicted MS/MS reference standards fragmentation of suspects fragmentation of database hits

Quantification of targets List of likely present suspects List of likely present unknowns

Figure 2- 7: Workflows for the three identification procedures of pollutants in environmental samples by using LC-high resolution-MS (redrawn from [90]).

2.6.1 Target analysis approach

This approach is designed to identify known compounds that are present in samples. It relies on the use of reference standards for which exact mass, retention time and MSn fragments are known (Figure 2-7). The identification criteria for confirmation of a target analysis approach using HRMS and an Orbitrap have been discussed by Hogenboom et al. [91] and are: (i) LC retention time, (ii) accurate mass of the precursor ion (using a resolving power of 100000) and, (iii) fragmentation pattern. Full-scan accurate mass measurements of the analytes are compared with theoretical exact masses of standard references. Confirmation of the identity is done by comparing the retention time and the fragmentation pattern of the analyte to the reference standard. Most of the target analysis approaches are applied to water analysis of pharmaceuticals [92] and required a self-created accurate mass database. Such methodology allows the determination of a large number of compounds in a single analysis, provides information about the presence of contaminants in the environment [93] and can be applied in routine analysis [92]. The main drawback of this approach is the limitation of the database due to the lack of environmental standard references.

19 Chapter 2

2.6.2 Suspect compounds screening approach

In contrast to target analysis, the suspects screening approach does not rely on reference standards, but on suspect compounds for which specific information, such as molecular formula and structure are available. Suspect-compounds can include transformation products (TPs) and/or industrial produced compounds that are expected in waste water effluents. This approach relies on exact mass filtering followed by the match of predicted parameters of the suspected compounds with the parameters of the unknown compounds to identify (Figure 2-7). One parameter involved in this approach is retention time in RP-LC n often using hydrophobicity (log Kow) as a retention predictor. The other is MS fragmentation interpretation and prediction [94]. Both approaches are described in more detail in Section 2.4 n (log Kow) and in Section 2.6.3 (MS prediction). First, the empirical formula allows the calculation of an exact m/z of the suspected ion, which is then extracted from the high resolution full-scan chromatogram. Second, in the case of a positive match, predicted retention time is compared to the retention time of the unknown, and third, in the case of a positive match, the predicted MSn fragments are compared with the unknown MSn fragments. This approach was used by Kern et al. [94], who developed a procedure to screen for large number of suspected TPs in environmental samples. The created list of target TPs included 1794 compounds generated by a rule-based system to predict products of microbial metabolism. Using a six-step funnelling procedure (see [94] for more details), 19 TPs were identified in surface waters from agricultural areas. This approach requires an efficient filtering procedure, due to the large number of suspects, comprising rather straightforward and obvious criteria such as absence in analytical blanks and the match of the observed isotope patterns with theoretically predicted ones for the molecular formula of the suspect [90]. The suspect target screening procedure is highly efficient in narrowing down a large number of plausible TPs to those few that are actually present in a given environmental sample [94].

2.6.3 Non-target screening approach (unknown identification)

Identification of unknown compounds is more challenging as the procedure starts without any information on the compounds to be detected [90]. There are several unknown identification procedures using different software (e.g. [91,95-101]), but three steps are common in typical screening workflow (Figure 2-7): (i) an automated peak detection by exact mass filtering from the high resolution full-scan chromatogram (e.g. Pitarch et al. [102] used the Chromalynx XS software to screen for non-targeted organic contaminants in environmental waters, Bataineh et al. [95] used the MZmine software for the identification of polar compounds in freshwater sediments), (ii) an assignment of an empirical formula to the exact mass of interest (e.g. Weiss et al. [101] used the composition tool provided in Xcalibur) and, (iii) a database search of plausible structures for the determined empirical formula (e.g. PubChem, ChemSpider) [93]. The determination of the molecular ion from a given spectrum is one of the crucial steps during mass spectral evaluation. In most LC-MS studies, electrospray (ESI) and atmospheric pressure chemical ionisation (APCI) are used and both produce adducts such as [M+H]+ or [M+Na]+ or multiple others. Thus, the nature of the adduct has to be determined

20 Chapter 2 before assigning the elemental composition using accurate mass measurements [103]. As soon as the elemental composition is known, database searches are performed and the resulting possible structures can be numerous. At this stage it is important to be able to rank the candidates for selection and exclusion from the list of candidates. This procedure usually involves HR-MSMS. However the search for unknowns in MSMS ion libraries is limited to the recorded spectra of analytical standards [84], which is not sufficient for a real unknown screening [90]. Hill et al. [98] showed that unknown chemical structures can be determined by matching their experimental fragmentation spectra with computational fragmentation spectra of compounds retrieved from chemical database and matching experimental MSn fragments with predicted fragments is a valuable tool to select candidates from database.

MZmine is a software enabling data processing of LC-MS analysis. Its special features are: (i) input file manipulation, (ii) spectral filtering, (iii) peak detection, aiming to find the peaks in the measurement data, chromatographic alignment, which allow comparison in between LC-MS/MS runs by the search for corresponding peaks across different runs, (iv) visualisation of the peak shape of the extracted m/z as well as its area and height (after processing, data is exported as a delimited peak height and area matrix) and (v) data export to an excel sheet in order to create csv files which are then used as “libraries” to compare and search other LC-MS/MS runs [104].

MOLGEN-MSMS [105] is a computer program that assigns the empirical formula to an unknown compound based on MS and MS2 experimental data. The program uses the measured exact mass of the unknown molecular ion to generate candidate molecular formulas. For each of these, the theoretical isotope pattern is compared to the measured intensities of the isotopic peaks in the MS to generate a match value. The MS2 fragments are then investigated to determine if a sub-formula of the candidate formula can be assigned to each peak, with a match value calculated based on the number of assigned fragments (within a given error range). The combined match value (based on the MS and MS2 data) can then be used to select the formula best fits the data. The exact mass error can also be taken into account.

MetFrag [106] is a computer program designed to support the identification of metabolite using tandem mass spectra. However, it can also be used for the identification of other organic compounds such as environmental pollutants. This program is able to screen dozens to thousands of candidates retrieved from compound databases such as PubChem or ChemSpider. It performs an exact mass search in these databases using the m/z value of the unknown compound of interest. A list of candidates is provided along with their fragmentation based on the bond disconnection approach instead of using cleavage rules. These “theoretical” fragments are then compared with the experimental fragments of the unknown compounds. Based on the m/z and intensity of the fragments, bond dissociation energy and neutral loss comparison between “theoretical” and experimental fragments, the candidates are ranked. It is then easy to remove unlikely candidates with low scores from the list. For instance a candidate can be removed from this list if it cannot explain the most intense fragment. Thus, MetFrag is able to identify small molecules from tandem MS measurement among a large set of candidate structures.

21 Chapter 2

2.7 Further classifiers to support identification

Matching experimental with computational fragmentation spectra is a valid method for rapid discrimination of compounds having the same empirical formula. However, there are often several possible candidate structures for an individual peak and thus an empirical formula of a compound to be identified. Further classifiers are needed to exclude as many candidates as possible. Retention prediction and toxicity prediction can help in the selection of candidates.

2.7.1 Retention prediction

Retention indices are important tools to help reduce the number of possible structures derived from mass spectra [22]. However, retention indices are not available for all compounds and have to be predicted. Retention time predictions are done with quantitative structure retention relationships (QSRRs) that describe properties and retention behaviours of analytes [107]. A simple approach is the retention prediction from log Kow as explained above (Section 2.4). Since octadecylsilica stationary phases differ significantly from octanol with respect to possible analyte interactions, log Kow based predictions may be rather poor for compounds exhibiting polar interaction or hydrogen bonding. For more exact retention prediction for structure elucidation, linear solvation-energy relationships (LSERs) [79] may be used. They allow the prediction of retention properties of organic compounds in GC and LC. The equation is presented below (Equation 2-1) and contains different parameters describing the interactions between analyte and solvent. Each of these interaction terms is calculated as the product of two complementary descriptors, one representing the sorption properties of the chromatographic column and the other describing the complementary property of the analyte [83]. The idea of Abraham equation is that the solute property has to be related to the partitioning process (e.g. log Kow, solubility) and retention [108].

Equation 2-1: solute property = aA + bB + sS + eE + vV + c

The solute property describes the interactions between analytes and mobile phase and between analytes and stationary phase. The descriptor A, B, S, E and V characterise the potency of the compounds to interact with surrounding phases while parameters a, b, s, e and v characterise the phase. Descriptor A corresponds to the hydrogen bond acidity, expressing the ability of the compound to donate a proton in a hydrogen bond. Descriptor B corresponds to the hydrogen bond basicity, expressing the ability of a compound to accept a proton in a hydrogen bond. Descriptor S measures the solute dipolarity and polarisability. E describes the excess molar refraction, and McGowan volume V is calculated from the molecular composition. The coefficient a reflects the ability of the solvent to accept a proton in a hydrogen bond, coefficient b describes the ability of the solvent to donate a proton in a hydrogen bond, coefficient s describes polarisability and dipolarity of the phase, coefficient e represents the Van-der-Waals forces of the solvent, coefficient v is the cavity formation of the phase and coefficient c is the intercept of the equation [109-111].

The solute property is the dependant variable of this equation and can represent different properties including the octanol-water partition coefficient (log Kow) or the chromatographic hydrophobicity index (CHI) [110]. The latter is directly connected to the

22 Chapter 2 elution gradient of the LC system. CHI is defined as the percentage of solvent (organic modifier) needed to elute the analyte in a linear gradient. For instance, with a gradient from 0 to 100 % of solvent, a CHI value of 80 means that the compound will elute when the gradient reaches 80% of solvent. The advantage of the CHI is that it is nearly independent from the gradient setup and column dimensions. The model does not need to be developed on each column and require only a “calibration” of the column used for the analyses. This calibration is usually done with standards that have known CHI value (see Chapter 5, Section 5-2 for the application). This prediction was found to be accurate enough to be an excellent tool to efficiently reduce the list of possible candidates [83] and could be included in EDA study as an additional classifier to MSn fragments.

2.7.2 Mutagenicity prediction

In EDA studies, only those chemicals that may cause the effect of concern are important. However, standards are not always available. Therefore, quantitative structure activity relationships (QSARs) [112] and structural alerts [63] are valuable tools to (i) select potentially active compounds from identified substances in a toxic fraction and (ii) to reduce and prioritise the number of candidates [113]. Kazius et al. [63] showed that mutagenicity is often related to specific chemical substructures, which are presented in Table 2-3. These structural alerts may be used as a starting point of mutagenicity prediction. However, not all of the compounds sharing these substructures are mutagenic. For instance within the large group of aromatic amines, some are known to be mutagenic (e.g. 2-aminoanthracene (used as positive control in the Ames test), 2- amino-3-methylimidazo-[4,5-f]quinolone [42], 3-amino-1,4-dimethyl-5H-pyrido[4,3-b]indole [42]) while others are not (e.g. 1-amino,2-hydroxynaphthalene [69], aniline).

Table 2- 3: Substructures involved in mutagenicity. Table adapted from [63]. Substructures Representation Substructures Representation - O O + Unsubstituted NH2,OH Aromatic nitro N heteroatom-bonded heteroatom N,O aro NH2 Aromatic amine Azo-type N = N aro Three membered NH,O,S Aliphatic halide -Cl,Br,I heterocycle arom. ring Polycyclic aromatic nitroso N = O aro system arom. ring

Quantitative structure activity relationships (QSARs) are used to predict physico- chemical properties and effects of chemicals from the structure [114]. Aromatic amines and azaarenes are widespread substances used in industry (e.g. drugs, dyes, paper, leather, plastics, cosmetics, and food) including many mutagens. While diagnostic strains can be used to link mutagenic effects to this compound group, QSARs may be helpful to predict the individual mutagenic potency of candidate structures. This is particularly useful since the

23 Chapter 2 mode of action of aromatic amines is well investigated and is related to the stability of nitrenium ions as the ultimate electrophilic metabolite that covalently binds and alters DNA [65-67,69] (see section 2.3.2). Nitrenium ions are di-coordinate nitrogen intermediates with a single pair and positive charge on the nitrogen [67]. Such intermediates are short-lived and recent studies have shown that increasing their life-time will increase their potential toward DNA binding [65,67,69], suggesting nitrenium stability is the key factor in mutagenicity. In general, nitrenium stability is strongly correlated with charge density at the exocyclic nitrogen (charge distribution over the molecule) and the following factors were found to have stabilising effects on the nitrenium ion: (i) hyperconjugation and inductive effects [65,69,115], involving mainly methyl substituent in ortho- position of the amine group. (ii) mesomeric effects (resonance effects) [65,115], involving delocalisation of the cationic charge through the aromatic system. These effects increase with the increase aromatic rings due to greater delocalisation of the cationic charge. These effects can also be linked to hydrophobicity, since the increased number of rings leads to the increase of hydrophobicity. It was noted by some authors [63,115], that bulk substituent in the positions adjacent to the amino group is unfavourable and leads to a decrease of potency. Borosky [65] also showed that the nitrenium ion stability decreases when the number of nitrogen atoms increases in the ring system.

One calculation method to predict mutagenicity of aromatic amines, was developed by Ford et al. [116,117] and used a semiempirical method (AM1) to calculate the stability of the nitrenium. In this method, reaction energies are calculated relative to aniline, which is a non mutagenic aromatic amine. It was shown that the energy of formation of the nitrenium ion varies significantly between different amines compared to other reaction steps during aromatic amine activation. This suggests nitrenium ion formation is the rate limiting step. Using aniline as a reference compound for relative formation energy can be calculated for the reaction below.

+ + Ar-NH2 + PhN H → Ar-N H + PhNH2

+ Ar-NH2 represents the parent amine, while Ar-N H is the corresponding nitrenium ion. + Similarly PhNH2 is aniline with PhN H as the corresponding nitrenium. From this reaction, relative energies ΔΔE are calculated by the equation 2-2, where ΔE values correspond to the heat of formation (ΔHf) of the compounds:

Equation 2-2: ΔΔE = ΔEAr-N+H + ΔEPhNH2 - ΔEAr-NH2 – ΔEPhN+H

Since aniline has been found to be non-mutagenic in the Ames test, positive ΔΔE values are related to a low probability of being mutagenic. Negative ΔΔE values indicate high probabilities of Ames positive results. In the present thesis this method of mutagenicity prediction was used to decrease the list of candidates by removing candidates with a low probability of being mutagenic.

The methods described above were implemented in the following chapters of this work, with the aim of proposing a limited list of candidates and/or identifying mutagens present in a water sample.

24 Chapter 3

CHAPTER 3: DEVELOPMENT AND APPLICATION OF INTEGRATED TOOLS FOR SAMPLING AND ISOLATION OF PLANAR POLYCYCLIC MUTAGENS

3.1 Introduction

Planar aromatic compounds, including many polar compounds, have been identified as a key compound group in waste and surface waters responsible for mutagenicity [8]. These compounds include combustion and pyrolysis products [118] as well as many products and by-products of chemical industry such as dyes [119,120] and are often present as complex mixtures in waters [121]. Planar compounds including those with multiple fused aromatic rings can be sampled selectively using blue rayon (BR), a fibre (e.g. rayon, cotton) with a covalently-bound copper phthalocyanine trisulfonate pigment [16,40]. The separation of this highly complex compound group is one of the major challenges for the isolation and identification of individual components and requires adequate fractionation procedures. Poly-aromatic compounds include different sub-groups, such as polycyclic aromatic hydrocarbons (PAHs) and more polar, water soluble derivatives [122,123], such as quinones, hydroxyl-, keto-, nitro-keto-, carboxylic- and nitro-PAHs and heterocyclic amines with one or more in the ring system [123]. These different compound groups are characterised by different physico-chemical properties, which can be used to efficiently separate them. Their ability or non-ability to act as acids or bases is a key property that can be exploited for separation using ion exchange [73,80,122]. Another important property is lipophilicity as the tendency to partition from water into , octanol or reversed-phase (RP) chromatographic stationary phases. Many EDA studies use reversed- phase high pressure liquid chromatography (RP-HPLC) [72]. While lipophilicity is the main driving force in RP-HPLC, the availability of a broad variety of stationary-phases allows for the exploitation of additional and complementary interactions (see Chapter 2, Section 2.4) supporting a selective and efficient separation. The aim of this study was to use a passive sampling method and develop a fractionation method for planar aromatic and potentially mutagenic compounds with a broad range of polarity from surface waters. The development aimed at a multistep fractionation procedure with an acid/base partitioning step and orthogonal separation on different RP stationary phases. To optimise recovery, two different approaches were applied. Fractionation of artificial mixtures of standard compounds provided precise chemical analytical recoveries for initial method development steps. In addition, recovery studies on the basis of mutagenicity were performed.

3.2 Materials and methods 3.2.1 Chemicals

HPLC grade methanol (MeOH), GC grade toluene, analytical reagent grade dimethyl sulfoxide (DMSO), acetic and methanoic acid were obtained from Merck (Darmstadt,

Germany). The ammonium hydroxide solution (NH4OH) (25% in water) was obtained from Fluka and ammonium acetate from Sigma-Aldrich. Other analytical standards suppliers are

25 Chapter 3 given in Table 3-1. The water used was bi-distilled in the laboratory. The blue rayon was purchased from Funakoshi Co. Ltd. (Tokyo, Japan). OASIS MCX, MAX and HLB solid phase extraction cartridges (1 g, 500 mg, and 200 mg respectively) were purchased from Waters (Milford, MA, USA).

Table 3- 1: Name, structure and suppliers of the analytical standards used. Name Chemical structure Suppliers N acridine Aldrich

Cl

atrazine CH3 N N Promochem

H3C NHN NH CH3 NH2

N Toronto research 2-amino-3,8-dimethylimidazo[4,5- N CH3 Chemicals f]quinoxaline (MeIQX) N H3C N NH2 N 2-amino-3-methylimidazo[4,5-f]quinoline N CH3 ABCR (IQ)

N

2-amino-1-methyl-6-phenylimidazo[4,5- CH3 N ABCR b]pyridine (PHIP) NH2 N N

2-amino-9H-pyrido[2,3-b]indole NH2 MP biomedicals N (AαC) N LCC H O O

H3C O N benalaxyl CH3 Ehrenstorfer H3C

H3C

Ges. für benz[c]acridine N Teerverwertung

benzanthrone Aldrich

O

N benzo[h]quinoline Aldrich

O benzophenone Fluka

N benzotriazole N Fluka N H

26 Chapter 3

Name Chemical structure Suppliers

CH3

N N O caffeine N Merck N CH3 H3C O

Sigma carbamazepine N Aldrich O H2N H N carbazole Aldrich

Cl

NO2 1-chloro-2,4-dinitrobenzene Sigma Aldrich

NO2 O

1,4 chrysenquinone Chiron

O

5,6 chrysenquinone Chiron O

O

4H-cyclopenta[def]phenanthren-4-one Chiron

O NH2

1,8 diaminopyrene H2N Chiron

Cl

H N 3,3’dichlorobenzidine 2 NH2 Riedel

Cl

H3C CH3

N,N’-diethyl-N,N’-diphenylurea N N Aldrich

O

O OH

1,5-dihydroxyanthraquinone Aldrich

OH O NO2 1,3-dinitrobenzene Fluka

NO2

27 Chapter 3

Name Chemical structure Suppliers

NO2

1,8-dinitropyrene O2N Aldrich

CH3

NO2 2,4-dinitrotoluene Sigma Aldrich

NO2

9-fluorenone Aldrich

O O

1H-isoindole-1,3(2H)-dione NH Riedel

O OH O

1-hydroxyanthraquinone Synthesised

O O OH 2-hydroxyanthraquinone Tokyo Kasei

O OH

3-hydroxybenzo(a)pyrene Chiron

OH

1-hydroxypyrene Chiron

N N O Fluka metazachlor N CH3 Cl

H3C

OH

NO2 4-methyl-2-nitrophenol Alfa Aesar

CH3

1-methyl-9H-pyrido[3,4-b]indole N Aldrich (Harmane) N CH3 H NH 2-naphthalenamine, N-phenyl- Aldrich

O

1,4 -dione Fluka

O

28 Chapter 3

Name Chemical structure Suppliers

2-naphthoic acid Alfa Aesar

HO O OH 1-naphthol Fluka

NO2

3-nitrobenzanthrone Chiron

O Cl nitrofen ABCR Cl ONO2

NO2 2-nitro-9-fluorenone Aldrich

O NO2

1-nitropyrene Aldrich

NO2

4-nitroquinoline 1-oxid Fluka + N - O CH3 2-nitrotoluene NO2 Riedel

9,10-phenanthrenedione Promochem

O O CH3

O2N NO2 trinitrotoluene Sigma Aldrich

NO2

3.2.2 Mutagenicity assay

To assess recovery based on mutagenicity, the Ames fluctuation assay (Ames II test) was performed as described by Perez et al. [124] with slight variations. Samples were tested using the tester strain TA98 with and without metabolic activation by S9 on 24-well and 384- well microplates, with three replicates per sample. Cultures of TA98 were grown overnight in oxoid nutrient broth Number 2 and ampicillin (50 μg ml-1) in an incubation shaker at 37°C. The growth of the bacterial suspension was controlled by measuring the bacterial density with a spectrophotometer and adjusted with minimal medium to a given value. After the samples were transferred to 24-well plates (10 μL per well), the culture was added (490 μL per well)

29 Chapter 3 and the suspensions were pre-incubated (90 min at 37°C with shaking). The pre-incubated suspensions were diluted six-fold in indicator medium and distributed into 384-well plates (50 μL per well, 48 wells per replicate). Because growth of reversed bacteria leads to acidification and change of the indicator color from purple to yellow, yellow wells were counted after incubation (48 h at 37°C). Positive controls included 2-nitrofluorene (20 ng/well) for TA98 and 2-aminoanthracene (5 ng/well) for the TA98 with activation and were performed with four replicates and at least eight replicates for the negative controls (only DMSO).

A sample was considered mutagenic when the effect was higher than the baseline (BL), which was calculated based on the mean value (MV), standard deviation (SD), and negative control (K-) as shown in Equation 3-1:

Equation 3-1: BL = MVK- + 2 SDK-

3.2.3 Optimisation of the fractionation procedure 3.2.3.1 Method development strategy

The fractionation procedure was developed stepwise as follows:

First, 47 planar aromatic standard compounds were selected covering a broad range of lipophilicity (log Kow = 0 to 6). Priority was given to compounds that had been detected previously at the sampling site and those environmental chemicals reported in the literature to adsorb to BR. Secondly, mixed-cation exchange (MCX) and mixed-anion exchange (MAX) solid phase extraction (SPE) sorbents were used to separate the compounds into three distinct fractions. The elution mixtures were optimised for polar planar compounds and the ability of these sorbents to separate the chemicals into acidic, basic and neutral fractions according to their behaviour in water was studied. Following the SPE step, two sequential separation steps using reversed-phase HPLC columns were introduced into the procedure. Six different stationary phases were tested to find complementary elution times (orthogonal elution) and resolution of the maximum number of individual peaks. This was done with analytical columns. Finally, the three-step fractionation method was tested with the selected preparative columns and validated based on the separation and recoveries of the standard compounds as well as the recovery of the mutagenicity of a BR extract using the Ames fluctuation assay test.

3.2.3.2 Ion exchange cartridges

The protocol used was based on Lavén’s procedure for pharmaceuticals [80], with adapted and optimised elution mixtures for polar planar compounds. The optimisation was based on attaining maximum standard recovery. Standards mixtures and the BR extract, both in methanol, were diluted 20 times with water and the pH was adjusted to approximately 2.5 with methanoic acid, protonating the basic compounds for maximum ionic interaction with the sulfonic acid groups of the sorbent. The MCX sorbent was washed with 20 mL of methanol/ammonium hydroxide (90:10%, v/v) followed by 10 mL of bi-distilled water. The diluted standard mixture or the BR extract were loaded on the MCX sorbent, which was then rinsed with 10 mL of bi-distilled water. The

30 Chapter 3 acidic and neutral compounds were eluted first as a single fraction (A+N) with 20 mL of a methanol/toluene mixture (50:50%, v/v). Following this, the basic compounds (fraction B) were eluted with 24 mL of methanol/ammonium hydroxide (90:10%,v/v) and 12 mL of toluene/ ammonium hydroxide (90:10%,v/v). Both fractions were evaporated to dryness. The A+N fraction was then redissolved in methanol, diluted 20 times with water and the pH was adjusted to approximately 11 using ammonium hydroxide to deprotonate acidic compounds for ionic interaction with the quaternary amine groups of the MAX sorbent. The sorbent was washed with 20 mL of methanol/methanoic acid (90:10%, v/v) followed by 10 mL of bi- distilled water. The diluted A+N fraction was loaded on the MAX sorbent, which was then rinsed with 10 mL of bi-distilled water. The neutral compounds (fraction N) were eluted with 20 mL of a methanol/toluene mixture (50:50%, v/v) followed by the acidic compounds (fraction A) with 24 mL of methanol/methanoic acid (90:10%, v/v) and 12 mL of toluene/methanoic acid (90:10%, v/v). These two fractions were evaporated to dryness.

3.2.3.3 Separation on analytical columns

LC separation of standards on analytical columns was performed on an Agilent series 1200 HPLC system consisting of a degasser, a high-pressure binary SL pump, an autosampler, a column oven and a diode array detector operated from 210 to 250 nm (Agilent Technologies, Santa Clara, CA, USA). This system was controlled by the ChemStation software. Standard compounds were analysed with an ion trap-Orbitrap hybrid instrument (LTQ Orbitrap, Thermo Scientific, Bremen, Germany) equipped with an atmospheric pressure chemical ionization (APCI) source and controlled by the Xcalibur software. MS spectra in the mass range of m/z 60-400 were acquired in the Orbitrap with a resolving power of R = 60 000. Tandem MS (MS2) experiments were conducted by collision-inducted dissociation (CID) in the ion trap and product ions were transferred to the Orbitrap for detection. All data were processed by Qual Browser (Thermo Fisher Scientific, San Jose, USA). The next two steps of the fractionation procedure relied on reversed-phase HPLC columns. In this set-up the two columns that allowed for the best separation of the whole set of standards were selected. The target compound groups were expected to display the following interactions with the chromatographic phases: (i) lipophilicity according to log Kow, (ii) shape selectivity, (iii) polar selectivity and (iv) aromatic selectivity (π-π interactions). Orthogonal elution of both stationary phases with low correlation between retention times on the different columns and the greatest number of eluted peaks possible was the goal. Six stainless steel analytical columns were tested, a polymeric C18 (Zorbax Eclipse PAH Plus rapid resolution, 150 x 3.0 mm, 1.8µm, Agilent, Santa Clara, CA,USA), a C18 (Lichrospher 100 RP-18, 250 x 4.0 mm, 5µm, Merck, Darmstadt, Germany), a phenyl-hexyl (Zorbax Eclipse Plus rapid resolution, 150 x 3.0 mm, 1.8µm, Agilent), a pentafluorophenyl (Pursuit 5 PFP, 250 x 4.0 mm, 5µm, Varian, Palo Alto, CA,USA), a pyrenyl (Cosmosil 5PYE Waters, 150 x 4.6 mm, Nacalai Tesque, Kyoto, Japan) and a ß-cyclodextrin phase (ChiraDex, 250 x 4.0 mm, 5µm, Agilent). Columns were tested with five different gradients each using a solution containing all 47 standards at 1 mg/L each in methanol. The orthogonality of different chromatographic settings was determined by plotting retention times on one column versus another column.

31 Chapter 3

3.2.3.4 Fractionation on preparative columns

Separation of standards and samples on preparative LC columns was performed on a HPLC system consisting of a Rheodyne manual valve, a Varian ProStar 210 Binary Pump System with 25 mL stainless steel pump heads (Varian, Darmstadt, Germany), an UVD340U diode-array-detector (Dionex, Idstein, Germany) and a Foxy®2000 fraction collector (Teledyne Isco Inc., Lincoln, USA) controlled by the software Chromeleon® 6.7 (Dionex). The polymeric C18 (Zorbax Eclipse PAH, 250 x 9.4 mm, 5 µm, Agilent) and the phenyl hexyl (Zorbax Eclipse, 21 x 250 mm, 5µ m, Agilent) stationary phases were selected for the fractionation procedure (see Section 3.1.2 for details). These two columns were used for the validation of the methods with the standards and a BR sample. For fractionation, an up-scaling from analytical columns to the preparative system was done. In the case of the phenyl-hexyl column the equipment available required a reduction to half of the calculated preparative flow, resulting in a doubling of retention times. Up-scaling had no impact on elution orders and provided highly correlated retention times on the analytical and corresponding preparative system (r2: > 0.975 for both columns). Chromatographic details are given in Table 3-2. On this preparative fractionation set-up, fractions were collected every minute. Adjacent fractions were combined when they contributed to the same peak based on the UV absorption chromatogram. The fractions were diluted in water (neutral compounds) or in an ammonium acetate buffer, 50 mM, pH 4 (acidic compounds) and then extracted with solid phase extraction (SPE) OASIS HLB cartridges to remove water from the fractions. The cartridges were then eluted with 30 mL of methanol and the volume reduced to 2-3 mL with a rotary evaporator and gentle nitrogen stream. The recoveries of standards were determined by LC-MS/MS after the fraction were recombined.

Table 3- 2: Gradient used on preparative columns. Stationary phase Flow Gradient Mobile phase (mL/min) 0-40 min 50-100%MeOH, 40-60 MeOH/water for the neutral Polymeric C18 3.3 min 100% MeOH, then fraction equilibration of the phase MeOH/water in ammonium 0-80 min 40-100%MeOH, 80-110 acetate buffer (50mM, pH 4) Phenyl-hexyl 8.0 min 100% MeOH, then for the acidic fraction equilibration of the phase

3.2.4 Collection and preparation of organic extract from river water

The sample was collected in May 2008 on the River Elbe at Prelouc, Czech Republic, downstream of dye factory discharges. Before use, BR was washed according to the protocol developed by Kummrow et al. [13]. The final extract of the BR washing procedure was tested for mutagenicity to exclude blank mutagenicity. Ten bags, each containing 10 g BR were deployed in the river for 24 hours as described by Sakamoto and Hayatsu [38]. The BR was removed and washed several times with bi-distilled water to remove the trapped particles from the BR. The BR was extracted with 160 mL of methanol/ammonium hydroxide (50:1; v/v) per 1 g of BR and agitating for 30 minutes [38]. The procedure was repeated once and the combined extracts were evaporated to dryness and redissolved in methanol. A sub-sample of the residue, corresponding to 5 g of BR equivalent, was taken and prepared for the bioassay

32 Chapter 3 by evaporating it to dryness and redissolving it in dimethyl sulfoxide (DMSO). The remaining sample (95 g of BR equivalent) was retained for fractionation and analysis.

3.2.5 Evaluation of the method

The three-step fractionation method was evaluated based on the ability of each step (i) to recover reference standard mixtures and the mutagenicity in a BR sample and (ii) to separate the standard mixtures and the BR sample within clearly defined fractions. This evaluation was done as follow: Firstly, reference standard mixtures (Table 3-1) at individual concentrations of 1 mg/L were prepared in methanol and were used to calculate the recoveries of all single standards for each step. Secondly, the same reference standard mixtures were used to evaluate the ability of the ion exchange step to separate these different standards within the acidic, basic and neutral fraction, with limited overlapping between each fraction. Actual separation was compared to the pKa of each standard to determine if pKa could be used to predict fractionation. The capacity factor of each standard on the analytical and preparative columns for the two RP- HPLC steps was tested to ensure separation was not affected by up-scaling. Thirdly, a BR sample was used to evaluate the recovery of the sample for each step. This was done by comparing the mutagenicity of the non-fractionated sample/fraction and the fractionated-recombined sample/fraction for each step. The strain TA 98 with and without S9 was used and the comparison was based on the mutagenic concentration response curves, using eight concentrations of BR equivalent (further details below, Section 3.3.2). Finally, the separation of the different compounds contained in the sample/fraction was studied to ensure that mutagenicity was distributed on few isolated fractions with very limited overlapping within neighbouring fractions. This was done by measuring the mutagenicity of each fraction, prioritising the most mutagenic fractions for further fractionation and chemical analysis.

3.3 Results and discussion 3.3.1 Evaluation of the method based on standards 3.3.1.1 Ion exchange recoveries and separation

Mixed-cation- and anion-exchange SPE columns were used to separate the analytes into three fractions (acidic, basic and neutral fractions), based on their pKa. The recoveries for the different classes of chemicals range from 32% (quinones) to 92% (amides). In general, azaarenes, keto-PAHs, nitro-keto-PAHs, hydroxy-quinones, amides, amino-compounds and large nitro-PAHs had high recoveries (Table 3-3). Quinones (16 to 56 ± 3%) and the 1-2 ring structure nitro-compounds (0 to 44 ± 8%) exhibited the lowest recoveries. Recoveries in ion exchange fractionation were reproducible with standard deviations (STDV %) below 10 % (Table 3-3). The structures of the MCX and MAX sorbent are presented in Figure 3-1 and enable two sites and types of interactions between sorbents and analytes: (i) the first possibility of interactions is reversed-phase, driven by π-π interactions due to the presence of aromatic rings in the sorbent and in the analytes. Such interactions are apolar interactions. Bäuerlein et al.

33 Chapter 3

[125] showed that with the increase of the number of aromatic bonds in the analyte, the π-π interactions are intensified, resulting in a higher sorption capacity. Reciprocally, the smaller the number of aromatic bond is the lower the interactions are between sorbent and analytes, and hence decreasing the possibilities for π-π interactions. Furthermore, the presence of polar moieties can disturb these apolar interactions, resulting in the decrease of retention capability of the sorbent for the more polar analyte [125]; (ii) the second possibility of interactions is of type strong ion exchange interaction ruled by Coulomb forces, which are electrostatic interactions between charged groups on a sorbent and charged molecules [125]. Thus, the low recoveries of the one aromatic ring nitro-compounds can be connected to incomplete retention on MCX and MAX sorbents due to their small aromatic structure and the presence of the polar nitro groups. 4-nitroquinoline-1-oxide also shows a low elution efficiency from the MCX column, as the nitrogen of the amine oxide group bears a partial positive charge even at the high pH during elution and is thus retained on the sulphonic acid groups of the MCX.

Strong ion-exchange interactions

- R = SO3 (MCX) CH3 R + R = N (CH3)2C4H9 (MAX)

H3C

H3C CH3

N

CH3 CH3 O

Reversed-phase interactions Figure 3- 1: Structure of the MCX and MAX sorbents

All of the standards eluted according to their pKa. In general, compounds containing hydroxyl and/or carboxylic functional groups were recovered in the acidic fraction, those with a carbonyl and/or nitro functional group were in the neutral fraction and those with an amino functional group (including quaternary amines) were collected in the basic fraction (as shown in Table 3-3). The combined use of the MCX and MAX sorbents allowed separation with negligible overlap (< 5 %) except 9,10 phenanthrenedione with 14% present in the acidic fraction and 30 % in the neutral one. Acid-base separation at defined pH values assisted in optimising buffer selection during subsequent RP-HPLC separation to ensure the analytes remained uncharged to enable optimum separation and minimise peak-tailing. The probable presence or absence of certain functional groups indicated by the ion exchange SPE method also supported the identification and confirmation of unknowns in the fractions by reducing the number of possible molecular structure candidates (see e.g. [126,127]).

34

Table 3- 3: Recoveries (average ± standard deviation of three replicates) of the standard compounds in the individual fractionation steps. *pKa values were calculated with the SPARC Online Calculator, na: pKa outside of the pH range (1-14), (1 to 47): standard number used for the chromatogram in Figure 3-2

pKa* Ion exchange step Polymeric C18 step Phenyl hexyl step Acidic Basic Neutral Standards class fraction fraction fraction Azaarenes (3) acridine 4.9 x 50.24(2.5) x 68.8 (1.4) 66.2 (2.0) (6) 2-amino-3,8-dimethylimidazo[4,5- 4.1; 2.3; 3.1 x 71.6 (6.2) x 71.5 (3.1) 73.6 (4.9) f]quinoxaline (5) 2-amino-3-methylimidazo[4,5-f] 5.7 x 83.8 (3.0) x 81.3 (1.8) 83.3 (3.9) quinoline (9) 2-amino-1-methyl-6- 5.4; 4.7 x 17.2 (0.2) x 72.9 (4.1) 58.4 (3.7) phenylimidazo[4,5-b]pyridine (8) 2-amino-9H-pyrido[2,3-b]indole 1.4; 4.9 x 68.8 (3.0) x 70.9 (2.5) 34.3 (0.9) (10) benz[c]acridine 4.3 x 84.4 (3.7) 2.5 (0.4) 71.0 (4.6) 73.1 (5.1) (4) benzo[h]quinoline 4.2 0.01 (0.0) 69.9 (5.0) 0.3 (0.2) 65.7 (1.1) 77.8 (4.4) (2) benzotriazole 8.5 97.0 (0.9) 1.2 (1.3) 0.3 (0.2) 81.6 (0.9) 111.3 (79.8) (1) carbazole na 0.1 (0.1) 0.5 (0.2) 91.2 (5.5) 84.2 (1.5) 82.0 (3.4) (7) 1-methyl-9H-pyrido[3,4-b]indole 5.6 x 94.3 (8.2) x 83.1 (1.0) 88.0 (9.9) Keto-PAHs (12) benzanthrone na 0.2 (0.2) 3.3 (0.9) 83.9 (1.7) 81.2 (5.3) 82.9 (5.8) (13)4H-cyclopenta[def]phenanthrene- na 0.2 (0.1) 0.5 (0.8) 93.2 (1.0) 77.7 (1.6) 89.9 (9.0) 4-one (11) 9-fluorenone na x x 72.1 (1.8) 70.2 (1.2) 102.0 (5.3) Ketone (14) benzophenone na 0.2 (0.1) x 53.4 (4.6) 70.9 (2.2) 77.5 (5.5) Nitro-keto-PAHs (15) 3-nitrobenzanthrone na x x 83.7 (9.1) 70.5 (2.9) 79.4 (2.2) (16) 2-nitro-9-fluorenone na 0.1 (0.1) x 95.2 (5.6) 71.6 (2.1) 73.2 (8.8) Quinones (20) 1,4-chrysenequinone na x x 15.7 (1.5) 18.0 (1.0) 6.3 (6.0) (21) 5,6-chrysenequinone na x x 29.1 (6.2) 38.5 (0.1) 16.3 (20.9)

pKa* Ion exchange step Polymeric C18 step Phenyl hexyl step Acidic Basic Neutral Standards class fraction fraction fraction (19) 1H-isoindole-1,3(2H)-dione 9.9 55.9 (2.5) x x 58.7 (4.1) 75.6 (1.4) (17) 1,4-naphthalene-dione na 2.7 (4.7) x 32.6 (2.4) 41.7 (0.5) 50.3 (8.0) (18) 9,10-phenanthrenedione na 14.4 (9.1) x 30.5 (3.9) 39.5 (4.4) 34.3 (13.8) OH-quinones (22) 1,5-dihydroxyanthraquinone 9.2 76.5 (4.2) x x 72.0 (8.0) 76.1 (3.6) (24) 1-hydroxyanthraquinone 7.5 64.9 (2.6) x 2.3 (2.0) 80.94 (5.71) 88.5 (9.8) (23) 2-hydroxyanthraquinone 7.5 92.8 (4.1) 3.2 (3.2) x 87.4 (4.5) 84.9 (8.3) Amino-compounds (27) atrazine 1.11 x 75.2 (0.6) x 82.5 (3.7) 71.9 (6.2) (46) 3,3’dichlorobenzidine 2.64 x 70.8 (5.3) x 74.2 (3.9) 15.1 (1.1) (25) 1,8-diaminopyrene 4.0 x 50.8 (3.8) x 54.74 (5.41) 43.3 (5.6) (26) N-phenyl-2-naphthalenamine na 0.2 (0.2) 4.0 (1.7) 50.5 (2.0) 30.5 (2.7) 12.0 (10.2) Amides (30) benalaxyl na 0.2 (0.1) 0.7 (0.5) 92.3 (3.3) 81.2 (2.9) 87.4 (6.8) (28) carbamazepine na 0.3 (0.1) 3.3 (0.8) 90.7 (7.6) 86.8 (7.7) 86.1 (3.9) (31) N,N’-dimethyl-N,N’-diphenylurea na 0.3 (0.1) 0.6 (0.2) 90.54 (3.3) 81. 5 (1.6) 66.1 (6.5) (29) metazachlor na 0.2 (0.1) 1.6 (0.3) 93.9 (7.0) 79.9 (2.3) 76.8 (1.7)

NO2-compounds (39) 1-chloro-2,4-dinitrobenzene - x x 0.3 (0.4) 63.0 (0.5) 65.1 (13.4) (37) 1,3-dinitrobenzene - x x 44.0 (7.7) 56.5 (3.8) 51.0 (17.2) (33) 1,8-dinitropyrene - x x 61.0 (4.6) 59.0 (8.2) 65.1 (10.2) (38) 2,4-dinitrotoluene - x x 13.0 (7.2) 58.0 (4.3) 62.1 (13.6) (34) 4-methyl-2-nitrophenol 7.4 x x 16.2 (3.7) 17.1 (3.0) 4.2 (0.7) (41) nitrofen - x x 77.6 (1.5) 75.1 (3.7) 70.1 (8.6) (35) 4-nitroquinoline 1-oxid - x x 24.7 (8.3) 65.2 (3.3) 71.9 (2.6) (32) 1-nitropyrene - x 2.2 (1.9) 78.5 (7.8) 70.0 (5.4) 78.7 (5.6) (36) 2-nitrotoluene - x x x 8.5 (0.0) 7.0 (12.1)

pKa* Ion exchange step Polymeric C18 step Phenyl hexyl step Acidic Basic Neutral Standards class fraction fraction fraction (40) trinitrotoluene - x x x 53.0 (0.0) 51.7 (7.6) OH-PAHs (43) 3hydroxybenzo(a)pyrene 7.3 54.6 (10.0) x x 68.2 (12.8) 69.4 (9.4) (42) 1 hydroxypyrene 7.9 57.4 (2.8) x x 82.1 (3.9) 84.0 (2.0) (44) 1-naphthol 9.3 31.4 (9.1) x x 55.0 (11.3) 65.6 (5.3) COOH-PAHs (45) 2-naphthoic acid 4.1 75.5 (3.6) x x 52.3 (2.9) 67.6 (3.7) Purine-dione (47) caffeine na 0.6 (0.1) 5.0 (2.9) 90.8 (3.3) 84.5 (4.0) 61.9 (15.6)

Chapter 3

3.3.1.2 Selection of the stationary phases for the preparative liquid chromatography

Column combinations and gradients were optimised by investigating standard mixtures on six analytical columns and minimising the co-elution of analytes. The retention times on different columns were plotted against each other, with the resulting 2D-plot correlation coefficients ranging between 0.27 for Chiradex and C18 to 0.85 for pentafluorophenyl and C18 (Table 3-4). Thirty-eight out of 47 compounds could be separated on the phenyl-hexyl phase, while the other phases separated between 20 and 26 compounds. The polymeric C18 phase eluted slightly more peaks than the monomeric C18 phase, probably due to shape selectivity particularly for isomers such as 1-hydroxyanthraquinone and 2- hydroxyanthraquinone eluting at 32 and 27 min respectively and the two chrysenequinone isomers. The combination of polymeric C18 and the phenyl-hexyl stationary phase exhibited good orthogonality combined with a maximum number of separated peaks and was thus selected for the optimised fractionation procedure.

Table 3- 4: Pearson correlation coefficients between the retention times on the two C18 columns, on the four polar columns and number of peaks obtained from the separation of 47 standards compounds. Phenyl-hexyl Chiradex Pyrenyl Pentafluorophenyl Eluted peak Polymeric C18 0.3632 0.3214 0.4207 0.5897 23 C18 0.5886 0.2664 0.6131 0.8486 20 Eluted peak 38 26 22 24

3.3.1.3 Separation with the combination of the polymeric C18 and phenyl hexyl columns

The chromatograms of the standards were obtained by LC-APCI-MSMS (Figure 3-2a (polymeric C18 phase) and Figure 3-2b (phenyl-hexyl phase)). The polymeric C18 phase is expected to separate complex mixtures according to lipophilicity modified by the shape selectivity of this phase. Within one compound class, the retention times increase with the number of fused rings, for example 9-fluorenone (3 fused rings) eluted at 22.7 min, 4H-cyclopenta[def]phenanthrene-4-one (4 fused rings including one ring with 5 carbons) at 29.6 min and benzanthrone (4 fused rings) at 32.10 min. Shape selectivity allowed the separation of similar compounds and isomers based on the intramolecular steric hindrance and maximum length to breadth ratio [95]. This is showed by 1-hydroxyanthraquinone, which is able to form intramolecular H-bonding between the hydroxy- and neighbouring carbonyl- group, resulting in a reduction of its ability to interact with the mobile phase of the system. Thus, this compound becomes more lipophilic than its isomer, 2-hydroxyanthraquinone, which is not able to create such intramolecular H-bonding and elutes at 31.6 minutes instead of 26.9 min for the latter. The capacity factors, k, used to evaluate the ability of this stationary phase to separate the standards are listed in Table 3-5. All standards within the same compound class could be separated, except 2-amino-3- methylimidazo[4,5-f]quinoline (IQ) and 2-amino-3,8-dimethylimidazo[4,5-f]quinoxaline (MeIQx) (azaarenes) and 4-methyl-2-nitrophenol and 1-chloro-2,4-dinitrobenzene (nitro- compounds). The phenyl-hexyl phase is expected to have a high selectivity for aromatic compounds due to π-π interactions between aromatic analytes and the phenyl group in addition to lipophilic interactions [76]. The capacity factors, k, for the standards on analytical and

38 Chapter 3 preparative phenyl-hexyl are listed in Table 3-5. Both sets of capacity factors are very similar. The chromatogram obtained with LC-MS/MS (Figure 3-2b) exhibits 23 completely isolated standards and the 24 remaining standards were divided among 15 peaks. For instance benzo(h)quinoline, 1-chloro-2,4-dinitrobenzene and naphthoic acid co-eluted at 32 + 0.1 min. The complementarity of the two columns is confirmed by the chromatograms presented in Figure 3-2a and 3-2b, which show that the standards that are not separated on the polymeric C18 column (e.g. IQ – MeIQx –caffeine) are separated perfectly on the phenyl- hexyl column and vice versa (e.g. benzo(h)quinoline – 1-chloro-2,4-dinitrobenzene – 2- naphthoic acid separate completely with the polymeric C18 column but co-elute on the phenyl-hexyl ). This complementarity in the separation and elution order is important as the aim of the EDA study is to provide an efficient method to isolate compounds in order to reduce the complexity of each fraction step by step for the analysis and identification of mutagenic compounds.

39

4.00E+07

3.50E+07

3.00E+07

2.50E+07

2.00E+07

a) 3.00E+07 43 5 20 32 6 18 46 3 13 23 24 47 31 1.50E+07 17 30 41 10 33 28 8 14 42 y 37 34 2 11 19 36 2.50E+07 38 Thiourea 35 1 1.00E+07 16 7

i t i t 27 s n 9 29 t e

i n 39 s k a e P 5.00E+062.00E+07 21 15 45 4 22 12 25 44 26 0.00E+00 1.50E+070 1020304050607080Retention time (min) b) 26 17 18 36 15 35 29 38 23 22 46 31 12 24 2 28 32 5 27 4 10 y 1.00E+07 8 39 13 41 47 9 45 16 6 44 43

19 3 14 34 i t i t s Thiourea Thiourea n t e t 7 i n 5.00E+06

s 1 k a e P

20 30 37 11 40 42 21 33 25 0.00E+00 0 1020304050607080Retention time (min)

Figure 3- 2: LC-APCI-MS chromatograms of the standard mixtures on analytical columns a) polymeric C18, b) phenyl-hexyl column. See Table 3-3 for corresponding names of the 1 to 47 numbers. Encircled numbers were obtained with APCI negative mode, numbers without circles in positive mode.

Table 3- 5: Capacity factors k and retention times of the 47 standard compounds on the preparative polymeric C18 and phenyl-hexyl columns. t phenyl-hexyl column k phenyl hexyl column t polymeric k polymeric t phenyl- k phenyl hexyl R Standards class log Kow R R adapted flow for adapted flow for C18 column C18 column hexyl column column preparative preparative Azaarenes acridine 3.4 25.58 8.92 27.17 6.59 57.77 7.29 2-amino-3,8- 1.4 4.94 0.79 7.17 1.00 15.25 1.19 dimethylimidazo [4,5-f] quinoxaline 2-amino-3-methylimidazo 1.2 4.42 0.77 5.87 0.64 12.58 0.80 [4,5-f]quinoline 2-amino-1-methyl-6- 1.3 9.39 2.32 19.82 4.54 42.74 5.13 phenyl imidazo [4,5-b] pyridine 2-amino-9H-pyrido[2,3-b] 2.0 15.29 4.51 22.71 5.34 47.79 5.86 indole benz[c]acridine 4.6 38.24 12.49 41.46 10.58 85.29 11.24 benzo[h]quinoline 3.3 32.90 10.63 31.90 7.91 65.79 8.44 benzotriazole 1.3 5.84 0.92 8.95 1.50 18.73 1.69 carbazole 3.7 28.07 9.11 33.41 8.33 68.63 8.85 1-methyl-9H-pyrido[3,4- 2.6 8.41 1.96 12.09 2.38 25.77 2.70 b] indole Keto-PAHs benzanthrone 4.8 32.10 10.52 39.15 9.94 80.43 10.54 4H-cyclopenta[def] 3.9 29.55 9.45 38.24 9.68 78.68 10.29 phenanthrene-4-one 9-fluorenone 3.6 22.69 6.81 32.88 8.18 67.69 8.71 Ketone benzophenone 3.2 19.44 5.52 32.55 8.09 67.07 8.62 Nitro-keto-PAHs 3-nitrobenzanthrone 4.5 34.74 11.61 44.68 11.48 91.53 12.13 2-nitro-9-fluorenone 3.4 29.89 10.13 38.69 9.81 80 10.48 Quinones 1,4-chrysenequinone 4.3 37.30 12.04 38.35 9.71 79.12 10.35 5,6-chrysenequinone 4.4 40.50 13.16 44.12 11.32 90.37 11.97 1H-isoindole-1,3(2H)- dione 1.3 5.79 0.61 10.78 2.01 24.05 2.45 1,4-naphthalene-dione 1.7 10.06 2.52 23.88 5.67 49.87 6.15

t phenyl-hexyl column k phenyl hexyl column t polymeric k polymeric t phenyl- k phenyl hexyl R Standards class log Kow R R adapted flow for adapted flow for C18 column C18 column hexyl column column preparative preparative 9,10-phenanthrenedione 3.1 16.19 4.44 29.28 7.18 60.54 7.69 OH-quinones 1,5- dihydroxyanthraquinone 3.9 37.94 12.41 44.61 11.46 91.42 12.12 1-hydroxyanthraquinone 3.6 31.56 10.05 41.11 10.48 84.41 11.11 2-hydroxyanthraquinone 2.9 26.91 8.62 32.05 7.95 65.79 8.44 Amino-compounds atrazine 2.6 13.66 3.39 24.67 5.89 51.49 6.39 3,3’dichlorobenzidine 3.5 24.47 7.55 31.01 7.66 64.06 8.19 1,8 diaminopyrene 2.6 36.96 12.06 47.87 12.37 97.91 12.92 2-naphthalenamine, N- 4.2 33.49 11.21 40.87 10.42 83.82 11.03 phenyl- Amides benalaxyl 3.2 24.47 7.31 38.75 9.82 79.62 10.42 carbamazepine 2.7 10.82 2.43 24.32 5.79 50.6 6.26 N,N’-dimethyl-N,N’- diphenylurea 4.2 17.46 4.56 32.15 7.98 66.34 8.52 metazachlor 2.1 12.37 2.90 29.18 7.15 60.49 7.68 NO2-compounds 1-chloro-2,4- dinitrobenzene 2.3 13.54 3.62 32.07 7.96 66.01 7.36 1,3-dinitrobenzene 1.5 10.69 2.79 26.51 6.41 55.12 6.91 1,8-dinitropyrene 4.7 38.24 12.37 46.54 12.00 96.79 12.89 2,4-dinitrotoluene 2.2 14.05 4.30 31.47 7.79 65.18 8.35 4-methyl-2-nitrophenol 2.5 15.74 3.72 27.70 7.96 58.25 7.36 nitrofen 4.6 31.95 10.29 44.88 11.54 91.97 12.20 4-nitroquinoline 1-oxid 0.9 6.83 1.21 21.34 4.96 44.37 5.37 1-nitropyrene 5.1 45.56 13.13 44.57 11.45 91.27 12.09 2-nitrotoluene 2.4 14.49 4.12 31.47 7.79 65.18 8.35 trinitrotoluene 2.0 11.44 2.61 34.97 8.77 72 9.35 OH-PAHs 3hydroxybenzo(a)pyrene 5.6 55.35 18.56 42.98 11.01 87.96 11.62 1 hydroxypyrene 4.5 38.79 13.38 38.06 9.63 78.14 10.21 1-naphthol 2.7 20.30 3.97 26.06 6.28 53.98 6.74 COOH-PAHs

t phenyl-hexyl column k phenyl hexyl column t polymeric k polymeric t phenyl- k phenyl hexyl R Standards class log Kow R R adapted flow for adapted flow for C18 column C18 column hexyl column column preparative preparative 2-naphthoic acid 3.1 27.64 8.77 32.01 7.94 66.01 8.47 Purine-dione caffeine 0.0 4.34 0.42 9.68 1.70 20.04 1.88

Chapter 3

3.3.1.4 Recoveries of the liquid chromatography fractionations

The recoveries and repeatability from triplicates listed in Table 3-3 ranked from 8.5 ± 0% (2-nitrotoluene) to 87.4 ± 4.5% (2-hydroxyanthraquinone) for the polymeric C18 column and from 4.2 ± 1% (4-methyl-2-nitrophenol) to 102 ± 5% (9-fluorenone) for the phenyl hexyl column. On both columns, the azaarenes, keto-PAHs, ketone, nitro- keto-PAHs, hydroxy-quinones, amino-compounds (except PNA), amides, hydroxy- PAHs, -PAH and caffeine had high recoveries (single standard recovery ≥ 70%). Nitro-compounds (except 2,4-dinitrophenol and 2-nitrotoluene) recovered from 57 to 75 %. The quinones had poor recoveries with values from 18 ± 1 to 59 ± 0.5% for the polymeric C18 column and values from 6 ± 14% to 59 ± 6% for the phenyl hexyl column. For both columns, the standard deviations were in most cases below 10%, except for some compounds with the lowest recoveries compounds. This shows the good reproducibility of the method.

3.3.2 Fractionation of the BR sample

The aim of the following section was to confirm that the three step fractionation procedure allows separation and recovery of mutagens from a complex environmental mixture (the BR extract). After each fractionation step, an aliquot of each fraction was taken and recombined (termed “recombined sample/fraction”). The mutagenicity of the raw sample/fraction was compared to the mutagenicity of the recombined sample/fraction with and without S9. Eight concentrations of BR equivalent/well were used for concentration response curves. In the first step, non logarithmic concentrations of BR were plotted versus the number of revertants obtained for each tested dose. The linear part of the curve was used to calculate the slope, which represents the number of revertants per well/g of BR, for the raw and recombined sample/fraction and are presented in Table 3-6 with the difference expressed in percentage. In the second step, the logarithmic concentrations were plotted versus the number of revertants using eight doses and a sigmoidal regression was applied. The resulting concentration-response curves are presented below (Figure 3-3, Figure 3-5, Figure 3-7, Figure 3-9, Figure 3-10, Figure 3-13 and Figure 3-14). The recovery comparisons were based on a quantitative comparison using the slope comparison between the raw and recombined sample/fractions (number of revertants per well/g of BR). The differences in percentage enabled the evaluation of each fractionation step in terms of mutagenicity recovery.

44 Chapter 3

Table 3- 6: Calculated slopes of the raw and recombined sample/fractions for each fractionation step. Number of Number of revertants per Loss revertants per Loss Step and fraction well/g of BR (%) well/g of BR (%) - S9 + S9 Ion Exchange raw fraction 2989.6 980.2 Ion Exchange recombined fraction 2340.1 -22 1107.1 +13 Polymeric C18 raw N fraction 883.1 874.8 Polymeric C18 recombined N fraction 739.9 -16 835.1 -5 Polymeric C18 raw A fraction 1624 554.1 Polymeric C18 recombined A fraction 1402.2 -14 589.6 +6 Phenyl-hexyl raw N-2 fraction - 299.2 Phenyl-hexyl recombined N-2 fraction - - 282.9 -5 Phenyl-hexyl raw N-7 fraction 143.9 - Phenyl-hexyl recombined N-7 fraction 180.4 +25 - - Phenyl-hexyl raw A-2 fraction 223.1 108.2 Phenyl-hexyl recombined A-2 fraction 238.3 +7 101.5 -6 Phenyl-hexyl raw A-8 fraction 138.8 - -14 - Phenyl-hexyl recombined A-8 fraction 120 -

3.3.2.1 Fractionation by ion exchange

Three fractions were collected (acidic, basic and neutral) and tested using eight concentrations ranging from 0.3 mg to 80 mg of BR equivalent/well. The concentration response curves (Figure 3-3) show that the mutagenicity is preserved as the trends with and without S9 for both raw and recombined BR extract are similar, with slightly higher mutagenicity of the recombined extract without S9 with the highest concentration of BR (80 mg of BR equivalent/well). This was confirmed by the slopes presented in Table 3- 6, showing 2990 and 2340 revertants per well/g of BR without S9 for the raw and the recombined extracts, respectively, representing a loss of 22% of mutagenicity. In contrast, the recombined extract with S9 activation (1107 revertants per well/g of BR) was 13% more mutagenic than the raw sample (980 revertants per well/g of BR). The mutagenicity results of the three fractions showed that the acidic and neutral fractions were the most mutagenic (Figure 3-4). The mutagenicity of the acidic fraction was surprising as the effects were expected to be in the basic fraction, consistent with previous studies which generally identified basic mutagens when using the passive sampler BR. (e.g. 2-amino-3-methylimidazo[4,5-f]quinoline, 2-amino-3,8- dimethylimidazo[4,5-f]quinoxaline, 3-amino-1,4-dimethyl-5H-pyrido[4,3-b]indole, 3- amino-1-methyl-5H-pyrido[4,3-b]indole, phenylbenzotriazoles-type compounds, 3,3’- dichlorobenzidine) [30,39,42,47,49,74,128]. The acidic and the basic fraction were further fractionated.

45 Chapter 3

Raw BR extract without S9 45 Recombined BR extract without S9 40 Raw BR extract with S9 35 Recombined BR extract with S9 30

25

20

15

10

Number of revertants per plate (max=48) 5

0 0.0001 0.001 0.01 0.1 Concentration ( g of BR/ plate)

Figure 3- 3: Recoveries of ion exchange separation, raw BR without S9 ( ), recombined BR without S9 ( ), raw BR with S9 ( ) and recombined BR with S9 ( ).

50.00 80 mg of BR without S9 45.00 40 mg of BR without S9 80 mg of BR with S9 40.00 40 mg of BR withS9 35.00

30.00

25.00

20.00

15.00

10.00

Number of revertants per plate (max= 48) 5.00

0.00 Acid fraction Basic fraction Neutral fraction

Figure 3- 4: Mutagenicity of the ion exchange fractions

46 Chapter 3

3.3.2.2 Fractionation by polymeric C18

The neutral (N) and acidic (A) fractions were fractionated with the preparative polymeric C18 column, using a solvent mixture of methanol/bi-distilled water (N) and methanol/bi-distilled water in ammonium acetate (50mM, pH 4) buffer (A). In both cases eleven sub-fractions were collected as described above. The eight tested doses ranked from 0.6 to 160 mg of BR equivalent/well. In the case of the fraction N, the dose response curves (Figure 3-5) show that the recombined fraction N is slightly less mutagenic than the raw fraction N without S9 and there are no significant mutagenicity losses with S9. The comparison of the raw and recombined N fraction showed a loss of 16 % of the mutagenicity in the recombined fraction without S9 and a loss of 8 % with S9 (see Table 3-6 for details). The mutagenicity results of the eleven sub-fractions (Figure 3-6) show that the first three fractions are mutagenic only with S9. The remaining fractions are mutagenic mainly without S9. The chemicals eluting in N-1 to N-3 fractions are more hydrophilic compounds (approximate log Kow = 0 to 3) than the chemicals eluting in the following fractions (N-4 to N-11), which contain compounds with increasing lipophilicity (higher log Kow, expected range 3 to 8.5). The sub-fractions exhibiting the greatest mutagenicity (N-2, 33 revertants with S9 and N-7, 41 revertants without S9, respectively) were selected for fractionation with the phenyl-hexyl column.

45 Raw Fraction N without S9 Recombined fraction N without S9 40 Raw fraction N with S9 35 Recombined fraction N with S9 30

25

20

15

10

5 Number of Revertants per plate (max=48)

0 0.0001 0.001 0.01 0.1 1 Concentration (g of BR per plate)

Figure 3- 5: Recoveries of the neutral fraction, raw fraction N without S9 ( ), recombined fraction N without S9 ( ), raw fraction N with S9 ( ) and recombined fraction N with S9 ().

47 Chapter 3

50 160 mg of BR without S9 45 80 mg of BR without S9 40 160 mg of BR with S9 80 mg of BR with S9 35

30

25

20

15

10

Number of revertants per plate (max=48) 5

0 N1 N2 N3 N4 N5 N6 N7 N8 N9 N10 N11

Figure 3- 6: Mutagenicity of neutral sub-fractions (polymeric C18 fractionation step)

In the case of fraction A, the dose response curves (Figure 3-7) show that the recombined fraction A is slightly less mutagenic than the raw fraction A without S9 and there are no significant mutagenicity losses with S9. The comparison of raw and recombined fractions showed a loss of 14 % of the mutagenicity in the recombined fraction without S9 and an increase in the mutagenicity of 6 % with S9 (see Table 3-6 for details). After the fractionation step the mutagenicity was more widely distributed over sub-fractions, with high mutagenicity spread over eight of the eleven fractions (Figure 3-8). As a result, the possibility of overlapping compounds in neighbouring fractions was investigated. The fractions were analysed by LC-MS/MS, along with a blank (non exposed piece of BR). The program MZmine [129] was used to extract exact m/z and retention times of ions from the mass spectrum and to create a mass list for each fraction and the blank. These lists were compared to the blank list and the common ions (mass tolerance of 0.003 u and retention time tolerance of 30 sec [95]) were removed. Then the lists of neighbouring fractions were compared. The maximum overlapping was found between fraction A-7 and A-8 with 13 % (26 ions in common over 200). The number of overlapping peaks for the other fractions was under 10%, which was considered to be a minimal overlap. Thus, the fact that mutagenicity was spread over 75% of the collected fractions suggests a broad range of mutagenic compounds rather than inefficiency of the polymeric C18 stationary phase to separate the compounds. The first fractions were mutagenic both with and without S9, while the following fractions were mutagenic only without S9 activation. Again the mutagenicity with S9 activation was associated with compounds of low to medium hydrophobicity (lower log Kow, approximately 0-4). A-2 and A-8 were selected for further fractionation as they exhibited a slightly higher mutagenicity than the other fractions.

48 Chapter 3

Raw fraction A without S9 45 Recombined fraction A without S9 40 Raw fraction A with S9 35 Recombined fraction A with S9 30

25

20

15

10

5 Number of revertants per plate(max=48)

0 0.0001 0.001 0.01 0.1 1 Concentration (g of BR per plate)

Figure 3- 7: Recoveries of the acidic fraction A, raw fraction A without S9 ( ), recombined fraction A without S9 ( ), raw fraction A with S9 ( ) and recombined fraction A with S9 ().

50

45 160 mg of BR without S9 80 mg of BR without S9 40 160 mg of BR with S9 35 80 mg of BR with S9

30

25

20

15

10

Numberof revertants per plate (max=48) 5

0 A-1 A-2 A-3 A-4 A-5 A-6 A-7 A-8 A-9 A-10 A-11

Figure 3- 8: Mutagenicity of acidic sub-fractions (polymeric C18 fractionation step).

49 Chapter 3

3.3.2.3 Fractionation by phenyl-hexyl

The neutral (N-2 and N-7) and acidic (A-2 and A-8) fractions were fractionated with the preparative phenyl-hexyl column using a solvent mixture of methanol/bi- distilled water (N) and methanol/bi-distilled water in ammonium acetate (50mM, pH 4) buffer (A). Sub-fractions were collected as described above. The eight tested concentrations ranked from 0.8 to 200 mg of BR equivalent/well. In the case of the neutral fractions, N-2 was tested only with S9 and N-7 only without S9. The concentration-response curves (Figure 3-9 for N-2 and Figure 3-10 for N-7) show that both recombined fractions remained as mutagenic as the corresponding raw fractions. The results in Table 3-6 show a loss of 5 % of the mutagenicity in the recombined N-2 fraction with S9 and an increase of 25% in the mutagenicity in the recombined N-7 fraction without S9 For N-2 (Figure 3-11), three fractions showed mutagenicity, N-2-6, N-2-7 and N-2-8 with the last fraction showing the highest response (44.8, 46.6 and 85.2 revertants per well/g of BR, respectively). For N-7 (Figure 3-12), only the fraction N-7-12 exhibited mutagenicity (73.6 revertants per well/g of BR).

45 Raw fraction N-2 with S9

40 Recombined fraction N-2 with S9

35

30

25

20

15

10

5 Number of revertants per plate(max=48)

0 0.0001 0.0010 0.0100 0.1000 1.0000 Concentration (g of BR per plate)

Figure 3- 9: Recoveries of fraction N-2, raw fraction N-2 with S9 ( ) and recombined fraction A-2 with S9 ( ).

50 Chapter 3

45 BR extract Fraction N7 without S9 BR extract recombined fraction N7 without S9 40

35

30

25

20

15

10

5 Number of revertants per plate(max=48)

0 0.0001 0.0010 0.0100 0.1000 1.0000 Concentration (g of BR/well)

Figure 3- 10: Recoveries of fraction N-7, raw fraction N-7 without S9 ( ) and recombined fraction N-7 without S9 ( ).

45.00

40.00 200 mg of BR with S9 100 mg of BR with S9 35.00

30.00

25.00

20.00

15.00

10.00

5.00 Number ofrevertants per plate(max=48) 0.00 N2-1 N2-2 N2-3 N2-4 N2-5 N2-6 N2-7 N2-8 N2-9 N2-10 N2-11 N2-12 N2-13 N2-14 N2-15 N2-16

Figure 3- 11: Mutagenicity of the N-2 sub-fractions (phenyl-hexyl fractionation step).

51 Chapter 3

45

40

35

30

25 200 mg of BR without S9 100 mg of BR without S9 20

15

10

Numberof revertants per plate (max=48) 5

0 N7-1 N7-2 N7-3 N7-4 N7-5 N7-6 N7-7 N7-8 N7-9 N7-10 N7-11 N7-12 N7-13 N7-14

Figure 3- 12: Mutagenicity of the N-7 sub-fractions (phenyl-hexyl fractionation step).

In the case of the acidic fractions, A-2 was tested with and without S9 and A-8 only without S9. The concentration response curves (Figure 3-13 for A-2 and Figure 3- 14 for A-8) show that raw and recombined fractions of A-2 exhibited the same mutagenicity (with and without S9). The recombined fraction A-8 is slightly less mutagenic than the raw fraction A-8. The slopes (Table 3-6) represented an increase of 7 % and a decrease of 6 % of the mutagenicity in the recombined A-2 fractions without and with S9, respectively. A loss of 14 % of the mutagenicity was observed in the recombined A-8 fraction without S9. For A-2 (Figure 3-15), two fractions showed mutagenicity, A-2-9 and A-2-10. Fraction A-2-9 was the most mutagenic. The slopes resulted in 184.2 and 102.2 revertants per well/g of BR without S9, and 15.8 and 2.4 revertants per well/g of BR with S9 for fraction A-2-9 and A-2-10, respectively. For fraction A-8 (Figure 3-16), only fraction A-8-9 showed mutagenicity, exhibiting the use of this method to separate complex samples into only a few fractions containing mutagenic compounds.

52 Chapter 3

Raw fraction A-2 without S9 45 Recombined fraction A-2 without S9 40 Raw fraction A-2 with S9 Recombined fraction A-2 with S9 35

30

25

20

15

10

5 Number of revertants per plate(max=48)

0 0.0001 0.0010 0.0100 0.1000 1.0000 Concentration (g of BR per plate)

Figure 3- 13: Recoveries of fraction A-2, raw fraction A-2 without S9 ( ), recombined fraction A-2 without S9 ( ), raw fraction A-2 with S9 ( ) and recombined fraction A-2 with S9 ( ).

Raw fraction A-8 without S9 45 Recombined fraction A-8 without S9 40

35

30

25

20

15

10

5 Number of revertants per plate (max=48)

0 0.0001 0.001 0.01 0.1 1 Concentration (g of BR per plate)

Figure 3- 14: Recoveries of fraction A-8, raw fraction A-8 without S9 ( ) and recombined fraction A-8 without S9 ( ).

53 Chapter 3

45

40 200 mg of BR without S9 35 100 mg of BR without S9 30 200 mg of BR with S9 100 mg of BR with S9 25

20

15

10

5

Number of revertents per (max plate = 48) 0 A-2-1 A-2-2 A-2-3 A-2-4 A-2-5 A-2-6 A-2-7 A-2-8 A-2-9 A-2-10 A-2-11 A-2-12 A-2-13 A-2-14 A-2-15

Figure 3- 15: Mutagenicity of the A-2 sub-fractions (phenyl-hexyl fractionation step).

45.00

40.00 200 mg of BR without S9 100 mg of BR without S9 35.00

30.00

25.00

20.00

15.00

10.00

Number of revertants per plate(max=48) 5.00

0.00 A-8-1 A-8-2 A-8-3 A-8-4 A-8-5 A-8-6 A-8-7 A-8-8 A-8-9 A-8-10 A-8-11 A-8-12 A-8-13 A-8-14

Figure 3- 16: Mutagenicity of the A-8 sub-fractions (phenyl-hexyl fractionation step).

3.4 Conclusions

The above results show that the three step fractionation method is efficient and reliable for recovering most of the mutagenicity of the BR extract and isolating the mutagens within a few fractions. Furthermore, information regarding the presence or absence of certain functional groups in the structure of the compounds involved are gained. This, could assist in the eventual identification of candidate structures.

54 Chapter 4

CHAPTER 4: DEVELOPMENT OF A HPLC-ESI/APCI-ORBITRAP METHOD AND A PROCEDURE TO IDENTIFY PLANAR POLYCYCLIC MUTAGENS IN RIVER WATER

4.1 Introduction

In EDA studies, samples are progressively fractionated until the toxic fractions are suitable for chemical analysis [31,130]. However, the identification of unknowns remains one of the most challenging steps, complicated by the fact that these toxic fractions remain complex even after extensive fractionation [22]. As the amount and purity of the compounds present are often not sufficient for nuclear magnetic resonance (NMR) or (IR) , mass spectrometry remains the method of choice for identification [95]. In many cases fractions are analysed by gas chromatography coupled to mass spectrometry (GC-MS) methods, due to the availability, ease of use, reproducibility of the method and the extensive databases available (e.g. NIST) [83]). The major drawback of GC-MS is that many compounds need to be derivatised prior to analysis [84,131]. Liquid chromatography-mass spectrometry (LC-MS) based methods are becoming increasingly popular for the structure elucidation of toxicants [132] as they are suitable for polar, thermally-unstable and higher molecular weight compounds. However, a major drawback of LC-MS/MS is the lack of spectral libraries for structure elucidation [133,134]. Thus, alternative approaches for structure elucidation are required and have to be integrated with computer tools, involving an identification strategy (see Chapter 2, Section 2.6). Furthermore, the success of identification relies as well on information gained during fractionation and biotests and the use of different strains of Salmonella could provide information regarding the presence of specific functional group. Indeed as explained in Chapter 2, Section 2.3.2, the YG strains such as YG 1024 and YG 1041, which are highly sensitive to nitroarene and/or aromatic amines can be used to confirm the presence or absence of amino- and nitro- functional groups. As the sampling site was located in an industrial area, where dyes and pharmaceuticals are produced, such amino- and nitro-compounds are expected to be present, thus the YG 1024 and YG 1041 were used to help during the identification process. The aim of this chapter is to compare the use of ESI and APCI to “target” a range of potentially mutagenic planar compounds and to develop a strategy to identify mutagenic compounds in two fractions (N-2-8 and N-7-12, see Chapter 3, Section 3.3.2). The first approach is target screening using a set of planar mutagens, comparing ESI and APCI to define ionisation and fragmentation rules connected to the presence of certain functional group, the second approach is a suspect-compound screening using exact mass filtering of compounds produced in the sampling site and the third approach is a non-target screening using fragmentation prediction and database search.

55 Chapter 4

4.2 Material and methods 4.2.1 Standards and samples

The solvents and the 47 standards compounds used are given in Chapter 3 (Section 3.2.1 and Table 3-1). The fractions N-2-8 and N-7-12 were collected according to the three step fractionation procedure described in Chapter 3. These two fractions were selected for further investigation because they showed the highest mutagenicity within the neutral sub-fractions (see Chapter 3, Section 3.2.3 Figure 3-10 for N-2-8 and Figure 3-11 for N- 7-12).

4.2.2 LC-MS system

The LC system used is described in Chapter 3 (Section 3.2.3.3). Standards and active fractions were separated on an analytical C18 reversed-phase column (LC-PAH, 250 x 2.1 mm, 5 µm particle size, Supelco, CA, USA), designed specifically for the analysis of the priority PAHs listed in US EPA Method 610 [135]. This phase is able to separate complex mixtures of polyaromatic compounds including isomers due to shape selectivity. A gradient elution was used, with water and methanol for negative ionisation or an ammonium acetate buffer (10 mM, pH 4) in water and methanol for positive ionisation. The samples were eluted at a flow of 0.2 mL/min with the following conditions: 0-50 min, 50-95% of methanol; 50-65 min, 95% of methanol, then re- equilibration of the phase. A volume of 5 µL was injected for the standards and 10 µL for the fractions. This LC system was coupled to an LTQ-Orbitrap hybrid instrument (Thermo Fisher Scientific, Bremen, Germany), described in Chapter 3 (Section 3.2.3.3). The mass spectrometer was equipped with an atmospheric pressure chemical ionisation (APCI) source or an electrospray ionisation (ESI) source and controlled by the Xcalibur software. MS spectra in the mass range of m/z 140-1400 were acquired in the Orbitrap at a resolving power of 60 000 (full width at half maximum, referenced to m/z 400). Tandem MS (MS2) experiments were conducted using collision-inducted dissociation (CID). The MS2 products were transferred to the Orbitrap for detection, such that high resolution (HR) fragments were obtained. Further details regarding MSn experiments for the active fractions are given below (Section 4.2.4). Calibration of the Orbitrap was performed prior to sample analysis. The deviation of the calibration masses from theoretical values was within the range -1 to 3 ppm. The deviation of the measured and exact mass for the 47 method development standards was always below 5 ppm. Thus, this value was used for candidate selection.

4.2.3 Peak detection and data processing

Full-scan HRMS chromatograms were processed using the free software MZmine [104,129], which performs a peak detection, calculates peak heights and areas and peak identification via a user-defined database search. The noise level was set to 1 × 103, the minimum peak height to 1 × 105, mass resolution to 60 000, m/z tolerance to 0.001 Da and minimum peak duration to 0.2 min.

56 Chapter 4

The target identification list contained all the standard compounds used during method development, including the retention time and exact mass in positive and negative ionisation mode. The suspect screening compound list contained the chemicals known to be produced at the sampling site and contained many dyes as well as pharmaceuticals. The original list of compounds obtained was complemented using additional information found in ChemSpider (e.g. empirical formula, exact mass, structure). Potential transformation products of the dyes were added, including the products of azo- and C-N bond cleavage of the dyes as described by Rehorek and Plum [136,137]. The masses of all compounds and potential degradation products were added to the suspect-compounds list.

MZmine uses csv files to compare target compounds with extracted masses, using two different approaches. The first, target identification, relies on matching both the exact mass and the retention time whereas the suspect-compounds screening identification relies only on matching the exact mass of compounds that could be present in the sample. The second strategy has a higher uncertainty, as only one parameter is used to identify a compound. A m/z tolerance of 0.003 Da and a retention time tolerance of 30 seconds were used for the blank search and the target search [95], while only the mass tolerance was applied for the suspect-compounds screening. MOLGEN-MSMS was used to determine the empirical formula of extracted m/z and to assign fragmentation losses [105], using the elements C, H, N, O, S, Si, P, Br, Cl, F, I, K and Na. The error margin used was 5 ppm in the formula and fragment assignment, while both ionisation products ([M]+ and [M+H]+ for positive mode, [M]- or [M-H]- for negative mode) were considered. MetFrag [106] was used to identify candidate structures based on the MS2 match. PubChem and ChemSpider searches were performed using the formula calculated with MOLGEN-MSMS, or using the exact mass where no clear matching formula was found. A search margin of 5 ppm was used for exact mass searches, while the Mzabs and Mzppm were set to 0.001 and 5, respectively. The fragmentation mode was set according to the MOLGEN-MSMS results and could include [M]+, [M+H]+, [M]- or [M-H]-.

4.2.4 Identification procedure of active fractions

The peaks to be investigated were selected from those detected by MZmine according to the following criteria: (i) absence in the blank [94] and in the non-mutagenic neighbouring fractions (ii) signal at least 100 times higher than noise (103) [94], (iii) mass ≥ 140 Da (BR adsorption limited to large planar molecules) and (iv) degree of unsaturation greater than 7 (planar compounds with at least 2 aromatic rings) The selected peaks were compared to the target list of standard compounds. The MS2 fragmentation patterns of all matches (exact mass and retention time) were investigated to support or reject the tentative identification.

57 Chapter 4

The selected peaks were then searched against the suspect-compounds list. Any matches obtained for the exact mass were checked against the physical-chemical properties (pKa and log Kow) to determine whether this compound was likely to end up in the fraction, as the fractionation steps included pKa and log Kow based-separations. pKa values were predicted using the SPARC online calculator [138] and the log Kow values using Kowwin from EPISuite [78] and ChemSpider database (ACD values) [139]. The range of log Kow of the fraction of interest was calculated from the polymeric C18 column fractionation (second analytical step) by plotting the log Kow versus the retention time of the analytical standards presented in Table 3-1, Chapter 3. Fraction N- 2-8 was collected between 10 and 15 min and fraction N-7-12 between 38 and 46 min, resulting in calculated log Kow ranges of 1.5–3.1 and 3.9–5.1, respectively. For candidate selection, fragmentation pattern was investigated for substructural information. This procedure for the suspect-compounds is summarised in Figure 4-1

m/z selection

m/z hit No m/z hit

Physico chemical properties fitting

Fit No fit Unknown identification

MSn comparison (observed & No fit predicted)

Standard confirmation

Figure 4- 1: Suspect-compounds screening identification procedure

Finally, in the case of unsuccessful match with the target and suspect-compounds screening, unknown identification was undertaken. This included the following steps: (1) Determination of the empirical formula. The MS full scan was run with a resolving power of 100 000 to obtain the most precise isotopic pattern for determination of the number of carbons (13C ratio), as well as the presence of sulphur, nitrogen, , chlorine and bromine (see e.g. [140,141]). Both Xcalibur and MOLGEN- MSMS were used to calculate the empirical formulas. (2) Structural information collection by exploiting the MS2 and MS3. Here, the fractions were run in data-dependent acquisition mode including tandem MS

58 Chapter 4 experiments using collision-inducted dissociation (CID) (MS2 and MS3) and the higher energy collision dissociation (HCD) cell (MS2). The MSn products were transferred to the Orbitrap for detection, such that high resolution fragments were obtained. (3) Candidate selection. The calculated empirical formula was used to generate candidates using MetFrag searching ChemSpider. The log Kow, pKa and fragment loss information was used in selection. The unknown identification procedure is presented in Figure 4-2. All candidates provided by target, suspect-compounds screening or unknown identification need to be confirmed by standards when available and/or affordable [22].

m/z selection

Database candidates with same empirical formula

Candidates fitting log Kow range assigned to the fraction

Candidates possessing the functional groups allowing neutral losses observed in MSn Candidates fitting pKa value allowing their presence in the neutral, basic or acidic fraction Candidate(s) matching the best with the predicted fragmentation pattern

Figure 4- 2: Strategy for the unknown identification

4.2.5 Mutagenicity assay

Fractions N-2-8 showed mutagenic effects when tested with the TA98 strain without and with activation. Additional testing was also performed with two other strains, YG 1024 and YG 1041, both without and with activation. These experiments were performed in collaboration with Dr. G. Reifferscheid at the Federal Institute for Hydrology (Bundesanstalt für Gewässerkunde), Koblenz, Germany. The protocol was as described by Perez et al. [124]. The tested concentration of the fractions ranged from 6.3 to 200 mg of BR equivalent/well. 4-Nitro-o-phenylenediamine (4.6 ng/well) and 2- aminoanthracene (0.77 ng/well) were used as positive control without and with activation respectively. The samples were incubated at 37ºC for 72 h (YG 1041) or 48 h (TA98 and YG 1024).

59 Chapter 4

4.3 Results and discussion 4.3.1 Ionisation of standard compounds by ESI and APCI sources

The ionisation source most suitable for a screening of compounds was selected based on the comparison of the ionisation modes APCI and ESI for standard compounds. Furthermore, the fragmentation pattern in positive and negative mode was studied to identify typical losses of each class of compounds to assist the detection of functional groups present in the molecular structure of unknown compounds. The detection limits, observed signals and fragmentation losses are presented in Table 4-1. Detection limits were determined by injecting 5 µL of standard solutions at different concentrations (from 1 ng/mL to 2 µg/mL). In APCI positive mode, [M+H]+ ions were observed for all standards except for 1,8-diaminopyrene, 1,3-diaminopyrene and 3-hydroxybenzo(a)pyrene, which were observed as [M]+ ions. In APCI negative mode, [M-H]- ions were observed for the azaarenes, hydroxy- and carboxylic acid-PAHs, while [M]- ions were observed for the keto-, nitro-keto-PAHs, quinones, hydroxyl-quinones and nitro-compounds. The detection of azaarenes in APCI positive mode was achieved by adding an ammonium acetate/ buffer (10 mmol/L at pH 4). In ESI positive mode, [M+H]+ ions were observed for all compounds, except for 3-hydroxybenzo(a)pyrene, for which both signals [M+H]+ and [M]+ were present. [M-H]- ions were observed for all the analytical standards in ESI negative mode. As mentioned above, chemicals ionisable in solution will be better detected in ESI [87]. This was confirmed for the azaarenes and amino-compounds, with detection limits up to five times lower in ESI compared with APCI. However, the hydroxy-quinones and hydroxy-PAHs, which are acidic in solution, were detected better with APCI. Furthermore, non polar compounds lacking a protonation site are difficult to ionise using ESI [142,143] and this was confirmed for the nitro-compounds, which do not have proton accepting or proton donating functional group and were not ionisable in ESI negative or positive mode. The MSn fragments of the standards were assigned using MOLGEN-MSMS to determine the typical losses associated with a compound group (e.g. nitro-, hydroxyl-, keto-PAHs) to identify similar substructures present in unknowns. The typical losses observed for APCI+ are: CHN (azaarenes), CO (keto-PAHs, quinones), NO (nitro-keto-

PAHs, nitro-PAHs), NH2 and NH3 (amino-PAHs) and H2O and CO (hydroxy-PAHs). Generally, compounds observed in APCI- did not generate many fragments and the molecular or protonated ion ([M]- or [M-H]-) was often the most abundant ion, confirming the observations of Lupton et al. [144] for polybrominated diphenyl ethers.

Typical neutral losses for ESI+ were: CHN and/or CH2N (azaarenes), CO and CHO (keto-PAHs), CO (quinones), NH2 and NH3 (amino-PAHs) and OH and H2O (hydroxyl- PAHs). As a larger range of compound groups (particularly nitro compounds) could be ionised in APCI, this technique was used to identify unknowns of interest in original sample screening. However, the information from ESI was used to confirm the presence or absence of certain functional groups in the molecular structure of the unknown compounds according to the criteria presented in Table 4-2.

60

Table 4- 1: Ionisation of standard compounds in APCI and ESI, limits of detection (DL), and neutral losses observed during CID fragmentation. nd = non determined (no signal or > 1000 pg). Mass APCI Neutral losses Neutral losses Neutral losses ESI DL Neutral losses recorded Name error in DL recorded in APCI recorded in ESI recorded in ESI pg (-/+) in APCI positive mode ppm (-/+) pg (-/+) negative mode positive mode negative mode Azaarenes acridine -/-2.4 nd/50 nd/10 M+H no signal M+H no signal no fragments CH2N 2-amino-3,8- -2.6/-2.0 nd/25 nd/5 M+H M-H M+H M-H dimethylimidazo[4,5-f] C3H4N2, CH2N2, C2H3N, CH3 C2H3N3, C3H4N2, no fragments quinoxaline CHN, H3N, CH3 CH2N2, C2H3N, CHN, H3N, H2N, CH3 2-amino-3- -3.4/-2.2 nd/25 nd/5 M+H M-H M+H M-H methylimidazo[4,5-f] C3H4N2, CH2N2, C2H3N, CH3 C2H3N3, C3H4N2, no fragments quinoline CHN, H3N, CH3 CH2N2, C2H3N, CHN, H3N, H2N, CH3 2-amino-1-methyl-6- -2.8/-1.6 nd/50 nd/5 M+H M-H M+H no signal phenylimidazo[4,5- CHN, H3N, C2H3N CHN, H3N b]pyridine 2-amino-9H-pyrido[2,3- -2.1/-1.9 nd/25 nd/5 M+H M-H M+H M-H b]indole C3H4N2, C2H3N, CHN, no fragments C3H4N2, C2H3N, no fragments H3N, CH4, CH3 CHN, H3N, CH4, CH3 benz(c)acridine -/-1.3 nd/25 nd/5 M+H M-H M+H no signal C3H7N3, C3H5N3, C3H4N3, CH3 C3H7N3, C6H6, C6H6, C2H5N3, C3H4N3, C3H4N2, C2H6N2, C2H6N2, C2H5N2, CH4N2, CH4N2, CH2N2, CH2N2, C2H3N, CHN, C2H3N, CHN, H3N, H3N, CH3 CH3 benzo(h)quinoline -/-2.3 nd/50 nd/10 M+H no signal M+H no signal CHN CH3N, CH2N, CHN benzotriazole -0.9/-4.1 50/50 50/50 M+H M-H M+H M-H N2 no fragments CHN3, N2 no fragments carbazole -2.5/-2.4 50/50 nd/500 M+H M-H M+H no signal no fragments no fragments no fragments

Mass APCI Neutral losses Neutral losses Neutral losses ESI DL Neutral losses recorded Name error in DL recorded in APCI recorded in ESI recorded in ESI pg (-/+) in APCI positive mode ppm (-/+) pg (-/+) negative mode positive mode negative mode 1-methyl-9H-pyrido[3,4- -2.8/-2.4 nd/25 nd/5 M+H M-H M+H M-H b]indole C3H4N2, C2H6N2, C2H2N2, CH3 C3H4N2, C2H6N2, no fragments C3H3N, CH4N2, CH2N2, C2H2N2, CH2N2, C2H3N, CH3N, CHN, C2H3N, CH3N, C2H2, H3N, H2N, CH3 CHN, H3N, H2N, CH3 Keto-PAHs benzanthrone -3.5/-1.4 50/50 nd/10 M+H M- M+H no signal CHO, CO no fragments CHO, CO 4Hcyclopenta [def] -3.4/-2.2 50/50 nd/250 M+H M- M+H no signal phenanthrene-4-one C3H2O, CHO, CO no fragments CO 9-fluorenone -2.8/-1.5 50/50 nd/100 M+H M- M+H no signal CHO, CO no fragments CHO, CO Ketone benzophenone -/-1.9 nd/25 nd/25 M+H no signal M+H no signal C7H6O, C6H6 C7H6O, C6H6 Nitro-keto-PAHs 2-nitro-9-fluorenone -3.46/-2.2 75/nd nd/nd M+H M- no signal no signal CNO2, NO2, NO, OH NO 3-nitrobenzanthrone -4.0/-0.7 50/25 nd/500 M+H M- (no M+H signal) no signal C3H5O3, C2HNO3, CNO2, NO, CHO, CO CNO2, NO2, NO, C2O2, NO2, NO, HO, N HO Quinones 1,4-chrysenequinone -3.9/-0.9 50/400 nd/1000 M+H M- (no M+H signal) no signal No fragments no fragments C3H2O2, C2O2, CO2, C2H2O, CO, H2O 5,6-chrysenequinone -3.9/-1.1 50/500 nd/1000 M+H M- M+H no signal C2HO2, CO no fragments C2HO2, CO 1-Hisoindole- -2.1/-3.2 50/nd 500/nd M+H M-H (no M+H signal) M-H 1,3(2H)dione H2O no fragments H2O no fragments naphthalene-1,4-dione -1.9/- 100/nd nd/nd M+H M- no signal no signal CO no fragments

Mass APCI Neutral losses Neutral losses Neutral losses ESI DL Neutral losses recorded Name error in DL recorded in APCI recorded in ESI recorded in ESI pg (-/+) in APCI positive mode ppm (-/+) pg (-/+) negative mode positive mode negative mode phenanthrene-9,10-dione -3.4/-2.0 50/50 nd/nd M+H M- no signal no signal C4H2O2, C2HO2, C2O2, no fragments CO OH-quinones 1,5- -3.75/- 50/nd nd/nd M+H M- M+H no signal dihydroxyanthraquinone C7H4O2, C3O4, C2O4, C3O3, no fragments C8H4O4, C7H4O2, C2H2O3, C2O3, C2O2, C3O4, C3O3, C2O3, CH2O2, CO2, C2H2O, CO, C2O2, CH2O2, CO2, H2O C2H2O, CO, H2O 1-hydroxyanthraquinone -4.0/-1.9 50/50 nd/50 M+H M- M+H M-H C8H4O3, C7H4O2, C3O3, no fragments C7H4O2, C3O3, CO C2H2O3, C2HO3, C2O3, C2HO3, C2O3, C2O2, C2O2, CH2O2, CO2, CO2, C2H2O, CO, C2H2O, CO, H2O H2O 2-hydroxyanthraquinone -3.6/-1.6 50/75 nd/50 M+H M- M+H M-H C8H4O3, C7H4O2, C3O3, no fragments C7H4O2, C3O3, CO C2H2O3, C2HO3, C2O3, C2O3, C2O2, C2H2O, C2O2, CH2O2, CO2, CO C2H2O, CO, H2O Amino-compounds 27) atrazine -/-0.6 nd/25 nd/5 M+H no signal M+H no signal C4H9N2Cl, C6H13N2Cl, C6H12N2, C4H8N2, C7H11N3, CH3N2Cl, C3H6N2, C3H6, C4H9N2Cl, C6H12N2, HCl, C2H4 C4H8N2, CH3N2Cl, C5H10, C3H6N2, C3H6, HCl, C2H4 46) 3,3'- -/-0.7 nd/50 nd/500 M+ no signal M+H no signal dichlorobenzidine HCl2, Cl2, CHNCl, H3NCl, H2Cl2, HCl2, HCl, H2NCl, Cl, H2N Cl, H

Mass APCI Neutral losses Neutral losses Neutral losses ESI DL Neutral losses recorded Name error in DL recorded in APCI recorded in ESI recorded in ESI pg (-/+) in APCI positive mode ppm (-/+) pg (-/+) negative mode positive mode negative mode 1,8-diaminopyrene -/-1.3 nd/25 nd/10 M+ no signal M+H no signal CH3N2, HN2, CH2N, CH2N, CHN, H3N CHN, H3N, H2N, HN, H N-phenylnaphthalen-2- -/-0.9 nd/10 nd/5 M+H no signal M+H no signal amine C10H8, C9H8, C7H7N, C10H8, C9H8, C7H5N, C6H6N, C6H6, C7H7N, C7H5N, C6H5, C2H4N, C2H3N, C6H6N, C6H6, C6H5, C2H5, CHN, H4N, H3N, C2H4N, C2H3N, CH4, CH3, H3,H2 C2H5, CHN, H4N, H3N, CH4, CH3, H3,H2 Amides benalaxyl -/-0.6 nd/25 nd/5 M+H no signal no signal no signal C10H10O3, C8H6O, C2H4O2, CH4O 28) carbamazepine -/-1.7 nd/50 nd/5 M+H no signal (no M+H signal) no signal CH3NO, CHNO, H3N CH3NO, CH2NO, CHNO, H3N N,N’-dimethyl-N,N’- -/-1.4 nd/10 nd/5 M+H no signal M+HC9H12N2O, no signal diphenylurea C8H9NO, C7H9N C9H10N2O, C8H9NO, C8H12N, C7H9N metazachlor -/-1.25 nd/50 nd/5 no signal no signal (no M+H signal) no signal C5H5N2OCl,C3H4N2 NO2-compounds 1- -3.5/- 50/nd nd/nd no signal M- no signal no signal chloro2,4dinitrobenzene NO2, NO 1,3-dinitrobenzene -3.0/- 50/nd nd/nd no signal M- no signal no signal NO 1,8-dinitropyrene -/-1.1 50/nd nd/nd M+H M- no signal no signal C2NO4, HN2O4,HN2O3, NO2, NO N2O3, CNO3, HNO3, N2O2, CNO2, HNO2, NO2, NO, HO

Mass APCI Neutral losses Neutral losses Neutral losses ESI DL Neutral losses recorded Name error in DL recorded in APCI recorded in ESI recorded in ESI pg (-/+) in APCI positive mode ppm (-/+) pg (-/+) negative mode positive mode negative mode 2,4-dinitrotoluene -3.3/- 50/nd nd/nd no signal M- no signal no signal CH2NO3, CNO2, H2NO2, CH3O2, HNO2, CH2O2, CHO2, NO, CHO, H2O, HO 4-methyl-2-nitrophenol -2.0/- 50/nd nd/nd no signal M- no signal no signal no fragments nitrofen -3.9/- 50/nd nd/nd no signal M+ no signal no signal C6H2Cl2, C6H3Cl, Cl, NO 4-nitroquinoline-1-oxide -3.2/-2.0 50/50 nd/75 M+H M- M+H no signal C3H2N2O3, C2HN2O3, CNO3, C2H2NO2, C3H2N2O3, C3H2NO3, C2HNO3, CNO2, H2NO2, C2HN2O3, C2NO3, CHNO3, CNO3, NO2, NO, HO C3H2NO3, C2HNO3, CHNO2, CNO2, HNO2, C2NO3, CHNO3, NO2, HNO, NO, HO CNO3, CHNO2, CNO2, O3, HNO2, NO2, HNO, NO 1-nitropyrene -3.6/-1.3 25/25 nd/nd M+H M- no signal no signal HNO2, NO2, NO, OH NO 2-nitrotoluene -1.5/- 50/nd nd/nd no signal M- no signal no signal no fragments trinitrotoluene -4.0/- 125/nd nd/nd no signal M- no signal no signal C2H2NO4, CHNO4, N3O3, CH2NO3, CHNO3, H2NO3, CH3O3, CH2NO2, N2O2, CHNO2, CNO2, H2NO2, HNO2, CH2O2, CHO2, CH2O, NO, CHO, HO

Mass APCI Neutral losses Neutral losses Neutral losses ESI DL Neutral losses recorded Name error in DL recorded in APCI recorded in ESI recorded in ESI pg (-/+) in APCI positive mode ppm (-/+) pg (-/+) negative mode positive mode negative mode OH-PAHs 3-hydroxybenzo -3.2/-2.1 50/100 nd/750 M+ M-H M+ & M+H M-H (a)pyrene CHO, CO, H2O, HO, O, H no fragments CHO, CO, HO, O no fragments 1-hydroxypyrene -3.7/-1.5 125/75 50/nd M+H M-H no signal M-H (& M-) C10H6O, CHO, CO, H2O, no fragments CO HO, H 1-naphthol -2.1/-3.4 50/500 50/nd M+H M-H no signal M-H CO, H2O no fragments no fragments COOH-PAHs 2-naphthoic acid -2.9/-2.5 100/- 50/- M+H M-H M+H M-H H2O, CO, CO2 CO2 H2O, CO, CO2 CO, CO2 Purine-dione caffeine -/-2.3 nd/25 nd/50 M+H no signal M+H no signal C5H6N2O2, C3H3NO2, C5H6N2O2, C2H3NO2, C2HNO2, C4H4N2O2, CH3NO2, C2H3NO, CO2, C3H3NO2, CH4O, CH3, CH2 C3H4N2O, C2H3NO, CO2, CH4O

Table 4- 2: Ionisation rules for the presence of functional group in ESI and APCI +/-. +: ionisable/detected, -: non ionisable/not detected, I+: ionisable and stronger signal associated to one ionisation source vs. the other source and nd: dependant of the compounds Chemical classes ESI + ESI - APCI + APCI - Azaarenes I+ - + + Keto-PAHs + - I+ I+ Nitro-keto-PAHs - - I+ nd Quinones - - + I+ Hydroxy-quinones + + + I+ Amino compounds I+ - + - Amides I+ - + - Nitro compounds - - nd I+ Hydroxy-PAHs + + + I+

Chapter 4

4.3.2 Target and suspect compounds screening in fractions N-2-8 and N-7-12

The mutagenic fractions N-2-8 and N-7-12 were run using the APCI source, along with the non-mutagenic neighbouring fractions to eliminate possible overlapping non-mutagenic peaks. No hits were recorded using the target screening described above (Section 4.2.4) The suspect-compounds screening was run on both mutagenic fractions, with one hit in the N-7-12 fraction. A peak with m/z of 341.1247 Da and a retention time of 47.0 minutes was extracted using MZmine and identified by mass in the suspect- compounds screening list as Pigment Yellow 1, shown Figure 4-3.

H3C NH O - O + O NN N O

CH3 Figure 4- 3: Structure of Pigment Yellow 1 [2512-29-0]

The predicted pKa of 10.7 (SPARC) indicates that this compound could be TM present in the neutral and acidic fractions, while the log Kow (EPISuite ) was 3.9, within the defined range of 2.9 – 6.1 (including the error margin). The MS2 fragments of the suspected Pigment Yellow 1 are shown in Figure 4-4 a. Pigment Yellow 1 was purchased from Tokyo Chemical Inc. Europe for confirmation and analysed as for the sample, using APCI in positive mode. The standard had a mass of 341.1258 Da at 46.9 minutes with matching MS2 fragments (see Figure 4-4 b for the proposed substructures). Thus, the Pigment Yellow 1 was identified as present in fraction N-7- 12. Testing of this chemical with the Ames fluctuation assay (with and without S9), however, revealed no mutagenic effects despite the presence of azo and nitro groups [63]. Thus, this compound cannot be considered responsible for the mutagenicity in this fraction. It is suspected that the lack of mutagenic activity could result from the lack of access to the functional groups, such that the azo group cannot be metabolised with the bulky substituents (only a metabolised product is mutagenic), while the nitro group may be sterically hindered.

67 Chapter 4

248.0668 100 a) 90

80

70

60

50

40 Relative Abundance 30 150.0661 178.0611 20

10 220.0718 266.0770 323.1143 130.7827 162.5951 197.7279 94.0649 238.4487 280.0933 299.9018 340.9434 0 100 150 200 250 300 350 m/z

248.0683 100 H C - 3 O H C b) + + 3 + NN N O O 90 - O O + 80 H + NN N O + N N NO CH 70 3 H H3C + 60 O - CH3 O + 50 CH3 NN N O

40 150.0670

Relative Abundance 178.0621 30 CH3 20 323.1161 10 135.0559 220.0731 266.0790 120.0448 280.0947 94.0653 197.9103 308.0921 346.7660 0 100 150 200 250 300 350 m/z Figure 4- 4: Mass spectrum (MS2) of Pigment Yellow 1 a) Suspected Pigment Yellow in the sample b) Pigment Yellow standard.

68 Chapter 4

4.3.3 Unknown identification in fraction N-2-8

The fraction N-2-8 was analysed by APCI positive mode (APCI +) at a resolution power of 100 000 in order to have the most accurate masses of the chemicals to determine their empirical formula. Using MZmine software, 159 m/z ions were detected (m/z ≥ 140 Da and intensity ≥ 105). After comparison with a blank and non mutagenic neighbouring fractions, 17 peaks (m/z ions) of interest were left and are presented in Table 4-3. The full scan chromatogram with the extracted peaks is presented in Figure 4-5. We can see that the compounds of interest for identification are eluting between 13 and 25 minutes. This is consistent with the fractionation procedure described in chapter 3, which included a polymeric C18 reversed-phase column. The fraction N-2-8 was collected between 10 and 15 minutes on this column. For the LC-MS/MS analysis, another polymeric C18 reversed-phase column was used, so it is expected that the unknowns contained in the fraction N-2-8 are still eluting together. In the second part of the chromatogram, only peaks belonging to the blank and non mutagenic neighbouring fraction where found. The blank consisted of 10 g of clean and non exposed blue rayon, which was extracted and fractionated as described in Chapter 3 for the blue rayon sample. Thus, our interest focused on the first part of the chromatogram.

Table 4- 3: peaks of interest contained in the fraction N-2-8 Peak identity m/z Retention time (min) Peak_3 259.0528 16.2 Peak_19 168.0801 24.2 Peak_35 225.0540 20.5 Peak_36 293.1084 20.0 Peak_40 233.0963 18.5 Peak_50 204.1012 13.2 Peak_60 377.1598 20.2 Peak_99 273.1226 19.4 Peak_109 250.0855 16.5 Peak_138 330.1438 17.2 Peak_143 302.0804 14.5 Peak_208 323.0694 25.3 Peak_212 276.1020 13.5 Peak_213 241.1329 20.4 Peak_228 326.0802 17.0 Peak_233 360.1545 18.4 Peak_242 185.1066 16.6

69

Figure 4- 5: Full scan chromatogram with the peaks of interest

Chapter 4

4.3.3.1 Determination of the empirical formula

The high resolution isotopic pattern of the selected m/z was used to determine the elements composing the formula. The atoms considered were C, H, N, O, S, Si, P, Br, Cl, F, I, K and Na. The isotopic composition of these atoms (except for phosphorus, fluorine and iodine which are monoisotopic elements) are presented in Table 4-4. Sodium is also monoisotopic but as it is able to form adduct, the formed adduct [M+Na]+ was also considered. When the elemental composition is determined, the relative intensity of the isotope peak is used to calculate the number of each element contained in the empirical formula. This procedure was done automatically by MOLGEN-MSMS [105] and formulas were proposed. In case of several fitting formulas, selection of the “best” empirical formula was made with the following criteria: (i) number of carbons close to the calculated number, (ii) the degree of unsaturation at least 7 and (iii) a mass error (difference between the theoretical mass and the mass detected by the Orbitrap) of less than 5 ppm. As seen in Table 4-1, we observed negative error on the masses detected by the machine with all the standards. Thus, the experimental masses were always lower than the theoretical masses. Furthermore these error were within 5 ppm. Thus, negative errors are considered more likely than positive errors. Furthermore, ion such as diisooctyl phthalate and dimethylphthalate are common background ions and can be used to estimate the mass error. The contaminant ion diisooctyl phthalate (391.2843 Da) was found with -3.57 ppm error and the dimethylphthalate (195.0657 Da) with -3.4 ppm error. To eliminate fitting formulas to the isotopic pattern an error of about -3.4 ppm was considered.

Table 4- 4: Isotopic composition

Atom or adduct [Misotop- Mmost abundant isotop] Relative abundancy (%) (Da) 13C 1.0034 1.02 2H 1.0063 0.01 15N 0.9970 0.37 17O 1.0042 0.04 18O 2.0042 0.20 33S 0.9994 0.75 34S 1.9958 4.21 36S 3.9950 0.02 29Si 0.9996 4.35 30Si 1.9968 3.26 81Br 1.9980 98 37Cl 1.9970 32 40K 1.0003 0.01 41K 1.9981 6.45 CNa instead of CH 21.9819 100

71 Chapter 4

The determination of the empirical formula for 17 peaks is presented below. In this part we search the mass spectra for some of the isotope peaks presented in Table 4- 4.

Peak 3: The mass spectrum is presented in Figure 4-6. A small peaak at 261.0487 Da wass detected with a relative abundancy of about 4. This peak corressponds to the presence of the isotope 34S (+1.0058 Da as seen in Table 4-4). Furthermore the 13C isotope peak was detected with a relatiive abundance of 13 (+1.0034 Da). Thus, the chemical corresponding to the m/z 259.0528 is expected to have aboutt 13 atoms of carbon and one atom of sulphur in its empirical formula. The different formulas are presented in Table 4-5. We know that in the empirical formula sulphur is present, therefore we can remove from the list all formulas which do not contain this element (formulas 4, 6, 7, 8, 9, 10, 12 and 13). Furthermore some formulas have a low number of carbon in comparison of the estimated one which is around 13 carbons and can also be removed from the list ( formulas 1, 2, 3, 15, 16 and 17). Now for the thrree remaining possible empirical formulas, two of them have a positive mass error, so we can also remove them (formula 5 and 14). Thus, only one formula remains, C13H10O2N2S and will be used for the identification procedure.

Figure 4- 6: Mass spectrum of the m/z 259.0528 with the isotope peaks and their relaattive abundances

72 Chapter 4

Table 4- 5: Proposed empirical formulas for peak 3 Formula Degree of unsaturation Mass error (ppm) 1) C3H15ON7P2S 1 -0.21 2) C5H14O4N4S2 1 -0.47 3) C7H11ON6PS 6 0.99 4) C6H17O3N2P3 1 1.24 5) C11H9ON5S 10 2.19 6) C5H6O5N8 7 -2.28 7) C6H13O10N 1 -2.31 8) C10H15O3NP2 5 2.44 9) C7H19N2PS3 0 2.80 10) C4H10O9N4 2 2.88 11) C13H10O2N2S 10 -2.99 12) CH10O5N9P 2 -3.48 13) C14H11O3P 10 3.64 14) C11H17NS3 4 4.00 15) C9H14O2N3PS 5 -4.19 16) C3H13O3N7S2 1 4.71 17) CH13N10P2S 2 4.98

Peak 19: The mass spectrum is presented in Figure 4-7. Only the 13C isotope peak was detected with a relative abundance of 12 (+1.0034). Thus, the formula is expected to possess about 12 carbon atoms . Only two formulas were proposed, (i)

C4H14O2N3S with no degree of unsaturation and an error of -0.08, and (ii) C12H9N with 9 degrees of unsaturation and an error of -3.78 ppm. The first formula does not fit the number of estimated carbon, the isotopic pattern and the mass error. Thus, the C12H9N is the formula we will use for the identification procedure. This formula is also the formula of one of the standards used in Section 4.3.1 of this chapter, carbazole which have a mass of 168.0808 Da and a retention time of 28.1 min. The detected peak is an isomer of carbazole.

73 Chapter 4

Figure 4- 7: Mass spectrum of the m/z 168.0801 with the isotope peaks and their relaattive abundances

As the determination of the empirical formulas for peaks 35, 36, 40, 50, 60, 99, 109, 1388, 143, 212, 213, 228, 233 and 242 is similar to the determination of the formula for peaks 3 and 19, the “best “ empirical formulas are presented in Tablle 4-6 and the MS spectra are presented in Annex I.

74 Chapter 4

Table 4- 6: Determination of empirical formulas for remaining peaks Peak Isotope Estimated Proposed Degree of Mass Formula peak number Formulas fitting unsaturation error selected of carbons the best the (ppm) isotopic pattern 13 C14H8O3 11 -2.76 35 C 14 C14H8O3 C12H7O2N3 11 3.21 13 C21H12N2 16 2.51 36 C 20 C19H17OP C19H17OP 11 -3.13 C17H12O 12 -2.32 13 40 C 20 C15H11N3 12 3.44 C17H12O C14H13FO2 8 -3.42 13 C10H12ON4 7 3.12 50 C 13 C12H13NO2 C12H13NO2 7 -3.46 C20H24O7 9 -0.51 C19H19N7O2 14 0.52 13 60 C 23 C20H30NP3 8 2.94 C21H20N4O3 C21H20N4O3 14 -3.04 C24H26P2 13 3.76 13 C15H16N2O3 9 -2.82 99 C 16 C15H16N2O3 C13H15N5O2 9 2.17 13 C14H10N4O 12 2.34 109 C 16 C16H11NO2 C16H11NO2 12 -3.02 13 138 C 18 C17H19NO4 10 -3.04 C17H19NO4 13 C17H10N4O2 15 2.03 143 C 21 C19H11NO3 C19H11NO3 15 -2.42 C H N O 18 -1.74 C H N O 208 13C 20 20 9 3 2 20 9 3 2 C22H10O3 18 -2.42 C22H10O3 13 C16H12N4O 13 2.20 212 C 18 C18H13NO2 C18H13NO2 13 -2.67 13 213 C 16 C15H16N2O 9 -3.07 C15H16N2O C19H10N4O2 17 0.78 13 228 C 22 C21H11NO3 17 -3.34 C21H11NO3 C18H14O6 12 4.90 C H N 15 -2.70 C H N 233 13C 18 17 16 10 17 16 10 C18H21N3O5 10 -2.71 C18H21N3O5 13 242 C 11 C12H12N2 8 -3.86 C12H12N2

75 Chapter 4

4.3.3.2 From the empirical formula to the candidate(s)

In this section, candidates are selected from the databases ChemSpider and

Pubchem. This selection was based on the match of each candidate to the log Kow range defined for the fraction N-2-8 (0.5 – 4.2 including the error range) and pKa values. The candidates were selected from the databases using MetFrag, which performs an empirical formula search (or exact mass search) in these databases. A list of candidates is provided, ranked according to their fragmentation according to the bond disconnection approach (instead of using cleavage rules). These “theoretical” fragments are then compared with the experimental fragments of the unknown compounds. The candidates are ranked based on the m/z and intensity of the fragments, bond dissociation energy and neutral loss comparison between “theoretical” and experimental fragments [106]. Unlikely candidates are easily removed from the list, as they have a score close or equal to 0. The closer to 1 the score is, the better, especially if many of the fragments are predicted. Thus, the selection is based on the score and the number of predicted fragments.

Peak 3: This peak of formula C13H10O2N2S (m/z 259.0528 Da), eluting at 16.2 min was detected in APCI + and ESI +. The APCI + MS2 spectrum of the unknown is presented in Figure 4-8a (CID fragmentation) and Figure 4-8b (HCD fragmentation). Nine fragments in total were found (encircled in Figures 4-8a and 4-8b, including the parent ion). The neutral losses identified by MOLGEN-MSMS were: CH4N2, C2H4N2O and C2H4N2O2. Using the ChemSpider database, 293 chemicals having C13H10N2O2S as an empirical formula and within the determined log Kow range were found. This number of candidates was decreased to four using MetFrag. These top match candidates are presented in Table 4-7 with their structure, pKa, log Kow, MetFrag score and number of predicted fragment. For the remaining 289 compounds, only one or two fragments were predicted and they had a very low score (< 0.263). The corresponding predicted MS2 spectrum of the top match candidates are presented in Figures 4-9 to 4-12.

76 Chapter 4

a)

b)

Fiigure 4- 8: MS2 spectrum of unknown ccompound Peak 3 a) obtained by CID and b) obtained by HCD. Fragments of interest are encircled

77 Chapter 4

Table 4- 7: Top match candidates for Peak 3. na: pKa out of range 0-14

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (1) 2-amino-5-(1,3-benzodioxol-5- yl)-4-methylthiophene-3-carbonitrile (ChemSpider ID 291445) N 1 5 1.8/1.4 na CH3

H2N S O

O (2) 2-amino-4-(2,3-dihydro-1,4- benzodioxin-6-yl)thiophene-3- carbonitrile (ChemSpider ID 4204525) 0.939 4 1.8/1.4 na N O

O H2N S (3) methyl 3-amino-5-(4- cyanophenyl)thiophene-2-carboxylate (Chemspider ID 10167241) H3C O 0.81 4 3.2/2.0 na

O S N H2N (4) 2-amino-5-(2H-chromen-3- ylmethylidene) -1,3-thiazol-4(5H)- one (ChemSpider ID 3386684) 3.9 0.778 3 1.9/1.7 + O N O NArH ↔ NAr

H2N S

78 Chapter 4

Fiigure 4- 9: Predicted fragmments for candidate 1

Fiigure 4- 10: Predicted fragments for candidate 2

79 Chapter 4

Figure 4- 11: Predicted fragments for candidate 3

Figure 4- 12: Predicted fragments for candidate 4

80 Chapter 4

Candidate 1 is better ranked than the other candidates due to the greater number of predicted fragments but also due to the “ease” of fragmentation. Indeed this candidate lost functional groups bounded to the structure skeleton. In all four cases the most intense fragment (m/z 216) was always assigned (loss of CH4N2). For candidates 1, 2 and 3 this loss corresponded to the loss of two functional groups (NH3 and HCN). The amino group loss was commonly observed with the standards (e.g. 2-amino-3- methylimidazo[4,5-f]quinoline, 2-amino-3,8-dimethylimidazo[4,5-f]quinoxaline, 2- amino-9H-pyrido[2,3-b]indole), supporting more these three candidates than candidate

4. Furthermore, for candidates 1, 2 and 3 the pKa values allow them to elute completely in the neutral fraction, as opposed to candidate 4, which would be mainly positively charged at pH 2 and thus would mostly elute in the basic fraction. The unknown was not detected in the basic fraction, thus the compound 4 (2-amino-5-(2H-chromen-3- ylmethylidene)-1,3-thiazol-4(5H)-one) can be removed from the list of candidates. Furthermore the unknown compound was also observed in ESI + mode (higher intensity in ESI than APCI), supporting the amino substituent present in the structure of the three remaining candidates (1, 2 and 3) (see Table 4-2). No information regarding the use of these chemicals were found on Google and ChemSpider, and only candidate 2 is available as standard.

Peak 19: This peak of formula C12H9N (m/z 168.0801 Da), eluting at 24.1 min was detected in APCI + only. Its MS2 spectrum obtained by APCI + revealed only one fragment, corresponding to the loss of a hydrogen atom (Figure 4-13), similar to carbazole (m/z 168.0808, empirical formula C12H9N and retention time of 28.1 min). The search performed with ChemSpider provided 103 hits, with only 58 fitting the log

Kow range (0.5 – 4.2 including the error range). As only the parent ion was present in the MS2 spectrum, MetFrag could not be used to reduce the number of candidates. However, the fragmentation behaviour of the standards presented in Table 4-1 was used instead. Indeed the 58 hits presented two types of structure, one for which fragment(s) were likely and one for which fragment(s) were less likely. Some examples of these structures are presented in Table 4-8. As seen in Table 4-1, substitutents on aromatic rings are easily lost during fragmentation (e.g. methyl group: 2-amino-3- methylimidazo[4,5-f]quinoline or 1-methyl-9H-pyrido[3,4-b]indole, amino group: N- phenylnaphthalen-2-amine or 2-amino-1-methyl-6-phenylimidazo[4,5-b]pyridine or 1,8- diaminopyrene). Furthermore, this unknown was only observed in APCI + and not in ESI mode, excluding the presence of an amino group. Using these expected fragmentation and ionisation behaviour the number of hits was decreased to 32 compounds.

81 Chapter 4

Figure 4- 13: MS2 spectrum of unknown compound Peak 19 obtained by CID

Table 4- 8: Example ChemSpider hits for Peak 19, indicatingn expected frf agmentation behaviour Structure 1 Structure 2 Fragmentation unlikely H N N

Structure 3 Structure 4 Structure 5 Structure 6 Fragmentation N expected

NH2 N N CH3

In the remaining possible structures (Table 4-8) the main diffeference is the number of hydrogen bound to the nitrogen atom. If we can determine how many of the hydrogen are exchangeable then it will be possible to further decrease the number of candidates. One approach already used in the past was the use of hydrogen/ exchange reaction for counting the active hydrogen (e.g. hydrogen attached to heteroatoms such as O, N and S) [145]. This was done by post column enrichment of the mobile phase with deuterated water (D2O) [146], leading to the deviatiion of the m/z in the MS spectrum of 1.0058 for one hydrogen exchanged, 2.0116 for two exchanged hydrogen. These experiments were run for the unkn nown and for the carbazole and showed that signal intensities of the deuterated ions of the unknown in APCI + was very

82 Chapter 4 similar to carbazole, even if the carbazole can exchange four while the unknown exchanges only three hydrogens [146]. Thus, the type of structure expected is similar to carbazole, structure 1, which also explains the loss of a hydrogen atom. Seven compounds in the ChemSpider database are still possible. The number of exchangeable hydrogen atoms bound to nitrogen is two. However, in addition, the two hydrogen atoms in the peri position may be partly exchanged [147]. The retention time of the carbazole was 28.1 min while the unknown’s was 24.2. According to Bataneih et al. [95], the decrease of the number of peri- or bay-region hydrogen leads to a decrease in the retention time of the chemical. Thus, our unknown compound should possess one peri- or one bay-region hydrogen instead of two peri-hydrogens. Only four candidates remain with these properties and are presented in Table 4-9 (structure, pKa, log Kow). Only the 1h-benz[g]indole was available as standard (Sigma Aldrich) and was purchased for confirmation. This compound showed the same loss of hydrogen, as the unknown compound (Figure 4-14), but had a different retention time (22.6 min,instead of 24.2 min). Within the remaining candidates, 3H-benz[e]indole is used in the synthesis of pharmaceuticals, and some pharmaceutical industries are present in the sampling area, while the 1H-azuleno[1,2-b]pyrrole could be involved in the synthesis of squarilium compounds or liquid crystal and no information was available for the 1H- benz[f]indole. The remaining candidates are not available as standards and would need to be synthesised for confirmation. However, as 1h-benz[g]indole and carbazole are not mutagenic, the mutagenicity observed in this fraction N-2-8 is not likely to be connected to this unknown compound.

Table 4- 9: Top Candidates for Peak 19. na: pKa out of range 0-14

Structure log Kow pKa information ACD/EPI SUITE (1) 3H-benz[e]indole (ChemSpider ID 9519149) Mainly used in the synthesis of anti-tumour 3.4/3.2 na pharmaceuticals (8 patents) [139] N H (2) 1H-benz[g]indole (ChemSpider ID 89061), Used in the synthesis of different H 3.4/3.2 na pharmaceuticals (4 N patents) [139] Incorrect retention time

(3) 1H-benz[f]indole (Chemspider ID 10496645) H No information 3.7/3.2 na N available

(4) 1H-azuleno[1,2-b]pyrrole (ChemSpider ID 14524926) H Use in the synthesis of N 3.7/3.2 na squarilium compounds (1 patent) [139]

83 Chapter 4

Figure 4- 14: MS2 spectrum of standard 1H-beenz[g]indole obtained by CID

Peak 35: This peak of formula C14H8O3 (m/z 225.0540), eluting at 20.5 min was detected in APCI +/- and ESI -. The APCI + MS2 spectrum of the unknown was obtained from CID and HCD fragmentations, and three fragments were observed and are presented in Table 4-10 with the corrresponding loss assigned by MOLGGEN-MSMS. For peak at 100.1116 Da no assignment was possible. This peak could come from co- eluting peaks at 225.1479 or 225.1955, which are also more intense than our unknown peak. Using ChemSpider, 44 chemicalls with the formula C14H8O3 were proposed. Metfrag identified three chemicals with a score greater than 0.956, explaining the two assigned fragments, and also identified three other chemicals with a score between 0.77 and 0.956 explaining only one of the assigned fragment. Two candidates did not fit the log Kow range and pKa values. The remaaining candidates are presented iin Table 4-11 with the predicted MS2 in Figure 4-15 and Figure 4-16.

Table 4- 10: Fragments for Peak 35 m/z Assigned formula Neutral loss 225.0540 C14H9O3 Parent ion 197.0596 C13H9O2 CO 141.0695 C11H9 N2O3 C3O3 100.1116 No assignment No assignment

84 Chapter 4

Table 4- 11: Candidates for Peak 35, na: pKa out of range 0-14

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (1) 2-oxo-2H-benzo[g]chromene-3- carbaldehyde (ChemSpider ID 9416609) 0.996 2 2.6/1.8 na O

O O (2) 1 dibenzo[b,d]furan-3- yl(oxo) (ChemSpider ID 4436917)

O O 0.77 1 2.9/3.0 na

O (3) dibenzofuran-2,8- dicarbaldehyde (Chemspider ID 13920496)

O 0.77 1 3.0/3.5 na

O

O (4) dibenzo[b,d]furan-4,6- dicarbaldehyde (ChemSpider ID 10129130) O O 0.77 1 2.8/3.5 na O

Fiigure 4- 15: Predicted MS2 spectrum of candidate 1 for Peak 35

85 Chapter 4

Figure 4- 16: Predicted MS2 spectrum of candidates 2, 3 and 4 for Peak 35

Based on the structures of candiddates 2, 3 and 4 it is impossible that they lose a

C3O3 fragment without splitting the whole molecule in two. Thus, candidatte 1 is the best candidate that we can propose as it is the only one explaining the ttwo assigned fragments. No information was found for this candidate.

Peak 36: This peak of formula C19H17OP (m/z 293.1084), eluting at 20.0 min was detected in APCI + and ESI +. The APCI + MS2 spectrum of the unknown is presented in Figure 4-17 (CID fragmentation) and four fragments in total were observed. Howeveer, as the peak at the m/z of 237 was a fragment of co-eeluting peak at 293.2105, this peak was not predicted by MOLGEN-MSMS. The neutral losses identified by MOLGEN-MSMS were: C6H6, C7H8 and C12H11OP. Using ChemSpider database 20 chemicals having C19H17OP as empirical formula were propoosed with only eight within the determined log Kow range. Using MetFrag, this number of candidates was decreased to four (highest score of 1), while the four others have a score of 0 and fewer fragments were predicted (one or two). For the four highest scoreed candidates (Table 4-12) all three fragments belonging to the unknown were assigned and the predicted MS2 spectrum is presented in Figure 4-17. The benzyl(diphenyl) phosphine oxide was available as standard and was purchased for confirmation. The retention time

86 Chapter 4 of the standard was the same than the unknown (19.8 min) and the MS2 fragmentation pattern was also identical to the unknown’s, confirming thus, the presence of this chemical in the blue rayon sample. However this compound was nott responsible for the mutagenicity observed in the sampple. This compound can be used in organic synthesis, explaining its presence in the sampling site, which is an industrial area where dyes, some pharmaceuticals and explosive are produced.

Figure 4- 17: MS2 spectrum oof unknown compound Peak 36 obtained by CID

Table 4- 12: Candidates for Peak 36, na: pKa out of range 0-14, nd: log Koww not predicted

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (1) benzyl(diphenyl) phosphine oxide (ChemSpider ID 68772)

1 3 3.5/3.6 na

P O (2) (3-methylphenyl) (diphenyl)phosphane oxide (ChemSpider ID 265165) (3) (2-methylphenyl) (diphenyl)phosphane oxide (ChemSpider ID 1046297) (4) (4-methylphenyl) (diphenyl)phosphane oxide (ChemSpider 1 3 3.4/3.6 na ID 1046298)

P O CH3

87 Chapter 4

Figure 4- 18: Predicted MS2 spectrum of candidates 1, 2, 3 and 4 for Peak 36

Peak 40: this peak of formula C17H12O (m/z 233.0963), eluting at 18.5 min was detected in APCI + only. The APCI + MS2 spectrum of the unknown is presented in Figure 4-19 (CID fragmentation) and five fragments in total were observed. The neutral losses were assigned to four of these fragments: CH3, H2O, CO and CHO. Using ChemSpider, 113 chemicals with C17H12O as an empirical formula were foound. MetFrag identified five chemicals explaining all four fragments observed (score from 1 to 0.862), but none fitted the log Kow range (> 3.2) or the pKa value allowing an eelution in the neutral fraction. Other structures fitting these properties were found, with a maximum of two fragments predicted. Their score ranked from 0 to 0.687, with only one molecule having a score higher than 0.5. Thus, the best candidate found was 7,9--dimethyl-8H- cyclopenta[a]acenaphthylen-8-one (Table 4-13). Its predicted mass spectrum is presented in Figure 4-20. The most intense fragment was predicted and coorresponded to the loss of the methyl group. The loss of the carbonyl group was also prediccted.

88 Chapter 4

Fiigure 4- 19: MS2 spectrum of unknown compound Peak 40 obtained by CIDD

Table 4- 13: Candidates for Peak 40, na: pKa out of range 0-14 log K MetFrag Predicted ow Structure pK Score fragments ACD/EPI a SUITE 7,9-dimethyl-8H-cyclopenta[a] acenaphthylen-8-one (ChemSpider ID 9141590) O H C CH 3 3 0.687 2 2.8/4.9 na

89 Chapter 4

Figure 4- 20: Predicted MS2 spectrum of best candidates for Peak 40

The search within databases using properties of the chemical and tthe prediction of fragments allowed us to propose as best candidate the 7,9--dimethyl-8H- cyclopenta[a]acenaphthylen-8-one. Howeever, it is important to be awarre that it also could be another isomer of this compound with different positions of tthe functional groups on the PAH skeleton, which is not present in the database. Furthermore, we can observe that there is a big difference in the prediction of the log Kow. Using the ACD values in ChemSpider, this chemical fit the determined range of the neutrral fraction N- 2-8 (0.5 – 4.2 including the error range), but not with the prediction of EPISuite, which is too high. If we compare the retention time of the standards used (Chapter 3 Section 3-

3-1-3, Table 3-5), the keto-PAHs 9-fluorenone (EPISuite log Kow 3.6, 3 aromatic rings and elution at 23 min), benzanthrone (EPISuite log Kow 4.8, 4 aromatic ring and elution at 32 min) and 4H-cyclopenta [def] phenanthrene-4-one (EPISuite log Kow 3.9, 4 aromatic rings and elution at 30 min) our candidate should elute after 22.7 min. Indeed, the retention time increases with increasis of number of aromatic rings. Thus, with five rings and a log Kow predicted at 4.9, it is unlikely that it would elute at 18..5 min. As the chemical search is database dependant, our unknown may not be present in ChemSpider and in PubChem. Another reason for not being able to provide any candiidate could be the empirical formula mismatching the MS spectrum. Indeed in the identification of unknown compounds, one of the first challenges is to determine the correct elemental composition, which is not always possible unambiguously even with the hiigh resolution isotopic pattern as it requires a clean mass spectrum with no interfering noise or coeluting compounds [133]. One reason to the mismatching of the empirical formula to the MS spectrum could be the presence of one or more adducts [133,148] or fragmentation of the molecule during iionisation. The formation of adducts can be influenced by the solvents and buffer constitution, and the chemical proopperties of the compounds [149], but programs to predict the probability of adduct formation are yet not available [133]. However, sodium and potassium isotopic peaks (Table 4-3) were not found in the mass spectrum. Further experiments regarding the formatiion of adducts

90 Chapter 4 could be undertaken, in order to determine the empirical formula. At this stage we cannot propose another candidate having this formula. Structure generation, as developed for GC-MS identification could be considered [150], but in this case the MS2 provides little information about the structure.

An exact mass search was also attempted using the neutral exact mass of 232.0884 Da, 290 hits were found. The first three chemicals proposed had a formula of

C9H16N2O3S, explaining 3 or 4 fragments with scores from 0.977 to 1. However the degree of unsaturation was too low (5 unsaturations). Nine chemicals had a formula of

C14H13FO2, explaining 4 fragments (scores 0.876 to 0.966) and another 27 structures explained 3 fragments and still had a high score (0.862 - 0.864). The third best formula found was C11H16NOFCl with three explained fragments and score of 0.864. However, the degree of unsaturation was also too low (4 unsaturations), while we set up the minimum of unsaturation to 7 to have large planar molecules, as they are known to better adsorbed on the blue rayon. The nine chemicals (highest score, four fragments predicted) resulting from this mass search were checked for log Kow and pKa and one candidate was removed. The eight remaining candidates are presented in Table 4-14. All these candidates have a hydroxyl group, which is well detected in ESI + and ESI -. However, the unknown was only detected in APCI +, which does not support the presence of this group. Thus, the possibility that the unknown is one of these candidates is unlikely, and even with the mass search we cannot provide a candidate for this peak.

91 Chapter 4

Table 4- 14: Candidates for Peak 40 from the mass search, na: pKa out of range 0-14, nd: log Kow not predicted MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (1) 5-fluoro-2-(2-methoxybenzyl) phenol (ChemSpider ID 14273794)

9.8 H3C 0.966 4 3.8/3.8 - O OH ↔ O

F OH (2) 4-fluoro-2-(3-methoxybenzyl) phenol (ChemSpider ID 11343506)

9.8 CH3 0.966 4 3.6/3.8 O OH ↔ O- OH

F (3) 4-fluoro-2-(4-methoxybenzyl) phenol (Chemspider ID 11445011)

F

9.9 0.966 4 3.4/3.8 OH ↔ O- OH

O CH3 (4) 2-[1-(3-fluorophenyl)-1- hydroxyethyl]phenol (ChemSpider ID 15185584)

10.8 0.883 4 2.4/2.9 - OH OHAr ↔ OAr F H3C HO

92 Chapter 4

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (5) 4-[(4-fluorophenyl) methoxy]-3- methyl-phenol (ChemSpider ID 25530933) F 10.1 0.883 4 3.2/nd OH ↔ O- O

HO CH 3 (6) 2-[(4-fluorophenyl) methoxy]-5- methyl-phenol (ChemSpider ID CH3 9.7 0.883 4 3.1/nd - F OH ↔ O HO O 25531216)

(7) 2-[(2-fluorophenyl)methoxy]-5- methyl-phenol (ChemSpider ID 25531498)

CH3 9.7 0.883 4 3.1/nd OH ↔ O- HO O

F (8) 1-[5-(4-fluorobenzyl)furan-3- yl]prop-2-en-1-ol (ChemSpider ID 14669156)

F

0.876 4 3.0/3.7 na O

OH

CH2

93 Chapter 4

Peak 50: This peak of formula C12H13NO2 (m/z 204.1012), elutinng at 13.2 min was detected in APCI + and ESI +. The APCI + MS2 spectrum of the unknown is presented in Figure 4-21 (CID fragmentation) and only one fragment was observed, corresponding to the loss of C2O2 (or two carbonyl groups). Using ChemSpider, 1345 chemicals with C12H13NO2 as an empirical formula were proposed. MetFFrag identified twenty chemicals explaining the fragment observed with a score greatter than 0.89.

After matching the pKa and log Kow, nineteen candidates remained. The chemical with the highest score of 1 was a carboxylic acid and thus, could not elute iin the neutral fraction. The remaining candidates are prresented in Table 4-15.

Figure 4- 21: MS2 spectrum of unknown compound Peak 50 obtained by CID

94 Chapter 4

Table 4- 15: Candidates for Peak 50, na: pKa out of range 0-14, nd: log Kow not predicted

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (1) 1-benzyl-4-methyl pyrrolidine-2,3- dione (ChemSpider ID 219476)

O H C 0.945 1 1.0/0.2 na 3 O N

(2) 1-ethyl-5,7-dimethyl-1H-indole-2,3- dione (ChemSpider ID 20523428) O H3C 0.945 1 2.0/1.9 na O N

CH3 CH3 (3) 5,6-diethyl-1H-indole-2,3-dione (Chemspider ID 13494962)

H 10.9 N 0.945 1 2.5/2.5 H3C - O NH ↔ N

O CH3 (4) 7-methyl-1-propyl-1H-indole-2,3- dione (ChemSpider ID 2407695)

CH3 CH3 0.945 1 2.1/1.8 na N O

O (5) 1-(2-phenylethyl) pyrrolidine-2,3- dione (ChemSpider ID 24798308)

N 0.945 1 1.8/nd na

O O (6) 5-isopropyl-1-methyl-indoline-2,3- dione (ChemSpider ID 24021760) CH3 O

H3C 0.945 1 1.9/nd na O N

CH3 (7) 4,5,6,7-tetramethyl-1H-indole-2,3- dione (ChemSpider ID 19461480)

CH3 O

H3C 11.4 0.945 1 2.4/2.6 - O NH ↔ N H C N 3 H CH3

95 Chapter 4

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (8) 5-tert-butyl-1H-indole-2,3-dione (ChemSpider ID 517156)

CH H C 3 O 10.6 3 0.945 1 2.2/2.4 NH ↔ N- H3C O N H (9) 1-isopropyl-5-methyl-1H-indole- 2,3-dione (ChemSpider ID 1606080)

H3C CH3 0.945 1 1.0/1.7 na N O

H3C O (10) 5-butyl-1H-indole-2,3-dione (ChemSpider ID 2339601) O 10.6 0.945 1 2.6/2.5 - H C NH ↔ N 3 O N H (11) 1-(2-methylpropyl)-1H-indole-2,3- dione (ChemSpider ID 1783307)

CH3

CH 0.945 1 2.0/1.7 na N 3 O

O (12) 4,4-dimethyl-1-phenyl-pyrrolidine- 2,3-dione (ChemSpider ID 9238673)

O O 0.945 1 1.6/0.4 na N H3C H3C (13) 5-(1-methylpropyl)-1H-indole-2,3- dione (ChemSpider ID 16812332) H 10.6 N 0.945 1 2.4/1.6 O NH ↔ N- CH3 O CH3 (14) 7-(2-methylpropyl)-1H-indole-2,3- dione (ChemSpider ID 513041) CH3

H3C 10.6 H 0.945 1 2.4/2.4 - N NH ↔ N O

O

96 Chapter 4

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (15) 1-butyl-1H-indole-2,3-dione (ChemSpider ID 1283920) O

O N 0.945 1 2.2/1.8 na CH3

(16) 5-methyl-1-propyl-1H-indole-2,3- dione (ChemSpider ID 2378726) CH3

N 0.945 1 2.1/1.8 na O

H3C O (17) 7-methyl-1-(1-methylethyl)-1H- indole-2,3-dione (ChemSpider ID 3347751) H C CH 3 3 CH3 0.945 1 1.9/1.7 na N O

O (18) 1-ethyl-4,6-dimethyl-indoline-2,3- dione (ChemSpider ID 24209284) O CH3 0.945 1 2.1/nd na O N CH3 H3C (19) 7-amino-4-propyl-2H-chromen-2- one (ChemSpider ID 2012579) O O NH2 2.2 0.89 1 2.2/2.1 + NH3 ↔ NH2

CH3

All of these candidates may explain the fragment 148. However, based on the study of standards (Table 4-1), some candidates could be removed from this list. Indeed for structures 1, 5 and 12 the loss of the benzyl group is expected, as seen for the 2- amino-1-methyl-6-phenylimidazo[4,5-b]pyridine, N,N’-diethyl-N,N’-diphenylurea, benzophenone and the N-phenyl-2-naphthalenamine. Candidate 9 (1-isopropyl-5-methyl-1H-indole-2,3-dione) was available as standard and was purchased for confirmation. Its retention time was 13.0 min instead of 13.2 min for the unknown and its MS2 fragmentation pattern (Figure 4-22) and was different to the unknown fragmentation pattern (Figure 4-21). The only fragment observed was at 162.0544 Da corresponding to the loss of C3H6 (tert-butyl group bound

97 Chapter 4 to the nitrogen) while the unknown had a loss of C2O2. Thus, the 1-isopropyl-5-methyl- 1H-indole-2,3-dione enabled the removal of other candidates from the list. Candidates 1, 2, 4, 5, 6, 11, 12, 15, 16, 17 and 18 have a similar structure with an alkyll group bound to the nitrogen and the release of this grooup is strongly expected. Furthermore candidate 9 did not show any mutagenicity in the Ames test.

Figuure 4- 22: MS2 spectrum of 1-isopropyl-5-methyl-1H-indole-2,3-dione obtained by CID

The best candidates remaining on the list are compounds 3, 7, 8, 10, 13, 14 and

19 (Table 4-16) and all of them are ablle to release a C2O2 fragment. Their predicted MS2 is presented in Figure 4-23. Furthermore the ions generated during fragmentation are all similar with and amino groups at a structure. At tthis stage we cannot select one candidate against another one based on the structures alone. Further fragmentation expectation and physical chemical properties and standards would be helpful. Nevertheless, candidate 19 is the only candidate having an amino group in its structure and such functional group is known to be often involved in muttagenicity, but nothing indicates that this unknown is or not involved in the observed toxic effect and could be as the phosphine oxide, present in the fraction but not mutagennic. It is also difficult to select candidates based on their use, because the informattion collected involves a use in pharmaceuticals or no information was found. We know that pharmaceutical industries are located close to the sampling site but we do not possess information regarding the type of pharmaceuticals produced.

98 Chapter 4

Fiigure 4- 23: Predicted MS2 spectrum of best candidates for Peak 50

99 Chapter 4

Table 4- 16: Top match candidates for Peak 50 Name of Candidate Information (3) 5,6-diethyl-1H-indole-2,3-dione Use in the synthesis of imino-indeno[1,2-c]quinoline derivatives (anti-cancer therapy) (2 patents) [139]

(7) 4,5,6,7-tetramethyl-1H-indole-2,3-dione No information available (8) 5-tert-butyl-1H-indole-2,3-dione Use in the preparation of six membered heterocycle derivatives which are used as selective inhibitor of serine protease enzyme (1 patent) [139]

(10) 5-butyl-1H-indole-2,3-dione No information available (13) 5-(1-methylpropyl)-1H-indole-2,3-dione No information available (14) 7-(2-methylpropyl)-1H-indole-2,3-dione No information available (19) 7-amino-4-propyl-2H-chromen-2-one No information available

Peak 60: This peak of formula C21H20N4O3 (m/z 377.1598), eluting at 20.2 min was detected in APCI +/- and ESI +. The MS2 fragments were obtained with APCI + (Figure 4-24) and these two fragments corresponded to the two halves of the molecule

(254.0930 corresponding to C14H12N3O2 and 124.0756 to C7H10NO). Thus, the molecule fragmented easily in two fragments as observed for two standards, benzophenone and 1,3-dimethyl-1,3-diphenylurea, for which the cleavage appeared at the carbonyl group. MS3 fragmentation of the most intense ion, 254.0930 Da was performed. The fragments obtained from this peak are presented in Table 4-17, as well as their corresponding losses. In total six fragments were used to search the ChemSpider database with MetFrag. There were more than 1600 hits, but only seven of them showed a very high score (≥ 0.856) and could explain four or five fragments of the six observed fragments.

After verification of the log Kow and pKa values, three were excluded as their pKa values would have led to elution the basic fraction. These four candidates are presented in Table 4-18 and the predicted fragments in Figure 4-25 (Structure 1) and Figure 4-26 (Structures 2, 3 and 4).

100 Chapter 4

Fiigure 4- 24: MS2 spectrum of unknown compound Peak 60 obtained by CID

Table 4- 17: Fragments for Peak 60 m/z Assigned formula Neutral loss 254.0921 C14H12N3O2 C7H9NO 169.0488 C6H7N3O C8H5O (from 254 ion) 135.0549 C7H7N2O C7H5NO (from 254 ion) 124.0752 C7H10NO C14H11N3O2 107.0599 C6H7N2 CO (from 135 ion) 94.0647 C6H8N C8H4N2O2 (from 254 ion)

101 Chapter 4

Table 4- 18: Candidates for Peak 60

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (1) 2-ethoxy-N-(3- 0.5 [(phenylcarbamoyl) amino] a + a phenyl)pyridine-3-carboxamide N H ↔ N (ChemSpider ID 16999411) 11.8 NbH ↔ Nb- b a NH N 1 4 3.9/3.5 13.5 O O NcH ↔ Nc- c O NH CH3 d 14.0 NH NdH ↔ Nd-

(2) N-(2-aminophenyl)-4-([2- (pyridin-3-yloxy)propanoyl] 2.8 amino)benzamide (ChemSpider ID a + a 11536340) N H3 ↔ N H2

a NH2 13.2 b b- b O N H ↔ N NH 0.945 5 2.0/1.9 11.3 NcH ↔ Nc-

c d 3.8 N NH O NdH ↔ Nd-

H3C O (3) N-(2-aminophenyl)-4-([(pyridin- 2-ylmethoxy)acetyl]amino) benzamide (Chemspider ID 2.8 11536336) a + a N H3 ↔ N H2 O b a NH NH2 13.2 NbH ↔ Nb-

0.945 5 2.5/2.5 11.4 c c- c N H ↔ N HN 3.3 d d- O O N H ↔ N d N

102 Chapter 4

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (4) N-(2-aminophenyl)-4-([(pyridin- 3-ylmethoxy)acetyl]amino) benzamide (ChemSpider ID 8197974) 2.8 a + a N H3 ↔ N H2

a NH2 13.2 b b- O NH b N H ↔ N

0.945 5 3.8/1.4 11.4 NcH ↔ Nc-

O NH c 4.4 NdH ↔ Nd- O

N d

Fiigure 4- 25: Predicted MS2 spectrum of candidate 1 for Peak 60

103 Chapter 4

Figure 4- 26: Predicted MS2 spectrum of candidates 2, 3and 4 for Peak 60

From the predicted MS2 spectra for the last three candidates, we can see that all the strucctures may release these fragments, thus it is difficult to make further selection of one structure. However, the ESI + signal was not more intense thhaan the signal recorded in APCI +. Thus, compounds possessing an amino group in theiir structure do not fully support the information gained by the ESI intensity. The candidate 1 may be the best candidate, but no information was found, while the three others can be involved in the synthesis of anticancer drugs (existing patents) [139].

104 Chapter 4

Peak 99: This peak of formula C15H16N2O3 (m/z 273.1226), eluting at 19.4 min was detected in APCI + and ESI +. The APCI + MS2 spectrum of the unknown is presented in Figure 4-27 (CID fragmentation) and only two fragments were observed, corresponding to the loss of C8H7NO2 and C9H10NO2 (or loss of CH3 from the fragment at 124 Da). Using ChemSpider database more than 2900 chemicalls with the formula

C12H13NO2 were proposed. MetFraag identified nine chemicals that explained one or two of the observed fragments with a score greater than 0.873. After matching the pKa and log Kow seven remained. These candidates are presented in Table 4-19, and their predicted MS2 spectra in Figure 4-28 and Figure 4-29.

Fiigure 4- 27: MS2 spectrum of unknown compound Peak 99 obtained by CID

105 Chapter 4

Table 4- 19: Candidates for Peak 99

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (1) 2-methoxy-N'-[1-(5-methylfuran- 2-yl) ethylidene]benzo hydrazide (ChemSpider ID757961) H C 3.8 3 a + a N H ↔ N O 11.7 b b- a 1 1 2.3/3.4 N H ↔ N

NCH3 O NH b 1.8 b + b N H2 ↔ N H O H3C

(2) benzamide, N-(4-amino-2- methoxyphenyl)-4-methoxy- (ChemSpider ID 833606) 3.4 CH3 a + a O N H3 ↔ N H2

13.6 0.987 1 0.9/1.4 NbH ↔ Nb-

b HN O 0.07 b + b O N H2 ↔ N H H3C

a NH2 (3) , N-(4-amino-2- methoxyphenyl)-2-phenoxy- (ChemSpider ID 833624) 3.4 a + a N H3 ↔ N H2

13.6 O NbH ↔ Nb- 0.987 1 1.6/1.4 b HN O 0.07 b + b O N H2 ↔ N H H3C

a NH2

106 Chapter 4

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (4) 1,3-Bis(4-methoxyphenyl)urea (Chemspider ID 120696)

H3C O

13.6 HN NH ↔ N- 0.987 1 2.8/3.1 O HN

O H3C (plus 5 of OCH3 rearrangement isomers) (5) 3-amino-4-methoxy-N-(2- methoxyphenyl) benzamide (Chemspider ID 8811261)

2.7 a + a CH N H3 ↔ N H2 O 3 b O NH 13.6 0.987 1 2.1/1.4 NbH ↔ Nb-

a 0.07 H N b + b 2 N H2 ↔ N H O CH3 (plus two isomers ID 16805321 and 13317570) (6) bis(3-amino-4-methoxyphenyl) methanone (ChemSpider ID 660964)

a NH2 O 2.7 CH 3 0.987 1 2.5/1.5 NaH + ↔ NaH O 3 2

a H2N O CH3 (7) 6-amino-3-(piperidin-1- ylcarbonyl) -2H-chromen-2-one (Chemspider ID 26898)

O O 3.6 + 0.873 2 0.4/0.7 NH3 ↔ NH2 O NH2 N

107 Chapter 4

Figure 4- 28: Predicted MS2 spectrum of candidates 1, 2, 3, 4, 5 and 6 for Peak 99

Figure 4- 29: Predicted MS2 spectrum of candidate 7 (6-amino-3-(piperidin-1-ylcarbonyl) -2H- chromen-2-one) for Peak 99

108 Chapter 4

While only candidate 7 has two fragments predicted with MetFrag, the other six candidates could generate a loss of a methyl group from the fragment at 124 Da. The presence of this unknown in ESI + is supporting all the candidates presented above (most of them possessing an amino function group). We observed similar fragmentation pattern with the 1,3-diethyl-1,3-diphenylurea and benzophenone which enable the observation of two fragments (snap bond on one side of the carbonyl group). The fact that the fragment 124 is the most intense could indicate a symmetrical compound such as candidates 4 and 6. Furthermore urea derivatives already have been found at the sampling site. Information regarding the use of these candidates was checked and only candidate 4 had a specific use (one patent [139]). The urea derivative is needed in the production of urea, which is an important compound in the preparation of various products (particularly agricultural chemical for the treatment of soils). We know about the pharmaceutical and dyes industries at the site, but nothing about the production of agricultural chemicals. Nevertheless, a direct use of the chemical in field can be possible. It is here again difficult to choose one candidate over another one, but urea derivatives are not known to be mutagenic, and the standard we used for the study (1,3- diethyl-1,3-diphenylurea) was not mutagenic either.

Peak 109: This peak of formula C16H11NO2 (m/z 250.0855), eluting at 16.5 min was detected in APCI +/- and ESI +. The APCI + MS2 spectrum of the unknown is presented in Figure 4-30a (CID fragmentation) and Figure 4-30b (HCD fragmentation).

Only two fragments were observed, corresponding to the loss of NH3 and CH3NO. The signal resulting from a loss of NH3 was the most intense and the molecule probably contains an amino group (see Section 4.3.1, Table 4-1). The NH3 loss was associated with high intensity peaks in the standards 2-amino-3-methylimidazo[4,5-f]quinoline, 2- amino-3,8-dimethylimidazo[4,5-f]4quinoxaline, 2-amino-1-methyl-6-phenylimidazo4,5- b]pyridine, 2-amino-9H-pyrido[2,3-b] indole, 3,3’dichlorobenzidine and 1,8- diaminopyrene. The second loss, CH3NO (NH3 +CO), indicates the possible presence of a carbonyl group, although the corresponding [M+H-CO] signal was not detected.

Using ChemSpider, 346 chemicals with the formula C16H11NO2 were proposed. MetFrag identified five chemicals with an amino group in their structure, explaining one or two of the fragments. Furthermore, these five candidates showed a high score with MetFrag. No further decrease of the number of candidates was possible as they all matched the log Kow and pKa values. These candidates are presented in Table 4-20 and the predicted MS2 in Figure 4-31 (Structures 1 and 2) and Figure 4-32 (Structure 3, 4 and 5). This peak was also present in the acidic fraction A-2-8, showing low mutagenicity.

109 Chapter 4

Figure 4- 30: MS2 spectrum of unknown compound Peak 109 a) CID fraga mentation and b) HCD fragmentation.

110 Chapter 4

Table 4- 20: Candidates for Peak 109, na: pKa out of range 0-14

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (1) 2-[amino(phenyl)methylidene]- 1H-indene-1,3(2H)-dione (ChemSpider ID 4373802)

O na 1 1 2.5/2.0 NH2

O

(2) 3-amino-2-benzoyl-1H-inden-1- one (ChemSpider ID 10520884) O 12.8 O - NH2 ↔ NH 1 1 1.0/1.7

NH 2 (3) 1-oxo-3-phenyl-1H-indene-2- carboxamide (Chemspider ID 216520)

na 0.879 2 2.4/2.2

O

NH2 O (4) 9-ethynyl-9H-fluoren-1-yl carbamate (Chemspider ID 2315078)

H 12.2 NH2 - 0.879 2 2.9/2.9 NH2 ↔ NH O O

(5) 2-aminoanthracene-9,10- dicarbaldehyde (Chemspider ID 15224742)

O 2.1 + 0.879 2 2.2/2.9 NH3 ↔ NH2

H2N

O

111 Chapter 4

Figure 4- 31: Predicted MS2 spectrum of candidates 1 and 2 for Peak 109

Figure 4- 32: Predicted MS2 spectrum of candidates 3, 4 and 5 for Peak 109

We saw with the benzophenone and the 2-amino-1-methyl-6-phenylimidazo[4,5- b]pyridine, standards used for the study (Section 4.3.1 Table 4-1), that they released a phenyl group during the fragmentation process. Thus, such fragments would be expected in the case of the structures 2 and 3. As the fragments are absent from the MS2 spectrum of the unknown, we can remove them from the list of the candidates. This unknown was also present in the acidic fraction A-2-8, and candidate 5 would not be

112 Chapter 4 expected to elute in the acidic fraction. Only candidate 4 would be present in the acidic fraction in some extent. From Structure 4, more fragments would be expected, as the loss of an ethyne group. Indeed we saw for some of the standards (Table 4-1) that functional groups on a PAH skeleton were the first to be fragmented (e.g. CH3 in 2- amino-3-methylimidazo[4,5-f] quinoline). However, in the present selection of standards (Table 4-1), none of them have in their structure an ethyne group on a PAH skeleton, so we cannot assure that this functional group will be lost during fragmentation. At this stage and based on the database, 2-[amino(phenyl)methylidene]- 1H-indene-1,3(2H)-dione and 9-ethynyl-9H-fluoren-1-yl carbamate are the best candidates we can propose, but no information regarding their use was found.

Peak 138: This peak of formula C17H19N3O4 (m/z 330.1438), eluting at 17.2 min was detected in APCI +/- and ESI +. The APCI + MS2 spectrum of the unknown was obtained from CID fragmentation, and two fragments were observed (211.1081 and 165.0657). MS3 (CID) of the 211 Da fragment was performed, and two fragments were obtained (196.0844 and 183.0762 Da). All are presented in Table 4-21 with the corresponding loss assigned by MOLGEN-MSMS. Using ChemSpider, 2257 chemicals with the formula C17H19N3O4 were proposed. Metfrag identified ten chemicals with a score greater than 0.853 and explaining at least two fragments, with three other chemicals with a score between 0.722 and 0.792 explaining three fragments.

These ten candidates fitted the log Kow range and pKa values. They are presented in Table 4-22 with the predicted MS2 in Figure 4-33 to Figure 4-38.

Table 4- 21: Fragments for Peak 138 m/z Assigned formula Neutral loss 330.1452 C17H20N3O4 Parent ion 211.1076 C10H15N2O3 C7H5NO 196.0844 C9H12 N2O3 CH3O (from 211) 183.0762 C8H11 N2O3 C2H4 (from 211) C9H11NO2 (from 330) 165.0657 C8H9 N2O2 H2O (from 183)

113 Chapter 4

Table 4- 22: Candidates for Peak 138, na: pKa out of range 0-14, nd: log Kow not predicted

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (1) 1-methyl-2-oxo-2-(phenylamino) ethyl 6-oxo-1-propyl-1,6- dihydropyridazine-3-carboxylate (ChemSpider ID 16867074) O H C 3 N 11.1 N 1 3 1.5/1.9 NaH ↔ Na-

O O O CH3 a NH

(2) ethyl 5-[(2-benzoylhydrazinyl) carbonyl]-2,4-dimethyl-1H-pyrrole-3- carboxylate (ChemSpider ID 832693)

CH3 9.0 O NaH ↔ Na-

O CH3 9.1 c 0.923 2 2.9/2.5 NH NbH ↔ Nb- H3C

b 9.1 HN O NcH ↔ Nc- O NH a

(3) ethyl 5-[(4-carbamoylphenyl) carbamoyl]-2,4-dimethyl-1H-pyrrole- 3-carboxylate (Chemspider ID 7177168)

H3C 0.7 a + a O N H3 ↔ N H2

H3C O 13.1 c 0.923 2 1.9/2.4 b b- HN N H ↔ N CH3 13.5 b HN O NcH ↔ Nc-

a H2N O

114 Chapter 4

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (4) ethyl 3-[(2-aminophenyl) carbonyl]-1-(cyclopropyl carbonyl)- 4,5-dihydro-1H-pyrazole-5- carboxylate (ChemSpider ID 9338397 ) 0.8 O O a + a CH3 N H3 ↔ N H2 O 0.892 4 0.2/2.4 N 0.7 b + b b N N H ↔ N

O

a H2N (5) 4-([(5-ethoxy-2-methyl-4- oxopyridin-1(4H)-yl)acetyl] amino)benzamide (ChemSpider ID 20907755 ) O 0.5 a + a O N H3 ↔ N H2

CH 0.859 2 1.0/2.0 3 N CH 10.1 3 NbH ↔ Nb- O

b HN O

NH2 a (6) ethyl 4-[(4-carbamoylphenyl) carbamoyl]-3,5-dimethyl-1H-pyrrole- 2-carboxylate (ChemSpider ID 7724394)

CH3 0.8 O a + a N H3 ↔ N H2

O c NH 13.3 0.859 2 0.8/1.8 NbH ↔ Nb- H3C CH3 12.9 b HN O NcH ↔ Nc-

a H2N O

115 Chapter 4

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (7) N-[3-(4-carbamoylanilino)-3-oxo- propyl]-N-ethyl-furan-3-carboxamide (ChemSpider ID 22444805)

a ONH2

0.6 a + a N H3 ↔ N H2

0.859 2 0.9/na 12.3 b HN O NbH ↔ Nb-

N O

CH3

O (8) 2-[(4-methoxy phenyl)carbamoylamino]ethyl N- phenyl carbamate (ChemSpider ID 24782508)

a HN O 10.5 NaH ↔ Na-

O 0.853 2 2.6/na 12.8 b b- b N H ↔ N NH

O NH

O H3C (9) N-(3-(2-[(2,5-dimethylfuran-3-yl) carbonyl]hydrazinyl)-3-oxopropyl) benzamide (ChemSpider ID 11702249)

H3C O 9.5 a a- CH N H ↔ N 3 a 9.3 HN O 0.792 3 0.1/1.6 b NbH ↔ Nb- O NH 13.9 NcH ↔ Nc- c O NH

116 Chapter 4

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (10) ethyl 5-(((2Z)-2-[(2- hydroxyphenyl) methylidene])carbonyl)- 2,4-dimethyl-1H-pyrrole-3- carboxylate (ChemSpider ID 18392286)

H C 3 3.3 NaH+ ↔ Na O H C 11.1 3 O 0.792 3 4.0/3.3 NbH ↔ Nb-

c HN CH3 135 NcH ↔ Nc-

O NH b

a N

HO

Fiigure 4- 33: Predicted MS2 spectrum of candidate 1 for Peak 138

117 Chapter 4

Figure 4- 34: Predicted MS2 spectrum of candidates 2, 3, 5 and 6 for Peak 138

118 Chapter 4

2-[(4-methoxy phenyl)carbamoylamino]ethyl N-phenyl carbamate (structure 8)

N-[3-(4-carbamoylanilino)-3-oxo-propyl]-N-ethyl-furan-3-carboxamide (structure 7)

165.0664 211.1083 330.1452 + + + [C8H9N2O2] [C10H15N2O3] [C17H20N3O4]

Figure 4- 35: Predicted MS2 spectrum of candidates 7 and 8 for Peak 138

119 Chapter 4

Figure 4- 36: Predicted MS2 spectrum of candidate 4 for Peak 138

Figure 4- 37: Predicted MS2 spectrum of candidate 9 for Peak 138

120 Chapter 4

Fiigure 4- 38: Predicted MS2 spectrum of candidate 10 for Peak 138

Based on the explained fragments, candidate 4 would be thhe best as the four fragments can be predicted from thhe structure and candidates 2, 3, 5, 6, 7 and 8 only enabled the prediction of two fragments. Within the remaining candidates, number 9 could not explain the second most intense fragment. At this stage the favourite candidates (most fragments and high score) are: (i) 1-methyl-2-oxo-2- (phenylamino)ethyl 6-oxo-1-propyl-1,6-dihydropyridazine-3-carboxylate (candidate 1), (ii) ethyl 3-[(2-aminophenyl) carbonyl]-1-(cyclopropyl carbonyl)-4,5-dihydro-1H- pyrazole-5-carboxylate (candidate 4) and (iii) ethyl 5-(((2Z)-2--[(2-hydroxyphenyl) methylidene]hydrazine)carbonyl)-2,4-dimethyl-1H-pyrrole-3-carboxylate (candidate 10).

Peak 143: This peak of formula C19H11NO3 (m/z 302.0804), eluting at 14.5 min was detected in APCI +/+ - and ESI +/-. The APCI + MS2 spectra of the unknown were obtained from CID and HCD frfragmentation, and four fragments were observed (284.0711, 274.0862, 256.0756 and 227.0756). MS3 (CID) of fragments 274 and 284 Da were performed, and three fragments were obtained (256.07556, 246.0912 and 228.0806). They are all presented in Table 4-23 with the correspondiing loss assigned by

MOLGEN-MSMS. Using ChemSpider, 21 chemicals with the formula C19H11NO3 were proposed. MetFrag identified three chemicals with a high scoore (0.925) and the three most intense fragments could be explained. Howevere , only onee of these candidates 2 fitted the log Kow range and is presented in Table 4-24 with the prediicted MS in Figure 4-39. For the other candidates over the 21 proposed, the score was llower than 0.291 so we removed them from the candidates list.

121 Chapter 4

Table 4- 23: Fragments for Peak 1433 m/z Assigned formula Neutral loss 284.0711 C19H10NO2 H2O 274.0862 C18H12NO2 CO CH2O2 (from parentt ion) 256.0756 C18H10NO CO (from 284) H2O (from 274) 246.0912 C17H12NO CO (from 274) 228.0806 C17H10N C2H2O2 (from 284) 227.0756 C177H9N C2H3O3 (from parentt ion)

Table 4- 24: Candidate for Peak 143

MetFrag Predicted log Kow Structure pKa Score fragmments ACD/EPI SUITE 3-(2-hydroxybenzoyl)-5H- indeno[1,2-b]pyridin-5-one (ChemSpider ID 10514029)

OH

10.0 O 0.925 3 3.8/3.8 OH ↔ O-

N

O

Figure 4- 39: Predicted MS2 spectrum of 3-(2-hhydroxybenzoyo l)-5H-indeno[1,2-b]pyriidin-5-one for Peak 143

122 Chapter 4

Regarding the MS2 prediction, the most three intense fragments are explained. The fragment 246 could also be possible by the loss of the carbonyl group present in the 5 rings of the molecule skeleton, by breakage of two C-C bonds. We saw that keto- PAHs can release such fragments (e.g. 9-fluorenone, benzanthrone, 4H-cyclopenta- [def]phenanthren-4-one). Thus, we would be able to explain four of the six fragments. The two fragments which cannot be explained with the presented structure are very low in intensity and this structure enabled to explain the most intense fragments, which is a good indicator. However, a loss of a phenyl group could be expected from parent ion or the fragment 284, as seen for the benzophenone (phenyl group bounded to a carbonyl group Table 4-1). Furthermore, the presence of signal in ESI +/- support this candidate. This candidate remain the best choice found in database. The search for information on ChemSpider or Google was not successful.

Peak 208: This peak of formula C22H10O3 (m/z 323.0694), eluting at 25.3 min was detected in APCI + and ESI +. The APCI + MS2 spectrum of the unknown showed one fragment at 295.0763, corresponding to the loss of a carbonyl group. Using

ChemSpider, five chemicals with the formula C22H10O3 were proposed, but none of them matched the log Kow range (> 4.5).

The second formula to search for a structure was C20H9N3O2. Using ChemSpider, two hits were found but only one chemical fitted the log Kow range. Most of this compound should elute in the basic fraction, due to the pKa values. However, a small part is still expected in the neutral fraction (Table 4-25). The loss of the carbonyl group is also possible and the prediction MS2 is presented in Figure 4-40.

Table 4- 25: Candidate for Peak 208 log K MetFrag Predicted ow Structure pK Score fragments ACD/EPI a SUITE 6H-indolo[6,7-a]pyrrolo[3,4- c]carbazole-6,8(7H)-dione 1.7 NaH+ ↔ Na (ChemSpider ID 10705285)

a c 0.3 N N NbH+ ↔ Nb O 1 1 2.9/4.5 2.5 NcH+ ↔ Nc

b O N

123 Chapter 4

Figure 4- 40: Predicted MS2 spectrum of 6H-indolo[6,7-a]pyrrolo[3,4-c]carbazole-6,8(7H)-dione for Peak 208

From the structure of the 6H-indolo[6,7-a]pyrrolo[3,4-c]carbazole-6,8(7H)- dione more fragments would be expected. For instance another loss of a carbonyl group as seen for the keto-PAHs and hydroxyqquinone (Table 4-1) and also the looss of a CHN group as seen for most of the azaarenes presented in Table 4-1. Furthermoorre as for peak

40 and its candidate for formula C17H12O, 7,9-dimethyl-8H--cyclopenta[a] acenaphthylen-8-one, there is a large difference in between the predicted log Kow, and if considering only the value given by EPISuite (4.5), our candidate for Peak 208 would be expected to elute after 25 minutes. In general, we observed large planar molecules eluting after 30 minutes (benz(c)acridine, benzanthrone, 4Hcyclopenta[def] phenanthrene-4-one…). As for Peak 40, mismatching of the MS2 spectrum and the formula could be due to the presence of adducts (K and Na adducts not present), or the compound does not exist in the database, or the compound we tried to identify here is part of a bigger molecule which fragmented in the ionisation source. At this stage we cannot propose another candidate using this formula. Structure generation ccould be used to generate structures that are not present in databases, but again the fragmentation information does not provide sufficient substructural information in this casse.

The second method we tried with this unknown was the mass search (neutral mass of 322.0627 Da), MetFrag found 4550 molecules fitting the MS2 patttern. The only other formulas, that showed a high score (≥ 0.814), explained the fragment and had at least seven degrees of unsaturation were: C15H10N6OS (14 degrees of unsaturation) and C14H14N2O5S (11 degrees of unsaturation). In total five chemicals were found, but aftf er matching the log Kow and pKa values only two remained as candidates and are presented in Table 4-26 and their MS2 predicted in Figure 4-41.

124 Chapter 4

Table 4- 26: Candidates for Peak 208 from the mass search log K MetFrag Predicted ow Structure pK Score fragments ACD/EPI a SUITE (1) 4-[(2-amino-4-oxo-1,3- thiazol-5(4H)-ylidene) methyl]-2,6-dimethoxyphenyl acetate (ChemSpider ID 3421619)

CH3 9.1 a + a N H3 ↔ N H2 O O OO 3.5 H C CH 0.903 1 1.0/0.7 3 3 NbH+ ↔ Nb

O

b N S a H2N

C14H14NO5S

(2) 5-(1-[(4-amino-6- cyanopyrimidin-2-yl)sulfanyl] ethyl)furo[2,3-c]pyridine-3- carbonitrile (ChemSpider ID 13748586)

O

N 3.1 a + a a N H ↔ N N 0.814 1 0.4/1.5

H3C S

N N

H2N N

C15H10N6OS

125 Chapter 4

Figure 4- 41: Predicted MS2 spectrum of candidate 1 and 2 from mass search for Peaak 208

The presence of an amino group in the two structures is supported by the presence of the unknown in ESI +. The mass search provided two canddidates for the unknown, but with such molecules, where nitrogen and oxygen are numerous, we would expect many more fragments than just a loss of a carbonyl group. Indeed C-O or C-N bond were the first observed bonds to be broken during fragmentation. Thhe study of the standards (Table 4-1) showed that the azzaarenes released the different niitrogen atoms from the rings. Thus, fragments involvinng nitrogen losses are expected, and not present in the unnknown MS2 spectrum. These two candidates are considered unlikkely and either adduct formation, in source fragmentation, co-eluting compounds could be involved in the mismmatching of the formula with the MS spectrum or the compounnd we tried to identify is not present in databases and structure generation could be considered, if enough substructural information can be gained.

Peak 212: This peak of formula C18H13NO2 (m/z 276.1020), eluting at 13.5 min was detected in APCI +/- and ESI +. The APCI + MS2 spectrum of the unknown showed five fragments in total (Figure 4-42a: CID and Figure 4-42b: HCD fragmentation). Four of the five neutral losses were identified, and corresponded to

NH3, H2O, C2H2O, CH3NO. Using ChemSpider, 417 chemicals withh the formula C18H13NO2 were proposed and Metfrag identified seven of them with a hhigh score (≥ 0.749), explaining two or three fragments. After matching the log Kow range and the pKa values only three of them remained as candidates, and are presented in Table 4-27 and their predicted MS2 in Figure 4-43 and Figure 4-44.

126 Chapter 4

Figure 4- 42: MS2 spectrum of unknown compound Peak 212 a) CID ffragmentation and b) HCD fragmentation.

127 Chapter 4

Table 4- 27: Candidate for Peak 212

MetFrag Predicted log Kow Structure pKa Score fragmments ACD/EPI SUITE (1) 3-amino-2,5-diphenylcyclohexa- 2,5-diene-1,4-dione (PubChem ID 12251096) 12.1 H N O - 2 1 2 3.3/3.1 NH2 ↔ NH

O (2) 7-hydroxy-3-(4-methoxyphenyl) naphthalene-1-carbonitrile (ChemSpider ID 9429393)

CH3 8.2 O OH ↔ O- 0.927 3 4.4/4.1

OH

N (3) 2-[(E)-2-[4-(hydroxymethyl) phenyl]ethenyl]-1-benzofuran-5- carbonitrile (Chemspider ID 13815809)

14.8 N 0.758 3 3.5/3.7 OH ↔ O-

O OH

Figure 4- 43: Predicted MS2 of candidate 1 for Peak 212

128 Chapter 4

Fiigure 4- 44: Predicted MS2 of candidates 2 and 3 for Peak 212

Candidate 1 has two predicted fragments, including the frragment releasing a

NH3 group. We saw that NH3 losses are mainly connected to the prresence of an amino group in PAHs skeleton (seen in Table 4-1 for 2-amino-3-methylimidazo[4,5-f] quinoline, 2-amino-3,8-dimethylimidazo[4,5-f] quinoxaline, 2-amino-9H-pyrido[2,3- b]]indole, 2-amino-1-methyl-6-phenylimidazo [4,5-b]pyridine, and 1,8-diaminopyrene). Some azaarenes (e.g. 1-methyl-9H-pyrido[3,4-b]indole and benz(c)acridine) enabled the loss of NH3 as well. This loss was also linked to a high signal. In contrary candidates 2 and 3 are not able to release a NH3 group, making them less likely. In addition, these two last candidates possess a hydroxyl group in their structure, but were not detected in ESI-, which is contrary with what we saw for hydroxy compounds. Candidates 2 and 3 can be removed from the candidate list. Furthermore candidate 1 could also be able to produce the fragment at 231 Da (loss of CH3NO) by the breakage of two C-C bonds to release a carbonyl group and the release of the already observed of NH3. Thus, the 3- amino-2,5-diphenylcyclohexa-2,5-diene-1,4-dione is our best candiidate and could be mutagenic as it has in its structure an amino group, but no information about its use was found.

129 Chapter 4

Peak 213: This peak of formula C15H16N2O (m/z 241.1329), eluting at 20.4 min was detected in APCI + and ESI +. Thee APCI MS2 spectrum of the unknown showed two fragments (Figure 4-45). The assignned fragments were C8H8NO and C7H9N with the respective loss of C7H9N and C8H9NNO. Thus, the breakage of only one bond is involved in the formation of these two fragments. Furthermore, the mass difference in between the two fragments corresponds to a carbonyl group. This type of fragmentation was also observed in the case of the benzophenone and 1,3-diethyl-1,3--diphenylurea, where one C-C bond was broken releasing the two fragments. Using ChemSpider, 1767 chemicals with the formula C15H16N2O were proposed and Metfrag identified 116 of them with a high score (≥ 0.865), explaining one or two fragments of the ttwo observed.

After matching the log Kow and pKa values, 31 compounds were removed, leaving still 85 compounds on the list. As explained above, from one bond split the two fragments should be observed. Thus, we can remove the 28 chemicals for which only one fragment is predicted. The remaining candidates are presented in Table 4-28.

Figure 4- 45: MS2 spectrum of unknown compound Peak 213

130 Chapter 4

Table 4- 28: Candidate for Peak 213

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (1) (6Z)-6-(1-[2-(2- methylphenyl)hydrazino] ethylidene)cyclohexa-2,4-dien-1-one (ChemSpider ID 5201826)

O 13.0 1 2 3.0/2.8 NaH ↔ Na- NH a H3C NH

H3C + 2 isomers

(2) 4-(1-[2-(4- methylphenyl)hydrazinyl]ethylidene)c yclohexa-2,5-dien-1-one (ChemSpider ID 7349810)

O 12.2 NaH ↔ Na- 1 2 2.9/2.5

NH a H3C NH CH3

+ 1 isomer (3) N-(5-amino-2-methylphenyl)-2- methylbenzamide (Chemspider ID 963260)

a O NH2 4.0 a + a NH 0.96 2 2.4/2.3 N H3 ↔ N H2

CH3 H3C

+ 10 isomers

(4) 3-amino-4-methyl-N-(4- methylphenyl) benzamide (ChemSpider ID 15117243 )

O NH 3.1 a + a 0.96 2 2.6/2.9 N H3 ↔ N H2

a H3C NH2 CH3 + 13 isomers

131 Chapter 4

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (5) N-(5-Amino-2-methylphenyl)-2- phenylacetamide (ChemSpider ID 6863132)

a O NH 4.2 2 a + a 0.96 2 1.3/1.9 N H3 ↔ N H2

NH

H3C + 2 isomers

(6) 3-amino-N,4-dimethyl-N- phenylbenzamide (ChemSpider ID 11856278)

CH3 2.6 a + a N 0.96 2 2.2/2.2 N H3 ↔ N H2 CH3 O a NH2 + 4 isomers (7) 1-[2-amino-5- (benzylamino)phenyl]ethanone (ChemSpider ID 9086908) 2.7 CH a + a 3 N H3 ↔ N H2 O 0.96 2 2.5/2.1 a b 4.3 H2N NH b + b N H2 ↔ N H

+ 1 isomer (8) bis(4-amino-3- methylphenyl)methanone (ChemSpider ID 11255017)

O CH 3.9 3 + 0.96 2 2.3/2.4 NH3 ↔ NH2

H2N NH2

CH3 + 6 isomers

132 Chapter 4

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (9) 4-amino-N-benzyl-3- methylbenzamide (ChemSpider ID 16491734)

O

CH3 2.7 HN a + a 0.956 2 2.5/2.5 N H3 ↔ N H2 a NH2

+ 5 isomers (10) N-(4-aminophenyl)-N,4- dimethyl-benzamide (ChemSpider ID 16491734)

H3C 4.2 a + a N NH2 0.879 2 2.7/2.5 N H3 ↔ N H2

H3C O + 3 isomers

Based on the structure of the candidates presented in Table 4-28, we could removed candidates 6 and 10. Indeed the standards used (Table 4-1) showed that any alkyl group bound to a nitrogen atom was released during fragmentation. It was also seen with the purchased standard for Peak 50 (1-isopropyl-5-methyl-1H-indole-2,3- dione) which lost its tert-butyl group linked to the nitrogen. Candidates 6 and 10 have a methyl group on a nitrogen atom, but no loss of CH3 was observed and hence, these nine candidates are not likely. Furthermore we saw with the benzophenone and the 1,3- diethyl-1,3-diphenylurea, that there is always a bond split on one side or the other from the carbonyl group, and in the case of our unknown compound two fragments are obtained containing each an atom of nitrogen. However, for candidates 3 and 5 the two nitrogen atoms are located on the same side of the carbonyl group, thus fragment with

CxHyN2 is expected, but not seen. These 14 candidates were also removed from the list. And 34 candidates are still remaining on the list, but we cannot further decrease the number of possible candidates.

Peak 228: This peak of formula C21H11NO3 (m/z 326.0802), eluting at 17.0 min was detected in APCI +/- and ESI +. The APCI + MS2 and MS3 spectra of the unknown showed six fragments in total, which are presented in Table 4-29 with their assigned loss. Using ChemSpider, eight chemicals with the formula C21H11NO3 were proposed and Metfrag identified only one with a score of 1, 2-phenylanthra[2,3-d][1,3]oxazole- 5,10-dione, explaining only two of six fragments, not including the most intense fragments. Furthermore its log Kow did not fit the range (5.2 and 4.8 predicted by ACD and EPISuite respectively). All the other candidates had a score of 0. We cannot provide any candidate for this unknown, because it does not exist on the database or because the empirical formula is incorrect due to adducts (Na and K were not present) or coeluting compounds.

133 Chapter 4

Table 4- 29: Fragments for Peak 228 m/z Assigned formula Neutral loss 326.0619 C21H12NO3 298.0868 C20H12NO2 CO 282.0919 C20H12NO CO2 241.0501 No assignement No assignement 223.0394 C14H7O3 C7H5N C H NO (from parent ion) 195.0436 C H O 8 5 13 7 2 CO (from 223) 139.0537 C11H7 C10H5NO3

The second method we tried with this unknown was the exact mass search (neutral mass of 325.0732 Da), MetFrag found 240 molecules fitting the MS2 pattern. The only formula that showed a high score (≥ 0.897) and had at least seven degrees of unsaturation was C13H15N3O5S (10 degrees of unsaturation), explained only one fragment. In total two chemicals were found, but after matching the log Kow and pKa values none of them would have eluted in the neutral fraction, but in the acidic fraction instead. Adduct study to clarify the empirical formula and structure generation should be proceeded for further identification.

Peak 233: This peak of formula C18H21N3O5 (m/z 360.1545), eluting at 18.4 min was detected in APCI + and ESI +. The APCI + MS2 and MS3 spectra of the unknown showed six fragments in total, which are presented in Table 4-30 with their assigned loss. Using ChemSpider, 1520 chemicals with the formula C21H11NO3 were proposed and Metfrag identified ten of them with a high score (> 0.833), explaining at least three fragments of the six observed. After matching the log Kow range and pKa values, three candidates were removed and another two were also removed due to their unplanar structure. The five remaining compounds are presented in Table 4-31 and the predicted MS2 in Figure 4-46 (candidate 1), Figure 4-47 (candidate 2) and Figure 4-48 (candidates 3, 4 and 5).

Table 4- 30: Fragments for Peak 233 m/z Assigned formula Neutral loss 360.1557 C18H22N3O5 211.1080 C10H15N2O3 C8H7NO2 196.0838 C9H12N2O3 CH3 (from 211) 183.0760 C8H11N2O3 C2H4 (from 211) 164.0655 C8H9N2O2 C2H6O (from 211) 124.0755 C7H10NO C11H12N2O2 109.0521 C6H7NO C12H15N2O2

134 Chapter 4

Table 4- 31: Candidates for Peak 233, nd: log Kow not predicted

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (1) 2-[(4-methoxyphenyl)amino]-1- methyl-2-oxoethyl 6-oxo-1-propyl- 1,6-dihydro pyridazine-3-carboxylate (ChemSpider ID 16896706) O CH3 a 11.3 NH 1 4 1.5/2.0 NaH ↔ Na - H C 3 O OO

N N CH3 O (2) ethyl 3-(2-amino-5-methoxy benzoyl)-1-(cyclopropylcarbonyl)- 4,5-dihydro-1H-pyrazole-5- carboxylate (ChemSpider ID 9385631) 1.2 a + a N H3 ↔ N H2

O 0.906 5 0.3/2.5 b N O 0.9 N b + b O N H ↔ N O a H N 2 CH3

O CH3 (3) N-(2,4-dimethoxy-5-([(2- methoxyphenyl) carbamoyl]amino) phenyl)acetamide (Chemspider ID 4991426) 11.9 NaH↔ Na-

c 12.8 HN 0.833 3 2.3/1.8 NbH↔ Nb- b O HN O CH3 13.5 O c c- H3C O N H↔ N

NH a CH3 O H3C

135 Chapter 4

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (4) N-(2,5-dimethoxy phenyl)-2-[(3- methoxy phenyl)carbamoylamino] acetamide (Chemspider ID 21985955) O CH3 13.5 a a- H3C N H↔ N O

c HN O 13.4 0.833 3 2.5/nd NbH↔ Nb- b NH 10.9 NcH↔ Nc-- O NH a

O CH3 (5) N-(2,4-dimethoxy phenyl)-2-[(3- methoxy phenyl) carbamoylamino] acetamide (Chemspider ID 15224742)

CH3 O 13.5 NaH↔ Na-

H3C O 13.5 NbH↔ Nb- HN O 0.833 3 2.1/nd c 11.4 b NH NcH↔ Nc--

O NH a

O CH3

136 Chapter 4

Fiigure 4- 46: Predicted MS2 of candidate 1 for Peak 233

Fiigure 4- 47: Predicted MS2 of candidate 2 for Peak 233

137 Chapter 4

Figure 4- 48: Predicted MS2 of candidates 3, 4 and 5 for Peak 233

All these candidates enabled the explanation of the most intense fragment (211 Da) coming with the loss of a large part of the parent ion and in all the candidate the bond involved in the formation of this fragment is on one side of a carbonyl group. This type of break was also observed for the benzophenone and 1,3-diethyl-1,3-ddiphenylurea. Based on the explained fragments, candidates 2 and 1 would be the best as five and four fragments can be respectively predicted from the structures, while only three fragments were predicted for candidates 3, 4 and 5. 2-[(4-methoxyphenyl)amino]-1- methyl-2-oxoethyl 6-oxo-1-propyl-1,6-dihydro pyridazine-3-carboxylate (candidate 1) was available as standard and was purchased for confirmation. This chemical revealed to be very unstable and we did not manage to record any MS spectrum, as it fragmented in the source during ionisation and thus cannot be the unknown. As expplained above, MetFrag uses bond dissociation energy to rank candidates and weaker bonds, thus have a priority. We can see from its structure that there are several C-O and C-N bonds which are weaker than C-C bonds, which would give instability to the molecule, explaining the high ranking of the candidate but also the instability of the candidate.

138 Chapter 4

Peak 242: This peak of formula C12H12N2 (m/z 185.1066), eluting at 16.6 min was detected in APCI +, only. MS2 fragments were obtained with CID (Figure 4-49a) and HCD (Figure 4-49b). The losses corresponding to the fragments were assigned using MOLGEN-MSMS and were: CH3, CH4, CHN, N2, C2H4, C2H2N and C2H3N. Using ChemSpider, 500 chemicals with the formula C12H12N2 were proposed and Metfrag identified 81 candidates with a high score (≥ 0.802 including 15 with a score ≥ 0.916 with six assigned fragments) and explaining five or six of the seven observed fragments. Amongst these candidates thirteen of them had an amino functional group and were removed from the list as amino-compounds are better ionised in ESI + and should have a greater signal than in APCI +. This unknown was not detected in ESI +, supporting the absence of amino group. After matching the log Kow and pKa values, 37 candidates remained, with 15 within the score range of 0.916 to 1. These 15 candidates are presented in Table 4-32. In this case we cannot further decrease the number of candidates based on physical-chemical properties of the candidates, and/or MS2 prediction.

139 Chapter 4

Figure 4- 49: MS2 spectrum of unknown compound Peak 242 a) CID fraga mentation and b) HCD fragmentation.

140 Chapter 4

Table 4- 32: Candidates for Peak 242, na: pKa out of range 0-14, nd: log Kow not predicted

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (1) 4-(cyanomethyl)-2-propyl (ChemSpider ID 15554727) N

1 6 2.4/2.6 na

N CH3 (2) 2-(4-ethylphenyl) butanedinitrile (ChemSpider ID 25620509) N 1 6 1.6/nd na

CH3 N (3) 4-(cyanomethyl)-3-propyl benzonitrile (Chemspider ID 15555002)

N 1 6 2.4/2.6 na

H3C

N (4) 3-(cyanomethyl)-2- propylbenzonitrile (ChemSpider ID 15555691 ) N

1 6 2.4/2.6 na

N CH3 (5) 3-(cyanomethyl)-4- propylbenzonitrile (ChemSpider ID 15555015)

CH3 1 6 2.4/2.6 na N

N

141 Chapter 4

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (6) (2,3-dimethyl-1H-indol-1- yl) (ChemSpider ID 15425225) CH3 0.953 6 2.7/2.7 na CH3 N

N (7) 3-(2-methyl-1H-indol-1-yl) propanenitrile (ChemSpider ID 15350264)

0.945 6 2.5/2.7 na N N

H3C (8) 2,2'-(4,5-dimethyl benzene-1,2- diyl) diacetonitrile (ChemSpider ID 269821) N

0.941 6 1.8/2.2 na N

CH3 CH3 (9) 3-(3-methylphenyl) pentanedinitrile (ChemSpider ID 15160596)

N 0.941 6 1.4/2.0 na

CH3 N (10) 3-(2-methylindolizin-3-yl) propanenitrile (ChemSpider ID 1735125) N 0.941 6 2.5/3.2 na

N CH3

142 Chapter 4

MetFrag Predicted log Kow Structure pKa Score fragments ACD/EPI SUITE (11) 3-(4-methylphenyl) pentanedinitrile (ChemSpider ID 14306461)

N 0.941 6 1.4/2.0 na

H3C N (12) 3-(3-methyl-1H-indol-2-yl) propanenitrile (ChemSpider ID 2820995) H N N 0.941 6 2.1/2.7 na

CH3 (13) 3-(2-methylphenyl) pentanedinitrile (ChemSpider ID 15160320)

0.941 6 1.4/2.0 na

CH3 N N (14) 3-(3-methyl-1H-indol-1-yl) propanenitrile (ChemSpider ID 2075988)

0.941 6 1.4/2.0 na

N N H3C (15) 3-(2-methylbenzyl) pyridazine (ChemSpider ID 15426110)

2.8 NaH+ ↔ Na 0.916 7 1.7/nd CH3 2.8 a b + b N N H ↔ N b N

143 Chapter 4

4.3.4 Limitation of the method and new challenges

We saw that for some peaks (40, 143, 208 and 228) it was impossible to propose either candidates at all or good candidates and this can be explained with two main reasons:

(i) Wrong empirical formula assigned. The determination of the elemental composition requires a clean mass spectrum with no interfering noise or co-eluting compounds. This is the case for Peak 40 and Peak 208 with two or three co-eluting compound with similar mass. Co-elution could be solved by using a different solvent gradient during the separation phase or another reversed-phase column. Another reason for unclear formula is that LC-MS using ESI and APCI can produce adducts, including [M+H]+ , [M+Na]+ and others [103,148]. While [M+H]+ adducts are not a problem, because already integrated in programs (e.g. MOLGEN-MSMS, MetFrag) or in databases (e.g. ChemSpider), and [M+Na]+ or [M+K]+ can be easily detectable due to their isotopic pattern on mass difference, (Table 4-4), the presence of other adducts or combination of different adducts can be a more difficult to determine. However, the nature of the adduct has to be determined before accurate mass data acquisition can be used for assigning elemental composition. One possible solution is to increase the concentration of a specific ion in the liquid phase to force the generation of the corresponding adduct [133]. For the peaks, where no candidates were proposed an exact mass search was used, but was not very successful. This method presents also the disadvantage to propose several empirical formulas in the range above 185 Da when elements C, H, N, S, O and P were included in the search, even with the most accurate spectrometer [103]. The candidates can be numerous and with the wrong elemental composition assignment. Thus, applying isotopic abundance pattern is the best way of determining the elemental composition and also removes most of the wrongly assigned molecular formulas.

(ii) The unknown compound is absent from the databases. At this stage, database searching, especially combined with predicted properties, is very useful to assist in the selection of the structure. However, the results depend greatly on the compounds present in the database. The results for Peak 208 show this, as there are no satisfactory candidates fitting the log Kow, pKa and ionisation information. An alternative approach would be the use of structure generation software which delivers all chemically possible candidates for a given molecular formula. A recent study has shown that the combination of mass spectral substructure information and calculated properties is effective in reducing the number of possible candidates using structure generation techniques for GC-EI-MS [30]. Although this could be applied to peaks with no or no good candidates (e.g. Peak 143 or Peak 208), there is generally insufficient information from fragmentation patterns to limit the number of candidates sufficiently to apply structure generation here. For instance, the losses of H2O, CO, CH2O2 (H2O + CO), C2O2 (2 CO), C2H2O3 (H2O + 2 CO) and C2H3O3 were identified for Peak 143, which results in the generation of more than 10 000 000 possible structures. As aromatic structures fragment much less no information on ring structure is available, resulting in millions of possible structures. This is shown by Peak 208, with only one fragment

144 Chapter 4 corresponding to a loss of carbonyl ([M+H-CO]+), resulting in even fewer restrictions to candidate numbers. Again here more than 10 000 000 structures were generated.

In a lot of cases a single candidate is not possible, but the number of possible candidates remains below 10 candidates, which is acceptable. Peak 213 and Peak 242 are two examples where the number of candidates is higher (34 for Peak 213 and 15 for Peak 242), including stereoisomers and many structurally highly similar compounds. MetFrag improves the identification in enabling the selection of candidates, but then it is not possible to distinguish stereoisomers from MS data alone [106] and standards or additional spectroscopic techniques, preferably NMR, to confirm the structure of the unknown are strongly needed. Sample preparation is crucial for NMR analysis, as a sufficient amount and high purity are essential and this would involve further fractionation steps to reach single compound isolation, which can be a problem for sample amount. Furthermore if standards are available and can be used to elucidate the structure, the high number of candidates can be a limitation, especially when synthesis is required for the standard. However, standards remain important in the study of the fragmentation behaviour and fragmentation rules can be implemented in the selection of candidates using standard information. This was the case for Peak 50, where we proposed 19 candidates, with some of them having a very similar structure. The purchased standard enabled us to remove 11 candidates from the list due to the fragmentation pattern.

4.3.5 Strains to diagnostic mutagenicity to support structure elucidation of unknown mutagens

Mutagenicity in specific diagnostic strains may be used to identify substructures likely contributing to mutagenicity. The sample was taken from an industrial area, where dyes, pharmaceuticals and explosives are produced. These types of chemicals often present in their structure amino- and nitro-groups and thus their presence at the sampling site is strongly expected. As seen in Chapter 2, Section 2.3.2, different strains of Salmonella exist and can be extremely sensitive to the presence of particular functional groups. The YG 1024 and YG 1041 are very sensitive to the nitroarenes and/or aromatic amines. Thus, in addition to strain TA 98, these two YG-type strains were used in this study to provide additional information during the identification process. Mutagenicity comparisons were performed as described in Chapter 3, Section 3.3.2, based on slopes expressing the number of revertants per well/g of BR. The results are presented in Figure 4-50 without S9 and Figure 4-51 with S9. Without S9 activation, the TA98 and YG1024 strains did not show any mutagenicity, while only the YG1041 strain showed some slight mutagenicity in fraction N2-8. However, the number of revertants remained low, even at the highest concentration, which is contradictory to the high sensitivity of this strain for nitroarenes. Indeed the presence of such compounds should show a very high number of revertants in comparison to the TA98, which is not the case here. Thus, the presence of nitroarenes in the fraction N-2-8 can be excluded. This was also confirmed by the results of mutagenicity with S9, where the YG1041 exhibited a slightly higher mutagenicity than the YG1024. With the YG 1041, 624.5 revertants per well/g of BR were induced and

145 Chapter 4

521.8 revertants per well/g of BR with the YG 1024. This represented a difference of mutagenicity of 16%. However, both YG-type strains showed much greater mutagenic effects after S9 activation than the TA98, for which 120.5 revertants per well/g of Br were induced. The mutagenic effects with S9 activation were 5.2 times and 4.4 times higher in YG 1041 and YG 1024 respectively than in TA 98. Thereby aromatic amines are expected to contribute to mutagenicity of fraction N-2-8. This is in accordance with the list of candidates presented above (Section 4.3.3), where most of the proposed candidates are aromatic amines.

45

40

35

30

25

20

15

10

5

0 Number of revertants per Number of plate (max=48) revertants 0.005 0.05 0.5 Concentration ( g of BR/ plate)

Figure 4- 50: Mutagenicity of fraction N-2-8 using three strains without S9 activation, TA98 ( ), YG1024 ( ) and YG1041 ( ).

146 Chapter 4

45

40

35

30

25

20

15

10

5

0 Number of revertants per Number of plate (max=48) revertants 0.005 0.05 0.5 Concentration ( g of BR/ plate)

Figure 4- 51: Mutagenicity of fraction N-2-8 using three strains with S9 activation, TA98 ( ), YG1024 ( ) and YG1041 ( ).

4.4 Conclusions

The identification of unknown compounds is of fundamental importance for the success of EDA studies. The success of the developed strategy is highly dependent on the availability of compounds databases and the compounds which are present. However, the availability of exact mass, enabling the determination of the empirical formula of the unknown compound combined with the fragmentation pattern allowed for the identification of candidates. The majority of this process combining different computer programs (MOLGEN-MSMS, MetFrag, Mzmine) remains manual and is time consuming. Thus, there is still a great need for automated database search development and improvement for the identification of polar unknown compounds, such as NIST for GC-MS. The identification is fundamental, but the confirmation is also an important step in EDA. In the case where the lack of commercially available or affordable standards present problems, a weight-of-evidence confirmation may be performed initially by matching the Ames test results with mutagenicity prediction results for the candidates, and/or by matching the retention behaviour of the unknown with predicted retention behaviour of candidates, in an effort to prioritize synthesis of standards for confirmation.

147

Chapter 5

CHAPTER 5: USE OF RETENTION AND MUTAGENICITY PREDICTION TOOLS TO DECREASE THE NUMBER OF CANDIDATES FOR CONFIRMATION

5.1 Introduction

In non-target screening and effect-directed analysis, confirmation of tentatively identified structures is typically achieved by the use of analytical standards if commercially available [22]. However, a lack of available standards for many components of environmental samples hampers confirmation, and chemical synthesis of the required standards may cause great expenses. Therefore, it is important to be able to prioritise the chemicals to be confirmed and to have a limited number of them and hence other evidences can be helpful in the selection of the candidate(s). These evidences can include retention prediction such as quantitative structure-retention relationships (QSSRs) [151] or linear solvation energy relationships (LSERs) [109] and toxicity prediction of candidates (quantitative structure-activity relationships (QSARs)) [114]. In the previous chapter, 92 candidates were proposed as possible components of the mutagenic fraction N-2-8, represented by 17 peaks of compounds with unknown structure. About half of the candidates are aromatic amines compounds. This class of chemicals is known to be widely used (e.g. dyes, pharmaceuticals), has been found in the environment [152] and is known to include potent mutagens [8]. The presence of aromatic amines was confirmed by the higher mutagenic effects in the YG 1041 and YG 1024 than in the TA 98. Based on these analytical and biological results, it was hypothesised that aromatic amines cause or at least significantly contribute to the measured mutagenicity in the fraction N-2-8. Two complementary approaches were applied to reduce the number of candidate structures for every peak including: (i) retention prediction in reversed phase liquid chromatography calculating the Chromatographic Hydrophobicity Index (CHI) using Linear Solvation Energy Relationships (LSERs) as described in Ulrich et al. [110] and (ii) mutagenicity prediction of aromatic amines on the basis of the stability of corresponding nitrenium ions as the ultimate electrophiles attacking the DNA as described in Bentzien et al. [69].

5.2 Materiel and methods 5.2.1 LC-MS/MS system

The LC system used is described in Chapter 3 (Section 3.2.3.3). Active fraction N-2-8 and calibration standards (Table 5-1) used for the determination of the Chromatographic Hydrophobicity Index (CHI) were separated on an analytical C18 reversed-phase column (Kinetex™ C18, 150 x 3.0 mm, 2.6 µm particle size, Phenomenex, Aschaffenburg, Germany). A linear gradient of ammonium acetate buffer (10 mmol/L, pH 7.3) and 0.1% in methanol was used with a flow of 0.2 mL/min with the following conditions: 0-3.2 min, 10% of methanol; 3.2-17.8 min, 10- 95% of methanol; 17.8-37.8 min, 95% of methanol, then re-equilibration of the phase. The column was placed in an oven at 22°C [153]. This LC system was coupled to a

149 Chapter 5

LTQ-Orbitrap hybrid instrument (Thermo Fisher Scientific, Bremen, Germany), described in Chapter 3 (Section 3.2.3.3).

5.2.2 LSER method

Linear Solvation Energy Relationship (LSER) applied to reduce the list of candidates for fraction N-2-8 was taken from Ulrich et al. [110]. In a first step, calibration standards (Table 5-1) with known CHI values were measured three times with the defined gradient system and column described above (Section 5.2.1). The mean value was used and plotted versus their CHI value (found in literature), providing a linear regression equation, which enabled to calculate the CHI of the unknowns from their measured retention times. In a second step, the phase parameters (descriptors a, b, s, e and v seen in Chapter 2, Section 2.7.1) were determined by measuring the solute property for a wide range of substances on different gradient systems with different pH and stationary phases. A multi-parameter linear regression was performed using Statistica 8.0 (StatSoft, Hamburg, Germany). The resulting equation was used to calculate the CHI of the candidates provided in Chapter 4.

In a third step, the solute descriptors A, B0, S, E and V for all candidate structures were predicted using ACD/ADME Suite 5.0.7 Absolv (ACD/Labs, Toronto, Canada). In a fourth step, CHI values of unknowns were compared to the CHI values of the candidates. Thus, it was possible to exclude candidates if the CHI value of the unknown did not fit the CHI range of the candidate. The comparison was based on a 95% prediction interval. Structures with predicted CHI values outside this interval were excluded. The candidate structures in this study were rather complex with the possibility of intramolecular interactions and hence may present a greater risk of deviation from prediction than most of the compounds used to develop the Absolv model. This was considered by not excluding compounds if outside of the prediction interval by less than + 0.2.

150 Chapter 5

Table 5- 1: Calibration standards Standard name Retention time (min) CHI (literature) 21.4 67.68 23.5 76.31 25.0 82.31 Valerophenone 26.1 87.33 Paracetamol 11.3 40.29 1-naphthol 24.1 78.92 Diethyl phthalate 24.5 78.96 Anisole 23.4 77.49 2-naphthol 23.7 76.09 Pyrene 28.9 100.24 Ethylbenzene 26.5 90.29 Propylbenzene 27.5 94.16 2-nitrophenol 21.3 71.08 Acetanilide 17.7 57.87 Naphthalene 26.3 89.39 28.1 96.55 3-nitrobenzoic acid 21.0 79.75 Caffeine 17.1 49.70 Benzene 23.0 78.21 Chlorbenzene 25.3 85.05 Toluene 25.3 85.76 Ibuprofen 26.9 90.85 Vanillin 18.0 56.22

5.2.3 Mutagenicity prediction

The mutagenicity prediction method applied to reduce the list of candidates for fraction N-2-8 was based on the nitrenium theory as described by Bentzien et al. [69] (Chapter 2, Section 2.7.2). In this study, 257 amines were used (183 mutagenic and 74 non mutagenic) to validate the method. The selected aromatic amines fitted the following criteria: (i) no formal charge, (ii) molecular weight lower than 500 Da, (iii) no more than one stereocentre in the molecule, (iv) less than 10 rotable bonds and (v) only one aromatic amine functionality. The method follows the suggestion of Ford et al. [117], applying reaction energies calculated relative to aniline according to Equation 2-2 in Chapter 2, Section 2.7.2. While Ford et al. [116,117] used a ΔΔE of 0 kcal/mol as a cut off criteria discriminating non-mutagenic compounds from possible mutagens, Bentzien et al. [69] included a “margin error” of ± 5 kcal/mol in the prediction of ΔΔE values. The authors could show that using this margin error the number of false positives and false negatives were greatly improved. Thus, criteria for probable mutagenicity of candidate structures were applied according to Bentzien et al [69] and are as follow: (i) ΔΔE < -5 kcal/mol compounds considered as positive Ames compounds, (ii) ΔΔE within the range [-5 + 5] kcal/mol compounds are called non-predicted and (iii) ΔΔE > +5 kcal/mol compounds considered as negative Ames compounds. This approach is applicable to mono- and poly-cyclic aromatic amines.

151 Chapter 5

The calculations were done using MOPAC (Molecular Orbital PACkage), which is a semi-empirical quantum chemistry program, allowing the calculation of the heat of formation of the different compounds present in the reaction presented in Chapter 2, section 2.7.2.

5.3 Results and Discussion 5.3.1 Application of the LSER method to fraction N-2-8

In the first step, the column was calibrated using the standard references in Table 5-1. The resulting equation is presented below and enabled the calculation of the CHI of the unknowns from their retention time (Rt), (Table 5-2).

Equation 5-1: CHI = 3.55 Rt – 4.86 with r2 = 0.9543

All the CHI values of the unknowns in this study were between 78 and 88, except for the unknown 242 with a CHI of 49. Thus, most of the unknowns required at least 78% of methanol to be eluted.

In the second step, the phase descriptors were determined by multiple regression using the Abraham equation seen in Chapter 2 (Section 2.7.1) and the resulting equation is presented below. The predicted analyte descriptors for the candidates (A, B0, S, E and V) are presented in Table 5.2 as well as the calculated CHI, their range of prediction (including the error ± 0.2).

Equation 5-2: CHIcalculated = -6.75(± 0.88)A - 34.05(± 1.13)B0 - 9.47(± 0.98)S + 1.71(± 0.58)E + 35.20(± 0.91)V + 62.80(± 0.87)

From Table 5-2, we can see if the measured CHI of the unknown is within the calculated CHI range of the corresponding candidates and predicted retention range of 22 out of all 92 candidates was in agreement with measured retention of the unknowns in fraction N-2-8.

152

Table 5- 2: Descriptor values, CHI and CHI range for the candidates in fraction N-2-8, CHI of the unknowns. +: candidate retained on the list, -: candidate eliminated from the list Candidates parameters Unknowns parameters Retention Name (ChemSpider ID) A B S E V CHI CHI Range CHI Prediction 0 calc time (min) unknown Unknown Peak 3 23.4 78.23 2-amino-5-(1,3-benzodioxol-5-yl)-4-methylthiophene- 0.23 1.04 2.18 2.15 1.79 71.99 65.62 – 78.36 + 3-carbonitrile (291445) 2-amino-4-(2,3-dihydro-1,4-benzodioxin-6-yl) 0.23 1.09 2.65 2.16 1.79 65.85 59.36 – 72.34 - thiophene-3-carbonitrile (4204525) methyl 3-amino-5-(4-cyanophenyl)thiophene-2- 0.18 0.93 2.09 1.93 1.85 78.86 72.52 – 84.80 + carboxylate (10167241) Unknown Peak 19 26.0 87.57 3H-benz[e]indole (9519149) 0.31 0.36 1.43 1.94 1.32 84.51 78.20 – 90.82 + 1H-benz[f]indole (10496645) 0.31 0.36 1.43 1.94 1.32 84.51 78.20 – 90.82 + 1H-azuleno[1,2-b]pyrrole (14524926) 0.13 0.87 1.01 1.62 1.32 71.79 65.32 – 77.84 - Unknown Peak 35 25.8 86.58 2-oxo-2H-benzo[g]chromene-3-carbaldehyde 0 0.85 1.98 1.96 1.59 74.32 67.95 – 80.68 - (9416609) Unknown Peak 50 23.9 80.01 5,6-diethyl-1H-indole-2,3-dione (13494962) 0.36 0.91 1.42 1.62 1.58 72.22 65.92 – 78.54 - 4,5,6,7-tetramethyl-1H-indole-2,3-dione (19461480) 0.36 0.9 1.5 1.47 1.58 73.78 67.47 – 81.10 + 5-tert-butyl-1H-indole-2,3-dione (517156) 0.36 0.95 1.62 1.38 1.58 70.79 64.48 – 77.10 - 5-butyl-1H-indole-2,3-dione (2339601) 0.36 0.91 1.69 1.39 1.58 71.51 65.20 – 77.81 - 5-(1-methylpropyl)-1H-indole-2,3-dione (16812332) 0.36 0.94 1.67 1.40 1.58 70.69 64.38 – 77.00 - 7-(2-methylpropyl)-1H-indole-2,3-dione (513041) 0.36 0.94 1.67 1.40 1.58 70.69 64.38 – 77.00 - 7-amino-4-propyl-2H-chromen-2-one (2012579) 0.23 0.80 1.49 1.29 1.58 77.85 71.56 – 84.15 + Unknown Peak 60 24.5 82.03 2-ethoxy-N-(3-[(phenylcarbamoyl) amino]phenyl) 1.00 1.92 3.35 2.59 2.84 63.45 56.76 – 70.14 - pyridine-3-carboxamide (16999411) N-(2-aminophenyl)-4-([2-(pyridin-3-yloxy)propanoyl] 1.06 2.26 3.69 2.80 2.84 48.60 41.75 – 55.46 - amino)benzamide (11536340) N-(2-aminophenyl)-4-([(pyridin-2-ylmethoxy)acetyl] 1.06 2.23 3.71 2.79 2.84 49.42 42.56 – 56.28 - amino)benzamide (11536336) N-(2-aminophenyl)-4-([(pyridin-3-ylmethoxy)acetyl] 1.06 2.23 3.71 2.79 2.84 49.42 42.56 – 56.28 - amino)benzamide (8197974)

Candidates parameters Unknowns parameters Retention Name (ChemSpider ID) A B S E V CHI CHI Range CHI Prediction 0 calc time (min) unknown Unknown Peak 99 24.2 81.07 2-methoxy-N'-[1-(5-methylfuran-2-yl) ethylidene] 0.26 1.17 1.77 1.50 2.08 80.19 73.81 – 86.56 + benzohydrazide (757961) benzamide, N-(4-amino-2-methoxyphenyl)-4- 0.65 1.45 2.44 1.91 2.08 62.37 55.94 – 68.81 - methoxy- (833606) acetamide, N-(4-amino-2-methoxyphenyl)-2-phenoxy- 0.65 1.46 2.50 1.88 2.08 61.41 54.97 – 67.86 - (833624) 1,3-bis(4-methoxyphenyl)urea (120696) 0.59 1.25 2.19 1.73 2.08 71.65 65.27 – 78.02 - 1,3-bis(2-methoxyphenyl)urea (260536) 0.62 1.19 2.03 1.78 2.08 75.09 68.73 – 81.45 + 1-(2-methoxyphenyl)-3-(4-methoxyphenyl)urea 0.60 1.22 2.11 1.75 2.08 73.40 67.03 – 79.56 - (260538) 1-(3-methoxyphenyl)-3-(4-methoxyphenyl)urea 0.59 1.2 2.15 1.71 2.08 73.70 67.53 – 80.06 - (761644) 1,3-bis(3-methoxyphenyl)urea (641853) 0.59 1.15 2.10 1.68 2.08 75.82 69.46 – 82.18 + 1-(2-methoxyphenyl)-3-(3-methoxyphenyl)urea 0.6 1.17 2.07 1.73 2.08 75.44 69.08 – 81.80 + (260537) 3-amino-4-methoxy-N-(2-methoxyphenyl)benzamide 0.66 1.48 2.37 1.95 2.08 62.02 55.58 – 68.45 - (8811261) 3-amino-4-methoxy-N-(3-methoxyphenyl)benzamide 0.65 1.46 2.40 1.90 2.08 62.40 55.96 – 68.83 - (16805321) 3-amino-4-methoxy-N-(4-methoxyphenyl)benzamide 0.65 1.51 2.45 1.93 2.08 60.23 53.82 – 66.72 - (13317570) bis(3-amino-4-methoxyphenyl) methanone (660964) 0.48 1.56 2.47 2.15 2.08 59.90 53.42 – 66.38 - 6-amino-3-(piperidin-1-ylcarbonyl)-2H-chromen-2- 0.23 1.44 2.52 1.85 2.01 62.38 55.88 – 68.87 - one (26898) Unknown Peak 109 23.7 79.12 2-[amino(phenyl)methylidene]-1H-indene-1,3(2H)- 0.21 1.31 2.32 2.18 1.87 64.24 57.79 – 70.69 - dione (8197974) 9-ethynyl-9H-fluoren-1-yl carbamate (2315078) 0.53 0.80 1.98 2.29 1.87 82.86 76.52 – 89.19 + Unknown Peak 138 23.9 80.08 1-methyl-2-oxo-2-(phenylamino)ethyl 6-oxo-1-propyl- 0.41 1.86 2.78 1.82 2.48 60.64 54.02 – 67.26 - 1,6-dihydropyridazine-3-carboxylate (16867074) ethyl 3-[(2-aminophenyl)carbonyl]-1-(cyclopropyl 0.18 1.83 2.64 2.12 2.41 62.74 56.11 – 69.37 -

Candidates parameters Unknowns parameters Retention Name (ChemSpider ID) A B S E V CHI CHI Range CHI Prediction 0 calc time (min) unknown carbonyl)-4,5-dihydro-1H-pyrazole-5-carboxylate (9338397) ethyl 5-(((2Z)-2-[(2-hydroxyphenyl) methylidene] 1.06 1.63 2.43 1.96 2.48 67.64 61.12 – 74.06 - hydrazine)carbonyl)-2,4-dimethyl-1H-pyrrole-3- carboxylate (18392286) Unknown Peak 143 23.3 77.73 3-(2-hydroxybenzoyl)-5H-indeno[1,2-b]pyridin-5-one 0.13 1.1 2.68 2.87 2.15 79.79 73.34 – 86.25 + (10514029) Unknown Peak 212 22.9 76.46 3-amino-2,5-diphenylcyclohexa-2,5-diene-1,4-dione 0.21 1.40 1.40 2.24 2.11 78.40 71.72 – 85.07 + (PubChem ID 12251096) Unknown Peak 213 24.4 81.71 (6Z)-6-[1-[2-(2-methylphenyl)hydrazino]ethylidene] 0.26 1.37 1.81 1.79 1.96 69.36 62.92 – 75.80 - cyclohexa-2,4-dien-1-one (5201826) (6Z)-6-[1-[2-(3-methylphenyl)hydrazino]ethylidene] 0.26 1.37 1.81 1.79 1.96 69.36 62.92 – 75.80 - cyclohexa-2,4-dien-1-one (5247178) 6-[1-[2-(4-methylphenyl)hydrazino]ethylidene] 0.26 1.37 1.81 1.79 1.96 69.36 62.92 – 75.80 - cyclohexa-2,4-dien-1-one (5253314) 4-[1-[2-(4-methylphenyl)hydrazinyl]ethylidene] 0.26 1.37 1.81 1.79 1.96 69.36 62.92 – 75.80 - cyclohexa-2,5-dien-1-one (7349810) 4-[1-[2-(3-methylphenyl)hydrazinyl]ethylidene] 0.26 1.37 1.81 1.79 1.96 69.36 62.92 – 75.80 - cyclohexa-2,5-dien-1-one (7350957) 4-[1-[2-(2-methylphenyl)hydrazino]ethylidene] 0.26 1.37 1.81 1.79 1.96 69.36 62.92 – 75.80 - cyclohexa-2,5-dien-1-one (7305321) 3-amino-4-methyl-N-(4-methylphenyl)benzamide 0.64 1.12 2.16 1.82 1.96 72.04 65.69 – 78.40 - (15117243) 4-amino-3-methyl-N-(2-methylphenyl)benzamide 0.64 1.12 2.16 1.82 1.96 72.04 65.69 – 78.40 - (14815425) 3-amino-4-methyl-N-(3-methylphenyl)benzamide 0.64 1.12 2.16 1.82 1.96 72.04 65.69 – 78.40 - (16801713) 4-amino-3-methyl-N-(3-methylphenyl)benzamide 0.64 1.12 2.16 1.82 1.96 72.04 65.69 – 78.40 - (16808230) 3-amino-4-methyl-N-(2-methylphenyl)benzamide 0.64 1.12 2.16 1.82 1.96 72.04 65.69 – 78.40 -

Candidates parameters Unknowns parameters Retention Name (ChemSpider ID) A B S E V CHI CHI Range CHI Prediction 0 calc time (min) unknown (16817934) 3-amino-2-methyl-N-(2-methylphenyl)benzamide 0.64 1.12 2.16 1.82 1.96 72.04 65.69 – 78.40 - (16817998) 3-amino-2-methyl-N-(3-methylphenyl)benzamide 0.64 1.12 2.16 1.82 1.96 72.04 65.69 – 78.40 - (16802496) 3-amino-2-methyl-N-(4-methylphenyl)benzamide 0.64 1.12 2.16 1.82 1.96 72.04 65.69 – 78.40 - (16810443) 4-amino-3-methyl-N-(4-methylphenyl)benzamide 0.64 1.12 2.16 1.82 1.96 72.04 65.69 – 78.40 - (16810500) 2-amino-3-methyl-N-(2-methylphenyl)benzamide 0.59 1.06 1.95 1.85 1.96 76.47 70.13 – 82.80 + (16814789) 2-amino-3-methyl-N-(3-methylphenyl)benzamide 0.59 1.06 1.95 1.85 1.96 76.47 70.13 – 82.80 + (16810019) 2-amino-3-methyl-N-(4-methylphenyl)benzamide 0.59 1.06 1.95 1.85 1.96 76.47 70.13 – 82.80 + (16801544) 2-amino-6-methyl-N-(2-methylphenyl)benzamide 0.59 1.06 1.95 1.85 1.96 76.47 70.13 – 82.80 + (16824275) 2-amino-6-methyl-N-(3-methylphenyl)benzamide 0.59 1.06 1.95 1.85 1.96 76.47 70.13 – 82.80 + (16805125) 1-[2-amino-5-(benzylamino)phenyl]ethanone 0.31 1.09 2.02 1.93 1.96 76.81 70.47 – 83.15 + (9086908) 1-[4-amino-3-(benzylamino)phenyl]ethanone 0.37 1.12 2.15 1.91 1.96 74.12 67.77 – 80.47 - (19441053) bis(4-amino-3-methylphenyl)methanone (11255017) 0.45 1.20 2.25 2.02 1.96 70.09 63.72 – 76.46 - 3,3'-diamino-4,4'-dimethylbenzophenone (508004) 0.45 1.20 2.25 2.02 1.96 70.09 63.72 – 76.46 - (4-amino-2-methylphenyl)(4-amino-3-methylphenyl) 0.45 1.20 2.25 2.02 1.96 70.09 63.72 – 76.46 - methanone (15990795) bis(4-amino-2-methylphenyl)methanone (15990812) 0.45 1.20 2.25 2.02 1.96 70.09 63.72 – 76.46 - (3-amino-5-methylphenyl)(4-amino-3-methylphenyl) 0.45 1.20 2.25 2.02 1.96 70.09 63.72 – 76.46 - methanone (15990839) bis(3-amino-5-methylphenyl)methanone (15990847) 0.45 1.20 2.25 2.02 1.96 70.09 63.72 – 76.46 - (3-amino-5-methylphenyl)(4-amino-2-methylphenyl) 0.45 1.20 2.25 2.02 1.96 70.09 63.72 – 76.46 - methanone (15990856)

Candidates parameters Unknowns parameters Retention Name (ChemSpider ID) A B S E V CHI CHI Range CHI Prediction 0 calc time (min) unknown 4-amino-N-benzyl-3-methylbenzamide (16491734) 0.48 1.11 2.30 1.79 1.96 72.09 65.71 – 78.46 - 3-amino-N-benzyl-2-methylbenzamide (16815979) 0.48 1.11 2.30 1.79 1.96 72.09 65.71 – 78.46 - 3-amino-N-benzyl-4-methylbenzamide (16822378) 0.48 1.11 2.30 1.79 1.96 72.09 65.71 – 78.46 - 2-amino-N-benzyl-3-methylbenzamide (16821380) 0.43 1.05 2.10 1.82 1.96 76.41 70.08 – 82.75 + 2-amino-N-benzyl-6-methylbenzamide (16818487) 0.43 1.05 2.10 1.82 1.96 76.41 70.08 – 82.75 + 2-amino-N-benzyl-5-methylbenzamide (497152) 0.43 1.05 2.10 1.82 1.96 76.41 70.08 – 82.75 + Unknown Peak 233 24.1 80.72 ethyl 3-(2-amino-5-methoxybenzoyl)-1-(cyclopropyl- 0.18 2.03 2.77 2.19 2.61 61.65 55.12 – 68.58 - carbonyl)-4,5-dihydro-1H-pyrazole-5-carboxylate (9385631) N-(2,4-dimethoxy-5-([(2-methoxy phenyl) carbamoyl] 1.04 1.81 2.85 2.17 2.68 65.05 58.46 – 71.64 - amino)phenyl)acetamide (4991426) N-(2,5-dimethoxy phenyl)-2-[(3-methoxy phenyl) 0.94 1.96 2.97 2.06 2.68 59.29 52.66 – 65.52 - carbamoylamino] acetamide (21985955) N-(2,4-dimethoxy phenyl)-2-[(3-methoxy phenyl) 0.41 2.07 2.90 1.88 2.68 59.48 52.76 – 66.19 - carbamoylamino]acetamide (15224742) Unknown Peak 242 15.2 49.08 4-(cyanomethyl)-2-propylbenzonitrile (15554727) 0 0.44 1.62 0.96 1.59 90.05 83.67 – 96.43 - 2-(4-ethylphenyl) butanedinitrile (25620509) 0 0.56 1.70 1.1 1.59 85.44 79.08 – 91.80 - 4-(cyanomethyl)-3-propylbenzonitrile (15555002) 0 0.44 1.62 0.96 1.59 90.05 83.67 – 96.43 - 3-(cyanomethyl)-2-propylbenzonitrile (15555691) 0 0.44 1.62 0.96 1.59 90.05 83.67 – 96.43 - 3-(cyanomethyl)-4-propylbenzonitrile (15555015) 0 0.44 1.62 0.96 1.59 90.05 83.67 – 96.43 - (2,3-dimethyl-1H-indol-1-yl)acetonitrile (15425225) 0 0.49 1.46 1.46 1.52 88.41 82.11 – 94.70 - 3-(2-methyl-1H-indol-1-yl) propanenitrile (15350264) 0 0.50 1.52 1.44 1.52 87.46 81.17 – 93.76 - 2,2'-(4,5-dimethyl benzene-1,2-diyl) diacetonitrile 0 0.45 1.57 1.02 1.59 90.28 83.93 – 96.64 - (269821) 3-(3-methylphenyl) pentanedinitrile (15160596) 0 0.49 1.67 0.97 1.59 87.89 81.51 – 94.27 - 3-(2-methylindolizin-3-yl) propanenitrile (1735125) 0 0.55 1.38 1.29 1.52 86.83 80.53 – 93.12 - 3-(4-methylphenyl) pentanedinitrile (14306461) 0 0.49 1.67 0.97 1.59 87.89 81.51 – 94.27 - 3-(3-methyl-1H-indol-2-yl) propanenitrile (15160320) 0.31 0.47 1.55 1.45 1.52 86.12 79.83 – 92.42 - 3-(2-methylphenyl) pentanedinitrile (2075988) 0 0.49 1.67 0.97 1.59 87.89 81.51 – 94.27 - 3-(3-methyl-1H-indol-1-yl) propanenitrile (15426110) 0 0.50 1.52 1.44 1.52 87.46 81.17 – 93.76 -

Chapter 5

Applying the LSER model, predicted retention of 22 out of 92 candidates was in agreement with measured retention of the unknowns in fraction N-2-8. Thus, the most likely candidates remain for peaks 3, 19, 50, 99, 109, 143, 212 and 213. Unfortunately no candidate remains for peaks 35, 60, 138, 233 and 242. Since this approach was based on database search and databases are always limited, it may be hypothesised that the compounds behind these peaks may be absent from the database.

5.3.2 Application of the nitrenium stability to the selected candidates for fraction N-2-8

Using MOPAC program, the ΔHf for the aniline (ΔEPhNH2) was 20.41 kcal/mol and ΔHf for the corresponding nitrenium ion of the aniline (ΔEPhN+H) was 126.27 kcal/mol. The calculated values of ΔHf for the candidates having an amino group in their structure, the ΔHf of the corresponding nitrenium as well as the resulting values of ΔΔE and the prediction are presented in Table 5-3.

Table 5- 3: Mutagenicity prediction results for the selected candidates for fraction N-2-8, with np: non predicted and *: should be listed as np Amine Nitrenium ΔH ΔH ΔΔE Mutagenicity Candidate name f f (ΔEAr-NH2) (ΔEAr-N+H) (kcal/mol) prediction (kcal/mol) (kcal/mol) Peak 3 2-amino-5-(1,3-benzodioxol-5-yl)-4- 398.75 72.26 - 432.34 Ames + methylthiophene-3-carbonitrile 2-amino-4-(2,3-dihydro-1,4-benzodioxin-6-yl) 12.67 69.51 - 49.01 Ames + thiophene-3-carbonitrile methyl 3-amino-5-(4-cyanophenyl)thiophene-2- - 3.87 102.46 0.48 np carboxylate Peak 50 7-amino-4-propyl-2H-chromen-2-one - 52.73 58.60 5.48 Ames - Peak 60 N-(2-aminophenyl)-4-([2-(pyridin-3- - 21.97 78.04 - 5.84 Ames + yloxy)propanoyl] amino)benzamide N-(2-aminophenyl)-4-([(pyridin-2- - 20.37 73.07 - 12.42 Ames + ylmethoxy)acetyl] amino)benzamide N-(2-aminophenyl)-4-([(pyridin-3- - 15.71 78.43 - 11.71 Ames + ylmethoxy)acetyl] amino)benzamide Peak 99 benzamide, N-(4-amino-2-methoxyphenyl)-4- -58.02 91.88 44.05 Ames - methoxy- acetamide, N-(4-amino-2-methoxyphenyl)-2- -57.78 201.41 153.34 Ames - phenoxy- 3-amino-4-methoxy-N-(2- -53.86 90.53 38.54 Ames - methoxyphenyl)benzamide 3-amino-4-methoxy-N-(3- -57.62 35.83 -12.40 Ames + methoxyphenyl)benzamide 3-amino-4-methoxy-N-(4- -57.31 93.22 44.68 Ames - methoxyphenyl)benzamide bis(3-amino-4-methoxyphenyl) methanone -57.19 199.21 150.54 Ames - 6-amino-3-(piperidin-1-ylcarbonyl) -2H- - 71.16 33.20 - 1.49 np chromen-2-one Peak 138 ethyl 3-[(2-aminophenyl)carbonyl]-1- 0 61.36 - 44.50 Ames + (cyclopropyl carbonyl)-4,5-dihydro-1H- pyrazole-5-carboxylate

158 Chapter 5

Amine Nitrenium ΔH ΔH ΔΔE Mutagenicity Candidate name f f (ΔEAr-NH2) (ΔEAr-N+H) (kcal/mol) prediction (kcal/mol) (kcal/mol) Peak 212 3-amino-2,5-diphenylcyclohexa-2,5-diene-1,4- 24.30 134.72 4.57 np dione Peak 213 3-amino-4-methyl-N-(4- 2.67 107.53 - 0.99 np methylphenyl)benzamide 4-amino-3-methyl-N-(2- 2.40 110.75 2.49 np methylphenyl)benzamide 3-amino-4-methyl-N-(3- 2.49 108.46 0.11 np methylphenyl)benzamide 4-amino-3-methyl-N-(3- 1.81 110.14 2.47 np methylphenyl)benzamide 3-amino-4-methyl-N-(2- 3.27 108.94 - 0.19 np methylphenyl)benzamide 3-amino-2-methyl-N-(2- 4.87 110.57 - 0.16 np methylphenyl)benzamide 3-amino-2-methyl-N-(3- 4.36 110.10 - 0.12 np methylphenyl)benzamide 3-amino-2-methyl-N-(4- 4.37 110.06 - 0.16 np methylphenyl)benzamide 4-amino-3-methyl-N-(4- 1.72 110.04 2.46 np methylphenyl)benzamide 2-amino-3-methyl-N-(2- 2.44 111.22 2.92 np methylphenyl)benzamide 2-amino-3-methyl-N-(3- 0.03 108.47 2.58 np methylphenyl)benzamide 2-amino-3-methyl-N-(4- 0.03 108.48 2.59 np methylphenyl)benzamide 2-amino-6-methyl-N-(2- 3.11 111.52 2.56 np methylphenyl)benzamide 2-amino-6-methyl-N-(3- 2.59 111.17 2.72 np methylphenyl)benzamide 1-[2-amino-5-(benzylamino)phenyl]ethanone 11.47 99.21 - 18.12 Ames + 1-[4-amino-3-(benzylamino)phenyl]ethanone 12.61 97.10 - 21.36 Ames + bis(4-amino-3-methylphenyl)methanone 2.52 212.08 103.69 Ames - 3,3'-diamino-4,4'-dimethylbenzophenone 0.53 217.64 111.26 Ames -* (4-amino-2-methylphenyl)(4-amino-3- 1.94 219.89 112.10 Ames -* methylphenyl) methanone bis(4-amino-2-methylphenyl)methanone 3.35 222.40 113.19 Ames -* (3-amino-5-methylphenyl)(4-amino-3- 1.51 215.87 108.50 Ames -* methylphenyl) methanone bis(3-amino-5-methylphenyl)methanone 2.59 215.17 106.73 Ames -* (3-amino-5-methylphenyl)(4-amino-2- 2.88 218.42 109.68 Ames -* methylphenyl) methanone 4-amino-N-benzyl-3-methylbenzamide 1.55 111.04 3.64 np 3-amino-N-benzyl-2-methylbenzamide 6.18 111.44 - 0.60 np 3-amino-N-benzyl-4-methylbenzamide 4.57 109.73 -0.70 np 2-amino-N-benzyl-3-methylbenzamide 1.82 110.48 2.80 np 2-amino-N-benzyl-6-methylbenzamide 19.87 128.90 3.17 np 2-amino-N-benzyl-5-methylbenzamide 3.56 110.29 0.88 np Peak 233 ethyl 3-(2-amino-5-methoxybenzoyl)-1- (cyclopropyl carbonyl)-4,5-dihydro-1H- - 84.80 19.81 - 1.25 np pyrazole-5-carboxylate

159 Chapter 5

The mutagenicity prediction method identified nine amino-candidates as mutagens and thirteen as non-mutagens. However, six of the latter have two amino- groups in their structure and do not fulfil the criteria defined by Bentzien et al. [69] to develop his prediction (only one amino functionality). Thus, these six candidates are listed as non-predicted. Twenty four amino-candidates cannot be determined as mutagens or non mutagens, because their ΔΔE value are within the [-5 +5] kCal/mol range.

5.3.3 Combining retention and mutagenicity prediction

The results combining prediction from LSER and mutagenicity are presented in Table 2-4. Two candidates, 1-[2-amino-5-(benzylamino)phenyl]ethanone (Peak 213) and 2-amino-5-(1,3-benzodioxol-5-yl)-4-methylthiophene-3-carbonitrile (Peak 3) were positively identified by both methods, meaning that they could be strongly expected to be present in the fraction N-2-8 and involved in the mutagenic effects. Furthermore ten other compounds can be considered as very good candidates. They are all possible candidates (aromatic amine structure) with the LSER approach and they also could be responsible for the high mutagenicity response with the YG-types strains. Another nine candidates (positive with LSER) could not have their mutagenicity potential evaluated by the model we described above, as they do not possess an amino group in their structure. However, such compounds cannot be excluded from the list of candidates because they may still contribute to the mutagenicity of the fraction, via other mechanisms of reaction than the nitrenium ion. For instance, the unknown corresponding to Peak 109 was also found in fraction A-2-8 which showed low mutagenicity, none of the proposed candidates are aromatic amines, but we cannot exclude them of being involved in the low mutagenicity for fraction A-2-8. 7-amino-4-propyl-2H-chromen-2-one, which is a candidate for Peak 50, was predicted to be non mutagenic, but its elution properties matched the elution properties of the unknown and was kept in the list of candidates for the confirmation step. This choice was justified by the fact that some compounds are not directly mutagenic but become mutagenic when reacting with another compound present in the mixture. This type of chemicals is called co-mutagens. Totsuka et al. [154] and Nishigaki et al. [155] showed that norharman (non-mutagenic) and aniline (non-mutagenic) when present in the same mixture formed the aminophenylnorharman compound, which is mutagenic with TA 98 and YG1024. This chemical was formed by the cytochrome P450 present in the S9 mix, followed by further activations to form the corresponding nitrenium. Fraction N-2-8 showed high mutagenicity only by activation, so some kind of synergy with the matrix of the blue rayon or other compounds present in this fraction could also be possible.

160 Chapter 5

Table 5- 4: Candidates of interest for the confirmation step of the blue rayon EDA study, with np: non predicted, na: non applicable and #: candidate available as standard Retention Mutagenicity Candidate name time prediction prediction Peak 3 2-amino-5-(1,3-benzodioxol-5-yl)-4-methylthiophene-3-carbonitrile + + methyl 3-amino-5-(4-cyanophenyl)thiophene-2-carboxylate + np Peak 19 3H-benz[e]indole + na 1H-benz[f]indole + na Peak 50 4,5,6,7-tetramethyl-1H-indole-2,3-dione + na 7-amino-4-propyl-2H-chromen-2-one # + - Peak 99 2-methoxy-N'-[1-(5-methylfuran-2-yl)ethylidene]benzohydrazide + na 1,3-Bis(2-methoxyphenyl)urea # + na 1,3-Bis(3-methoxyphenyl)urea # + na 1-(2-methoxyphenyl)-3-(3-methoxyphenyl)urea # + na Peak 109 9-ethynyl-9H-fluoren-1-yl carbamate + na Peak 143 3-(2-hydroxybenzoyl)-5H-indeno[1,2-b]pyridin-5-one + na Peak 212 3-amino-2,5-diphenylcyclohexa-2,5-diene-1,4-dione + np Peak 213 2-amino-3-methyl-N-(2-methylphenyl)benzamide # + np 2-amino-3-methyl-N-(3-methylphenyl)benzamide # + np 2-amino-3-methyl-N-(4-methylphenyl)benzamide # + np 2-amino-6-methyl-N-(2-methylphenyl)benzamide # + np 2-amino-6-methyl-N-(3-methylphenyl)benzamide # + np 1-[2-amino-5-(benzylamino)phenyl]ethanone + + 2-amino-N-benzyl-3-methylbenzamide # + np 2-amino-N-benzyl-6-methylbenzamide # + np 2-amino-N-benzyl-5-methylbenzamide + np

5.4 Conclusions

The combination of the LSER and mutagenicity prediction enabled us to severely decrease the number of candidates for the confirmation step. Two compounds were positively predicted by both methods, but they are not commercially available as standards and would require to be synthesised. In addition ten other candidate compounds are commercially available. Thus, the confirmation involving standards could easily start with these available standards. The combination of these two approaches enabled us to prioritise compounds for the confirmation step of the EDA study.

161

Chapter 6

CHAPTER 6: SUMMARY AND FUTURE WORK

The methods developed throughout this thesis work provide a good basis for EDA studies of planar mutagens in water samples. The passive sampler used, blue rayon (BR), is specific to this class of compounds and thus lowers the complexity of the water extract due to its high selectivity. The Ames Fluctuation Test is a reliable test enabling the detection of mutagenicity in sample and has the advantage of the availability of multiple strains of Salmonella with different sensitivity to different classes of compounds.

The aim of this was to end up with the chemical and biological measures of the candidates provided for fraction N-2-8. Despite our extensive efforts so far, we were unable to clarify the mutagenicity. We saw that this fraction showed mutagenicity only with S9 activation, and it would be worthwhile to assess the mixture toxicity of the identified compounds in the fraction N-2-8 by measuring the mutagenicity. Further, it may be possible that due to some kind of synergism or matrix/mixture effects, mutagenicity is measurable only in the fraction, while pure single compounds will not show any mutagenicity.

The three-step fractionation method presented in this work showed that losses in the sample were limited to 20% and that the compounds were efficiently separated within the fractions/subfractions, with overlapping less than 10%. Beside these efficiencies in separating and recovering the sample, the method also provides information regarding physico-chemical properties of the eluted substances. Most of the previously published fractionation procedures used with BR extracts rely mainly on chromatography, such that only information on partitioning coefficient between octanol/water (log Kow) is available. The method described in this thesis work gives information about the acid/base properties of the compounds in addition to the log Kow. Thus, this method provides two criteria to help in the identification of unknown compounds. In general this fractionation method can be applied to any other samples collected with BR, as it is specific for planar compounds. As seen in [83] and Chapter 4 of this work, a given fraction can still be quite complex, even after extensive fractionation. The inclusion of a fourth step may improve this fractionation procedure. There are numerous stationary phases that provide other types of interactions in addition of the lipophilicity, thus changing the elution of analytes. The 2D-plot used to select the two RP-HPLC columns (polymeric C18 and phenyl-hexyl) was effective in selecting the best complementarity between these two phases and could be also used for the selection of a third column. While addition of a third column could improve the existing procedure, a fourth step was not possible in this study due to the small amount of sample remaining. This is the main disadvantage of subsequent fractionation. However, an additional fractionation step could result in single component isolation and hence NMR analysis would be accessible for structure elucidation, should sufficient sample be available. Such structural information would be of great help in the identification of compounds as this procedure often relies on

163 Chapter 6 database searches and the confirmation remains very difficult due to the lack of standards. Another enhancement to the fractionation procedure could be implementation as an automatic setup such as the multi-step normal-phase fractionation method developed by Lubcke-von Varel et al. [77]. This would decrease the amount of solid phase extraction required to concentrate the sample and could help in limiting the risk of losses and contamination when handling the sample/fractions.

High resolution mass spectrometry coupled to LC is a powerful combination for screening and identification purposes. Three different types of identification procedures were used in this thesis work: (i) The target screening. Due to the limited number of standards used here (47), this strategy did not yield any identification. Expansion of the target library can improve this strategy as shown by Hogenboom and al. [91]. The authors used a self-created database containing about 3000 water pollutants, enabling the identification of many pollutants in sewage effluents, surface waters and groundwater. (ii) The suspect-compounds screening. This yielded the identification of pigment yellow 1 in fraction N-7-12. This strategy, if relying on a larger library including the suspect compounds as well as their transformation products, is a promising strategy for identification of compounds in water samples as showed in Kern et al. [94], using a large library containing more than 1700 suspect compounds. Awareness of the role of transformation products in water contamination is growing rapidly. Thus, there is a huge need for studies investigating the fate of contaminants in the environment in order to predict their transformation, and a need of software/program development such as the University of Minnesota Pathway Prediction System, able to predict products of microbial metabolism. Further development involving primarily products of abiotic reactions (e.g. hydrolysis, photolysis) could also be a great improvement in the creation of suspect compounds libraries. In general, the more compounds are present in databases, the greater the chances for identification. (iii) The non-target screening. This was by far the most challenging and time consuming strategy applied to this work. This strategy relies on the empirical formula determination, followed by database searches using the formula and the match of predicted and experimental MSn fragmentation pattern. In general, this identification procedure requires as much information as possible and satisfying the highest number of criteria to select the candidates. In many cases MSn fragmentation and retention

(expressed by the log Kow) are most commonly used criteria in the identification of chemicals by LC-MS/MS. In the present work two other selection criteria were introduced and proved to be helpful in the selection of the candidates. The first one was the pKa of candidates to determine if the candidate could be present in the fraction of interest (acidic, neutral or basic fraction). The second classifier introduced was the ionisation behaviour of analytes. The analytical study of reference standards enabled the identification of typical losses for the different classes of compounds (Chapter 4, Section 4.3.1) and to derive ionisation rules (see Chapter 4, Table 4-2) for the different classes of compounds. Such rules were applied during the identification process and were found to be useful to decrease the number of candidate for the peaks of interest, as

164 Chapter 6 seen for instance for Peak 40 in Chapter 4, Section 4.3.3.2. The acid/base properties

(pKa) for the selection of candidates is specific to the procedure described in this work as it involved a fractionation step for the separation of chemicals according to their pKa value(s). In contrast, the behaviour of the different classes of compounds during LC- MS/MS analysis is more a universal classifier that could be used in any identification procedure. Applying this strategy, benzyl(diphenyl) phosphine oxide was identified and confirmed and a list of 92 candidates was provided for 13 peaks. The use of MetFrag greatly improved the searches through thousands of compounds having the same empirical formula, in matching the predicted and experimental MSn fragments in database (e.g. Peak 213 in Chapter 4, Section 4.3.3.2). Thus, this computer program was rather efficient and helpful in the selection of the best candidates due to the ranking score and represents one of the strengths of the strategy applied here for the identification of compounds. Additionally, MetFrag is very user-friendly and freely available. However, even with strong software, the searches remain dependant on the databases and the compounds present. Thus, these databases need to be developed further, to help in the identification of unknown compounds. This dependence of databases is a limiting factor to non-target identification. For instance we saw that for many peaks of interest it was not possible to suggest any candidates. One of the likely reasons is the absence of the chemical from the databases. In GC-MS, structure generation proved to be an alternative strategy to overcome the absence of structures from database [156,157], but we saw in this work that the lack of fragmentation in LC- MS/MS is a severe obstacle to the application of structure generation to LC-MS/MS data. The lack of fragmentation means that insufficient structural information is available to limit the number of candidates. Higher collision energies increase the fragmentation of chemicals and may provide more information. Another possibility is the in-source fragmentation. In most cases, negative ionisation provides very few fragments. However, in a recent study [158], Wu and co-workers used in-source fragmentation under negative APCI to provide intense fragments allowing the structural identification of triclosan. Improvement in negative ionisation mode could provide some more information regarding the fragmentation pattern of an unknown, and could be combined with the identified fragments obtained in positive mode, enabling the use of the structure generation. Failure in the determination of the correct empirical formula, leading to the lack of candidates for some peaks could be attributed to adduct formation. Solvent adducts or adducts containing one or more alkali metals could be present as mentioned by Weiss et al. [101]. These adducts should be removed prior to determination of the molecular formula, which is the basis of the unknown strategy identification. It could be valuable to incorporate into the method a computer program enabling the determination of the nature of the adducts before proceeding to the determination of the elemental formula [103]. This feature is now being built into many workflows (e.g. [159]) that have not yet been evaluated fully (an evaluation was beyond the scope of work here). Another reason could be also due to the loss of a portion of the molecule during the ionisation process (e.g. loss of water or alkyl group). This results in the wrong molecular ion detection and hence to the wrong empirical formula. Thus, it would be worthwhile to further

165 Chapter 6 investigate the role of in-source fragmentation and possible losses during ionisation and retry the database searches.

Matching experimental with computational fragmentation spectra and physical- chemical properties is a valid method for rapid discrimination of compounds with the same empirical formula. However, there are often several possible candidate structures for an individual peak. Thus, further classifiers are needed to exclude as many candidates as possible. In this work we proposed the use of two methods. The first one was based on the quantitative structure retention relationship (QSRR) approach, using the linear solvation-energy relationship (LSER) and was applied to fraction N-2-8. This method is based on the Abraham equation, which although developed in the 1990’s, has only recently been applied in in EDA studies. We proved here that it helps to drastically reduce the number of candidates. The proposed list of 92 candidates was reduced to 22 using this approach. The chromatographic hydrophobicity index (CHI) equation proposed by Ulrich et al. [110] was easy to use and only requires a calibration of the LC-column with standards before application. In general, this approach can be easily included in EDA studies as a pre-step for confirmation. The LSER approach using CHI is of universal approach as is it nearly independent from gradient set-up and column dimension and thus can be used in any other EDA studies. The second approach was based on the quantitative structure activity relationship (QSAR), approach using the stability of nitrenium ion theory of aromatic amines. This method was easy to use to predict mutagenicity of compounds and did not require further experiments, only calculations. In comparison to the LSER approach, this method is specific to mutagenic compounds involving aromatic amines. However, in such EDA studies with BR sampling, this method can be easily applied to prioritise candidates for confirmation. Thus, it is really important to develop and apply new classifier in the selection of candidates during identification process.

The combination of specific passive sampler, fractionation and novel candidate selection for unknown identification is a very promising method for the isolation and investigation of mutagenic compounds in water matrices.

166 Acknowledgements

ACKNOWLEDGEMENTS

First of all, I would like to thank Prof. Dr. Henner Hollert, Department of Ecosystem Analysis, Institute for Environment Research, RWTH Aachen University, for giving me the opportunity to prepare my thesis under his official supervision and, thus, being my first referee.

My expressed appreciation also goes to Prof. Dr. Andreas Schäffer, RWTH Aachen University for being my second referee and by this showing his interest in research on effect directed analysis.

Most important, I am thankful to my supervisor, Dr. Werner Brack, Department of Effect-Directed Analysis, Helmholtz Centre for Environmental Research (UFZ), Leipzig, Germany, for the opportunity to be involved in an interdisciplinary group, for his constructive advices and comments all along the experimental work and writing time, and for his patience in dealing with my daily complains about the food at UFZ and the weather in Leipzig.

Special thanks go to Emma Schymanski, for her continuing support, help and discussions until the submission of this work. It would not have been possible without you and your expertise in dealing with MOLGEN-MSMS and MetFrag. Beside the fantastic collaboration thanks Emma, Stan and Angus for your friendship, all shared diners, bike rides and our crazy UNO games…Stan, you are still the L…

Many thanks to Mahmoud Bataineh, and Martin Krauss for their support in dealing with the Orbitrap, in providing experimental help, ideas and comments.

Furthermore I want to thanks Ronald de Boer and Rikke Brix for their LC-MS teaching, Gisela Umbuzeiro for her kind advices about the Ames test, Pavel Juradja and his group for their help during the sampling campaigns in Czech Republic, Georg Reifferscheid, Sebastian Buchinger and Ramona Pfänder for welcoming me at the BfG Institute to conduct further Ames test and finally all indirect contributors, who developed programs used in this thesis, Markus Meringer, Sebastian Wolf and Steffen Neumann.

Thanks to the past and present WANA members, Angela Sperreuter for her help in the HPLC labs and for teaching me some German, Ivonne Löffler and Margit Petre for providing me with Ames test results, Tobias Schulze for the energy calculations and his friendship, Peter Van der Ohe for the correction of the German abstract and his friendship, Georg Streck for organising the sampling campaigns, Ines, Marion, Katrin, Conny, Steffi L., Thomas, Sonja and Egina for their kindness, fun in the labs, and for their efforts to communicate with me.

This work was supported by the Marie Curie Research Training Network KEYBIOEFFECTS (RTN-CT-2006-035695) and many thanks to Helena Guash and Anita Geiszinger for the coordination and management of the project. Thanks to all the Keybioeffects people for the good time spent together, Anja, Tineke, Stanley, Candida, Helen, Jennifer, Lorenzo, Roberta, Guillaume and Ricardo.

Acknowledgements

Thanks to my “PhD” friends for our great discussions, big laughs and for being of great support especially when the mood was bad. Thanks for making this adventure with me: Steffi (unforgettable SETAC holidays, fun…the only one who could fly with a SETAC tag as ID paper!!!), Micha (our hilarious discussions, your hope for improving my Deutsch pronunciation…and…Kirsche oder Kirche? Still no differences), Nathalie and Saskia (the inseparable friends…who drove with me to Linköping…what an adventure and the Scandic Hotel still remembers us!!!), Fred (les nombreux fous rire, ton sens de l’humour, tes commentaires, ton accueil a Leipzig et les soirées Feuerzangenbowle), Cynthia and Marco (merci pour votre gentillesse et ma journée au ski…slow motion ski day and big laugh), Christine (thanks for your help in the labs and the great taste of the Federweisser wine sitting on the bean bag!!!), Nadin (thanks for your help in applying your method and our great laugh in the zoo…next time we go to monkey island!!!), Nicole (thanks for the support in the labs and the enjoyable week end in Warnemunde during Father’s day…), Urte (thanks for your advises and the nice day at the swimming pool…), Leslie and Chloe (merci de votre accueil à Girona, pour les fous rire…et le deliverable), Jana, Eszter and Marja (always had great time in meetings and thanks for my memorable time in Amsterdam), Uli, Caro, Arne, Daniel and René for the fun we had together.

Thanks to my friends from all over the World for the nice holidays we had together and for forgetting me not giving news, Jérôme, Caro et Walter, Véro et Jean- Louis, Myra and Grant, Bronwyn, Melanie and Simon. Thanks Melanie for correcting my Frenglish.

Thanks to my adopted Australian family Paul, Brian and Gail, and Belinda for making me laugh and smile even from so far away.

Thanks to the Linköping gang, Sara, Chamilly and Jacob for your support and friendship during the hard time of the writing in Sweden.

"Tack till "Plan 13" i Linköping, för att ni välkomnat mig och hjälpt mig överleva min första svenska vinter": Hanna, Björn, Anita S., Anders, Kerstin, Birgitta, Cilla, Vivian, Lotta, Emelie, Anita L., Annelie, Viviana, Narges, Ingmar.

Thanks to Susana Cristobal for giving me some freedom to finish my PhD.

Merci à mon amie Isa pour son soutien, ses encouragements, ses coups de téléphone, sa gentillesse, sa bonne humeur et aussi pour nos fous rire.

Michèle et Gerard Le Pestipon, merci pour tous les bons moments passés ensemble, les Noëls et vacances en Normandie.

Enfin, je voudrais remercier ma famille, qui a toujours été présente pour moi. Merci papa et maman pour votre soutien moral, les bon petits plats maison et les aller- retours à l’aéroport ou la gare. Yayane pour toutes les cartes envoyées et les petites douceurs rapportées de Suisse. Laurent et Nadine, Blandine, Sophie et Davy, et Nicole pour tous les week end et vacances passés ensemble, les bons moments devant une assiette, un japonais ou une tisane-carré de chocolat. Mes quatre petits bouts, Juliette, Elise, Mathias et Erwan (mes nièces et neveux, nés pour faire des bêtises et largement supportés par tata) et ma grand-mère, Mamée pour son éternel soutien.

168 References

REFERENCES

[1] O. Botalova, J. Schwarzbauer, T. Frauenrath, L. Dsikowitzky, Water Research 43 (2009) 3797. [2] O. Botalova, J. Schwarzbauer, N. al Sandouk, Water Research 45 (2011) 3653. [3] M. Castillo, M.C. Alonso, J. Riu, M. Reinke, G. Kloter, H. Dizer, B. Fischer, P.D. Hansen, D. Barcelo, Analytica Chimica Acta 426 (2001) 265. [4] M. Castillo, D. Barcelo, Analytica Chimica Acta 426 (2001) 253. [5] M.L. Hewitt, C.H. Marvin, Mutation Research 589 (2005) 208. [6] A. Soupilas, C.A. Papadimitriou, P. Samaras, K. Gudulas, D. Petridis, Desalination 224 (2008) 261. [7] V. Tigini, P. Giansanti, A. Mangiavillano, A. Pannocchia, G.C. Varese, Ecotoxicology and Environmental Safety 74 (2011) 866. [8] T. Ohe, T. Watanabe, K. Wakabayashi, Mutation Research-Reviews in Mutation Research 567 (2004) 109. [9] J. Robles-Molina, B. Gilbert-Lopez, J.F. Garcia-Reyes, A. Molina-Diaz, Talanta 82 (2010) 1318. [10] M.H. Devier, P. Mazellier, S. Ait-Aissa, H. Budzinski, Comptes Rendus Chimie 14 (2011) 766. [11] A. Oguri, T. Shiozawa, Y. Terao, H. Nukaya, J. Yamashita, T. Ohe, H. Sawanishi, T. Katsuhara, T. Sugimura, K. Wakabayashi, Chemical Research in Toxicology 11 (1998) 1195. [12] T. Shiozawa, A. Tada, H. Nukaya, T. Watanabe, Y. Takahashi, M. Asanoma, T. Ohe, H. Sawanishi, T. Katsuhara, T. Sugimura, K. Wakabayashi, Y. Terao, Chemical Research in Toxicology 13 (2000) 535. [13] F. Kummrow, C. Rech, C.A. Coimbrao, D.A. Roubicek, G.A. Umbezeiro, Mutation Research 541 (2003) 103. [14] V.S. Houk, Mutation Research 277 (1992) 91. [15] E. Pehlivanoglu-Mantas, D.L. Sedlak, Water Res 40 (2006) 1287. [16] H. Hayatsu, Journal of chromatography A 597 (1992) 37. [17] H. Kataoka, Journal of Chromatography A 774 (1997) 121. [18] M. Murkovic, Analytical and Bioanalytical Chemistry 389 (2007) 139. [19] R.S. Padda, C. Wang, J.B. Hughes, R. Kutty, G.N. Bennett, Environ Toxicol Chem 22 (2003) 2293. [20] J. Namiesnik, B. Zabiegala, A. Kot-Wasik, M. Partika, A. Wasik, Analytical and Bioanalytical Chemistry 381 (2005) 279. [21] B. Vrana, G.A. Mills, I.J. Allan, E. Dominiak, K. Svensson, J. Knutsson, G. Morrison, G. R., Trac-Trends in Analytical Chemistry 24 (2005) 845. [22] W. Brack, M. Schmitt-Jansen, M. Machala, R. Brix, D. Barcelo, E. Schymanski, G. Streck, T. Schulze, Analytical and Bioanalytical Chemistry 390 (2008) 1959. [23] M.C. Diaz-Baez, L.E. Cruz, D. Rodriguez, J. Perez, C.M. Vargas, Environmental Toxicology 15 (2000) 345. [24] M. Hecker, H. Hollert, Environmental Science and Pollution Research 16 (2009) 607. [25] N. Bandow, R. Altenburger, G. Streck, W. Brack, Environmental Science & Technology 43 (2009) 7343. [26] M. Grung, R. Lichtenthaler, M. Ahel, K.E. Tollefsen, K. Langford, K.V. Thomas, Chemosphere 67 (2007) 108. [27] M. Grung, K. Naes, O. Fogelberg, A.J. Nilsen, W. Brack, U. Lubcke-von Varel, K.V. Thomas, Journal of Toxicology and Environmental Health-Part a-Current Issues 74 (2011) 439. [28] S. Kaisarevic, U. Lubcke-von Varel, D. Orcic, G. Streck, T. Schulze, K. Pogrmic, I. Teodorovic, W. Brack, R. Kovacevic, Chemosphere 77 (2009) 907. [29] T. Smital, S. Terzic, R. Zaja, I. Senta, B. Pivcevic, M. Popovic, I. Mikac, K.E. Tollefsen, K.V. Thomas, M. Ahel, Ecotoxicology and Environmental Safety 74 (2011) 844.

169 References

[30] T. Watanabe, T. Hasei, T. Ohe, T. Hirayama, K. Wakabayashi, Genes and Environment 28 (2006) 173. [31] W. Brack, Analytical and Bioanalytical Chemistry 377 (2003) 397. [32] H. Hayatsu, T. Oka, A. Wakata, Y. Ohara, T. Hayatsu, H. Kobayashi, S. Arimoto, Mutation Research 119 (1983) 233. [33] H. Hayatsu, T. Hayatsu, S. Arimoto, H. Sakamoto, Analytical Biochemistry 235 (1996) 185. [34] S. Kira, Y. Nogami, T. Ito, H. Hayatsu, Environmental Toxicology 14 (1999) 279. [35] S. Kira, K. Nogami, H. Taketa, H. Hayatsu, Bulletin of Environmental Contamination and Toxicology 57 (1996) 278. [36] S. Kira, Y. Nogami, T. Ito, H. Hayatsu, Marine Environmental Research 46 (1998) 267. [37] N.S. Lebedeva, E.V. Parfenyuk, Journal of Thermal Analysis and Calorimetry 87 (2007) 437. [38] H. Sakamoto, H. Hayatsu, Bulletin of Environmental Contamination and Toxicology. 44 (1990) 521. [39] T. Morisawa, T. Mizuno, T. Ohe, T. Watanabe, T. Hirayama, H. Nukaya, T. Shiozawa, Y. Terao, H. Sawanishi, K. Wakabayashi, Mutation Research-Genetic Toxicology and Environmental Mutagenesis 534 (2003) 123. [40] T. Fujimoto, Y. Tsuchiya, N. Shibuya, M. Taiyoji, T. Nishiwaki, K. Nakamura, M. Yamamoto, Tohoku Journal of Experimental Medicine 211 (2007) 171. [41] H. Fukazawa, H. Matsushita, Y. Terao, Mutation Research-Genetic Toxicology and Environmental Mutagenesis 491 (2001) 65. [42] H. Kataoka, T. Hayatsu, G. Hietsch, H. Steinkellner, S. Nishioka, S. Narimatsu, S. Knasmuller, H. Hayatsu, Mutation Research-Genetic Toxicology and Environmental Mutagenesis 466 (2000) 27. [43] H. Nukaya, T. Shiozawa, A. Tada, Y. Terao, T. Ohe, T. Watanabe, M. Asanoma, H. Sawanishi, T. Katsuhara, T. Sugimura, K. Wakabayashi, Mutation Research-Genetic Toxicology and Environmental Mutagenesis 492 (2001) 73. [44] H. Nukaya, J. Yamashita, K. Tsuji, Y. Terao, T. Ohe, H. Sawanishi, T. Katsuhara, K. Kiyokawa, M. Tezuka, A. Oguri, T. Sugimura, K. Wakabayashi, Chemical Research in Toxicology 10 (1997) 1061. [45] T. Ohe, T. Watanabe, Y. Nonouchi, T. Hasei, Y. Agou, M. Tani, K. Wakabayashi, Mutation Research 655 (2008) 28. [46] T. Takamura-Enya, T. Watanabe, A. Tada, T. Hirayama, H. Nukaya, T. Sugimura, K. Wakabayashi, Chemical Research in Toxicology 15 (2002) 419. [47] T. Tsukatani, Y. Tanaka, N. Sera, N. Shimizu, S. Kitamori, N. Inoue, Environmental Health and Preventive Medicine 8 (2003) 133. [48] T. Watanabe, H. Nukaya, Y. Terao, Y. Takahashi, A. Tada, T. Takamura, H. Sawanishi, T. Ohe, T. Hirayama, T. Sugimura, K. Wakabayashi, Mutation Research-Genetic Toxicology and Environmental Mutagenesis 498 (2001) 107. [49] T. Watanabe, H. Ohba, M. Asanoma, T. Hasei, T. Takamura, Y. Terao, T. Shiozawa, T. Hirayama, K. Wakabayashi, H. Nukaya, Mutation Research-Genetic Toxicology and Environmental Mutagenesis 609 (2006) 137. [50] T. Watanabe, T. Shiozawa, Y. Takahashi, T. Takahashi, Y. Terao, H. Nukaya, T. Takamura, H. Sawanishi, T. Ohe, T. Hirayama, K. Wakabayashi, Mutagenesis 17 (2002) 293. [51] T. Watanabe, Y. Takahashi, T. Takahashi, H. Nukaya, Y. Terao, T. Hirayama, K. Wakabayashi, Mutation Research-Genetic Toxicology and Environmental Mutagenesis 519 (2002) 187. [52] Y. Nogami, R. Imaeda, T. Ito, S. Kira, Environmental Toxicology 15 (2000) 500. [53] W.R. Kusamran, K. Wakabayashi, A. Oguri, A. Tepsuwan, M. Nagao, T. Sugimura, Mutation Research Letters 325 (1994) 99. [54] T. Ohe, N. Takeuchi, T. Watanabe, A. Tada, H. Nukaya, Y. Terao, H. Sawanishi, T. Hirayama, T. Sugimura, K. Wakabayashi, Environmental Health Perspectives 107 (1999) 701. [55] Y. Ono, I. Somiya, Y. Oda, Water Research 34 (2000) 890.

170 References

[56] D.A. Alvarez, P.E. Stackelberg, J.D. Petty, J.N. Huckins, E.T. Furlong, S.D. Zaugg, M.T. Meyer, Chemosphere 61 (2005) 610. [57] L.J. Bao, E.Y. Zeng, Trac-Trends in Analytical Chemistry 30 (2011) 1422. [58] F. Kummrow, C.M. Rech, C.A. Coimbrao, G.A. Umbuzeiro, Mutation Research- Genetic Toxicology and Environmental Mutagenesis 609 (2006) 60. [59] G.A. Umbuzeiro, F. Kummrow, D.A. Roubicek, M.Y. Tominaga, Environment International 32 (2006) 359. [60] G.A. Umbuzeiro, D.A. Roubicek, P.S. Sanchez, M.I.Z. Sato, Mutation Research- Genetic Toxicology and Environmental Mutagenesis 491 (2001) 119. [61] B.N. Ames, J. McCann, E. Yamasaki, Mutation Research 31 (1975) 347. [62] D.M. Maron, B.N. Ames, Mutation Research 113 (1983) 173. [63] J. Kazius, R. McGuire, R. Bursi, Journal of Medicinal Chemistry 48 (2005) 312. [64] K. Mortelmans, E. Zeiger, Mutat Res 455 (2000) 29. [65] G.L. Borosky, J Mol Graph Model 27 (2008) 459. [66] M.G. Knize, F.T. Hatch, M.J. Tanga, E.Y. Lau, M.E. Colvin, Environ Mol Mutagen 47 (2006) 132. [67] P. McCarren, G.R. Bebernitz, P. Gedeck, S. Glowienke, M.S. Grondine, L.C. Kirman, J. Klickstein, H.F. Schuster, L. Whitehead, Bioorg Med Chem 19 (2011) 3173. [68] M. Watanabe, M. Ishidate, T. Nohmi, Mutation Research 234 (1990) 337. [69] J. Bentzien, E.R. Hickey, R.A. Kemper, M.L. Brewer, J.D. Dyekjaer, S.P. East, M. Whittaker, J Chem Inf Model 50 (2010) 274. [70] Y. Hagiwara, M. Watanabe, Y. Oda, T. Sofuni, T. Nohumi, Mutation Research 291 (1993) 171. [71] T. Ohe, D.T. Shaughnessy, S. Landi, Y. Terao, H. Sawanishi, H. Nukaya, K. Wakabayashi, D.M. DeMarini, Mutation Research-Fundamental and Molecular Mechanisms of Mutagenesis 429 (1999) 189. [72] W. Brack, T. Kind, H. Hollert, S. Schrader, M. Möder, Journal of chromatography A 986 (2002) 55. [73] H. Qiu, M. Takafuji, X. Liu, S. Jiang, H. Ihara, Journal of chromatography A 1217 (2010) 5190. [74] G.A. Umbuzeiro, H.S. Freeman, S.H. Warren, D.P. de Oliveira, Y. Terao, T. Watanabe, L.D. Claxton, Chemosphere 60 (2005) 55. [75] J.J. Kirkland, Journal of chromatography A 1060 (2004) 9. [76] M. Yang, S. Fazio, D. Munch, P. Drumm, Journal of Chromatography A 1097 (2005) 124. [77] U. Lubcke-von Varel, G. Streck, W. Brack, Journal of Chromatography A 1185 (2008) 31. [78] USEPA (2007) Estimation Program Interface (EPI) Suite ™, V3.20. United States Environmental Protection Agency, USA. [79] M.H. Abraham, M. Roses, C.F. Poole, S.K. Poole, Journal of Physical Organic Chemistry 10 (1997) 358. [80] M. Laven, T. Alsberg, Y. Yu, M. Adolfsson-Erici, H.W. Sun, Journal of Chromatography A 1216 (2009) 49. [81] W. Brack, K. Schirmer, L. Erdinger, H. Hollert, Environmental Toxicology and Chemistry 24 (2005) 2445. [82] C.J. Houtman, A.M. Van Oostveen, A. Brouwer, M.H. Lamoree, J. Legler, Environmental Science & Technology 38 (2004) 6415. [83] E.L. Schymanski, M. Bataineh, K.U. Goss, W. Brack, Trac-Trends in Analytical Chemistry 28 (2009) 550. [84] W.T. Liao, W.M. Draper, S.K. Perera, Analytical Chemistry 80 (2008) 7765. [85] S.D. Richardson, Analytical Chemistry 79 (2007) 4295. [86] N.S. Zimmermann, J. Gerendás, A. Krumbein, Molecular Nutrition & Food Research 51 (2007) 1537. [87] T. Reemtsma, Journal of chromatography A 1000 (2003) 477. [88] M.E. Rybak, D.L. Parker, C.M. Pfeiffer, Journal of Chromatography B-Analytical Technologies in the Biomedical and Life Sciences 861 (2008) 145.

171 References

[89] A. Makarov, E. Denisov, A. Kholomeev, W. Baischun, O. Lange, K. Strupat, S. Horning, Analytical Chemistry 78 (2006) 2113. [90] M. Krauss, H. Singer, J. Hollender, Analytical and Bioanalytical Chemistry 397 (2010) 943. [91] A.C. Hogenboom, J.A. van Leerdam, P. de Voogt, Journal of chromatography A 1216 (2009) 510. [92] R. Rodil, J.B. Quintana, P. Lopez-Mahia, S. Muniategui-Lorenzo, D. Prada-Rodriguez, Journal of Chromatography A 1216 (2009) 2958. [93] A. Ballesteros-Gomez, S. Rubio, Analytical Chemistry 83 (2011) 4579. [94] S. Kern, K. Fenner, H.P. Singer, R.P. Schwarzenbach, J. Hollender, Environmental Science & Technology 43 (2009) 7039. [95] M. Bataineh, U. Lubcke-von Varel, H. Hayen, W. Brack, Journal of the American Society for Mass Spectrometry 21 (2010) 1016. [96] I. Bobeldijk, J.P.C. Vissers, G. Kearney, H. Major, J.A. van Leerdam, Journal of Chromatography A 929 (2001) 63. [97] J.F. Garcia-Reyes, I. Ferrer, E.M. Thurman, A. Molina-Diaz, A.R. Fernandez-Alba, Rapid Communications in Mass Spectrometry 19 (2005) 2780. [98] D.W. Hill, T.M. Kertesz, D. Fontaine, R. Friedman, D.F. Grant, Analytical Chemistry 80 (2008) 5574. [99] J. Hollender, H. Singer, D. Hernando, T. Kosjek, E. Health, Xenobiotics in the urban water cycle: mass flows, environmental processes, mitigation and treatment strategies, Springer, Dordrecht, 2009. [100] E.M. Thurman, I. Ferrer, A.R. Fernandez-Alba, Journal of Chromatography A 1067 (2005) 127. [101] J.M. Weiss, E. Simon, G.J. Stroomberg, R. de Boer, J. de Boer, S.C. van der Linden, P.E.G. Leonards, M.H. Lamoree, Analytical and Bioanalytical Chemistry 400 (2011) 3141. [102] E. Pitarch, T. Portoles, J.M. Marin, M. Ibanez, F. Albarran, F. Hernandez, Analytical and Bioanalytical Chemistry 397 (2010) 2763. [103] T. Kind, O. Fiehn, Bmc Bioinformatics 8 (2007). [104] M. Katajamaa, M. Oresic, Bmc Bioinformatics 6 (2005). [105] M. Meringer, S. Reinker, J.A. Zhang, A. Muller, Match-Communications in Mathematical and in Computer Chemistry 65 (2011) 259. [106] S. Wolf, S. Schmidt, M. Muller-Hannemann, S. Neumann, Bmc Bioinformatics 11 (2010). [107] K. Heberger, Journal of chromatography A 1158 (2007) 273. [108] K. Valko, M. Plass, C. Bevan, D. Reynolds, M.H. Abraham, Journal of chromatography A 797 (1998) 41. [109] M.H. Abraham, A. Ibrahim, A.M. Zissimos, Journal of chromatography A 1037 (2004) 29. [110] N. Ulrich, G. Schüürmann, W. Brack, Journal of chromatography A (2011). [111] M. Vitha, P.W. Carr, Journal of chromatography A 1126 (2006) 143. [112] R. Benigni, A. Giuliani, R. Franke, A. Gruska, Chemical Reviews 100 (2000) 3697. [113] H.X. Liu, E. Papa, P. Gramatica, Chemical Research in Toxicology 19 (2006) 1540. [114] R. Benigni, L. Passerini, G. Gallo, F. Giorgi, M. Cotta-Ramusino, Environ Mol Mutagen 32 (1998) 75. [115] R. Franke, A. Gruska, C. Bossa, R. Benigni, Mutat Res 691 (2010) 27. [116] G.P. Ford, G.R. Griffin, Chemico-Biological Interactions 81 (1992) 19. [117] G.P. Ford, P.S. Herman, Chemico-Biological Interactions 81 (1992) 1. [118] P. Plaza-Bolanos, A.G. Frenich, J.L.M. Vidal, Journal of chromatography A 1217 (2010) 6303. [119] P.A. Carneiro, G.A. Umbuzeiro, D.P. Oliveira, M.V.B. Zanoni, Journal of Hazardous Materials 174 (2010) 694. [120] D.P. Oliveira, P.A. Carneiro, M.K. Sakagami, M.V.B. Zanoni, G.A. Umbuzeiro, Mutation Research-Genetic Toxicology and Environmental Mutagenesis 626 (2007) 135.

172 References

[121] S.K. Brack W., Environmental Sciences and Technology 37 (2003) 3062. [122] H.-Y. Chen, M. Preston, Analytica Chimica Acta 501 (2004) 71. [123] M.T. Galceran, E. Moyano, Journal of chromatography A 715 (1995) 41. [124] S. Pérez, G. Reifferscheid, P. Eichhorn, D. Barceló, Environmental Toxicology Chemistry 22 (2003) 2576. [125] P.S. Bäuerlein, J.E. Mansell, T.L. ter Laak, P. de Voogt, Environmental Science & Technology 46 (2012) 954. [126] E.L. Schymanski, C. Meinert, M. Meringer, W. Brack, Analytica Chimica Acta 615 (2008) 136. [127] E.L. Schymanski, M. Meringer, W. Brack, dx.doi.org/10.1021/ac102574h. [128] T. Ohe, Mutation Research-Genetic Toxicology and Environmental Mutagenesis 393 (1997) 73. [129] M. Katajamaa, J. Miettinen, M. Oresic, Bioinformatics 22 (2006) 634. [130] J.M. Weiss, T. Hamers, K.V. Thomas, S. van der Linden, P.E.G. Leonards, M.H. Lamoree, Analytical and Bioanalytical Chemistry 394 (2009) 1385. [131] I. Becue, C. Van Poucke, C. Van Peteghem, Journal of Mass Spectrometry 46 (2011) 327. [132] A. Wick, G. Fink, T.A. Ternes, Journal of chromatography A 1217 (2010) 2088. [133] T. Kind, O. Fiehn, Bioanal Rev (2010). [134] S. Neumann, S. Böcker, Analytical Bioanalytical Chemistry 398 (2010) 2779. [135] SigmaAldrich (2010), http://www.sigmaaldrich.com/analytical-chromatography/ analytical-products.html?TablePage=9657655. [136] A. Plum, A. Rehorek, Journal of Chromatography A 1084 (2005) 119. [137] A. Rehorek, A. Plum, Analytical and Bioanalytical Chemistry 388 (2007) 1653. [138] SPARC (2010) SPARC Online Calculator http://sparc.chem.uga.edu/sparc. [139] ChemSpider (2010) http://chemspider.com. [140] F.W. McLafferty, F. Turecek (F.W. McLafferty, F. Turecek(F.W. McLafferty, F. Tureceks), Interpretation of Mass Spectra University Science Books 1993. [141] R.M. Smith (R.M. Smith),R.M. Smiths), Understanding Mass Spectra: A Basic Approach John Wiley & Sons, Inc., Hoboken, New Jersey, 2004. [142] S.A. Lanina, P. Toledo, S. Sampels, A. Kamal-Eldin, J.A. Jastrebova, Journal of chromatography A 1157 (2007) 159. [143] E. Straube, W. Dekant, W. Volkel, Naunyn-Schmiedebergs Archives of Pharmacology 369 (2004) R140. [144] S.J. Lupton, B.P. McGarrigle, J.R. Olson, T.D. Wood, D.S. Aga, Rapid Communications in Mass Spectrometry 24 (2010) 2227. [145] J.S. Brodbelt, Mass Spectrometry Reviews 16 (1997) 91. [146] M. Krauss, C. Gallampois, E. Schymanski, W. Brack, SETAC, Milan, Italy, 2011. [147] N. Leroux, M. Goethals, T. Zeegershuyskens, Vibrational Spectroscopy 9 (1995) 235. [148] E.M. Thurman, I. Ferrer, D. Barcelo, Analytical Chemistry 73 (2001) 5441. [149] E. Ventola, P. Vainiotalo, M. Suman, E. Dalcanale, Journal of the American Society for Mass Spectrometry 17 (2006) 213. [150] E. Schymanski, T. Schulze, J. Hermans, W. Brack, in D.B.l.A.G. Kostianoy (Editor), The Handbook of Environmental Chemistry: Effect-Directed Analysis of Complex Environmental Contamination, Springer-Verlag Berlin Heidelberg, 2011. [151] S. Funar-Timofei, W.M.F. Fabian, G.M. Simu, T. Suzuki, Croatica Chemica Acta 79 (2006) 227. [152] A. Fekete, A.K. Malik, A. Kumar, P. Schmitt-Kopplin, Critical Reviews in Analytical Chemistry 40 (2010) 102. [153] C. Hug, N. Ulrich, M. Krauss, T. Schulze, W. Brack, SETAC North America, Boston, MA, USA (2011). [154] Y. Totsuka, T. Takamura-Enya, R. Nishigaki, T. Sugimura, K. Wakabayashi, J Chromatogr B Analyt Technol Biomed Life Sci 802 (2004) 135. [155] R. Nishigaki, Y. Totsuka, T. Takamura-Enya, T. Sugimura, K. Wakabayashi, Mutat Res 562 (2004) 19. [156] E.L. Schymanski, M. Meringer, W. Brack, Analytical Chemistry 83 (2011) 903.

173 References

[157] E.L. Schymanski, M.J. Gallampois, M. Krauss, M. Meringer, S. Neumann, T. Schulze, S. Wolf, W. Brack (2012), DOI:10.1021/ac203471y. [158] J.L. Wu, J. Liu, Z.W. Cai, Rapid Communications in Mass Spectrometry 24 (2010) 1828. [159] T. Kind, K.H. Liu, D.Y. Lee, O. Fiehn, Abstracts of Papers of the American Chemical Society 239 (2010).

174 Appendix

APPENDIX

List of Figures in Appendix I

Figure AI-1: Mass spectrum of peak 35 Figure AI-2: Mass spectrum of peak 36 Figure AI-3: Mass spectrum of peak 40 Figure AI-4: Mass spectrum of peak 50 Figure AI-5: Mass spectrum of peak 60 Figure AI-6: Mass spectrum of peak 99 Figure AI-7: Mass spectrum of peak 109 Figure AI-8: Mass spectrum of peak 138 Figure AI-9: Mass spectrum of peak 143 Figure AI-10: Mass spectrum of peak 208 Figure AI-11: Mass spectrum of peak 212 Figure AI-12: Mass spectrum of peak 213 Figure AI-13: Mass spectrum of peak 228 Figure AI-14: Mass spectrum of peak 233 Figure AI-15: Mass spectrum of peak 242

175 Appendix

Figure AI-1: Mass spectrum of peak 35

Figure AI-2: Mass spectrum of peak 36

Figure AI-3: Mass spectrum of peak 40

176 Appendix

Fiigure AI-4: Mass spectrum of peak 50

Fiigure AI-5: Mass spectrum of peak 60

Fiigure AI-6: Mass spectrum of peak 99

177 Appendix

Figure AI-7: Mass spectrum of peak 109

Figure AI-8: Mass spectrum of peak 138

Figure AI-9: Mass spectrum of peak 143

178 Appendix

Fiigure AI-10: Mass spectrum of peak 208

Figure AI-11: Mass spectrum of peak 212

Fiigure AI-12: Mass spectrum of peak 213

179 Appendix

Figure AI-13: Mass spectrum of peak 228

Figure AI-14: Mass spectrum of peak 233

Figure AI-15: Mass spectrum of peak 242

180 Curriculum Vitae

CURRICULUM VITAE

Name: Christine Gallampois Birthdate: 05/02/1974 Birth place: Chatillon sous Bâgneux, France

From 11/2010 Research position, University of Linköping, Cell Biology Department, Linköping, Sweden 04/2007 – 11/2010 PhD Student, Helmholtz Research Centre for Environment – UFZ, Department of Effects-Directed-Analysis, Leipzig, Germany

Education

05/2000 DEA: analytical and Marine Chemistry, University of Bretagne Occidentale, Brest, France 09/1996 – 09/1998 Licence and Maitrise: Organic and Analytical Chemistry, University of Paris 11, Orsay, France 09/1995 – 06/1996 Chemistry studies, University of René Descartes, Paris, France

181