University of North Dakota UND Scholarly Commons
Theses and Dissertations Theses, Dissertations, and Senior Projects
January 2019 Applying Machine Learning To Asteroid Classification Utilizing Spectroscopically Derived Spectrophotometry Kathleen Jacinda Mcintyre
Follow this and additional works at: https://commons.und.edu/theses
Recommended Citation Mcintyre, Kathleen Jacinda, "Applying Machine Learning To Asteroid Classification Utilizing Spectroscopically Derived Spectrophotometry" (2019). Theses and Dissertations. 2474. https://commons.und.edu/theses/2474
This Thesis is brought to you for free and open access by the Theses, Dissertations, and Senior Projects at UND Scholarly Commons. It has been accepted for inclusion in Theses and Dissertations by an authorized administrator of UND Scholarly Commons. For more information, please contact [email protected]. APPLYING MACHINE LEARNING TO ASTEROID CLASSIFICATION UTILIZING SPECTROSCOPICALLY DERIVED SPECTROPHOTOMETRY
by
Kathleen Jacinda McIntyre Bachelor of Science, University of Florida, 2011 Bachelor of Arts, University of Florida, 2011
A Thesis
Submitted to the Graduate Faculty
of the
University of North Dakota
in partial fulfillment of the requirements
for the degree of
Master of Science
Grand Forks, North Dakota
May 2019 ii PERMISSION
Title Applying Machine Learning to Asteroid Classification Utilizing Spectroscopically Derived Spectrophotometry
Department Space Studies
Degree Master of Science
In presenting this thesis in partial fulfillment of the requirements for a graduate degree from the University of North Dakota, I agree that the library of this University shall make it freely available for inspection. I further agree that permission for extensive copying for scholarly purposes may be granted by the professor who supervised my thesis work, or in his absence, by the Chairperson of the department of the dean of the School of Graduate Studies. It is understood that any copying or publication or other use of this thesis or part thereof for financial gain shall not be allowed without my written permission. It is also understood that due recognition shall be given to me and to the University of North Dakota in any scholarly use which may be made of any material in my thesis.
Kathleen Jacinda McIntyre April 23, 2019
iii TABLE OF CONTENTS
LIST OF FIGURES………………………………………………………………….vii
LIST OF TABLES……………………………………………………………………ix
ACKNOWLEDGEMENTS…………………………………………………………...x
ABSTRACT…………………………………………………………………………..xi
CHAPTER
1. MOTIVATION……………………………………………………………………...1
1.1. Why Asteroids? ………………………..………..……………………………...1
1.2. Statement of Purpose………………………..………..…………………………5
1.3. Outline………………………..………..……………………………...... 5
2. BACKGROUND………………………..………..……………………………...... 6
2.1. Reflectance Spectroscopy……………………..……………………………...... 6
2.2. Spectra-Altering Surface Properties…………………………………………...... 6
2.2.1. Mineralogical Composition………………………..………..…………….7
2.2.2. Space Weathering………………………..………..……………………...8
2.2.3. Particle Size……………..………..……………………………...... 10
2.2.4. Temperature and Albedo……………..……….…………………...... 11
2.3. Taxonomy……………..………..……………………………...... 12
iv 2.3.1. Early Classifications……………..………..………………………….....13
2.3.2. Bus-DeMeo Taxonomy……………..………..………………………….14
3. SPECTRAL ANALYSIS TECHNIQUES……………..………..…………………..17
3.1. Principal Component Analysis……………..………..…………………………17
3.2. Band Parameters……………..………..……………………………………….19
3.3. Machine Learning……………..………..……………………………………...21
4. OBSERVATIONS AND DATA ANALYSIS……………..………..……………….25
4.1. Observations……………..………..…………………………………………..25
4.1.1. Near-Infrared Spectra……………..………..……………………………25
4.1.2. Visible Spectra……………..………..………………………………….26
4.1.3. Assigning Bus-DeMeo Classes: c2 Thresholds……………..………..…...27
4.2. Spectrophotometry……………..………..…………………………………….33
4.2.1. Photometric Band Selection……………..………..……………………..33
4.2.2. Vega Magnitude Calculation……………..………..…………………….35
4.2.3. Color Indices……………..………..……………………………………36
4.3. Implementing Machine Learning Method using Python……………..………..…37
4.3.1. Training and Test Dataset Selection……………..………..…….……..…37
4.3.2. k-Nearest Neighbor Algorithm……………..………..………….…….…38
4.3.3. Interpreting Metrics……………..………..…………………….……….39
v 5. RESULTS AND DISCUSSION……………...………..…………………………...41
5.1. Visible Dataset…………..………..…………………………………………..41
5.2. NIR Dataset……………..………..…………………………………………..46
6. CONCLUSIONS AND FUTURE WORK……………..………..……………….…51
REFERENCES……………..………..………………………………………………53
vi LIST OF FIGURES
Figure 1-1: Plots showing the known asteroids and comets, as of 04/18/2019……………….2
Figure 1-2: Bennu taken on 12/02/2018 (NASA/Goddard/University of Arizona..…..………4
Figure 2-1: Plots of the Band I center vs. the Band II/Band I area ratio……………………...8
Figure 2-2: General space weathering processes…………………………..………………….9
Figure 2-3: Vesta’s normalized reflectance spectra compared to a Howardite ……………..10
Figure 2-4: PCA Bus taxonomic classes……………………………………………………..14
Figure 2-5: Vis and NIR spectra of Bus-DeMeo taxonomic classes………………………...15
Figure 3-1: PCA projection of Tholen taxonomy…………………………………………....18
Figure 3-2: Plot of the offset reflectance spectra of olivine and low-Ca pyroxene………….20
Figure 3-3: Machine learning algorithm flowchart…………………………………………..23
Figure 3-4: Example of k-Nearest Neighbor classification method………………………….24
Figure 4-1: Visible and NIR spectrum of (4) Vesta………………………………………….27
Figure 4-2: V, R, and I filters with Bus-DeMeo taxonomy………………………………….33
Figure 4-3: Z, J, H, and K bandpasses with Bus-DeMeo taxonomy………………………...34
Figure 4-4: Varying the number of neighbors, k…………………………………………….38
Figure 5-1: The training dataset using the V-I and V-R color indices………………………42
Figure 5-2: The test dataset using the V-I and V-R color indices………...…………………44
Figure 5-3: The S-, C-, X-class training dataset using the V-I and V-R color indices………44
Figure 5-4: The S-, C-, and X-class test dataset using the V-I and V-R color indices………45
vii Figure 5-5: The training dataset using the Z-J and J-K color indices……..…………………47
Figure 5-6: The test dataset using the Z-J and J-K color indices……….....…………………47
Figure 5-7: The S-, C-, and X-class training dataset using the Z-J and J-K color indices..….49
Figure 5-8: The S-, C-, and X-class test dataset using the Z-J and J-K color indices….……49
viii
LIST OF TABLES
Table 2-1: Descriptions of the classes in the Bus-DeMeo taxonomy…………………....….16
Table 4-1: Data, ß values, and classifications used in analysis……..…………………....….28
Table 4-2: Bandpass values used in this analysis……………………………………………36
Table 4-3: Solar color correction factors…………………………………………………….37
Table 5-1: Classification report for the visible dataset………………………………………43
Table 5-2: Classification report for the visible data S-, C-, and X-class subset………..……45
Table 5-3: Classification report for the NIR dataset…………………………………………58
Table 5-4: Classification report for the NIR data S-, C-, and X-class subset………..………50
ix ACKNOWLEDGEMENTS
I wish to express my sincere appreciation to my advisory, Dr. Mike Gaffey, and the members of my committee (Dr. Sherry Fieber-Beyer, Dr. Nels Forsman, Dr. David Whalen, and Dr. Taufique Mahmood) for their guidance and support throughout my time in the master’s program at the University of North Dakota. I would also like to thank my parents,
John and Susan McIntyre, who put up with more than a lot over the years but never stopped believing in me. And a special thanks to my sister, Tina McIntyre, who ensured my survival with coffee and granola bars while I was writing this thesis. I couldn’t have done it without all of you.
x ABSTRACT
Taxonomic studies of asteroids have been ongoing for more than fifty years without a clear understanding of the class parameters. The current method of Principal Component
Analysis is computationally expensive and leaves ambiguous results. In this study, I selected the machine learning algorithm, k-Nearest Neighbor in combination with the current Bus-
DeMeo (DeMeo, et al. 2009) taxonomic classification schema to test if machine learning can take the place of Principal Component Analysis. Using a dataset of spectrophotometric color indices derived from combined visible and near-infrared (NIR) observations and paired with
Bus-DeMeo taxonomic class, I created a training dataset for the model to learn. The results support the visible wavelength region as more diagnostic of spectral slope and the NIR wavelength region as more diagnostic for surface mineralogy. The overall accuracy scores
(>80%) of the machine learning test dataset validate the methodology, but fall short of the threshold necessary to replace current methods of classification (>95%). The overall robustness of the Bus-DeMeo taxonomy is corroborated through the relatively similar grouping structure between the C-, S-, and X-complexes in both wavelength ranges, suggesting an overall relationship between slope and qualities present across multiple wavelength regimes. This is possibly due to spectral features being closely tied to surface mineralogy and spectral reddening of the slope believed to be tied to the effects of space weathering.
xi
1 MOTIVATION
1.1 Why Asteroids?
There are currently over a half million numbered and catalogued asteroids in our solar system. A little over two hundred years ago, there were none. It was not until 1801 when
Giuseppe Piazzi was searching for the elusive planet theorized to be between Mars and Jupiter
(Kepler 1596; Bode 1772), that he discovered (1) Ceres. In the following years (2) Pallas, (3)
Juno, and (4) Vesta were all discovered around the same orbital distance. The gap between
Mars and Jupiter had been filled, not by a single large planet, but seemingly by multiple smaller bodies. The theory was soon developed that these smaller bodies were the broken pieces of a fragmented planet (Olbers 1803). It was such a captivating proposal that it took over a century for a new idea to gain traction. In the mid-1900s, Otto Schmidt proposed that asteroids were not the result of a catastrophic collision, but rather a failure of a planet to fully aggregate. The current understanding of asteroid formation is that they are the original building blocks of our solar system and studying them offers us a window back in time to the early days of our planet’s formation.
The theory of solar system formation began when a large cloud of molecular gas and dust became massive enough to trigger gravitational collapse. The majority of the mass in the cloud went to the center, where the Sun formed, and the rest flattened into a protoplanetary disk
(Mizuno 1980; Pollack 1984; Boss 1997, 1998). It was from this disk that the planets and the asteroids formed (Chambers & Wetherill, 1998). The early inner solar system was mostly rocky material, while the outer solar system was cold enough for volatiles and ices to condense on to aggregating bodies. This led to the primary divide in the main asteroid belt, where
1 siliceous S-type asteroids dominant the inner region of the main asteroid belt and carbonaceous
C-type asteroids dominate the outer areas of the Main Belt (Fisher 1941, Kitamura 1959, Wood
& Kuiper 1963, Chapman et al. 1971, Chapman et al. 1975).
The two panels on Figure 1-1 represent the asteroid population of the inner solar system.
The majority of the asteroids are in the Main Belt, located between Mars and Jupiter, (1.52-
5.20 AU), but there are also significant asteroid populations around the orbit of Jupiter
(Trojans) and near Earth’s orbit as well (near-Earth asteroids, or NEAs). Compare the area around Earth’s orbit in the left panel of Figure 1-1 to that in the right panel of Figure 1-1. The
NEA population is sparse compared to the Main Belt, but there are thousands of catalogued bodies for us to investigate in our cosmic backyard.
Figure 1-1: Plots showing the known asteroids and comets, as of April 18, 20191. Green circles represent Main Belt asteroids, red circles represent NEAs, blue circles are Jupiter Trojans, and comets are shown as blue squares. The Sun is represented by the yellow star in the center. The left panel depicts the solar system out to the orbit of Jupiter, with the orbits of the five innermost planets shown. The right panel provides a better view of the NEA population, extending out to the orbit of Mars.
1 https://cgi.minorplanetcenter.net/iau/lists/MPLists.html
2 Asteroids are of great scientific value as they allow scientists to study the early solar system, but there are more reasons to pay attention to asteroids. The general definition of an asteroid is a rocky small solar-system body that orbits the Sun at heliocentric distances ranging from interior to Earth’s orbit to Jupiter’s orbit (IAU 2006). While the general asteroid distribution of volatile rich outer-belt asteroids and rocky, thermally processed inner-belt asteroids gives us clues as to the conditions of the early solar system, there is still much we do not understand about how the orbital properties of asteroids evolved over time. Assigning asteroid families is one way to try to determine dynamical origins and migrations of asteroids.
Asteroid families are believed to be the product of the collisional breakup of a progenitor body
(Hirayama 1918). This breakup results in many smaller asteroids, which can be traced back to their parent body through orbital dynamics. To further assign an asteroid as a member of a family, its spectra can be used to genetically link members together (Farinella et al. 1992). A homogeneous progenitor would result in compositionally similar family members, but if the progenitor was a differentiated body, there should be family members of varying compositions and taxonomic classifications, making family classification more complex.
Of great importance to these dynamical studies is trying to understand and predict how asteroids migrate inwards, potentially towards Earth. NEAs are asteroids with perihelion distances less than or equal to 1.3 AU and aphelion distances greater or equal to 0.983 AU. In an effort to associate an impact risk factor with orbital properties, NEAs have been subdivided into four classes: Atiras are NEAs whose orbits are entirely encompassed within the orbit of
Earth. Atens are NEAs with Earth-crossing orbits but a semi-major axis less than Earth’s.
Apollos are NEAs with Earth-crossing orbits that have semi-major axes greater than Earth’s.
Finally, Amors are NEAs whose orbits are outside of Earth’s but within 1.3AU of the Sun at perihelion. Asteroids whose Minimum Orbit Intersection Distance with Earth is equal to or
3 less than 0.05 AU are known as Potentially Hazardous Asteroids (PHAs). The IAU currently lists 724 NEAs with non-negligible probabilities of impacting Earth (IAU 2019). One such
PHA is Bennu, currently being visited by the OSIRIS-REx spacecraft.
Figure 1-2: Compilation of eight images of asteroid Bennu taken with OSIRIS-REx’s PolyCam instrument on 12/02/2018 (NASA/Goddard/University of Arizona).
What makes NEAs a threat to Earth is also what makes them prime for scientific study: proximity. Asteroids do not emit significant amounts of their own light in the visible and near- infrared (NIR), so spectroscopic and photometric studies use the reflected light from the Sun, which I will discuss in Section 2. They also tend to be faint and difficult to observe. NEAs are closer than the majority of asteroids located in the Main Belt, and therefore relatively brighter.
They are also the topic of the new field of space resource allocation (Fieber-Beyer & Gaffey
2019). Asteroids formed from the same protoplanetary disk as Earth, and so the majority have a low relative inclination (Lecar & Franklin 1973). In addition to a low inclination, NEAs are close to Earth’s orbit, making them the easiest targets to visit for scientific and resource
4 utilization (Yarnoz et al. 2013). Motivating this study is the desire to better understand the properties of these asteroids through remote sensing and modern computational tools.
1.2 Statement of Purpose
The primary objective of this paper is to further the understanding of the relationship between asteroid classifications and observational properties by combining historical tools and perspectives with the computational capabilities of the present. To meet this objective, I will apply a machine learning algorithm to derived spectrophotometric data and compare to current classification efforts through the Bus-DeMeo taxonomy (DeMeo, et al. 2009).
1.3 Outline
Section 2 will examine the relevant background and historical information necessary to support the motivation behind this work. Section 2 outlines why scientists strive to classify observations into groups, so it only follows that Section 3 explores the techniques commonly used to justify these classifications. Section 4 defines the observations and describes the steps followed to implement a machine learning routine using derived spectrophotometry. Finally,
Sections 5 and 6 discuss the results, conclusions, and future work that can utilize this method.
5
2 BACKGROUND
2.1 Reflectance Spectroscopy
Reflectance spectroscopy is an effective measurement for the physical characterization of asteroids. Reflectance spectroscopy studies light that has been reflected, refracted, or scattered from a material, as a function of wavelength. For example, a mineral on the surface of an asteroid is hit by incident photons. Some photons will be absorbed by the mineral, some will be transmitted through the grains, while others will be scattered via reflection and refraction.
What we are detecting when we observe that asteroid are the scattered photons, leaving an absorption band feature in the spectra where the absorbed photons are missing from the continuum (Burns 1970).
The absorbed photons are converted to heat and re-emitted as thermal radiation (McCord
& Adams 1977), which will affect the flux received from the asteroid. Also, the Sun’s peak emission (~0.5 µm) is located within the wavelength region included in this study (0.8-2.5
µm), so reflectance will be maximized. Additionally, Earth’s atmosphere is relatively transparent in the visible band, with telluric corrections applied to the NIR band, making the visible-NIR the ideal wavelength region to study asteroid spectroscopy.
2.2 Spectral-Altering Surface Properties
There are multiple factors contributing to the alteration of an asteroid’s spectrum: chemical composition, space weathering, particle size, temperature, and albedo (Burns 1970; Adams
1974; Adams & McCord 1971; Hiroi et al. 1994; Lebofsky et al. 1986). This is why asteroid
6 surface characterization and diagnostic taxonomic classifications are so difficult using remote sensing techniques. We can only speculate the surface mineralogy of asteroids through observational spectra alone.
2.2.1 Mineralogical Composition
The most diagnostic wavelength region for mineralogy of silicate-based compounds is in the NIR spanning the 0.8 – 2.5 µm range (across the 1 and 2 µm absorption bands of olivine and pyroxene). Studies of the spectral properties of minerals in the NIR region have concentrated primarily on olivine and pyroxene (Clark 1957; Burns 1970; Adams 1974; King
& Ridley 1987; Cloutis & Gaffey 1991). Minerals are characterized by a unique reflectance spectrum with different spectral features. The features specific to a certain mineral come from the electronic and vibrational transitions within the molecules. At what wavelengths these features will appear in the spectrum depends on the chemical species involved and the structure of the mineral (Burns 1970).
Olivine and pyroxene appear to be the two most abundant minerals on the surface of asteroids, but the possibility that you can obtain a similar spectrum through the combination of various minerals cannot be ignored. The primary diagnostic features of olivine is comprised of three overlapping individual bands located in the ~1 µm region, while the primary diagnostic features in pyroxenes are two distinct absorption bands located near 0.9 µm and 2 µm (Clark
1957; Burns 1970). Section 3 will discuss how to use these band features to distinguish the mineralogical subtypes of the S-class asteroids. To that end, Gaffey et al. (1993) developed a classification scheme based on the ratios of olivine to pyroxene in the spectra. An advantage of exploring mineralogical sub-types is that the effects of space weathering (see section 2.2.2) are virtually nonexistent, making the classification system sensitive to only composition
(Gaffey et al. 2010). Figure 2-1 depicts the seven mineralogical subtypes along the olivine-
7 orthopyroxene mixing line from olivine-dominated at S(I)-type to pyroxene-dominated at
S(VII)-type.
A B
Figure 2-1: Plots of the Band I center vs. the Band II/Band I area ratio. (A) General representation of the mineralogical zones of the S-type subgroups. (B) The corresponding subgroups labelled S(I) – S(VII). Adapted from Figure 1E and 1F of Gaffey et al. 1993.
2.2.2 Space Weathering
An essential step in our comprehension of asteroids requires the understanding of the processes resulting in space weathering. All airless bodies in the solar system are affected by space weathering, including asteroids. Effects of space weathering are induced through solar ion irradiation, including solar wind, solar flares, and coronal mass ejections, galactic cosmic- ray ion irradiation, and micrometeoroid bombardment (Clark & Johnson 1996; Moroz et al.
1996; Hapke, 2001; Sasaki et al. 2001). Figure 2-2 in displays these processes and the resulting sputtering, displacement, and vaporization of asteroidal surface particles, which we observe as space weathering effects.
Space weathering was first predicted to occur on the lunar surface (Gold, 1955), and return samples from the Apollo program confirmed that space weathering of the lunar regolith caused its spectra to exhibit weaker features and a redder continuum (Adams & McCord 1971;
Pieters et al. 2000; Noble et al. 2001; Taylor et al. 2001). Extrapolating this theory to asteroids,
8 it becomes apparent that space weathering may alter the appearance of remote sensing observations and can explain the missing link between ordinary chondrite meteorites and their
S-type asteroid parent bodies, which plagued planetary scientists for decades (Bell et al. 1989).
Figure 2-2: Depiction of the general space weathering processes occurring on asteroidal surfaces. Adapted from Figure 1 of Pieters & Noble 2016.
Irradiation from solar wind or galactic cosmic rays can alter the surface grains through high-energy interactions that result in sputtering or chemical alteration (Keller & McKay
1997). Solar wind particles and galactic cosmic rays both primarily consist of hydrogen ions.
The flux of solar wind impacts is much higher than the galactic cosmic ray flux, but the galactic cosmic rays are much more energetic and so cannot be discounted (Brunetto et al., 2015).
Brunetto and Strazzulla (2005) describes an experiment where samples of silicate-rich rock were irradiated with various ions (H+, He+, Ar+, Ar2+) at energies ranging from 60 to 400 keV,
9 to simulate space weathering on asteroids. The reflectance spectra were both reddened and darkened in the 0.25 – 2.7 μm range, but Brunetto and Strazzulla believe that while ion irradiation may dominate in the inner solar system, micrometeoroid bombardment could be the main contributor to space weathering in the asteroid belt.
2.2.3 Particle Size
The particle size distribution of the surface regolith of asteroids can also impact its reflectance spectrum. Figure 2-3 compares the reflectance spectra between Vesta and a
Howardite EET87503 of different grain sizes, ranging from less than 25 µm to 125 µm (Hiroi et al. 1994).
Figure 2-3: Vesta’s normalized reflectance spectra compared to a Howardite EET85703 of varying particle size across the visible and NIR. Figure 2 from Hiroi et al. 1994.
10
We can see that fine particles fit the spectrum of Vesta well, indicating that the surface of Vesta is covered with a fine regolith. Also of note is how the spectrum changes as the grain size increases. The overall slope becomes bluer and the absorption band depths increase. Grain size does not affect the NIR as much as the mid-IR because the mid-IR region is dominated by thermal emission, which is more strongly affected by changes in surface grain structure
(Hargrove, 2009). The Dawn spacecraft gave us a close-up view of Vesta and its regolith
(Russell et al. 2012, 2013). Material excavated from large craters on Vesta indicates that the regolith is more than 1-kilometer-thick in some places (Denevi et al. 2016), supporting the hypothesis of a thick, fluffy layer of regolith fines.
2.2.4 Temperature and Albedo
Surface temperatures of asteroids can vary wildly depending on heliocentric distance. As noted in Section 1, asteroids are defined as small bodies from within Earth’s orbit to Jupiter’s orbit. This means the surface temperature of airless bodies at these semi-major axes ranges from about 120 K for the Trojans to over 300 K for near-Earth asteroids (Hinrichs et al., 1999).
The surface temperature of asteroids is also linked to albedo (Lebofsky et al. 1986):