Quantitative Planetary Image Analysis Via Machine Learning
Total Page:16
File Type:pdf, Size:1020Kb
Tina Memo No. 2013-008 External, PhD Thesis, University of Manchester Quantitative Planetary Image Analysis via Machine Learning. Paul Tar Last updated 25 / 09 / 2014 Centre for Imaging Sciences, Medical School, University of Manchester, Stopford Building, Oxford Road, Manchester, M13 9PT. Quantitative Planetary Image Analysis via Machine Learning A thesis submitted to the University of Manchester for the degree of PhD in the faculty of Engineering and Physical Sciences 2014 Paul D. Tar School of Earth, Atmospheric and Environmental Sciences 2 Contents 1 Introduction 19 1.1 Theriseofimagingfromspace. ...... 19 1.1.1 Historicalimages ............................... 20 1.1.2 Contemporaryimages . 20 1.1.3 Futureimages.................................. 21 1.2 Sciencecase ..................................... .. 22 1.2.1 Lunarscience .................................. 22 1.2.2 Martianscience ................................ 22 1.3 Imageinterpretation ............................. ..... 23 1.3.1 Manualanalysis................................ 24 1.3.2 Automatedanalysis.............................. 24 1.4 Measurements.................................... .. 25 1.4.1 Quantitative measurements and The Scientific Method . .......... 26 1.4.2 Theroleofstatistics . ... 27 1.4.3 Assumptionsandapproximations . .... 29 1.5 Argumentforquantitativeautomation . ........ 30 1.6 Criteriaforaquantitativesystem . ......... 31 1.7 Thesisoutline ................................... ... 32 2 Literature Review 35 2.1 Representations ................................. .... 35 2.1.1 Properties .................................... 36 2.1.2 Imageencodings ................................ 36 2.1.3 Summary .................................... 40 2.2 Statisticalmodelling . ...... 40 2.2.1 Datamodellingmethods . .. 40 2.2.2 Summary .................................... 42 2.3 Classifiers ...................................... .. 43 2.3.1 ABayesOptimalclassifier . ... 43 2.3.2 Fixeddecisionboundarymethods . ..... 44 3 2.3.3 Densitymethods ................................ 46 2.3.4 Summary .................................... 47 2.4 Performanceevaluation. ...... 47 2.4.1 Empiricalperformanceevaluations . ....... 48 2.4.2 Theoreticalperformanceevaluations . ........ 51 2.4.3 Summary .................................... 52 2.5 Applicability to quantitative measurements . ............ 52 2.5.1 Criterion: a measurement must be a numerical estimate driven by evidence foundwithinthedata ............................. 52 2.5.2 Criterion: a measurement must be accompanied by an error estimate indi- cating the expected accuracy of the measurement; and Criterion: a measure- ment must be shown in practice to not deviate from the true measurement bymorethanispredictedbytheestimatederror . ... 54 2.5.3 Criterion: a measurement, where possible, should be supported by addi- tionalcheckstoensuretrustworthiness . .... 56 2.6 Summary ........................................ 57 3 AStatisticalModelforPlanetaryImageData 58 3.1 Propertiesofplanetaryimages. ........ 58 3.1.1 Featuresandtheirmeasurements . ..... 58 3.1.2 Highlevelsofvariationwithinfeatures . ........ 61 3.1.3 Overview .................................... 62 3.2 Modelrequirements............................... .... 62 3.3 Linearhistograms................................ .... 63 3.4 Components, quantities and measurements . .......... 65 3.5 Likelihoodparameterestimation. ......... 66 3.5.1 Quantity estimation: Q ............................ 66 3.5.2 Component PMF estimation: P ........................ 67 3.6 Goodnessoffit .................................... 69 3.6.1 Modelselection ................................ 71 3.6.2 Measurementsuccessindicator. ...... 71 3.7 Monte-Carlotesting. .. .. .. .. .. ... .. .. .. .. .. ... .. ..... 72 4 3.8 Discussion...................................... .. 76 3.9 Summary ........................................ 77 4 Statistical Error Estimation 79 4.1 CramerRaoBound.................................. 79 4.1.1 Anticipatedproperties . ... 80 4.1.2 Lowerboundproperties . .. 81 4.2 Classquantitycovariance. ....... 81 4.3 Monte-Carlotesting. .. .. .. .. .. ... .. .. .. .. .. ... .. ..... 82 4.4 Discussion...................................... .. 83 4.5 Summary ........................................ 84 5 Systematic Error Estimation 86 5.1 EMErrorpropagation .............................. ... 86 5.1.1 Sourcesofuncertainty . ... 86 5.1.2 SingleEMsteperror ............................. 87 5.1.3 Erroramplification ............................. .. 89 5.2 Monte-Carlotesting. .. .. .. .. .. ... .. .. .. .. .. ... .. ..... 92 5.3 Discussion...................................... .. 94 5.4 Summary ........................................ 95 6 Martian Terrains: BRIEF Representation 97 6.1 LocalBRIEFdescriptors . ..... 97 6.2 Generating test data: Martian terrain simulator . ............. 99 6.3 Statistical properties of BRIEF histograms . ............ 100 6.4 Syntheticterraintesting . ....... 104 6.5 Discussion...................................... 105 6.5.1 Behaviourconsistentwiththeory . ...... 109 6.5.2 Problematicbehaviour . 109 6.6 Summary ........................................ 110 7 MartianTerrains:PoissonBlobRepresentation 112 7.1 BlobsandAreas.................................... 112 5 7.2 AreaErrors ....................................... 114 7.2.1 PerXbincovariancescalingapproach . ...... 114 7.2.2 Errorpropagationapproach . .... 116 7.3 Monte-Carlo and synthetic terrain testing . ........... 117 7.4 Discussion...................................... 118 7.4.1 Areameasurementsfromblobs . 123 7.4.2 Poissonblobimprovements. .... 123 7.4.3 Poissonbloblimitations . .... 124 7.5 Summary ........................................ 125 8 Lunar Crater Counting: Moon Zoo Part 1 126 8.1 MoonZooproject.................................. 126 8.1.1 PropertiesofMoonZoocraterdata . ..... 127 8.2 Craterprocessingpipeline . ....... 130 8.3 Step1:Coalescence............................... .... 131 8.3.1 ALikelihoodsolution. 131 8.3.2 Parameteraccuracy. 133 8.3.3 Coalescencealgorithm . 136 8.3.4 Testing...................................... 137 8.3.5 Discussion.................................... 137 8.4 Step2:Refinement ................................. 138 8.4.1 ALikelihoodsolution. 140 8.4.2 Templateselection ............................. 142 8.4.3 Matchscoreselection. 145 8.4.4 Refinementalgorithm. 147 8.4.5 Testing...................................... 149 8.4.6 Discussion.................................... 149 8.5 Summary ........................................ 151 9 Lunar Crater Counting: Moon Zoo Part 2 152 9.1 MoonZooreducedcraterdata. ..... 152 9.1.1 Groundtruth .................................. 153 6 9.2 Step3:Linearmodelling . ..... 154 9.2.1 Match score distributions and histogram selection . ............ 155 9.2.2 Populatinghistograms . 156 9.2.3 Testing...................................... 157 9.2.4 Discussion.................................... 160 9.3 Step4: Falsenegativecalibration . ......... 162 9.3.1 Testing...................................... 162 9.3.2 Discussion.................................... 163 9.4 Summary ........................................ 163 10 Conclusions 166 10.1Theorysummary .................................. 167 10.1.1 Findings..................................... 167 10.1.2 Strengths .................................... 168 10.1.3 Limitations .................................. 168 10.2Application ..................................... 169 10.2.1 Findings..................................... 169 10.2.2 Strengths .................................... 170 10.2.3 Limitations .................................. 171 10.3Futurework..................................... 172 10.4Conclusion..................................... 173 Word count: 55,000 7 List of Figures 1 2D two class data density with Bayes Optimal decision boundary. ......... 44 2 Example ROC curves: Algorithm A performs best, giving largest area under curve, where as algorithm C performs no better than chance.. ......... 50 3 Top: original decision boundary using training data of figure 1. Bottom: appli- cation of trained boundary when class 2 has increased in quantity, i.e. P (X 2) is | fixed, but P (2) has increased. Here, the boundary is no longer optimal, i.e. at the point where P (1 X)= P (2 X). ............................ 53 | | 4 Features found within planetary terrain images. Top row, from left to right: Craters; Dunes; Fissures. Bottom row, from left to right: Drainage network; Martian ‘spiders’; Martian chaos terrain. Images courtesy of NASA, Lunar Re- connaissance Orbiter and Mars Reconnaissance Orbiter. ............ 60 2 5 This plot shows that by using the χD goodness-of-fit function the EM ICA algo- rithm can extract an appropriate number of linear components to describe a set of example training histograms. Each curve represents a Monte-Carlo data source generated with between 1 and 10 components. Each curve crosses unity at the most appropriate point, i.e. when the number of extracted components equals the numberofgeneratingcomponents.. ..... 74 6 This plot shows that the EM quantity parameter estimation algorithm can suc- cessfully fit previously learned components to new unknown mixtures of the same components in the case when they are representative. Each data point corresponds to a model of different complexity of between 1 and 10 components. It can be seen that the goodness-of-fit function remains stable around unity for varying quantities oftrainingandtestingdata. .... 74 2 7 This plot shows that the