2009:029 MASTER'S THESIS

Collocating Satellite-Based Radar and Radiometer Measurements to Develop an Ice Water Path Retrieval

Gerrit Holl

Luleå University of Technology Master Thesis, Continuation Courses Space Science and Technology Department of Space Science, Kiruna

2009:029 - ISSN: 1653-0187 - ISRN: LTU-PB-EX--09/029--SE Master’s Thesis

Collocating satellite-based radar and radiometer measurements to develop an ice water path retrieval

Gerrit Holl

June 11, 2009

Approximate footprints for different sensors 4480 CloudSat MHS 4460 HIRS AMSU−A

4440

4420

4400

4380 UTM y−pos (km)

4360

4340

4320

4300 390 400 410 420 430 440 450 460 470 480 UTM x−pos (km) Abstract

Remote sensing satellites can roughly be divided in operational satellites and scientific satellites. Generally speaking, operational satellites have a long lifetime and often several near-identical copies, whereas scientific satellites are unique and have a more limited lifetime, but produce more advanced data. An example of a scientific satellite is the CloudSat, a NASA satellite flying in the so-called ”A-Train” formation with other satellites. Examples of operational satellites are the NOAA and MetOp meteorological satellite series. CloudSat carries a 94 GHz nadir viewing radar instrument measuring pro- files of clouds. The NOAA-15 to NOAA-18 and MetOp-A satellites carry radiometers at various frequencies ranging from the infrared (3.76 µm) to around 183 GHz (≈ 1.6 mm). The full range is covered by the High Res- olution Infrared Radiation Sounder (HIRS) and the Advanced Microwave Sounding Units (AMSU-A and AMSU-B). On newer satellites, AMSU-B has been replaced by the Microwave Humidity Sounder (MHS) with nearly the same characteristics. Those instruments scan the atmosphere at angles from approximately −50◦ to +50◦ perpendicular to the ground track. The large amount of data from operational satellites is interesting to the scientific community, particularly when combined with measurements from a scientific satellite. The degree project focuses on this combination and consists of two parts:

• The first part of the project involves searching for collocations 1 between the CloudSat radar and one of the NOAA or MetOp-A instruments. A collocation between two instruments is defined to occur when both look at the same place at the same time (within pre-set thresholds). This has been done with software developed by the student.

• Those collocations are then used to find the relation between the radi- 1Two different spellings of the word collocation are found: either co + location, which makes colocation, or con + location which makes collocation. This text uses the spelling collocation.

ii Abstract iii

ances and physical data (such as Ice Water Path (IWP)) derived from CloudSat measurements. For the tropical ocean, this relation has been compared with data from models. Additionally, an artificial neural network has been trained to retrieve IWP. Acknowledgements

This thesis work would not have existed in this form without the help of a large number of people. First of all, I would like to thank Stefan Buehler, my supervisor and teacher in a course on . It was your idea to look for collocated satellite data. By your supervising, I have learnt a lot about atmospheric remote sensing, a field about which I had little to no knowledge when I started. Your feedback along the way was highly valuable. You were always easy to reach and it was impressive how fast you always replied to e-mails. Tack Stefan! I would like to thank Carlos Jim´enez(at Observatoire de Paris) for giving valuable feedback about neural networks. Bengt Rydberg (Chalmers Univer- sity of Technology) and Ajil Kottayil (SAT-group) have helped by providing plots I used to validate my collocations. Jasna Pittman (National Center for Atmospheric Research) and collaborators have done similar research. A poster published by Pittman and collaborators was useful for comparing dif- ferent results. I have used some freely available code that was not written by my- self. I would like to thank Patrick Eriksson and the atmlab commu- nity for land sea mask.m, Alexandre Schimel for wgs2utm.m, Matt G. for ignoreNaN.m David Dean for hist2d.m, and the Python community for the Python programming language. The following figures were not made by me: (2.1) (NASA/JPL), (2.2) (USGS), (2.3), (2.5), (2.6), (2.7), (2.8) (ESA), (2.4) (Arash Houshangyour), (2.9) (Viju Oommen John), (4.4) (Wikipedia contributor Colin M.L. Bur- nett). My thanks to all members of the SAT-group: Stefan Buehler, Salomon Eliasson, Erik Johansson, Ajil Kottayil, Thomas Kuhn, Oliver Lemke, Math- ias Milz, Isaac Moradi and Simon Ostman.¨ On the weekly group meetings, you have all helped by considering the questions arising from my work. You were and are good colleagues to have, both at work and outside work. My mother, Tinelot Wittermans, has proofread my thesis and suggested

iv Acknowledgements v corrections to the language. Thanks go to all the teachers and assistants for all the courses I have read throughout my studies. Thanks to Sven Molin, Victoria Barabash and others, for making the Erasmus Mundus Master Course in Space Science and Technology possible. Also thanks to the administrative staff at the Department of Space Science (IRV), Anette Sn¨allfot-Br¨andstr¨omand Maria Winneb¨ack, and to the head of the department, Hans Weber. I also want to thank my science teachers during secondary education, Hans van Dijk and Hans van Riet, for stimulating my interest in science in general and physics in particular. Without your interesting science lessons at the Pieter Nieuwland College, I am not sure if I would have chosen to study Applied Physics in my Bachelor. Thanks to all my friends in the SpaceMaster programme for an experience I will never forget. The international environment, the time spent together in W¨urzburgand Kiruna, all the trips and parties, particularly in the first year, have provided me with memories that will stay with me for a very long time. Thanks to my friends and family for sharing joy and sorrow in good and bad times, for supporting me and for simply being there. Finally, thanks to my girlfriend, Catherine Dieval, for enriching my life. Table of Contents

1 Introduction 1 1.1 Scientific background ...... 2 1.2 Tools ...... 3

2 Satellites and sensors 4 2.1 CloudSat ...... 4 2.1.1 Cloud Profiling Radar ...... 5 2.2 NOAA15 – NOAA-18, MetOp-A ...... 7 2.2.1 Radiometers ...... 9 2.2.1.1 HIRS ...... 11 2.2.1.2 AMSU ...... 13 2.2.1.3 Channel summary ...... 17

3 Finding collocations 18 3.1 Introduction ...... 18 3.2 Method ...... 18 3.2.1 Input data ...... 18 3.2.2 Preprocessing ...... 19 3.2.2.1 Finding matching granules ...... 19 3.2.2.2 Converting units ...... 20 3.2.2.3 Checking for data validity ...... 20 3.2.2.4 Checking temporal overlap ...... 21 3.2.3 Collocation algorithm ...... 21 3.2.3.1 Implementation ...... 23 3.2.3.1.1 HIRS and AMSU-A ...... 24 3.2.4 Postprocessing ...... 25

vi TABLE OF CONTENTS vii

3.3 Verification and statistics ...... 26

4 Using collocations 35 4.1 Validating simulated ice water path ...... 35 4.2 Comparing ice water path retrievals ...... 38 4.3 Using NN to develop IWP retrieval ...... 40 4.3.1 Adding HIRS ...... 42

5 Conclusions and outlook 44

A Complete software description 46 A.1 Python source file ...... 46 A.1.1 relate hdf.py ...... 46 A.2 MATLAB source files ...... 46 A.2.1 Core program ...... 47 A.2.1.1 find overlap.m ...... 47 A.2.1.2 compare granule.m ...... 47 A.2.1.3 compare date.m ...... 47 A.2.1.4 find all overlap.m ...... 48 A.2.2 Post-processing ...... 48 A.2.2.1 find mean CSIWP per AMSU pixel.m . . . . 48 A.2.2.2 fill missing noaa18 amsua.m ...... 48 A.2.2.3 fill mspps.m ...... 48 A.2.2.4 get data from overlap.m ...... 48 A.2.2.5 remove doubles.m ...... 48 A.2.2.6 remove all doubles.m ...... 49 A.2.3 Helper functions ...... 49 A.2.3.1 COLNO.m ...... 49 A.2.3.2 calc distance.m ...... 49 A.2.3.3 land or sea.m ...... 49 A.2.3.4 put in bins.m ...... 49 A.2.3.5 satboxplot.m ...... 49 A.2.3.6 for each granule.m ...... 49 A.2.3.7 find datafile by date.m ...... 49 A.2.3.8 find datafile by unixtime.m ...... 50 viii TABLE OF CONTENTS

A.2.3.9 find short distance.m ...... 50 A.2.4 Reader function ...... 50 A.2.4.1 extract from overlap.m ...... 50 A.2.5 Statistics and verification ...... 50 A.2.5.1 make stats colloc.m ...... 50 A.2.5.2 find gridded average IWP month.m ...... 50 A.2.5.3 find IWPcorrelated channels ...... 50 A.2.6 Functions by others ...... 51 A.2.6.1 ignoreNaN.m ...... 51 A.2.6.2 wgs2utm.m ...... 51 A.2.6.3 hist2d.m ...... 51 A.2.7 Plotter functions ...... 51 A.2.7.1 plot footprints.m ...... 51 A.2.7.2 plot example dlat dlong.m ...... 51 A.2.7.3 plot hist2d dist int.m ...... 51 A.2.7.4 plot hist2d logIWP BT.m ...... 51 A.2.7.5 plot scatter CSIWP NESDISIWP.m . . . . . 51 A.2.7.6 plot extract from overlap.m ...... 52 A.2.7.7 find lowestlat by interval.m ...... 52 A.2.7.8 plot average BT month midlat.m ...... 52 A.2.7.9 plot hist2d latitude angle.m ...... 52 A.2.7.10 plot latitude hist.m ...... 52 A.2.7.11 plot monthly IWP PDF.m ...... 52 A.2.7.12 plot overlap column.m ...... 52 A.2.7.13 plot overlap time.m ...... 52 A.2.8 Test functions ...... 52 A.2.8.1 echonames.m ...... 53 A.2.8.2 test modis.m ...... 53 A.2.8.3 test data comp.m ...... 53 A.2.8.4 compare amsua amsub hirs.m ...... 53 A.2.8.5 getfieldnames.m ...... 53 A.2.8.6 LWC height.m ...... 53 A.2.9 Neural network ...... 53 A.2.9.1 test neuralnet.m ...... 53 TABLE OF CONTENTS ix

B File format 54 B.1 File overlaps ...... 54 B.2 Collocations ...... 54

C Websites 58

D Acronyms and glossary 59 List of Figures

2.1 CloudSat artist impression ...... 4 2.2 CloudSat groundtrack ...... 5 2.3 HIRS picture ...... 11 2.4 HIRS channels ...... 12 2.5 AMSU-A1 picture ...... 13 2.6 AMSU-A2 picture ...... 13 2.7 MHS drawing ...... 14 2.8 MHS channels ...... 15 2.9 AMSU-B Jacobians ...... 16

3.1 Different granules time coverage ...... 20 3.2 Ground track comparison ...... 22 3.3 Footprint comparison ...... 23 3.4 Distances CS to MHS, AMSU-A and HIRS ...... 25 3.5 Number of collocations ...... 27 3.6 Collocation angles, NOAA-18/MHS and CPR, 15 minutes . . 28 3.7 Collocation angles, NOAA-18/MHS and CPR, 1 minute . . . . 28 3.8 Collocation angles, NOAA-16/AMSU-B and CPR, 15 minutes 29 3.9 Collocation time interval ...... 30 3.10 Collocation aliasing ...... 30 3.11 Collocations January 2008 ...... 32 3.12 Number of collocations versus distance and interval ...... 32 3.13 IWP histogram ...... 34

4.1 IWP/Tb boxplot ...... 36 4.2 IWP/Tb hist2d ...... 37

x LIST OF FIGURES xi

4.3 CS IWP versus NOAA NESDIS MSPPS IWP ...... 38 4.4 Neural net schematic ...... 40 4.5 NN scatter plot, MHS only ...... 42 4.6 NN scatter plot, MHS and HIRS ...... 43 4.7 Comparison NN with and without HIRS ...... 43 List of Tables

2.1 Orbital parameters ...... 8 2.2 Channels used ...... 17

3.1 Collocation statistics ...... 33

B.1 Overlap file description ...... 56 B.2 Overlap data file description ...... 57 B.3 Mean overlap data file description ...... 57

xii Chapter 1

Introduction

Spaceborne remote sensing instruments have provided the earth sciences with a wealth of information in recent decades. The first earth observation satel- lite was the Television InfraRed Observation Satellite-1 (TIROS-1), a launched April 1960 by the National Aeronautics and Space Admin- istration (NASA), part of the TIROS programme and eventually superseded by the satellite series operated by the National Oceanic and Atmospheric Administration (NOAA) (Rees, 2001; NASA website). Scientific and op- erational earth observation satellites carry a variety of passive and active instruments operating in wavelength regions ranging from microwaves to UV and in geometries ranging from limb to nadir observation. Data acquired from such instruments is used by the scientific community, meteorological agencies, and other organisations. Combining data from different satellites allows for powerful and innovative applications. This study combines measurements from the CloudSat Cloud Profiling Radar (CPR) with radiances obtained by a variety of instruments on NOAA and MetOp satellites1. Chapter (2) below describes which satellites and instruments are considered. Then, in chapter (3), the methodology of finding collocations is described. Chapter (4) describes how those are used. Among other things, it contains a brief introduction to artificial neural networks. Finally before the appendices, chapter (5) concludes with a summary of the results obtained and ideas for further research. The appendices provide a description of the written software (appendix A), a description of the file format of the files containing collocation data

1Before launch, NOAA satellites are assigned a letter. After launch, they are assigned a number. Sensor documentation is often written before the satellite launch, and may thus refer to NOAA-K through -N referring to NOAA-15 through -18. In this thesis and the corresponding software, NOAA satellites are referred to by their number, not by their letter.

1 2 CHAPTER 1. INTRODUCTION

(appendix B), various relevant and interesting websites (appendix C), and finally an overview of acronyms used in this work (appendix D).

1.1 Scientific background

As defined by Rees (2001), remote sensing is the measuring of physical prop- erties of a system without being in physical contact with it. It can be divided in active remote sensing, where the instrument transmits a signal and mea- sures the signal coming back, or passive remote sensing, where the system measures the radiation from the source (originating at or behind the source, or scattered from elsewhere). Examples of common technologies for active remote sensing are RADAR and LIDAR. Passive remote sensing is commonly done with radiometers or imagers. For measuring the atmosphere, sensors can either look at the limb, which means looking to the side, to the atmo- sphere with the sun or space behind it, or they look toward the Earth, at a certain angle toward the nadir direction. In this work, the latter category is studied, particularly the nadir looking geometry. In the atmospheric sciences, such as and climatology, atmo- spheric remote sensing has become a very important source of information. Information can be retrieved on a wide variety of subjects such as the pres- ence of trace gases, temperature profiles, or properties of (ice) clouds. A good understanding of ice clouds is important for climate modelling, but considerable uncertainties currently exist. Whether ice clouds have a cooling or a heating effect depends on specific properties that are not resolved by climate models. Among those properties are the median cloud mass height, the particle size distribution and the Ice Water Path (IWP) (Buehler et al., 2007). The latter, defined as the vertically integrated mass of the ice in a column of the atmosphere, is the main property considered in this study. Remote sensing does not measure physical properties directly. Rather, radiation is measured, and from this, properties such as Ice Water Path can be calculated. This calculation, also known as retrieval, is a non-trivial process. The problem is ill-posed and the methods used to retrieve physical properties from remote sensing form a field of study in itself. One such method uses an artificial neural network. Such a network can be trained by feeding it a number of inputs (radiances expressed in brightness temperature, for various channels) and targets (such as ice water path), and it then develops a mapping to calculate output values from a set of input values. How this works exactly is beyond the scope of this Master’s Thesis, but Jim´enez (2003) provides a a good introduction. In this thesis work, a training dataset is created by collocating measure- 1.2. TOOLS 3 ments from the CloudSat Cloud Profiling Radar with measurements from radiometers on NOAA and MetOp satellites. The IWP product of the for- mer forms the target (and is assumed to be correct), whereas various channels on one or more radiometers constitute the input.

1.2 Tools

The programming for this thesis was mostly done in MATLAB 7.7.0 (R2008b). A small part was done in Python 2.5.2. The thesis was writ- ten in LATEX and compiled with pdftex 3.141592, using the text editor Vi IMproved 7.1 on a Ubuntu 8.10 Linux machine. Versioning was done with Subversion 1.5.1. All tools used, except MATLAB, are OpenSource Software. Chapter 2

Satellites and sensors

In this study, one active instrument is compared with many passive instru- ments. The active instrument is the Cloud Profiling Radar, carried on Cloud- Sat. The passive instruments are radiometers on various weather satellites, measuring radiation from the Earth in various wavelengths, mainly in the infrared and microwave parts of the electromagnetic spectrum.

2.1 CloudSat

Figure 2.1: Artist impression of the CloudSat in orbit. Image by NASA/JPL.

CloudSat (figure (2.1)) is a polar orbiting Earth observation satellite, launched by NASA on 28 April 2006 (Durden and Boain, 2004). It is part of the A-Train constellation, consisting of a variety of specialised remote sensing satellites. The lead A-Train satellite, , is aligned with the Worldwide Reference System-2 (WRS-2) (Stephens et al., 2002), a system originally

4 2.1. CLOUDSAT 5 developed for Landsat (WRS). CloudSat is 215 km to the east of the WRS- 2. The ground track repeats every 233 orbital revolutions or every 16 days. After this period, adjacent orbits at the equator are 172 km apart. However, the distance between two consecutive ascending nodes is 2752 km, again measured at the equator. The orbit is sun-synchronous and has a Local Time Ascending Node (LTAN) of 13:45 The CloudSat orbit is almost circular, with a semi-major axis of 7083.4456 km and an inclination of 98.2464 degrees. See figure (2.2) for a map showing WRS-2 (figure by the USGS).

Figure 2.2: Idealised groundtrack of satellites following the World Reference System. The figure shows only the descending nodes. CloudSat follows this system approximately. Figure by the USGS.

2.1.1 Cloud Profiling Radar CloudSat’s main instrument is the Cloud Profiling Radar (CPR), a nadir- looking coherent radar instrument operating at a frequency of 94 GHz in the Extremely High Frequency (EHF) band. The antenna has a diameter of 6 CHAPTER 2. SATELLITES AND SENSORS

1.85 m. A profile footprint is approximately 1.7 km along-track by 1.3 km across-track and profiles are generated every 1.1 km along-track. One profile consists of 125 bins of 240 m height, so the total profile is around 30 km high (Stephens et al., 2002; Cloudsat). A radar works by transmitting pulses and measuring the signal coming back to the detector, called the “backscatter ratio”. By measuring the time between the transmitted and reflected signal, one can calculate the distance to the feature that causes the backscattering. By measuring the frequency shift between the signals, one can determine the relative velocity of the tar- get. The target can be a solid object (ice), a liquid object (water), or any gradient in the refractive index of a transmittant medium (due to differences in temperature or humidity). For different frequencies, different features are prominent in the backscattering signal (R¨ottger, 1989). The physical quantity directly measured by the CPR is the backscat- tered power as a function of distance from the radar (Li et al., 2007). After calibration and geolocation, this is collected in a product called “1B-CPR” (Cloud Profiling Radar, level 1B). The 1B-CPR data are combined with forecast data from ECMWF (Partain, 2007a) and measurement data from MODIS (Partain, 2007b) to determine a “cloud mask” in a product called “2B-GEOPROF” (Mace, 2007). In turn, from the 2B-GEOPROF and the 1B-CPR, the product 2B-CWC-RO (level 2B, Cloud Water Content, Radar Only) is calculated. This contains the Cloud Ice Water Content (among many other fields) (Austin, 2004). One of the other fields is the “Ice Water Path”, which is the vertically integrated ice water content, and thus the total amount of ice per unit area in a vertical column of the atmosphere between 0 and 30 km. The algorithm is described in the aforementioned sources and in (Austin, 2007). Different algorithms exist, and two of those are contained in the standard published IWP fields: ROIWP (Radar-Only Ice Water Path), IOROIWP (Ice-Only Radar-Only Ice Water Path). The latter assumes the atmospheric column of water consists only of ice, whereas the former also allows for mixed clouds. In this study, ROIWP was used. 2.2. NOAA15 – NOAA-18, METOP-A 7 2.2 NOAA15 – NOAA-18, MetOp-A

The passive instruments considered are carried on five different satellites: NOAA-15 to NOAA-18 and MetOp-A. NOAA-19 was launched 6 February 2009 and is not considered in this study (at the time of writing, its status is still “on-orbit verification”). All those satellites are in sun-synchronous orbits and thus pass the equator at the same local solar time every orbit. The orbital parameters and equator-crossing times are summarised in table (2.1). 8 CHAPTER 2. SATELLITES AND SENSORS

Table 2.1: Orbital parameters for the various satellites. LTAN means Local Time As- cending Node and is a special parameter that is only applicable to sun-synchronous satellites. Semi- Eccentricity Inclination Period LTAN major (min) axis (km) CloudSat 7083.4456a 0.0001283b 98.2580b 98.88c 13:30–13:45c NOAA-15 (K) 7178d 0.0011463e 98.5f 101.1f 16:55f NOAA-16 (L) 7220d 0.0009746e 99.0f 102.1f 17:12f NOAA-17 (M) 7181d 0.0012787e 98.7f 101.2f 21:43f NOAA-18 (N) 7225d 0.0014257e 98.74f 102.12f 13:39f NOAA-19 (N’) 7241g 0.0013947h 98.7i 102.14i 14:00i MetOp-A 7188d 0.0000983e 98.7f 101.36f 9:30f a (Durden and Boain, 2004) b CloudSat Two-Line-Elements (TLE) from the CloudSat Data Processing Center, 2009-02-18 16:00:00 (see appendix (C) for links to relevant websites). c Data for Aqua, which is closely followed by CloudSat (Durden and Boain, 2004). d Altitude as of 2009-02-05 00:00:00 from the POES Status website, plus mean radius of the Earth (6, 371.009 km), rounded to the nearest whole kilometre, because with an eccentricity on the order of 0.001, apogee and perigee differ on the order of 15 km, which is of a similar magnitude as the difference between the equatorial and polar Earth radius (22 km). e NORAD TLE, 2009-02-19 (time not given). NOAA and METOP TLE are ac- quired by the US Space Surveillance Network (SSN) from optical and radar observations and not provided directly by the responsible agencies, and are thus of limited accuracy. f As of 2009-02-05 00:00:00 from the POES Spacecraft Status website. Inclinations reported in the TLE are slightly different from those reported by the POES Spacecraft Status website, the latter is reported here for the reasons described in note (e). g Like (d), but as of 2009-02-25 00:00:00. Not yet operational, not used for collo- cations. h Like (e), but as of 2009-03-01. Not yet operational, not used for collocations. i Like (f), but as of 2009-02-25 00:00:00. Not yet operational, not used for collo- cations. 2.2. NOAA15 – NOAA-18, METOP-A 9

2.2.1 Radiometers A radiometer is an instrument designed to measure radiation intensity. Radi- ation from a certain target can originate from emission or from scattering; it can also originate from lower-lying regions and be transmitted by the target. Each object (including air) emits thermal radiation according to its tem- perature. An object that emits the maximum amount of radiation for its temperature is called a black body. The spectrum of the emitted radiation is described by Planck’s function (Rees, 2001, equation 2.31):

2hc2 1 I(λ, T ) = 5 hc (2.1) λ e λkT − 1 A black body is an idealised case that does not exist in reality. In reality, objects have an emissivity  (0 <  < 1), defined as the fraction of black-body radiation actually emitted at a certain temperature. However, the concept of a black body is used to define a measurement unit that is used extensively in atmospheric remote sensing. The brightness temperature of a source is the temperature (expressed in kelvin) of a black body emitting the same amount of radiation at a specific frequency. It depends on radiation intensity and frequency (or wavelength). The Earth and clouds scatter radiation from sources such as the sun. In the far infrared and in the microwave region, this is negligible relative to the radiation emitted by the Earth and its atmosphere. The details are governed by the absorption and scattering coefficients and can be described by the radiative transfer equation (Rees, 2001, section 3.4.1). The radiative transfer equation can be solved numerically by a system such as ARTS (Buehler et al., 2005), developed at the SAT-group, in collaboration with Chalmers University. The details are beyond the scope of this thesis. The radiation measured by instruments such as HIRS and AMSU is emit- ted by the layers of air below the cloud (it is not emitted by the cloud itself). The region where the radiation originates depends on the atmospheric tem- perature and on the wavelength. On its way up from the lower atmosphere to the instrument, the radiation might pass through clouds and get attenuated (scattered or absorbed). The presence of more clouds means more attenua- tion and less radiation reaches the instrument. If less radiation reaches the instrument, the brightness temperature is lower; the measurement is “cold”. At some frequencies, the atmosphere is transparent and radiation emit- ted by the surface (or by higher layers of air) gets transmitted to the satel- lite unhindered. Channels measuring at those frequencies are called window channels. Other channels measure at frequencies corresponding to absorption lines by atmospheric species such as ozone, carbon dioxide or water vapour. 10 CHAPTER 2. SATELLITES AND SENSORS

Those channels can be used to measure the presence of those species. MHS, described in section (2.2.1.2), has channels centred around a water vapour ab- sorption line. The brightness temperature for those channels depends mostly on relative humidity, as shown by Buehler and John (2005). 2.2. NOAA15 – NOAA-18, METOP-A 11

2.2.1.1 HIRS The sources for this paragraph are (NOAA; ESA). The High Resolution Infrared Radiation Sounder (HIRS) is a radiometer measuring radiation in 20 spectral bands. One channel (channel 20) is in the visible range (λ = 690 nm), the other channels measure in the infrared between 3.76 µm and 14.95 µm. It scans at 56 angles between −49.5◦ via nadir to 49.5◦. HIRS/3 has a Field-Of- View (FOV) of 20.3 km by 18.9 km at nadir. HIRS/4 has a FOV of 10.0 km at nadir. HIRS/3 is present on NOAA-15 to NOAA-17. HIRS/4 is present on NOAA-18, NOAA-19 and MetOp-A. A photograph of the HIRS instrument can be seen in figure (2.3).

Figure 2.3: Photograph of the HIRS instrument. Image by ESA.

Figure (2.4) shows a plot of the locations of the various HIRS channels. Note that this applies to HIRS/3 and the HIRS on NOAA-18 is HIRS/4. However, the differences between HIRS/3 and HIRS/4 are small. 12 CHAPTER 2. SATELLITES AND SENSORS

Figure 2.4: Location of the HIRS/3 channels as a function of wavelength. 2π The wavenumber is defined as λ (Rees, 2001, page 10). This plot was made by Arash Houshangpour. See also table (2.2). 2.2. NOAA15 – NOAA-18, METOP-A 13

2.2.1.2 AMSU The sources for this paragraph are (NOAA; ESA). The Advanced Microwave Sounding Unit (AMSU) is a radiometer measuring radiation in different po- larisations with frequencies in twenty channels ranging from 23.8 GHz to 190.31 GHz, a segment of the electromagnetic spectrum known as the mi- crowave range. It consists of two sub-instruments, AMSU-A and AMSU-B. AMSU-B is replaced by the Microwave Humidity Sounder (MHS) on recent satellites.

Figure 2.5: Photograph of the sub- Figure 2.6: Photograph of the sub- system AMSU-A1. Image by ESA. system AMSU-A2. Image by ESA.

• AMSU-A (figures (2.5) and (2.6)) measures radiation in 15 bands from 23.8 MHz to 322.2 MHz 1. It scans at 30 angles between −48.33◦ to 48.33◦. With a satellite altitude of 833 km, the resolution at nadir is 50 km and the swath width 2343 km. It is carried on NOAA-15 through NOAA-18 and on MetOp-A and will also fly on NOAA-19, MetOp-B and MetOp-C.

• AMSU-B measures radiation in five channels. Three of those (chan- nels 18–20) measure around 183.31 GHz at distances of 1, 3 and 7 GHz respectively, covering a major absorption line of water vapour (Rees, 2001, Table 4.2, Figure 4.5). It scans at 90 angles between −48.50◦ to 48.50◦. With an altitude of 850 km, it has a nadir resolution of 16.3 km. AMSU-B scans three times as fast as AMSU-A, and the footprint size

1Actually, AMSU-A consists of AMSU-A1 and AMSU-A2. However, this is an imple- mentation detail that is not relevant for this study and will subsequently be ignored. 14 CHAPTER 2. SATELLITES AND SENSORS

is one-third of AMSU-A’s in each dimension. Nine AMSU-B measure- ments fit in one AMSU-A measurement (see also figure (3.3) at page 23). AMSU-B is present on NOAA-15 through NOAA-17.

• The Microwave Humidity Sounder (MHS) (figure (2.7)) has superseded AMSU-B. It scans at nearly identical frequencies in 90 angles from −49.44◦ to 49.44◦. Channels 1–5 on MHS correspond to channels 16– 20 on AMSU-B. One difference between MHS and AMSU-B is that channel 20 on AMSU-B measures at 183.31 ± 7 GHz, whereas channel 5 on MHS measures only on the positive side, thus at 190.31 GHz. It is carried on NOAA-18, NOAA-19 and MetOp-A, and will be carried on MetOp-B and MetOp-C. A plot of the location of the MHS channels is shown in figure (2.8).

Figure 2.7: Schematic of MHS. Figure by ESA.

Those sensors are carried on six different satellites. Five NOAA satellites (NOAA-15 to NOAA-19) and MetOp-A, the first satellite in the European MetOp series. Figure (2.8) shows the location of the MHS channels 3, 4 and 5 along with the absorption spectrum of water. MHS 3–5 are similar to AMSU- B 18–20 (seee table (2.2)). Figure (2.9) shows from which altitudes the radiation measured in the different channels of interest originates for a clear- sky, standard atmosphere at mid-latitude summer. It can be seen in figure (2.8) that MHS-3 (AMSU-B 18) has its band closest to the water vapour absorption line; the radiation is more quickly attenuated by water vapour, so the radiation originates from a higher altitude (figure (2.9), the radiation from lower altitudes is invisible). Since the temperature is lower at higher altitudes, this is the “coldest” channel. 2.2. NOAA15 – NOAA-18, METOP-A 15

Figure 2.8: Location of MHS channels. Figure by ESA. See also table (2.2).

The temperature for those microwave channels depends primarily on the relative humidity, which in turn depends on the temperature and the partial pressure. The change in temperature is compensated by the change in relative humidity, so that the brightness temperature seems to be independent of the physical temperature (Buehler and John, 2005). 16 CHAPTER 2. SATELLITES AND SENSORS

20 20

Channel 18 15 Channel 19 15 Channel 20

10 10 Altitude [ km ] Altitude [ km ]

5 5

0 0 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 -0.4 -0.2 0.0 0.2 0.4 Jacobian [ K / 1 ] Jacobian [ K / 1 ]

Figure 2.9: Jacobians for the AMSU-B radiometer. The jacobian, also known as the weighting function, describes from which altitude the radiation reach- ing the sensor originates. The figure shows the Jacobian for the tropics, clear- sky, climatological mean state. Channels 18–20 on AMSU-B correspond to channels 3–5 on MHS. Figure by Buehler and John (2005). 2.2. NOAA15 – NOAA-18, METOP-A 17

2.2.1.3 Channel summary AMSU, MHS and HIRS contain altogether around 40 channels, only a few of which are relevant and used in this study. In the literature, various different units are used to describe the position of a band in the electromagnetic spectrum: wavelength, wavenumber (the reciprocal of wavelength), spatial frequency and temporal frequency. In table (2.2) below is a summary of the channels used.

Channel Centre Centre Bandwidth frequency wavelength HIRS 8 26.98 THz 11.11 µm 0.286 mm HIRS 11 40.90 THz 7.33 µm 0.25 mm MHS 3/AMSU-B 183.31±1.00 GHz 1.631 ± 0.009 mm 500 MHz 18 MHS 4/AMSU-B 183.31±3.00 GHz 1.631 ± 0.027 mm 1000 MHz 19 MHS 5 190.31 GHz 1.575 mm 2200 MHz AMSU-B 20 183.31±7.00 GHz 1.631 ± 0.060 mm 2000 MHz

Table 2.2: Channels on AMSU-B, MHS and HIRS/4 used in this study. The MHS and AMSU-B channels are all centred around a water absorption band (see figure 2.8) (NOAA). Chapter 3

Finding collocations

3.1 Introduction

A collocation between two satellite sensors is an occasion where sensors on different satellites observe the same place at the same time. The spatial extent of a measurement is on the order of 1–50 km, and the duration is on the order of a fraction of a second. This means that measurements might truly cover the same area in space, but probably not in time. For that reason, a time lag between observations is allowed. A collocation is defined to occur between two measurements of sensors on different satellites if an intersection exists between the footprints and the time interval is smaller than a specified number tmax.

3.2 Method

Special software was developed by the student to find the collocations be- tween CloudSat and the various sensors on the NOAA and MetOp satellites. The programming languages used are Python and MATLAB. The input to the software is a large amount of data distributed over many files. The out- put is a description of the collocations. The following sections describe the intermediate steps.

3.2.1 Input data One data file contains one orbit of data. The CloudSat CPR data are stored as compressed Hierarchical Data Format (HDF) files. The CloudSat product used is “2B-CWC-RO” (Austin, 2004). This con- tains Ice Water Path in grams per square meter. It also contains fields for

18 3.2. METHOD 19 latitude and longitude (both in degrees) and time. The time is contained in the fields UTC_start (seconds since the start of the day for the first pro- file), Profile_time (seconds since the start of the profile) and TAI_start (seconds since 00:00:00 Jan 1 1993 for the first profile). The NOAA radiometer data are stored as ATOVS data 1. The granule2 filename contains information on the date and the time of the start of the orbit (Cloudsat). Each granule contains the Planck brightness temperatures for each channel for all viewing angles. For each scan, it also contains the time in milliseconds since the start of the day. (NOAA). The AMSU-B or MHS scans at 90 angles (see section (2.2.1.2)) and the N scan lines of one granule are stored in a 90xN matrix. The 90 columns 4 4 are linearly spaced in the angle interval −49 − 9 , 49 + 9 degrees (MHS) or −48.5, 48.5 degrees (AMSU-B). There is no nadir angle, but columns 45 and 46 correspond to scans at approximately −0.5 ◦ and 0.5 ◦ respectively. Since nine AMSU-B or MHS pixels fit in one AMSU-A pixel, the AMSU- N A matrix has a size of 30x 3 ; in both dimensions, the AMSU-B matrix is three times as long as the AMSU-A matrix. HIRS pixels are not on the same grid and are in a 56xM matrix. See also figure (3.3) on page 23. On the system used, granule files were stored in directories per sensor and day.

3.2.2 Preprocessing 3.2.2.1 Finding matching granules Different satellites have different equatorial crossing times. The time cov- ered by a CloudSat file overlaps with the times of two files for each of the radiometers (see figure (3.1)). Since there are five satellites each carrying three radiometers, this would require to compare a CloudSat orbit against 5 ∗ 3 ∗ 2 = 30 radiometer orbits. However, it is only necessary to compare against one radiometer per satellite. If a collocation is found for one sensor, the other sensors will have a collocation within the same scan line, or no col- location at all. A program was written to go through all CloudSat granule files and search which radiometer orbits overlap with it. This task was implemented in Python in a script called relate_hdf.py. The output file does not contain paths to the granule files, but the start times of the relevant orbits. Swath files for sensors on the same satellite have the same start time, so one timestamp is enough to direct to all three granule

1ATOVS stands for Advanced TIROS Operational Vertical Sounder, where TIROS stands for Television InfraRed Observation Satellite 2One granule is a file containing one orbit of data 20 CHAPTER 3. FINDING COLLOCATIONS

5-7 7-9 9-11 11-13 13-15

4-6 6-8 8-10 10-12 12-14

Figure 3.1: Illustration of time coverage of granule files for different satellites. Since equatorial crossing times are different, granule files start at different times. Suppose the turquoise blocks represent CloudSat granules and the red blocks NOAA-18 AMSU-B granules; the numbers in the blocks represent times. the CloudSat orbit lasting from 7–9 should be compared with the NOAA-18 orbits from 6–8 and 8–10; the CloudSat orbit lasting from 9–11 should be compared to the NOAA-18 orbits from 8–10 and 10–12.

files. Additionally, writing timestamps rather than paths allows for easier compatibility on other systems and makes it easier to read the files. A full description of the file format can be found in appendix (B).

3.2.2.2 Converting units CloudSat and NOAA/MetOp granules use different time units. To be able to compare, one format needs to be converted to the other. 1. CloudSat contains the time for the start of the profile and all other times relative to the start of the profile. The NOAA/MetOp files con- tain the time for each profile since the start of the day (UTC). To convert this to a common unit, the times relative to the start of the profile are added to the CloudSat profile start time. 2. Since the NOAA/MetOp time is expressed in milliseconds and the CloudSat time in seconds, the NOAA/MetOp times are divided by 1000. 3. If the orbit crosses a day boundary (in Universal Time), the times- tamps for the NOAA/MetOp measurements will “reset” to 0 (as each measurement time is expressed in seconds since the start of the day), but the CloudSat times will not, because they were generated as de- scribed in point (1). This is compensated by adding the number of seconds in a day, 86400, to all NOAA/MetOp timestamps smaller than the NOAA/MetOp timestamp for the first measurement in the granule.

3.2.2.3 Checking for data validity Some files contain invalid data. If any longitude or latitude values are outside the valid range, or if all data are zero, the file is rejected, logged and skipped. 3.2. METHOD 21

3.2.2.4 Checking temporal overlap If two orbits are known to have a certain temporal overlap, the next step is to find out if their orbits contain collocations. The first step is to select the overlapping time period. Suppose the CloudSat orbit lasts from 7:26 to 9:05 and a NOAA-16 orbit lasts from 6:50 to 8:14. If the temporal condition for the collocation is such that measurements should happen within ten minutes of each other, only the time interval 7:16 – 8:24 needs to be checked; many points can be discarded immediately.

3.2.3 Collocation algorithm When the preprocessing has been done, the actual collocations in this time period are located. Several approaches to this problem are possible. One could determine the distance between the sensor ground tracks as a function of time. A problem with this approach is that it may be difficult to find overlaps with a certain time lag. If one satellite is trailing another by 100 km, no overlaps would be found without considering a time lag, even if the orbits would be equal if a time lag of several minutes is introduced. Instead, a two-step approach is chosen. The two conditions — spatial and temporal — are split. First it is tested for which points the spatial condition is met. Here, each column (each viewing angle) of radiometer data is treated as a separate ground track. If two ground tracks are plotted (see figure (3.2)), a human observer can see immediately whether there is any spatial overlap or not. However, this does not give information about the time between the overpasses. Computers are not as graphical as humans, so the following algorithm is used to identify points where the spatial overlap condition is met. The algorithm is around one order of magnitude faster than a brute force method (brute force would be to compare each point in track A with each point in track B).

1. The maximum speed of the ground tracks in km/point is determined by calculating the derivative of the ground track.

2. Start with n = 1, find close points to An in B by the following method:

(a) Choose N samples spread over B dividing B in N+1 intervals. Rudimentary profiling with different values showed that N = 200 works well.

(b) Find which sample is closest to A1. 22 CHAPTER 3. FINDING COLLOCATIONS

Ground−tracks CloudSat (7:26 − 9:05), NOAA−16 (6:50 − 8:14) 2006−08−10 7:26 ° 90 N CloudSat NOAA−16

° 45 N

° 0 ° ° ° ° ° ° ° ° ° 180 W 135 W 90 W 45 W 0 45 E 90 E 135 E 180 E

° 45 S

° 90 S

Figure 3.2: Comparison of two ground tracks for 10 August 2006. The red line shows the position of one of the near-nadir measurements (column 45 in the data, angle −0.55◦) for AMSU-B on NOAA-16 between 6:50 and 8:14 UT. The green line shows the CloudSat sub-satellite-point (SSP) between 7:26 and 9:05 UT.

(c) Take the neighbouring intervals of this sample (two unless the sam- ple is at either end of the ground track). If the spatial condition is met for the edges of any interval (e.g. neighbouring sample), include the next interval as well, until either the spatial condition is no longer met or the start or end of the ground track is reached. (d) Calculate the distance for all points in this interval. (e) Note all points for which the spatial condition is met. If there are no such points, remember the distance of the closest point.

3. If there were any points for with the spatial condition was met, increase n by 1 and repeat.

4. If there were no points for which the spatial condition was met, cal- culate the least number of points remaining before it could be met: increase n by smallest distance − spatial condition max speed For example, if the shortest distance is 120 km, the spatial condition 3.2. METHOD 23

distance 20 km, and the max speed 10 km/point, n will be increased by 120−20 10 = 10. This algorithm works because the points on a ground track are on a continuous line. On the other hand, the distance of a ground track to a point has local minima that are not absolute minima, so a faster algorithm to find the minimum of a function that has only one will not work (it might find the wrong minimum). It is probably not the fastest algorithm, but it is considerably faster than brute force and does not miss any collocations. Finally, all points for which the temporal criterion is met are selected.

3.2.3.1 Implementation

Approximate footprints for different sensors 4480 CloudSat MHS 4460 HIRS AMSU−A

4440

4420

4400

4380 UTM y−pos (km)

4360

4340

4320

4300 390 400 410 420 430 440 450 460 470 480 UTM x−pos (km)

Figure 3.3: Approximate footprint of the MHS, HIRS and CPR sensors. As described in sections (2.1.1), (2.2.1.1) and (2.2.1.2), the footprint for the CPR (1.7x1.5 km) is an order of magnitude smaller than the one for the AMSU-B/MHS (16.7 km diameter) or the HIRS (20.3 km diameter). This figure is approximate. The data is for the time interval 2007-03-12 2:07:55.7 – 2:08:10.8. In reality, the size of the pixels depends on viewing angle and footprints are not circular.

The CloudSat footprint is around 1.3 km by 1.7 km (see section (2.1.1)) and the AMSU-B or MHS footprint is 16.3 km at nadir, but more at off- nadir angels (see section (2.2.1.2)). For collocating precisely, one would need 24 CHAPTER 3. FINDING COLLOCATIONS to consider not only the varying size of the AMSU-B or MHS footprints with the viewing angle, but also the shape of both the CloudSat and the AMSU-B or MHS footprints. Instead, the size of the footprints was overestimated and for finding the collocations, a maximum distance of 15 km between the pixel centerpoints was taken. This was done because it is easy to select a subset of this (for example, all points with centerpoints less than 7 km apart), but finding points with larger distances would require rerunning the collocation algorithm. Figure (3.3) shows an approximation of the footprint sizes of the MHS, HIRS and CPR sensors. Even though the HIRS and the MHS are on the same satellite, their footprints do not match exactly, so a CPR pixel inside a MHS pixel is not always inside a HIRS pixel. A maximum time interval of 900 s was chosen.

3.2.3.1.1 Finding the HIRS and AMSU-A pixels corresponding to an AMSU-B or MHS pixel The algorithm described above is carried out between a CloudSat granule and a AMSU-B or MHS granule. However, the AMSU-A and HIRS sensors are present an the same satellite, so if a collocation with AMSU-B or MHS is found, there is also an AMSU-A or HIRS pixel “nearby”. As described in section (2.2.1.2) and visible in figure (3.3), nine MHS/AMSU- B pixels fit in one AMSU-A pixel. If the AMSU-B/MHS pixel at position (i, j) in the matrix in the granule data file collocates with CloudSat, the corresponding AMSU-A pixel can be found by (round(i/3), round(j/3)) bounded by the size of the AMSU-A matrix. This means that the maximum distance from CloudSat to AMSU-A is three times as large as the maximum distance from CloudSat to AMSU-B. Finding the corresponding HIRS pixel is less straightforward, because HIRS is on a different grid. Each HIRS and AMSU-B or MHS scanline have an associated time. First, the HIRS scanline occurring closest in time to the AMSU-B/MHS scanline is located. The HIRS pixel in this scanline closest to the CloudSat pixel is added to the collocated dataset. Both HIRS and AMSU-A pixels located by the described method can be quite far from the CloudSat pixel, with extreme cases currently larger than can be explained. This is visible in figure (3.4). The maximum distance for the MHS is 15 km, which is as expected because a collocation is defined like that. It is expected that the maximum distance to AMSU-A is 45 km and to HIRS not considerably larger than that. However, figure (3.4) shows larger distances. The reason for this needs to be investigated. 3.2. METHOD 25

Distances between pseudo−collocated pixels from AMSU−A, MHS and HIRS with Cloudsat 0.14 AMSU−A MHS HIRS 0.12

0.1

0.08

0.06 Normalised frequency

0.04

0.02

0 0 10 20 30 40 50 60 70 Distance (km)

Figure 3.4: Normalised histogram showing the distances between the center- points of the CloudSat pixels with MHS, AMSU-A and HIRS pixels, using the data that are output by the collocation algorithm. See text for a discussion.

3.2.4 Postprocessing The algorithm described in section (3.2.3) is performed between a CloudSat granule and all NOAA and MetOp files that overlap with it. This is repeated for all CloudSat granules in a day. For each day, the points are sorted by the following fields:

• CloudSat start time (thus CloudSat swath)

• AMSU start time (thus AMSU swath)

• CloudSat measurement time

• AMSU-B/MHS measurement time3

• AMSU-B/MHS column (viewing angle)

For each day and satellite, three files are generated:

3It was later realised it would be better to first sort on AMSU-B/MHS measurement time, then on CloudSat measurement time, because in many cases, there are many Cloud- Sat pixels inside an AMSU-B/MHS pixel. However, this was only realised when all the data had already been collected and stored. 26 CHAPTER 3. FINDING COLLOCATIONS

1. The first file contains information on the actual collocations. One AMSU-B or MHS pixel may collocate with many CloudSat pixels. The file format is described in table (B.1) in appendix (B.2).

2. Secondly, a file with data from the sensors on those satellites is gen- erated. It contains brightness temperatures for all the channels on AMSU, HIRS and MHS, as well as the CloudSat Ice Water Path prod- ucts. It is described in table (B.2) in appendix (B.2).

3. A third datafile contains one row for each AMSU-B or MHS pixel with at least one collocation. Since the AMSU-B or MHS pixel is much larger than the CloudSat pixel (see figure (3.3)), many CloudSat pixels collocate with the same AMSU-B or MHS pixel. For many applications, a one-to-one relation is desirable. For that reason, a number of statistics is calculated for each of the AMSU-B pixels with at least one CloudSat pixel:

• The number of CloudSat pixels inside the AMSU-B or MHS pixel. • The mean value of the CloudSat pixels in the AMSU-B or MHS pixel. The mean value includes both cloudy and non-cloudy pixels. • The standard deviation of the CloudSat pixels. • The coefficient of variation (standard deviation divided by mean value). • The cloud fraction: the number of pixels with positive Ice Water Path divided by the total number of pixels. • The number of CS pixels inside the (nearest) HIRS pixel.

For those files, a smaller maximum distance between the pixel center- points is chosen, a maximum of 7.0 km, so that CloudSat pixels are certain to be inside the AMSU-B or MHS pixel. Also, all pixels with a flagged value for the Ice Water Path (such as -9999) are discarded.

3.3 Verification and statistics

The correctness of the collocation method was mainly verified by inspection of the design and a study of the results. The design and implementation have gone through some iterations before reaching the current stage. No unittests were conducted and it is not proven that the method is either correct or optimal. The method is thought to be correct, but it is probably not optimal. 3.3. VERIFICATION AND STATISTICS 27

5 x 10 Histogram <15km <900 s overlaps Cloudsat vs. amsub/mhs on various satellites, 2007−01 6 NOAA15 NOAA16 NOAA17 NOAA18 5 METOPA

4

3 Number of occurences 2

1

0 0 10 20 30 40 50 60 70 80 90 Absolute latitude (degrees)

Figure 3.5: A histogram of the number of collocations between the Cloud- Sat CPR and the AMSU-B or MHS sensors on various satellites in January 2008. The maximum distance for a collocation is 15 km; the maximum time between the collocated measurements is 15 minutes (900 seconds).

Figure (3.5) is a histogram showing the distribution of collocations over different latitudes for the AMSU-B/MHS sensor on all five satelittes consid- ered. In this figure, it is clear that NOAA-18 has by far the most collocations with CloudSat, and is the only satellite that has nonpolar collocations with CloudSat. This can be explained by the LTAN (see table (2.1)). Whereas the LTAN difference between CloudSat and NOAA-18 is only around six minutes, the LTAN difference between CloudSat and the other satellites is in the order of hours. Figures (3.6), (3.7) and (3.8) show the collocations as a function of lati- tude and viewing angle, for NOAA-18 and NOAA-16. In figure (3.6) it can be seen that close to the equator, collocations happen at a wider range of MHS viewing angles than at higher latitudes. If the two satellites pass through the same point in inertial space, above the equator,five minutes after each other, the Earth rotates so their sub-satellite-points are 15 m/24 h · 40000 km = 417 km apart. If the NOAA-18 altitude is 850 km, the viewing angle needs to be tan−1(417/850) = 26◦. Another feature is that the figure is not mirror-symmetric around the equator. CloudSat has 28 CHAPTER 3. FINDING COLLOCATIONS

Histogram <15km <900s CloudSat vs NOAA18 AMSUB/MHS, 2007−01 occurences 10000

−80 9000

−60 8000

−40 7000

−20 6000 ) °

0 5000 Latitude (

20 4000

3000 40

2000 60

1000 80

0 −50 −40 −30 −20 −10 0 10 20 30 40 50 Viewing angle (°)

Figure 3.6: A two-dimensional histogram showing at which angles colloca- tions between NOAA-18 MHS and the CloudSat CPR occurred in January 2007. See text for discussion.

Histogram <15km <60s CloudSat vs NOAA18 AMSUB/MHS, 2007−01 occurences

−80 2500 −60

−40 2000

−20 ) ° 1500 0 Latitude (

20 1000

40

60 500

80

0 −50 −40 −30 −20 −10 0 10 20 30 40 50 Viewing angle (°)

Figure 3.7: This 2D-histogram is similar to the histogram in figure (3.6), but the collocation time interval criterion is one minute rather than fifteen minutes. See text for discussion. 3.3. VERIFICATION AND STATISTICS 29

Histogram <15km <900s CloudSat vs NOAA16 AMSUB/MHS, 2007−01 occurences 10000

−80 9000

−60 8000

−40 7000

−20 6000 ) °

0 5000 Latitude (

20 4000

3000 40

2000 60

1000 80

0 −50 −40 −30 −20 −10 0 10 20 30 40 50 Viewing angle (°)

Figure 3.8: This plot is similar to figure (3.6), but for NOAA-16 rather than NOAA-18. a slightly lower inclination than NOAA-18 (see table (2.1)), so there can be no collocation at nadir if the NOAA-18 is at its maximum latitude; it has to look off-nadir in the direction of the equator to collocate with the NOAA-18 maximum latitude. On one pole that is to the left, on the other pole that is to the right. In figure (3.7), the variation of angles at which collocations occur is much smaller, because the time interval is only one minute rather than fifteen minutes. Figure (3.8) shows the same information for NOAA-16, and the figure looks very different. NOAA-16 is not near the A-train. Its LTAN is very different from the CloudSat LTAN (see table (2.1)). The only collocations occur near the poles, at higher angles as the latitude gets slightly lower. This makes sense because only near the poles, where the satellites go each orbit, they might be close, and NOAA-16 AMSU-B might need to “look over the pole” to see the same place as the CloudSat CPR does. Figure (3.9) shows how the time interval for collocations for various satel- lites changes over the course of a sidereal day. The time interval is defined as the time for the AMSU-B or MHS measurement minus the time for the CloudSat measurement, so a positive number means that the AMSU-B or MHS comes first, and a negative number means that CloudSat comes first. It can be seen that the NOAA or MetOp satellite slowly catches up with CloudSat. That might be counter-intuitive at first, because the CloudSat 30 CHAPTER 3. FINDING COLLOCATIONS

Figure 3.9: The time interval for the collocations between CloudSat and the various NOAA and MetOp satellites, shown for the entire day of 2007-01-03.

NOAA−18, CloudSat, 2008−03−31 8:40−12:02 100 NOAA−18/AMSU−B 8:40−10:22 NOAA−18/AMSU−B 10:17−12:02 80 CloudSat 9:12−10:51 CloudSat 10:51−12:30 collocation 60

40

20

0

Latitude (degrees) −20

−40

−60

−80

−100 8.5 9 9.5 10 10.5 11 11.5 12 12.5 13 Time of day (UTC)

Figure 3.10: This figure illustrates the counter-intuitive feature seen in figure (3.9) that, despite CloudSat being faster, lower and having a shorter orbit, the time interval AMSU-B - CloudSat steadily increases, as if CloudSat was slower. The red sections show where collocations happen; naturally, this depends on the longitude as well (not shown). 3.3. VERIFICATION AND STATISTICS 31 orbit is lower and shorter, so the CloudSat is faster (table (2.1)). However, CloudSat also has a higher inclination, so the time derivative of the latitude is higher, as can be seen in figure (3.10). The NOAA or MetOp satellite, with the lower latitude time derivative, can thus slowly approach CloudSat in this dimension. Figure (3.10) shows the latitude as a function of time for both CloudSat and NOAA-18/MHS for 2008-03-31 8:40 – 12:30. It can be seen again that NOAA-18/MHS catches up with CloudSat. CloudSat has a higher inclination and a shorter period, so it needs to cover a larger range of latitudes in a shorter amount of time. Hence the time derivative of the latitude is shorter, the line in the figure is steeper, and the NOAA-18/MHS can slowly approach CloudSat. All subsequent statistics and figures apply to NOAA-18/MHS, because the only combination with global collocations is CloudSat vs. NOAA- 18/MHS. Figure (3.11) shows when collocations occur in January 2008. A regular pattern can be seen where around 16 hours with collocations are followed by around 32 hours with no collocations. Collocations within such a 16-hour period can be seen in figure (3.9). Figure (3.12) shows how the number of collocations depends on the time interval and the distance criterion. It can be seen that for longer distances, the number of collocations increases. This makes sense, because the area increases quadratically with the distance. The time interval has only one dimension, and it can be seen that for a longer time interval, the number of collocations does not increase much. However, one can see a preference for a negative time difference, with CloudSat in front of AMSU. This is in agreement with the LTAN from table (2.1). The initial collocation dataset was made for any angle, any region, any time less than 900 seconds and any distance less than 15 km. This leads to many millions of collocations; for the year 2007, for all the collocations between CloudSat and NOAA-18/MHS, there are 124,822,977 collocations, where one NOAA-18/MHS pixel collocating with ten CloudSat pixels is counted as ten collocations. This corresponds to 2,669,135 MHS pixels with the CloudSat pixel certainly inside the MHS pixel (less than 7.0 km from the centrepoint). Naturally, this number decreases if conditions are set. For example, if only MHS pixels are considered that are within 20 degrees of the equator, within 5 degrees of nadir, contain only cloudy (positive) CS IWP measurements, have a standard deviation smaller than the mean, and the collocation occurs within 5 minutes, only 3,058 pixels are left for the year 2007. 32 CHAPTER 3. FINDING COLLOCATIONS

4 Collocations CloudSat CPR with NOAA−18/MHS, January 2007 x 10 0 6

5 5

10 4

15 3 Day of month

20 2

25 1

30 0 0 5 10 15 20 Time of day (UTC)

Figure 3.11: Number of collocations in January 2008. The vertical axis shows the day of the month. The horizontal axis shows the universal time.

4 Histogram of total number of collocations in 2007, NOAA−18/MHS x 10

−800 14

−600 12

−400 10 −200

8 0

200 6

400 4 Time interval NOAA−18/MHS − CloudSat/CPR (s) 600 2

800

0 5 10 15 Distance (km)

Figure 3.12: The total number of collocations in the year 2007 in a 2- dimensional histogram as a function of interval and distance, The boxes are 1km by 30 seconds and the histogram is non-cumulative. 3.3. VERIFICATION AND STATISTICS 33

relative percentage

07% of all collocations total . percentage = 5 cloudy 6332964

124822977 total 85,542 3.20% 48.07% 167,112 6.26% 29.19% 3,452,0656,332,964 2.77% 5.07% 41.26% 24.30% 1,164,362 43.62% 100.00% 46,549,333 37.29% 100.00% 60% of all cloudy collocations are tropical . relative = 13 percentage 6332964 46549333 total percentage tropical

total 26,410 0.99% 14.84% 572,444167,112 21.45% 100.00% 6.26% 14.35% 1,209,730 0.97%6,332,964 14.46% 5.07% 13.60% 26,064,247 20.88% 100.00% Number of MHS pixels

relative Number of CloudSat pixels percentage

total percentage nadir

total 26,41085,542 0.99% 3.20% 4.61% 7.35% 177,959 6.67% 100.00% 30% of all tropical collocations are cloudy. and . 8,366,870 6.70%1,209,730 100.00% 0.97%3,452,065 2.77% 4.64% 7.42% = 24 nadir nadir cloudy cloudy tropical tropical 6332964 26064247 are tropical and cloudy. “Relative percentage” divides(the by denominator the is number on of the points diagonal). meeting the condition on the left: Table 3.1: The totalkm) number or of 2,669,135 collocations MHSmeans for pixels a latitude (distance the within up year 20the degrees to 2007 of number 7.0 was the of 124,822,977 km) equator. collocations Nadir CloudSatcloudy. A meeting MHS means pixels “Total both pixel percentage” a (distance is requirements: divides viewing up cloudy “total” there if angle to all by are less 15 CloudSat the 6,332,964 than pixels total collocated are. one number CloudSat “Total” degree. of indicates pixels collocations: Tropical tropical and 34 CHAPTER 3. FINDING COLLOCATIONS

Table (3.1) shows more detailed statistics on the number of collocations.

Histogram of 1x1 degree grid 2007 mean IWP, tropical ocean 0.1 collocated all 0.09

0.08

0.07

0.06

0.05

0.04

Normalised occurence frequency 0.03

0.02

0.01

0 −4 −3 −2 −1 0 1 2 3 4 10 10 10 10 10 10 10 10 10 yearly mean IWP for grid cell (g/m2)

Figure 3.13: CloudSat IWP was gridded in 1x1 degree boxes and averaged over the year 2007. Then, a histogram of the average IWP inside those grid boxes was created. This process was done twice: once for all IWP, once for AMSU-collocated IWP. This plot was created in cooperation with Carlos Jim´enezand Salomon Eliasson.

Figure (3.13) shows several IWP histograms that were plotted to verify that no biases or other unexpected statistics exist in collocated IWP. It can be seen that the histogram that considers only collocated IWP is slightly more noisy than the histogram considering all IWP. This makes sense, because the total number of points is less. The rough shape is the same, as can be expected from a sample that should not have any bias. Chapter 4

Using collocations

Ice water path (IWP) can be retrieved from measurements or simulated with radiative transfer systems. A primary application for collocations is to compare retrievals from radar measurements to simulations or to other retrievals. Retrievals from radar measurements are not perfect, but they can be expected to be considerably more reliable than passive measurements, because the vertical structure of clouds is resolved, which is not the case for passive radiometry. It is thus a suitable method to validate IWP retrievals from radiometer measurements or simulated IWP retrievals.

4.1 Validating simulated ice water path

A forward model is a model which takes an atmospheric state and simulates the brightness temperature that would be observed by a spaceborne radiome- ter in a certain frequency range. An inverse model is a model that takes a brightness temperature and calculates atmospheric properties from it. The forward model is mathematically well-defined (although the physics might be complicated). It can calculate the radiance the instrument would measure when observing a predefined atmospheric state. The inverse model is non-trivial. The measured brightness temperature is a function of many variables: ice water content, ice particle size distribution, physical temperature of the atmosphere, physical temperature of the sensor, and many others. It is fundamentally impossible to know which of those properties accounts for an observed variation in measured radiance. The best that can be done is an educated guess. There are different ways to test the quality of an inverse model. One way is to go “round”: given an atmospheric state, one uses a forward model

35 36 CHAPTER 4. USING COLLOCATIONS to calculate the brightness temperature. One then inputs the brightness temperature into the inverse model. The output of the inverse model can then be compared with the atmospheric state that was used as the input to the forward model. An example of a forward model is the Atmospheric Radiative Transfer Simulator (ARTS) (Buehler et al., 2005), that was developed at the Satellite Atmospheric Science group, in collaboration with Chalmers university.

Cloudsat IWP and NOAA18 MHS BT (ν=190.311 GHz) for 2007−00 300

280

260

240

220

200

Brightness Temperature (K) 180

160

140 CloudSat collocations Bengt/Ajil simulations 120 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 10 log(IWP) [log(g/m2)]

Figure 4.1: Modified boxplot of Ice Water Path and MHS channel 5 bright- ness temperature. The horizontal bars show the median value of all the MHS channel 5 brightness temperatures inside the 10 log IWP box. The up- per and lower bars of the rectangle show the 1st and 3rd quartile (25th and 75th percentile). The lines connecting from the boxes show the 1st and 99th percentile. All other points are plotted as outliers. Blue shows collocated measurements (MHS 5 and CS IWP), red shows simulations (AMSU-B 20). The simulations were obtained by Bengt Rydberg and Ajil Kottayil with the Atmospheric Radiative Transfer Simulatior (ARTS). See also figure (4.2.)

The collocations can be used to relate brightness temperature to Ice Wa- ter Path. Figures (4.2) and (4.1) show how the Ice Water Path depends on the MHS channel 5 brightness temperature. For this plot, only pixels are considered that are within 20 degrees of the equator, within 5 degrees of nadir, contain only cloudy CloudSat pixels, contain at least ten CloudSat pixels including five collocated with HIRS, have a standard deviation smaller than the mean value, and have a maximum time interval of ten minutes. It can be seen that the brightness temperature only drops significantly at high 4.1. VALIDATING SIMULATED ICE WATER PATH 37

Cloudsat IWP and NOAA18 MHS BT (ν=190.311 GHz) for 2007−00 occurences 300

280 20

260

240 15

220

200 10

Brightness Temperature (K) 180

160 5

140

120 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 10 log(IWP) [log(g/m2)]

Figure 4.2: Two-dimensional histogram of CS Ice Water Path and MHS channel 5 brightness temperature. This plot uses the same data as the blue boxes and lines in figure (4.1).

IWP values (assuming CloudSat IWP is correct), and thin clouds are not visible using only microwave radiometer data. The physical process causing this drop is attenuation. The water in the cloud attenuates the radiation on its way up from the lower troposphere to the satellite sensor. A relatively large amount of water is required before this effect becomes noticeable. Ad- ditionally, a variability in brightness temperatures for a given 10 log IWP is observed. This is caused by the natural variability of relative humidity in the troposphere, as described in section (2.2.1.2). The natural variability for the simulated data is smaller than for the data simulated with ARTS. This may indicate that the ARTS simulations insufficiently model the natural variability in the troposphere. 38 CHAPTER 4. USING COLLOCATIONS 4.2 Comparing ice water path retrievals

Ice water path (IWP) can be retrieved from radar measurements (section (2.1.1)) or passive measurements. CloudSat carries the Cloud Profiling Radar (see section (2.1.1)). IWP retrievals from passive sensors can be from mi- crowave measurements (Zhao and Weng, 2002; Liu and Curry, 2000), usually AMSU-B/MHS (see section (2.2.1.2)) or (usually) by combining measure- ments in various frequency ranges, for example, using MODIS (Baum et al., 2000). The NESDIS MSPPS (National Environmental Satellite, Data and Infor- mation Service Microwave Surface and Precipitation Products System) pub- lishes IWP values derived from microwave radiometer measurements from the AMSU-B and MHS sensors.

Scatter density plot IWP NOAA−18/MHS NESDIS versus CloudSat, 2007 10 log(occurences)

14000

3

) 12000 2

2.5

10000

2 8000

1.5 6000

1 4000 CloudSat CPR Radar Only Ice Water Path (g/m

0.5 2000

0 500 1000 1500 2000 2500 NOAA−18/MHS NESDIS Ice Water Path (g/m2)

Figure 4.3: Two-dimensional histogram of CloudSat Ice Water Path (aver- aged over an AMSU pixel) and NOAA NESDIS MSPPS IWP, for all collo- catinos in the year 2007. This kind of figure is similar to a scatter plot, but it shows the density of points rather than the actual points. The colour axis is logarithmic.

Figure (4.3) shows a two-dimensional histogram comparing CloudSat Ice Water Path with NOAA NESDIS MSPPS Ice Water Path. It can be seen that the CloudSat CPR IWP is consistently higher than the NOAA NESDIS MSPPS IWP. This can be explained by looking at figures (4.1) and (4.2); only 4.2. COMPARING ICE WATER PATH RETRIEVALS 39 for high IWP, the brightness temperature drops significantly. Thin clouds are invisible, so ice water path is underestimated when only radiometer data is used (assuming CloudSat CPR IWP is correct). 40 CHAPTER 4. USING COLLOCATIONS 4.3 Using an artificial neural network for de- veloping an ice water path retrieval

An artificial neural network is defined by Jim´enez (2003) as an interconnected assembly of processing units. Those units are called nodes or neurons. There are two types of nodes: source nodes and computation nodes; the latter are also known as neurons. Nodes are divided into layers: an input layer, zero or more hidden layers, and an output layer. A network without any hidden layers is called a single-layer network, a network with one or more hidden layers is called a multi-layer network.

Figure 4.4: Schematic of a simple neural net with three nodes in the input layer, four nodes in the hidden layer, and two nodes in the output layer. Figure by Colin M.L. Burnett, retrieved from the Wikipedia article artificial neural network.

A schematic of a neural network is shown in figure (4.4). A neuron calcu- lates a linear combination of its inputs (vector i), using weights W and bias b, and transforms the the resulting value using a transfer function:

o = f(WT i + b) (4.1) The function f is often called the activation function. Typical activation functions are the linear function (f(a) = a) and the hyperbolic tangent function (f(a) = tanh(a)), but there are many others. A neural network can be trained for a specific task, such as regression. In this case, the network is shown a number of examples, representing a function. A training algorithm adjusts weights, biases and parameters of 4.3. USING NN TO DEVELOP IWP RETRIEVAL 41 the activation functions to make the function fit the data as well as possible (regression). The details of neural networks are beyond the scope of this master thesis, but Jim´enez (2003) provides a good introduction. The collocated dataset (see chapter (3)) provides a suitable training dataset. For the year 2007, all MHS-averaged collocations were collected as described in section (3.2). From this, all collocations meeting the following criteria were selected:

• The latitude of the centerpoint of the MHS-pixel is between -20 and +20 degrees.

• The viewing angle of the MHS is between -5 and +5 degrees

• All CloudSat pixels inside the MHS pixel detect an ice cloud. In prac- tise, this means those CloudSat pixels measure a nonzero Ice Water Path.

• The MHS pixel contains at least 10 CloudSat pixels.

• At least 5 of the CloudSat pixels inside the MHS pixel are also inside a HIRS pixel (see figure (3.3)).

• The coefficient of variation of the CloudSat pixels is at most 1. That means that the standard deviation of the CloudSat pixels shall not exceed the mean value.

• The centerpoint of the MHS-pixel is over ocean.

• The time interval for the collocation is at most ten minutes.

For the year 2007, 2,627 collocations met those criteria. All neural network operations were performed using the MATLAB Neu- ral Network toolbox V6.0.1 (R2008b). The collocations were randomly di- vided into 60% training, 15% testing and 25% validation. From the training dataset, MHS channels 3, 4 and 5 were selected as inputs and the CS IWP was selected as target. The hidden layer had a hyperbolic tangent sigmoid activation function and the output layer a linear activation function. Those were used to train the neural network using the trainbr training method. The training was considered to be finished when the error with the testing data increased for fifteen iterations. After training, the validation data were used to retrieve IWP from the observations. A scatter plot showing the neural network IWP versus the 42 CHAPTER 4. USING COLLOCATIONS

Neural network retrieval test, 2007 4 10

3 10 ) 2

2 10 retrieved IWP (g/m

1 10

0 10 0 1 2 3 4 10 10 10 10 10 independent IWP (g/m2)

Figure 4.5: Scatter plot to show the performance of the neural network. collocated CloudSat IWP is shown in figure (4.5). The retrieval is of rea- sonable quality, but there is a lot of room for improvement, particularly for lower IWP. This is expected, because from figures (4.2) and (4.1) it can be seen that the difference between low and medium IWP is not visible from microwave radiometer data (at least not in MHS channel 5). See also figure (4.6) for a similar plot with added HIRS data for the input..

4.3.1 Adding HIRS The process described above was repeated with adding two channels from the High Resolution Infrared Sounder (HIRS)1. In theory, this should signif- icantly improve the retrieval, particularly for lower values of the ice water path, because thin clouds are more visible in the infra-red than in the mi- crowave region. Figure (4.6) shows a scatter plot with HIRS and MHS channels as input, and figure (4.7) shows the improvement of adding HIRS data. It can be seen that the retrieval error slightly decreases when adding HIRS data, but the decrease is quite small. Further work is needed to investigate this and further reduce the error (see also section (5) on outlook).

1Initially, a naive approach was chosen. All channels that had a significant correlation with the IWP were selected as inputs. However, since (some of) those channels might see the surface, this approach was abandoned. 4.3. USING NN TO DEVELOP IWP RETRIEVAL 43

Neural network retrieval test, 2007 4 10

3 10 ) 2

2 10 retrieved IWP (g/m

1 10

0 10 0 1 2 3 4 10 10 10 10 10 independent IWP (g/m2)

Figure 4.6: Scatter plot like in figure (4.5), but with HIRS channels 8 and 11.

Neural Network improvement adding HIRS 10 MHS 3+4+5 MHS 3+4+5, HIRS 8+11 9

8

7

6

5

4 median relative error

3

2

1

0 0 0.5 1 1.5 2 2.5 3 3.5 4 10 log IWP (g/m2)

Figure 4.7: Comparison of median absolute error with and without adding the HIRS channels. Chapter 5

Conclusions and outlook

Satellite collocations can provide a very useful combination of data for atmo- spheric remote sensing. Finding collocations can be less trivial than it might initially seem. First the data need to be converted to a common format, then an algorithm needs to be developed that finds all collocations in an efficient way. Even sensors that are on the same satellite are not perfectly aligned, significantly reducing the total number of three-way collocations between AMSU, HIRS and CPR. Since NOAA-18 is the only operational satellite near the A-Train used in this study, it is the only satellite of which the instruments have equatorial collocations with the CloudSat CPR. The number of collocations seems huge at first, but is much lower once the small core of “ideal” collocations meeting many requirements is selected. However, it is still large enough to perform basic neural network analysis. The preliminary neural network analysis that has been done raises many new questions, some of which are described in the following section. However, the initial results are also encouraging and suggest that good results can probably be obtained using the approach described in this thesis. The correctness of the algorithm used to find collocations has not been proven. It could be good to prove the correctness of this method, either by mathematics or by rigorous testing using an extensive unittest framework. The current algorithm only works for the CloudSat / AMSU combination. Generalising the collocation algorithm can be useful for other applications in the future. One such application could be to add new data to the dataset. NOAA- 19 was launched on 6 February 2009, and carries another copy of the MHS and HIRS/4 sensors. Inside the A-train, and therefore near CloudSat, are advanced imagers and radiometers such as MODIS and AVHRR, that can be added to the collocations. This also raises the question of the correctness

44 45 of the algorithm used to locate the HIRS; this has not been tested rigorously and there might be room for improvement here. As seen in figure (3.4) on page (25), the distance to HIRS and AMSU-A can get quite large; this is puzzling and the reason for this needs to be investigated. The quality of the neural network training can be further improved by studying the variability of the AMSU pixels. If the temperature of an AMSU pixel varies widely with the temperature of neighbouring pixels, the cloud system might be so small or so highly variable that the CloudSat measure- ments are not at all representative for the AMSU pixel. However, if this variability is low and we are looking at a large, homogeneous cloud system, we can even use CloudSat pixels that are between neighbouring AMSU pixels to determine the mean Ice Water Path of the region. The current system assumes the CloudSat data is correct. In reality, CloudSat measurements have a finite error. Further work could take this error into account. Many restrictions were applied to improve the quality of the neural net- work, but the improvement of those restrictions was not yet studied in detail. Also, the algorithm currently only works for tropical ocean cases. It could be extended to work globally, possibly by adding additional inputs (latitude, surface type) to the network, or by using several networks. The discussion above shows that, in addition to the encouraging results in this master thesis, a lot of work remains to be done. This could be the subject of continued research, for example in the form of a PhD project. Appendix A

Complete software description

This appendix contains a description of all source files used in the imple- mentation of this Master’s Thesis. For the purpose of the implementation, AMSU-B is equivalent to MHS and MetOp-A is equivalent to the NOAA satellites. Inside the sourcecode and debugging output, MHS is considered as if it was AMSU-B, and MetOp-A is sometimes referred to as NOAA-99. This was done for ease of implementation. In the description below, AMSU-B means “AMSU-B or MHS”. This appendix only contains a brief description of the source files. For detailed usage and implementation, the reader is referred to the source files themselves.

A.1 Python source file

A.1.1 relate hdf.py

The Python source file relate hdf.py loops through all the CloudSat files and finds any AMSU-B granules that overlap. It writes the output to small text files with the start times for all the relevant granules

A.2 MATLAB source files

More complete information can be found in the m-file headers and source code.

46 A.2. MATLAB SOURCE FILES 47

A.2.1 Core program The following m-files implement the actual algorithm as described in section (3).

A.2.1.1 find overlap.m The m-file find overlap.m contains the core algorithm as described in section (3.2.3). As an argument, it takes the path to a CloudSat file, the path to an AMSU file, the maximum distance and the maximum time. The preprocess- ing (decompressing the CloudSat file and processing the AMSU file) needs to be done before. As an output, find overlap gives a 5xN matrix, where N is the number of collocations found. Each row corresponds to a collocation. The first column contains the index in the CloudSat data. The second and third column contain the index and column in the AMSU data. The fourth column has the distance, and the fifth column has the time interval, defined as the AMSU time minus the CloudSat time.

A.2.1.2 compare granule.m The m-file compare granule.m takes one CloudSat granule and runs find overlap.m for each AMSU-B granule on each satellite that has any time overlap with the CloudSat granule. It also carries out the required preprocessing (decom- pressing, converting). Then, for each AMSU-B point, it locates the nearest AMSU-A and HIRS points. The first argument is the path to a file con- taining the start date and times for a CloudSat granule and all overlapping AMSU-B granules, as output by relate hdf.py. Additionally, it takes argu- ments for the maximum distance and time (passed on to find overlap.m, for the satellite number (number 99 is MetOp, number 0 means “all satellites”), and, optionally, two file descriptors for a normal logfile and a error logfile. The output is a 29 x N matrix, where N is the number of collocations found. The columns are described in table (B.1) in appendix (B.2).

A.2.1.3 compare date.m The m-file compare date.m runs compare granule.m for all CloudSat gran- ules starting on a certain day. As arguments, it takes the year, the month and the day to consider, the maximum distance, maximum time and the satellite (passed on to compare granule.m). A special dataset can be created and distinguished in the output files by adding a string that will be added in any output filenames. Optionally, two file descriptors for a normal logfile and an error logfile can be passed on, and will be passed on to compare granule.m. 48 APPENDIX A. COMPLETE SOFTWARE DESCRIPTION

The output of compare date.m, which is of the same format as the output of compare granule.m, is written to a file. It writes one file per day and satellite.

A.2.1.4 find all overlap.m

The m-file find all overlap.m runs compare date.m for all days. It reads a file alldates that contains a 3xN matrix with the year, month and day for all dates to consider.

A.2.2 Post-processing The following m-files implement various post-processing function, such as finding averages, filling gaps and removing doubles.

A.2.2.1 find mean CSIWP per AMSU pixel.m This m-file groups all CloudSat pixels inside an AMSU pixel and performs some statistics on those, as described in section (3.2.4).

A.2.2.2 fill missing noaa18 amsua.m

When get data from overlap.m was initially run, no AMSU-A data were available for NOAA-18. This m-file collects AMSU-A data and writes them to the overlap data files.

A.2.2.3 fill mspps.m This m-file adds the NOAA NESDIS MSPPS IWP product to the overlap data files.

A.2.2.4 get data from overlap.m Collects NOAA/MetOp brightness temperatures and CS IWP for all collo- cations and writes them to data files (see section (3.2.4)).

A.2.2.5 remove doubles.m An early version of the collocation algorithm led to some collocations being found twice. This m-file fixes this. A.2. MATLAB SOURCE FILES 49

A.2.2.6 remove all doubles.m Runs remove doubles.m for all granules using for each granule.m.

A.2.3 Helper functions The m-files in this section can be used in various circumstances.

A.2.3.1 COLNO.m The m-file COLNO.M contains definitions for the columns used in the colloca- tion files, collocation data files and the collocation mean data files. Those definitions make it easier and more portable to access certain columns of data in those matrices.

A.2.3.2 calc distance.m calc distance.m takes two (lat, lon) pairs and calculates the distance using the Haversine formula (Wikipedia, 2009).

A.2.3.3 land or sea.m Using atmlab’s land sea mask.m, returns a logical array for a matrix whose rows are (lat, lon) pairs, describing whether the point is on land or on sea.

A.2.3.4 put in bins.m Puts y-values in bins according to the corresponding x-values.

A.2.3.5 satboxplot.m Modified boxplot that connects lines to the 1st and 99th percentile instead of the usual convention.

A.2.3.6 for each granule.m Wrapper function that applies a function to each granule (overlap, overlap data, overlap meandata)

A.2.3.7 find datafile by date.m For a certain date/time, satellite and sensor, return the path to the granule starting on this date (if available). 50 APPENDIX A. COMPLETE SOFTWARE DESCRIPTION

A.2.3.8 find datafile by unixtime.m

Like find datafile by date.m, but takes UNIX time instead of date. Seconds are ignored.

A.2.3.9 find short distance.m For two ground tracks, find the segments where the distance from the first to the second ground track is shorter than maxdist and describes to which region of the second ground track the distance is shorter than maxdist.

A.2.4 Reader function This m-file is used to read data from the collocation files, collocation data files and collocation mean data files.

A.2.4.1 extract from overlap.m Reads indicated columns of data from overlap files, overlap data files and overlap mean data files, for the indicated time period (one month or one year) and satellite.

A.2.5 Statistics and verification Those m-files are related to section (3.3).

A.2.5.1 make stats colloc.m Make stats like in table (3.1).

A.2.5.2 find gridded average IWP month.m Make figure (3.13).

A.2.5.3 find IWPcorrelated channels While experimenting to find which AMSU and HIRS channels are good to use as input in a neural network (see section (4.3.1)), all AMSU channels were correlated with IWP, using this function. However, this approach was later abandoned. A.2. MATLAB SOURCE FILES 51

A.2.6 Functions by others Those m-files are downloaded from the MATLAB website, contributed by people for common usage.

A.2.6.1 ignoreNaN.m Helper function by Matt G. to calculate statistics for an array of matrix, but while ignore NaN values (normally, mean(1, 2, nan) = nan, but with ignoreNaN it becomes 1.5).

A.2.6.2 wgs2utm.m Converts geographical coordinates to utm coordinates. Used by plot footprints.m to draw circles. Written by Alexandre Schimel.

A.2.6.3 hist2d.m Creates a two-dimensional histogram. Written by David Dean.

A.2.7 Plotter functions Those m-files are used to plot.

A.2.7.1 plot footprints.m Used to plot figure (3.3).

A.2.7.2 plot example dlat dlong.m Used to plot figure (3.10).

A.2.7.3 plot hist2d dist int.m Used to plot figure (3.12).

A.2.7.4 plot hist2d logIWP BT.m Used to plot figure (4.1)

A.2.7.5 plot scatter CSIWP NESDISIWP.m Used to plot figure (4.3). 52 APPENDIX A. COMPLETE SOFTWARE DESCRIPTION

A.2.7.6 plot extract from overlap.m Earlier version of plot hist2d logIWP BT, not used.

A.2.7.7 find lowestlat by interval.m Plots the lowest latitude for which any collocations are found at all, as a function of the time interval chosen, for noaa16. This plot is not included in the thesis.

A.2.7.8 plot average BT month midlat.m Used to plot an earlier variant of figure (3.13).

A.2.7.9 plot hist2d latitude angle.m Used to plot figures (3.6) through (3.8).

A.2.7.10 plot latitude hist.m Used to plot figure (3.5).

A.2.7.11 plot monthly IWP PDF.m For each month, make various plots such as the ascending node latitude histogram, the ascending node longitude histogram, universal and local time histograms, ascending/descending ice water path, and MHS channel 5 Tb minus MHS channel 3 Tb vs. MHS channel 5. An example of such a plot is figure (3.13).

A.2.7.12 plot overlap column.m Plots the overlap time interval as a function of the angle for equatorial col- locations. Not included in the thesis.

A.2.7.13 plot overlap time.m Used to plot figure (3.9) (for each day in January 2007).

A.2.8 Test functions The following functions are used for testing m-files. A.2. MATLAB SOURCE FILES 53

A.2.8.1 echonames.m Used to test for each granule.m, merely writing down the names and argu- ments.

A.2.8.2 test modis.m Used to test how to read MODIS data. MODIS data are not included in the thesis.

A.2.8.3 test data comp.m Various tests, keeps changing.

A.2.8.4 compare amsua amsub hirs.m Used to test how to read various data.

A.2.8.5 getfieldnames.m Used to check what fields are available in a CloudSat file.

A.2.8.6 LWC height.m Used to check how to read CloudSat data.

A.2.9 Neural network A.2.9.1 test neuralnet.m Used to get the neural network results described in section (4.3). Appendix B

File format

B.1 File overlaps

The granule overlap files have a filename Cloudsat-YYYYMMDD-hhmm and con- tain a matrix describing which CloudSat-files contain a temporal overlap with which NOAA/MetOp-files (see section (3.2.2). The first five columns corre- spond to the year, month, day, hour and minute that the datafile starts. The last column is the satellite number: 00 means CloudSat, 15–18 mean NOAA- 15 through NOAA-18, and 99 means MetOp-A. The first row contains the information for CloudSat. This information is redundant to the filename. The rest of the lines describe the NOAA and MetOp satellite AMSU-B/MHS granules that contain overlap. From this, the path to the actual files contain- ing the data for the different sensors can be calculated. On the system on which the collocations were calculated, the m-file find_datafile_by_date.m was used to do so. On other systems, only a small adaptation of this m-file should be required.

B.2 Collocations

The data are stored as matrices in MAT files using the MATLAB save com- mand (version 7.3 or later). Table (B.1) describes the columns in the files containing collocation information. Table (B.2) describes the data corre- sponding to this. Finally, table (B.3) describes the mean data files (see sec- tion (3.2.4)). Each overlapfile corresponds to a datafile and a meandatafile, and the matrices have the same number of rows; the collocation described in row N in the overlapfile describes the same collocation as the actual data considered in row N of the datafile. The meandatafile has less rows, be- cause the overlapfile and datafile have a row for each CS pixel, whereas the

54 B.2. COLLOCATIONS 55 meandatafile has a row for each AMSU-B/MHS pixel. A CloudSat measurement is uniquely determined by columns 3 (start time) and 5 (index). Times are stored in UNIX time (seconds since 1970- 01-01 00:00:00) in UTC. A AMSU-B or MHS measurement is uniquely de- termined by columns 16 (start time), 18 (index) and 19 (column, directly related to the angle). Similar relations apply to AMSU-A and HIRS. 56 APPENDIX B. FILE FORMAT

Table B.1: Columns in the overlap files. A = NOAA/MetOp-A AMSU A; B = NOAA/MetOp-A AMSU B/MHS; C = CloudSat; H = NOAA/MetOp-A HIRS no quantity unit comment 1 C longitude degrees obtained from columns 3, 5 2 C latitude degrees obtained from columns 3, 5 3 C orbit start time UNIX time determines granulefile 4 C measurement time UNIX time calculated from 3, 5 5 C index integer Core information 6 A longitude degrees Obtained from columns 8, 10, 11 7 A latitude degrees Obtained from 8, 10, 11 8 A orbit start time UNIX time Determines granulefile 9 A measurement time UNIX time Scan time neglected 10 A index integer Calculated from 18 11 A column integer Calculated from 19 12 A distance to CloudSat km Calculated from 1, 2, 6, 7 13 A time since CloudSat s Equal to 9 - 4 14 B longitude degrees Obtained from 16, 18, 19 15 B latitude degrees Obtained from 16, 18, 19 16 B orbit start time UNIX time Determines granulefile 17 B measurement time UNIX time Scan time neglected 18 B index integer Core information 19 B column integer Core information 20 B distance to CloudSat km Calculated from 1, 2, 14, 15 21 B time since CloudSat s Equal to 17 - 4 22 H longitude degrees Obtained from 24, 26, 27 23 H latitude degrees Obtained from 24, 26, 27 24 H orbit start time UNIX time Determines granulefile 25 H measurement time UNIX time Scan time neglected 26 H index integer Calculated from 18 and data 27 H column integer Calculated from 19 and data 28 H distance to CloudSat km Calculated from 1, 2, 22, 23 29 H time since CloudSat s Equal to 25 - 4 B.2. COLLOCATIONS 57

Table B.2: Columns in the data files no quantity unit 1 CloudSat Radar-Only (RO) IWP g m−2 2 CloudSat RO IWP uncertainty g m−2 3 CloudSat Ice-Only (IO) RO IWP g m−2 4 CloudSat IO RO IWP uncertainty g m−2 5–19 AMSUA TB channels 1–15 K 20–24 AMSUB TB channels 16–20 or MHS TB channels 1–5 K 25–44 HIRS TB channels 1–20 K 45 MSPPS IWP g m−2

Table B.3: Columns in the mean data files no quantity unit 1 First corresponding row in overlapfile 2 Last corresponding row in overlapfile 3 Number of CloudSat pixels 4 Mean CloudSat RO IWP g m−2 5 CloudSat RO IWP standard deviation g m−2 6 CloudSat RO IWP coefficient of variation 7 Fraction of cloudy CloudSat pixels Appendix C

Websites

POES Spacecraft Status http://www.oso.noaa.gov/poesstatus/index. asp

CloudSat Data Processing Centre TLE http://www.cloudsat.cira. colostate.edu/dpcstatusElements.php

NOAA and MetOp TLE http://celestrak.com/NORAD/elements/

58 Appendix D

Acronyms and glossary

This appendix contains a glossary explaining both acronyms and terms not usually known to people not working in the field (more specifically, terms not known by the student prior to this study).

AMSU Advanced Microwave Sounding Unit

ANN Artificial Neural Network

ARTS Atmospheric Radiative Transfer Simulator collocation see section (3.1)

CDPC CloudSat Data Processing Centre

CPR Cloud Profiling Radar

EHF Extremely High Frequency granule One orbit of satellite data, starting at the ascending node

HIRS High Resolution Infrared Radiation Sounder

MHS Microwave Humidity Sounder ice water path The vertically integrated ice content in a column of atmo- sphere (mass per area)

IOROIWP Ice-Only Radar-Only Ice Water Path

IRV Institutet f¨orRymdvetenskap (Department of Space Science)

IWP Ice Water Path

59 60 APPENDIX D. ACRONYMS AND GLOSSARY

JPL Jet Propulsion Library

LTAN Local Time Ascending Node

MSPPS Microwave Surface and Precipitation Products System

NASA National Aeronautics and Space Administration

NESDIS National Environmental Satellite, Data and Information Service

NN Neural Network

NOAA National Oceanic and Atmospheric Administration

NORAD North American Aerospace Defense Command

POES Polar Orbiting Environmental Satellite

RADAR RAdio Detection And Ranging

ROIWP Radar-Only Ice Water Path

SAT SATellite Atmospheric Science Group (the research group where the thesis was written)

SRF Sensor Response Function

SSN Space Surveillance Network

SSP Sub-Satellite Point swath One scanline of data (across-track for NOAA and MetOp)

TIROS Television Infra-Red Observation Satellite

TLE Two-Line Elements

URL Uniform Resource Locator

USGS United States Geological Survey

UTC Universal Time Coordinated

WRS Worldwide Reference System Bibliography

Austin, R. (2004), Level 2 Cloud Ice Water Content Product Process De- scription and Interface Control Document, Fort Collins, Colorado, version 3.0, 30 July 2004.

Austin, R. (2007), Level 2B Radar-only Cloud Water Content (2B-CWC-RO) Process Description Document, version 5.1, 27 October 2007.

Baum, B. A., P. F. Soulen, K. I. Strabala, M. D. King, S. A. Acker- man, W. P. Menzel, and P. Yang (2000), Remote sensing of cloud prop- erties using MODIS airborne simulator imagery during SUCCESS. 2. Cloud thermodynamic phase, J. Geophys. Res., 105, 11,781–11,792, doi: 10.1029/1999JD901090.

Buehler, S. A., and V. O. John (2005), A simple method to relate microwave radiances to upper tropospheric humidity, J. Geophys. Res., 110, D02110, doi:10.1029/2004JD005111.

Buehler, S. A., P. Eriksson, T. Kuhn, A. von Engeln, and C. Verdes (2005), ARTS, the atmospheric radiative transfer simulator, J. Quant. Spectrosc. Radiat. Transfer, 91 (1), 65–93, doi:10.1016/j.jqsrt.2004.05.051.

Buehler, S. A., et al. (2007), A concept for a satellite mission to measure cloud ice water path and ice particle size, Q. J. R. Meteorol. Soc., 133 (S2), 109–128, doi:10.1002/qj.143.

Cloudsat (2008), CloudSat Standard Data Products Handbook, Fort Collins, Colorado, revised 25 April, 2008.

Durden, S., and R. Boain (2004), Orbit and transmit characteristics of the cloudsat cloud profiling radar (CPR), Tech. Rep. D-29695, Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA 91109.

ESA (2006), ESA - Living Planet Programme - MetOp - MetOp instrument overview, [Online; accessed 21 January 2009].

61 62 BIBLIOGRAPHY

Jim´enez,C. (2003), A neural network technique for retrieving atmospheric species from microwave limb sounders, Ph.D. thesis, Charlmers University of Technology, G¨oteborg, Sweden. Li, L., S. Durden, and S. Tanelli (2007), Level 1 B CPR Process Description and Interface Control Document, Pasadena, California, version 5.3, 27 June 2007. Liu, G., and J. A. Curry (2000), Determination of ice water path and mass median particle size using multichannel microwave measurements, J. Appl. Meteorol., 39, 1318–1329. Mace, G. (2007), Level 2 GEOPROF Product Process Description and Inter- face Control Document, version 5.3, 28 June 2007.

NASA website (2008), Tiros mission - science mission directorate, http:// nasascience.nasa.gov/missions/tiros, [Online; accessed 18 January 2009]. NOAA (2007), NOAA KLM User’s Guide, [Online; accessed 21 January 2009]. Partain, P. (2007a), Cloudsat ECMWF-AUX Auxiliary Data Process De- scription and Interface Control Document, Fort Collins, Colorado, version 5.2, 18 July 2007. Partain, P. (2007b), Cloudsat MODIS-AUX Auxiliary Data Process Descrip- tion and Interface Control Document, Fort Collins, Colorado, version 5.1, 18 July 2007. Rees, G. (2001), Physical Principles of Remote Sensing, Cambridge Univer- sity Press. R¨ottger,J. (1989), The instrumental principles of MST radars and inco- herent scatter radars and the configuration of radar system hardware, in Middle atmosphere program — Handbook for MAP: International school on atmospheric radar, vol. 30, pp. 54–113, SCOSTEP Secretariat, Urbana (IL). Stephens, G. L., et al. (2002), The cloudsat mission and the A-train, Bull. Amer. Met. Soc., 83, 1771–1790. Wikipedia (2009), Haversine formula — wikipedia, the free ency- clopedia, http://en.wikipedia.org/w/index.php?title=Haversine_ formula&oldid=277051691, [Online; accessed 20-April-2009]. BIBLIOGRAPHY 63

WRS (), The worldwide reference system, http://landsat.gsfc.nasa. gov/about/wrs.html, [Online; accessed 13 February 2009].

Zhao, L., and F. Weng (2002), Retrieval of ice cloud parameters using the advanced microwave sounding unit, J. Appl. Meteorol., 41, 384–395.