The Pennsylvania State University The Graduate School College of Engineering

STEGANOGRAPHIC

A Dissertation in Electrical Engineering by Joonho Daniel Park

© 2016 Joonho Daniel Park

Submitted in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

December 2016 The dissertation of Joonho Daniel Park was reviewed and approved∗ by the following:

John F. Doherty Professor of Electrical Engineering Dissertation Advisor, Chair of Committee

Vishal Monga Associate Professor of Electrical Engineering

David M. Jenkins Research Associate of Applied Research Laboratory

R. Lee Culver Senior Research Associate of Applied Research Laboratory Associate Professor of Acoustics

Kultegin Aydin Professor of Electrical Engineering Head of the Department of Electrical Engineering

∗Signatures are on file in the Graduate School.

ii Abstract

Steganographic transmission waveforms are desirable for many underwater appli- cations such as underwater acoustic communications, surveillance, detection, and tracking. In the context of active sonar tracking, an overall approach is presented that covers many topics including background sound modeling, steganographic security assessment of transmission waveform, target motion modeling, and batch track detection. Under shallow water environment background sound dominated by snapping shrimps, the presented approach shows feasibility of steganographic sonar for target track detection. A background sound modeling approach is presented that utilizes symbolic time series analysis with phase space representation and clustering of time series segments. The time evolving characteristics are captured with the hidden Markov model (HMM). The steganographic security of the transmission waveform can be assessed by measuring the Kullback-Leibler divergence (KLD) between the cover model and the stego model. Conventional tracking approaches fail to perform reliably, and existing track- before-detect approaches also fail due to overwhelming search computation when they use simplistic short timescale motion models. A long timescale target motion model with considerations for plausibility is presented and utilized in a batch track detection approach. A Monte Carlo-based simulation was performed to find the scenarios that demonstrate feasibility of steganographic sonar in a shallow water environment with snapping shrimp sound.

iii Table of Contents

List of Figures vi

Acknowledgments x

Chapter 1 Introduction 1 1.1 Motivation ...... 1 1.2 Sonar Tracking ...... 2 1.2.1 Underwater Sound Propagation ...... 3 1.2.2 Sonar Equations ...... 3 1.3 Steganography ...... 4 1.3.1 Steganography in Communications ...... 6 1.3.2 Steganography in Sonar ...... 7 1.4 Contributions ...... 8 1.5 Dissertation Organization ...... 9

Chapter 2 Track-Before-Detect 10 2.1 Problem Definition ...... 11 2.1.1 Measurement Model ...... 11 2.1.2 Motion Model ...... 12 2.2 Dynamic Programming ...... 13 2.3 Particle Filter ...... 15 2.4 Role of the Motion Model ...... 17

Chapter 3 Weak Target Track Detection 19 3.1 Long Timescale Motion Model ...... 20 3.2 Long Timescale Batch Track Detection ...... 23 3.3 Comparison with TBD Methods ...... 27

iv 3.4 Motion Similarity ...... 29 3.5 Motion Model Homogeneity ...... 30 3.6 Batch Track Detection Performance Prediction ...... 32

Chapter 4 Steganography 36 4.1 Information Hiding and Covert Waveform ...... 37 4.1.1 Spread Spectrum ...... 39 4.2 Adaptive Steganography ...... 39 4.3 Epistemological Steganography ...... 41 4.4 Steganography in Sonar ...... 43 4.5 Steganographic Security of Sonar Waveform ...... 46

Chapter 5 Background Sound Modeling 50 5.1 Time Series Modeling ...... 52 5.1.1 Time Series Clustering ...... 53 5.1.2 HMM Parameter Estimation ...... 56 5.2 Song Modeling and Steganography ...... 60

Chapter 6 Steganographic Sonar 63 6.1 Matched Filter and Energy Detector ...... 64 6.2 Monte Carlo Simulation ...... 69 6.3 Results ...... 72

Chapter 7 Conclusion and Discussion 87 7.1 Future Effort: Model Mismatch and Model Complexity ...... 88 7.2 Discussion ...... 89

Bibliography 91

v List of Figures

3.1 PDFs of acceleration, Singer’s model and the acceleration-based HMM 22 3.2 Sample motions from the Singer’s motion model ...... 22 3.3 Sample motions from the acceleration-based HMM ...... 23 3.4 Entire observed acoustic data with the target motion depicted as the curved dashed line...... 24 3.5 Aligned observation using the motion information ...... 24 3.6 Processed with incorrect motion ...... 25 3.7 Well aligned autocorrelation peaks produces higher contrast inte- grated peak as seen in (c) and (f) ...... 26 3.8 Track detection performance with varying SNR. Pulse duration: 1000 28 3.9 Rxx(t) is the autocorrelation function of a band-limited transmission waveform, and R(t) is a function with the same 3dB width...... 30 3.10 Distribution function of similarity to a true target motion with associated parameter value λ = −2...... 32 3.11 Track detection performance comparison between simulation and model. Solid lines are simulation and dashed lines are model. Black lines are when M = 200 and Blue lines are when M = 400...... 35

4.1 The striping pattern of zebras, while effective against its predator and parasite, do not adapt to the environment, and are easily recognized by humans...... 40 4.2 Octopus can blend into the environment with very high fidelity making it very difficult for its predators to recognize it...... 41 4.3 Venn diagram of ideal cover model and estimate models by the hider and the seeker ...... 42 4.4 Wenz curve. It describes the pressure spectral density levels of under- water ambient noise from weather, wind, geologic, and commercial shipping ...... 45

vi 5.1 Segments of recording of snapping shrimp. The direct path arrivals are aligned, and the second arrival are reflections off the water-air interface...... 54 5.2 An illustration of the clustering topology. Each point represents a segment and a circle represents a cluster with radius d. Segments within the a circle are assigned the same symbol. Segments not included in any circle is a assigned a unique symbol...... 55 5.3 Histogram of effective segment durations characterized by the mod- eling scheme...... 57 5.4 A simplex illustration of the true model and the estimated model for both cover and stego, and the maximum entropy model. . . . . 59 5.5 Steganographic security vs. Waveform strength ...... 62

6.1 Distribution function of motion similarity for a complex target motion with associated parameter λ = 2...... 66 6.2 Power spectral density of white noise, pink noise with α = 1, 1.5, 1.9, and snapping shrimp recording ...... 67 6.3 Zoomed in peak of autocorrelation functions of white noise, pink noise with α = 1, 1.5, 1.9, and snapping shrimp recording. The dotted line indicates −3dB...... 67 6.4 Comparison of estimated PDF of energy detector decision statistics of white Gaussian and pink Gaussian (α = 1), and other colored Gaussian processes with α = 1.5, 1.9, for integration length of 1 second...... 68 6.5 Matched filtered data with a strong signal for illustration...... 70 6.6 Processed data with true target motion...... 71 6.7 Expected lower bound of missed detection probability computed from the estimated Kullback-Leibler divergence of the cover and the stego for white Gaussian, snapping shrimp, pink Gaussian (α = 1), and other colored Gaussian processes with α = 1.5, 1.9...... 73 6.8 Probability of missed detection of energy detector with integration time of 1 second. The stego for white Gaussian, snapping shrimp, pink Gaussian (α = 1), other colored Gaussian with α = 1.5, 1.9. . . 74 6.9 Expected lower bound of missed detection probability and the missed detection probability of energy detector with 1 second integration time of stego with white Gaussian process...... 75 6.10 Expected lower bound of missed detection probability and the missed detection probability of energy detector with 1 second integration time of stego with snapping shrimp process...... 75

vii 6.11 Expected lower bound of missed detection probability and the missed detection probability of energy detector with 1 second integration time of stego with pink (α = 1) Gaussian process...... 76 6.12 Expected lower bound of missed detection probability and the missed detection probability of energy detector with 1 second integration time of stego with colored Gaussian process with α = 1.5...... 77 6.13 Expected lower bound of missed detection probability and the missed detection probability of energy detector with 1 second integration time of stego with colored Gaussian process with α = 1.9...... 77 6.14 Batch Track Detection performance vs. received target echo level. . 78 6.15 Steganographic sonar performance of white Gaussian process in snapping shrimp background...... 79 6.16 Steganographic sonar performance of snapping shrimp process in snapping shrimp background...... 80 6.17 Steganographic sonar performance of pink Gaussian process (α = 1) in snapping shrimp background...... 80 6.18 Steganographic sonar performance of colored Gaussian process with α = 1.5 in snapping shrimp background...... 81 6.19 Steganographic sonar performance of colored Gaussian process with α = 1.9 in snapping shrimp background...... 81 6.20 Mean received level of the waveform at the target and the mean target echo level at the platform, for each grid...... 83 6.21 Probability of missed detection at the target and probability of track detection at the platform, for each grid...... 83 6.22 Steganographic sonar performance computed by taking the product of the individual performances at the target and at the platform compared to the jointly computed performance...... 84 6.23 Steganographic sonar performance from the overall simulation and from the estimated model based on the individual simulation re- sult and computing the joint performance with the independence assumption, for pink Gaussian waveform...... 85 6.24 Steganographic sonar performance from the overall simulation and from the estimated model based on the individual simulation re- sult and computing the joint performance with the independence assumption, for white Gaussian waveform...... 85 6.25 Steganographic sonar performance from the overall simulation and from the estimated model based on the individual simulation re- sult and computing the joint performance with the independence assumption, for colored Gaussian waveform with α = 1.5...... 86

viii 6.26 Steganographic sonar performance from the overall simulation and from the estimated model based on the individual simulation re- sult and computing the joint performance with the independence assumption, for colored Gaussian waveform with α = 1.9...... 86

ix Acknowledgments

December in State College can be very long and cold with snowy weather and overcast skies. This one would have felt longer without the help from many people I would like to express my gratitude to. I would like to first acknowledge the support from the Office of Naval Research Code 33 as most of the work was done under award N00014-10-1-0817, and the Applied Research Laboratory for the administrative support. Academic advisor Dr. Doherty has shown much more patience and generosity anyone could ask for, and his gentle way of putting me back on track from numerous straying excursions into the deep woods have saved me from being completely lost. His sense of humor has also allowed me to see the absurdity in some of my initial ideas without the feel of devastation as many graduate work experience can be. I am deeply grateful for his guidance. Very special thanks to Dr. Culver, who has always kept interest and provided valuable advice, and to Dr. Thompson, who had initially suggested this problem to me. I feel fortunate to be working for Tom and Jim, I appreciate their encouragement and administrative support for this work. Brett had provided the snapping shrimp recording, and Shawn had introduced the EDM track I used to test the modeling, their constructive feedbacks are also appreciated. I am deeply indebted to my wife for her trust and sacrifices, and to my parents for their encouragement and patience. The support from my family from near and far are greatly appreciated. Dedications to my grandfather and grandmother, whom I will miss dearly.

x Chapter 1 | Introduction

1.1 Motivation

One of the main objectives for underwater acoustic signal processing is to find objects of interest using the knowledge of wave propagation and scattering theory. In typical SONAR (Sound Navigation And Ranging) applications, the source at the platform transmits an acoustic waveform pulse which propagates through the water and scatters off the target object and propagates back to the platform. The returned echo signal is processed for detection and tracking. Traditionally, the type of sound used for this process has been determined by the physical constraints imposed by the electro-acoustic transducer, and by the signal processing constraints due to computational efficiency. A linear frequency modulated (LFM) waveform, also known as a chirp, is typically used in many sonar applications. An LFM chirp is not a natural sound in underwater environment, therefore it is not difficult for the target to recognize it is being probed by a man-made sound. There are many applications where a less conspicuous waveform is desired. Also, short pulses with high sound pressure level (SPL) can have adverse effects on the marine life. Efforts are being made to find alternative sonar waveforms that are less intrusive. By spreading the transmission waveform energy in time and adopting a different signal processing approach, sonar can have less environmental harm while still achieving its goal. In this dissertation, a steganographic approach for sonar tracking is presented with the comparison to existing methods for covert sonar waveform design and weak target tracking. Elements from steganography are adopted to guide the waveform

1 design process with an empirical assessment of the steganographic security of the waveforms. For tracking weak targets, a motion model for marine vehicles is proposed which improves upon existing models and tracking approaches. Typicality and plausibility are of great interest, for both environmental sound and target motion.

1.2 Sonar Tracking

Conventional active sonar tracking is done by transmitting a waveform, followed by thresholding the matched filter output of the received echo to detect and estimate the target position, repeated for each transmission. Track of a target is the connection of the position estimates across multiple transmissions [1]. Thresholding, or detection, is an irreversible process through which information is lost, but it allows for efficient information processing afterwards since the following processes only need to work with position estimates and is only concerned with how to connect them, without the need to retain all of the sensor data. One of the shortcomings of conventional tracking methods is that when target echo level is low relative to the clutter level, the thresholding process does not sufficiently separate the target energy and energy from non-targets, also known as clutter. As a consequence, higher probability of detection cannot be achieved without having to tolerate an increase in probability of false detection. Thresholding becomes an unreliable signal processing step for weak targets, or also known as low-observable targets in literature [2–6]. Track-before-detect (TBD) approaches were devised to improve the reliability of tracking performance by retaining the sensor data longer and making a number of assumptions on viable motion of the target for short periods of time. For example, by assuming that the maximum speed is known, one can hypothesize a number of potential tracks and compute scores for each hypothesized track for the observed data to find the track with the highest score [7]. This approach was theorized and demonstrated by [8–13]. By making assumptions for motion characteristics, TBD approaches can limit the search space of viable tracks and make decisions within the viable track space, making it computationally feasible. TBD approaches are explicitly or implicitly embedding information regarding target motion in the signal processing,

2 thereby assisting the decision-making and improving the performance. Dynamic programming and particle filter are the two main approaches for TBD. They will be discussed in more detail in the following chapters.

1.2.1 Underwater Sound Propagation

Due to its relative ease of underwater propagation over other forms of radiation, sound waves have been used extensively for underwater exploration and for other purposes [14]. Sound travels as a longitudinal compression wave and its propagation characteristics depend on the properties of the medium, ocean water, including salinity, temperature gradient, both of which affect the density, and the bulk modulus of elasticity [14,15]. In an ideal homogeneous medium, sinusoidal wave from an oscillating source propagates spherically in all directions symmetrically. The spherical curvature of the wavefront is prominent in close range, but as the range from the source grows sufficiently larger relative to the wavelength, the wavefront curvature becomes sufficiently small such that the propagating wave can be approximated as a plane wave. This approximation is valid for many practical situations and simplifies the spherical wave equations to plane wave equations [15]. The intensity of an acoustic wave is defined as the acoustic power per unit area, and in acoustic system analysis it is customary to use decibel, or dB, notation to represent the level of the acoustic intensity relative to the intensity of a plane wave with an rms (root mean square) pressure equal to 1µP a. This notation is also carried out in sonar equations which is traditionally used for assessing performance of sonar systems [14,15].

1.2.2 Sonar Equations

Source level, SL, is the amount of sound radiated by the platform, defined as the intensity of the radiated sound at a distance of 1 meter from the source, where intensity is the amount of sound power transmitted through a unit area in a certain direction in units of dB relative to 1µP a. Transmission loss, TL, is a measure of the reduction in sound intensity due to propagation loss between two points, for example, between the platform and the target, due to spreading and absorption. The received level at the target, RLT of

3 the waveform transmitted by the platform is

RLT = SL − T . (1.1)

At the platform, the transmitted waveform is scattered off the target, at the level called the target strength, TS, and the attenuation due to propagation is twice that of what the target receives

RLP = SL + TS − 2T L. (1.2)

Both of the received levels are expressed in the log scale, and can also be expressed in linear scale as, RLT RLP σT = 10 20 , σP = 10 20 . (1.3)

Effects of spreading on transmission loss dominates at close ranges and absorp- tion effects are more significant at longer ranges. For the ranges considered in the dissertation, the transmission loss is dominated by the spreading loss and the effect of absorption loss is less significant. Therefore, TL is primarily a function of range,

R, is typically expressed as 20 log10 R in deep water by spherical spreading and as 10 log10 R in shallow water by cylindrical spreading. In practice, other factors contribute to more attenuation and (1.4) is often used as a simple approximation for many underwater applications in shallow water and shallow-to-deep transition regions [16].

TL = 15 log10 R (1.4)

1.3 Steganography

Steganography is the practice of concealing information in plain sight [17, 18]. Undetected transmission of waveform is desired for many applications including communications, radar, and sonar. Covertness is not a well-defined quantity, and it is typically assessed by evaluating the performance of potential intercept detectors, making it an assessment subjective to the type of detector. Detectors make assumptions on the signal or the noise, and try to find the best way to separate the two based on the assumption. When the assumptions match reality, intercept detectors perform well finding anomalous events while making little errors.

4 On the contrary, if the intercept detector makes incorrect or incomplete assump- tions, it could make it easier to deceive such a detector. All of the information may be available to distinguish between the normal background and the transmission waveform present in it, but when the detector does not know how to look for one, it will either miss it completely or it may take a long time to find one, which may be too late. Therefore, from the perspective of an intercept detector, it is important to have a good understanding of the background. The steganography literature points out that in practice, one has to consider the empirical nature of real world implementation of steganographic methods [19]. It is natural to assume one should learn the patterns of the background in order to be effective when trying to hide information in the background. Learning the background as accurately as resources allow could be the key to effective adaptive steganography [20]. However, the idea of adapting to the background and changing the hiding scheme is yet to be adopted in most current waveform designs for sonar and radar. Most existing covert waveform design approaches employ either a spread spec- trum pseudo-random chaotic waveform [21,22] or a frequency-hopping method with some coding scheme to remove man-made structures [23–25], with little or no regard to what environment it is being transmitted into. These approaches employed in different environments will achieve inconsistent levels of stealth depending on how similar the background perturbed by the transmitted waveform, which does not adapt to the environment, is to the background itself. Suppose there exists a model with which the ambient sound is described exactly and future sound can be predicted. Neither the hider nor the seeker, i.e., the waveform transmitter and the intercept detector, respectively, has this model. They may, however, observe the ambient sound and approximate the exact model with their respective model estimates. The hider would use this model to design transmission waveforms and the seeker would use it to design a detector to detect anomalous sounds. It is important to note that their model estimates are not perfect and that they are likely imperfect in different ways. The missing part from the seeker’s model is what the hider can exploit. Ultimately, the hide and seek competition becomes one of modeling the background in which the one with the better model is more likely to outperform the opponent. Modeling capabilities continues to grow and characterizing the inaccuracies

5 of models by comparing them to reality in an effort to improve them is certainly extremely valuable and of interest to many scientists and engineers. This dissertation also presents an approach to developing a model for the environmental ambient sound via symbolic time series analysis with phase space representation, clustering, symbolization, and model parameter estimation. However, instead of investigating the inaccuracies and shortcomings of the model, this dissertation focuses on the ability to distinguish between two scenarios using the developed model, with the understanding the model is not perfect. With the presented model, it is possible to evaluate the steganographic security of a candidate waveform by measuring the Kullback-Leibler divergence between the model for the ambient background and the model with the waveform transmitted onto the ambient background, using the framework built during the background modeling process. This process is interpreted as assessing the similarity between the background with it perturbed by the waveform through the perspective of the hider’s modeling structure.

1.3.1 Steganography in Communications

Steganography is similar to cryptography in that it is used for information security, but different from it in that while cryptography focuses on making it difficult to decode the message, steganography endeavors to conceal the existence of the message. Combining the two could result in a more secure approach in practice, but the focus of this dissertation is better aligned with steganography alone. Simmons’ motivational scenario in [26] illustrates a typical steganography prob- lem. Alice and Bob are arrested accomplices in separate cells and they want to secretly communicate with each other using messages through a channel allowed by the warden. They need to make the messages seem innocuous while conspiring an escape plan, and the warden is monitoring the messages to intercept potential secret hidden message. This is a covert communications problem, and one of the variational scenarios is that the warden can alter the messages to disrupt any potential hidden commu- nication even though detecting and decrypting the messages may not be possible. The warden in this case is called an active warden and the distortion introduced is called an attack. A message with a hidden message embedded is called a stego,

6 and otherwise, a cover. The warden’s objective is two-fold; to detect the existence of hidden secret messages, and to decode them.

1.3.2 Steganography in Sonar

Applying this framework to sonar tracking needs to be done with caution. First, active sonar tracking is not a typical communications problem involving two parties, although it can be viewed as one where the transmitter and the receiver are the same and the goal is to confirm the receipt and the time of receipt of the message, not to decode an unknown message. Secondly, the warden does not receive the message as was sent by Alice. It will be distorted due to the one-way underwater propagation and interfered by other sources of natural sound which actually make up the cover. Thirdly, the message received by Bob may be distorted more severely due to the two-way underwater propagation, which can be thought of as an active warden scenario. Also, the warden is likely to see a much less distorted message than Bob. To translate this in terms of sonar, the objective of the platform is not com- munication, but detection of target echo, typically using a matched filter since the transmitted waveform is known to the platform. The target’s intercept detector is the warden, and the distortion due to the two-way propagation seen at the platform can be thought of as the warden’s attack. The ambient environmental sound is the cover, and the transmission from the platform is the hidden message, the two together make up the stego. One of the aspects of steganography that is only being investigated recently is adaptive steganography [27–29]. Stochastic and statistical steganography has been studied in [30–33], however, adaptive steganography for heterogeneous cover is still an active field of study. Underwater acoustic environment is highly non-stationary and heterogeneous. Furthermore, existing models for underwater sound are made on very large timescales [34], much larger timescales than what is typical for many sonar applications. Epistemological approach for steganography has been suggested for practical implementation of steganography and Böhme [19] points out the importance of modeling the cover and the empirical nature of such an approach. This perspective emphasizes the characterization framework of the cover and it also points out that

7 the security of real world implementation depends on the relative accuracy of the cover models of the hider and the seeker. Given perfect models, an information- theoretic approach can be shown to work at a desired level of security [35]. The modeling approach presented in this dissertation focuses on capturing the temporal characteristics of the underwater background sound. Spectral char- acterization achieves some aspects of it, but it does not capture the short-time local structures that repeat at random intervals and its coexistence with temporal events on longer timescales. With sonar detection and tracking applications in mind, capturing local structures on shorter timescales may prove to be useful. How various types of transmission waveforms is seen through this model is of particular interest.

1.4 Contributions

The following is a list of contributions of this dissertation

• Sound modeling approach that captures the time-evolving aspect of underwa- ter ambient acoustic environment

• Model-based steganographic security assessment approach that is independent of the type of intercept detector in the context of underwater sonar applications

• Application of the steganographic security assessment approach to realistic en- vironment and the investigation of effectiveness of various types of waveforms including mimic waveform

• A track detection approach for weak targets using a long timescale motion model designed with considerations for plausibility of typical marine vehicle motion

• A novel distance/similarity measure for time series

List of publications J. D. Park and J. F. Doherty, “Track detection of low observable targets using a motion model,” Access, IEEE, vol. 3. pp. 1408-1415, 2015. J. D. Park and J. F. Doherty, “A steganographic approach for covert waveform

8 design,” Military Communications Conference, 2016, MILCOM 2016, IEEE.

1.5 Dissertation Organization

Chapter 1 introduces the background and summarizes the contribution of this dissertation. Chapter 2 discusses the existing track-before-detect (TBD) approaches and their limitations for weak target track detection. An approach to improve upon the TBD approaches by utilizing a plausible long timescale motion model, a batch track detection approach is discussed in Chapter 3. In Chapter 4, the background on steganography is provided and this is further developed in the context of underwater steganographic sonar. Chapter 5 discusses a time series modeling approach that makes it feasible to assess the steganographic security. Chapter 6 combines the two main components of steganographic sonar tracking in a Monte Carlo simulation to demonstrate feasibility, and Chapter 7 concludes and discusses potential improvements on this dissertation.

9 Chapter 2 | Track-Before-Detect

Conventionally, tracking with pulsed active sonar has been done by making point estimates of the target position from each sonar transmission, followed by a process to connect the point estimates to form a track [1]. Point estimation of the target position is typically done by thresholding the normalized data and declaring detections on threshold-crossings. Given sufficiently high SNR of the target echo, this approach is reliable and efficient since the target echoes are more likely to survive the thresholding process and connecting only the detected target positions does not require large amount of computation. Low echo level, either due to reduced source level or low target strength, renders conventional pulsed active sonar detection and tracking less reliable. Detection of such targets requires lower threshold at the expense of more false detections. Lowering the threshold value not only makes the detection less reliable but also makes the process of forming the track more difficult and computationally inefficient since the track-forming process cannot rely on all of the threshold crossings to be coming from targets, and needs to further process the data to distinguish between target and clutter. Track-before-detect (TBD) approaches were devised to improve the reliability of tracking performance by retaining the sensor data across multiple transmissions and integrating the target echo energy along potential paths by making a number of assumptions on viable motion of the target. These approaches are assisted by using short timescale motion models, and have been used in many tracking applications [1,6,8–13,36–39]. For example, by assuming that the maximum speed is known, one can hypothesize a number of potential tracks and compute scores for each hypothesized track by applying the data to find the track with the highest

10 score [7]. Intuitively, TBD approaches process the collected data and search for the target path that is best supported by the data. By making assumptions for motion characteristics, TBD approaches can limit the search space for viable tracks and make decisions within the viable track space, making it computationally more feasible. TBD approaches are explicitly or implicitly embedding information regarding the target motion in the signal processing, thereby assisting the decision- making process and improving the performance. Dynamic programming and particle filter are the two main approaches for TBD.

2.1 Problem Definition

Consider the data with a set of time series recorded by an acoustic sensor from multiple active transmissions in which a target is present. The objective is to process this data to detect and track the target whose echoes are weak such that they are not reliably detected by conventional tracking methods. Each time series is denoted as a scan. A scan is matched filtered with the transmitted waveform or with the Doppler-compensated matched filter kernel if the radial velocity of the target is known or assumed. It is also considered to be basebanded so that the effects of the carrier frequency is removed and the focus is on the bandwidth of transmitted waveforms. The following sections will describe the measurement model and the motion model of the target.

2.1.1 Measurement Model

The measurement at each scan, indexed by m, is the recorded acoustic time series denoted as ym(t), where the argument t is the time since each transmission. The matched filtered and basebanded output of each scan, zm(t), can be expressed as the sum of basebanded autocorrelation function scaled due to propagation attenuation and delayed by the round trip time of the transmitted waveform, σRm(t − τ), and the filtered noise, vm(t).

zm(t) = σRm(t − τ) + vm(t) (2.1)

One of the properties of the basebanded autocorrelation function, correlation

11 width, wc, is defined as the 3dB width of R(t). This width is known to be inversely proportional to the bandwidth of the transmitted waveform [40]. This quantity will be used to further analyze the target motion. Note that the relative strength between Rm(t) and vm(t) is described by the scaling factor σ, which is determined by the sonar equations introduced in Chapter 1.

2.1.2 Motion Model

Target motion model, also referred to as kinematic model or dynamic model in literature, is the stochastic description of the target motion over time. Position and velocity are typical descriptors for the motion and they constitute the state of the motion. The probabilistic transition of the state is described by the motion model. In many of the literature including [1,36,41–43] in which the data is a series of 2-D radar or sonar images, the state of the target motion used is expressed as,

∆ T xm = [xm x˙ m ym y˙m] . (2.2)

The state consists of the position and the velocity in both x and y directions. The evolution of the state can be modeled as a linear stochastic process as,

xm = F xm−1 + um, (2.3)

  1 T 0 0      0 1 0 0  F =   , (2.4)    0 0 1 T    0 0 0 1 where T is the sampling period and the noise term um is also a stochastic process, typically sampled from a Gaussian process with covariance,

  T 3/3 T 2/2 0 0    2   T /2 T 0 0  Q = q   . (2.5) s  3 2   0 0 T /3 T /2    0 0 T /2 T

The preceding scalar, qs, is the intensity of the acceleration noise process in the spatial dimension. Dynamic programming utilizes this discrete state model and

12 particle filter uses a continuous-valued state model, but the underlying ideas of the models are equivalent.

2.2 Dynamic Programming

Dynamic programming, also well known for its application in sequence estimation as Viterbi algorithm for sequences modeled with hidden Markov models, is applied to TBD to find the overall maximum likelihood (ML) track by finding the maximum likelihood subsections of the overall track based on the observed data, recursively [36, 37,44]. For each new update, likelihood values are computed for each hypothesized subsection of the track that terminate at the same discretized state, or position. Subsequent subsections of the track are appended to the previously determined maximum likelihood track. The underlying state sequence is,

Xm = {x0, x1, ··· , xm} (2.6) and the sequence of observations is,

Zm = {z0, z1, ··· , zm} (2.7) and the goal is to maximize the conditional probability density function, or the likelihood function,

p(Xm|Zm) = p(x0, x1, ··· , xm|z0, z1, ··· , zm) (2.8)

ˆ which leads to the maximum likelihood (ML) estimate of the state sequence Xm|m expressed as, ˆ p(Xm|m|Zm) = max {p(Xm|Zm)} . (2.9) Xm

Exhaustive search in all possible Xm could yield the solution, but dynamic programming achieves it via recursive maximization by first rewriting (2.9) as,

  ˆ p(Xm|m|Zm) = max max {p(Xm|Zm)} . (2.10) xm Xm−1

13 For simplicity of notation,

I(xm, m) = max {p(Xm|Zm)} (2.11) Xm−1 and the iterative relation between I(xm, m) and I(xm+1, m + 1) allows the recursive computation of (2.9). Using Bayes’ theorem,

p(Zm+1|Xm+1)p(Xm+1) p(Xm+1|Zm+1) = (2.12) P (Zm+1) p(z |X ,Z )p(Z |X )p(X ) = m+1 m+1 m m m+1 m+1 (2.13) P (Zm+1) p(z |X ,Z )p(X |Z ) = m+1 m+1 m m+1 m (2.14) p(zm+1|Zm)

Since zm+1 only depends on xm+1, (2.14) is further simplified and developed into,

p(z |x )p(X |Z ) p(z |x )p(x |x )p(X |Z ) m+1 m+1 m+1 m = m+1 m+1 m+1 m m m . (2.15) p(zm+1|Zm) p(zm+1|Zm)

Now I(xm+1, m + 1) is expressed in terms of I(xm, m) as,

( ) p(zm+1|xm+1)p(xm+1|xm) I(xm+1, m + 1) = max p(Xm|Zm) (2.16) Xm p(zm+1|Zm) ( ) p(zm+1|xm+1)p(xm+1|xm) = max I(xm, m) . (2.17) x m p(zm+1|Zm)

Therefore, the ML track estimate is determined from

ˆ p(Xm|m|Zm) = max {I(xm, m)} . (2.18) xm where p(x0) is used for the initial condition for the iteration. One shortcoming of dynamic programming against weak targets is the fact its process is inherently sequential making it difficult to extend to longer integration time without loss of performance due to insufficient accumulation of target echo energy. However, increasing the integration time asymptotically approaches exhaus- tive search and the number of potential paths to consider grows exponentially. The computational inefficiency can be alleviated, as illustrated in [45], by utilizing a

14 better fitting motion model.

2.3 Particle Filter

Also known as sequential Monte Carlo method [46, 47], particle filter is a Monte Carlo based filtering approach used in many fields including molecular chemistry, computational physics, economics and mathematical science, and signal processing. For on-line tracking applications, particle filter estimates the latent dynamical states via estimation of the posterior density function based on the sequence of partial observations. [1,48] With the same measurement model and the motion model, particle filter ap- proximates the posterior distribution of the state by randomly sampling particles and associated weights and computing the state estimate, for each measurement. As the number of samples increases it approaches the optimal Bayesian estimate, however becomes computationally inefficient. As in dynamic programming, the two models, motion model and measurement model, form an HMM, but the observation space can be continuous. Starting from the sequences of states and observations,

Xm = {x0, x1, ··· , xm} , (2.19)

Zm = {z0, z1, ··· , zm} , (2.20) the full posterior distribution of sequence of states given the sequence of observations can be expressed as,

p(Zm|Xm)p(Xm) p(Xm|Zm) = R . (2.21) p(Zm|Xm)p(Xm)dXm

Particle filter estimates the posterior distribution as new observations are obtained with the following recursive relation,

p(zm+1|xm+1)p(xm+1|xm) P (Xm+1|Zm+1) = P (Xm|Zm) . (2.22) p(zm+1|Zm)

15 The marginal distribution also satisfies the following relations.

P rediction : Z p(xm|Zm−1) = p(xm|xm−1)p(xm−1|Zm−1)dXm−1, (2.23) Updating :

p(zm|xm)P (xm|Zm−1) p(xm|Zm) = R . p(zm|xm)P (xm|Zm−1)dxm

These distributions, however, are not easy to sample from directly. Therefore an alternative approximate distribution is used to sample from via importance sampling. The importance function of choice is the prior as,

m Y π(Xm|Zm) = p(Xm) = p(x0) p(xi|xi−1). (2.24) i=1

This can be decomposed into the following,

π(Xm|Zm) = π(Xm−1|Zm−1)π(xm|Xm−1,Zm) (2.25) which allows for recursive evaluation of the importance weight, or the likelihood ratio associated with each ith particle,

(i) (i) (i) (i) (i) p(zm|xm )p(xm |xm−1) wm ∝ wm−1 (i) (i) . (2.26) π(xm |Xm−1,Zm)

With (2.24), the expression (2.26) becomes,

(i) (i) (i) wm ∝ wm−1p(zm|xm ), (2.27) and the normalized weights are,

w(X(i)) w˜(i) = m . (2.28) m PN (j) j=1 w(Xm )

These weights serve as the approximation of the filtering probability density p(xm|Zm) from which underlying state is estimated. Finally, the track state estimate can be computed, for example, by taking the mean of the state from all the particles.

16 Particle filter as an approach for TBD can achieve more reliable tracking performance against weak targets by retaining more information from the data through the sequence of estimated posterior distribution function rather than point estimates as done in conventional tracking [1]. However, it is still inherently a sequential approach and suffers from the same problem as dynamic programming due to insufficient accumulation of target echo energy from small number of state updates. Increasing the integration time requires estimating the joint posterior distribution, and the number of particles required to estimate the joint distribution, (2.21), grows rapidly with the number of iterations [48].

2.4 Role of the Motion Model

The motion model describes the evolution of the dynamic state as well as what is called the noise process which actually drives the evolution of the overall motion. This process, denoted as u in section 2.1.2, is typically modeled as a zero-mean Gaussian process in most TBD methods, and assumes little structure about the target motion evident by the fact it is referred to as the noise process [1]. The lack of structure in motion model is beneficial in the context of sequential dynamic state estimation since it allows the tracking filter to search in all directions during each state update. However, when the search for state sequence extends beyond a few iterations, the number of unique state transition sequences that can be generated by the motion model grows exponentially with the number of iterations. This implies the number of hypothesized paths, thus the number of likelihood evaluations, also grows exponentially, rendering direct application of TBD methods to lower SNR targets computationally infeasible. Many of the hypothesized paths, however, are not plausible. Majority of the motions generated by the sequence of accelerations sampled from a Gaussian process are highly fluctuating and would be associated with a sequence of abrupt acceleration changes. Intuitively, a motion of this nature may be considered complex. In contrast, a motion with consistent or slowly changing acceleration would result in a relatively simple motion. If the true target motion were characteristically simple, the likelihood evaluated with most complex motion hypotheses would be low. This inefficiency can be avoided by adopting a more plausible motion model with appropriate assumptions on the motion structure. Details of such a motion

17 model will be discussed in the following chapter.

18 Chapter 3 | Weak Target Track Detection

With sonar waveforms using lower transmission levels to meet the steganographic security requirement, track-before-detect (TBD) methods may become computa- tionally infeasible to achieve desired detection performance. Therefore, a novel track detection approach that uses a long timescale motion model is presented and its performance will be shown in comparison with existing TBD methods. After transmission from the platform, the waveform scattered off the target and propagated back to the platform is attenuated even more than what would have been already small when seen at the target. This level is approximated as 2 a positive scaling factor, σP ≈ σT , where σT is typically small in order to meet the steganographic security constraint. The platform is interested in detecting and tracking this weak target. The recorded data for each transmission segment is denoted as a scan with index m. The motion of the target, τ, is the sequence of the round trip times of the waveform pulse, τm’s, and the travel times depend on the speed of sound, c, and the ranges between the platform and the target, rm’s, yielding τm = 2rm/c. For reliable detection of a weak target, the echo energy needs to be integrated along the target track for a long duration. The required integration duration depends on the range as well as the steganographic security constraint. Longer range to the target results in larger difference between the signal level observed by the target and by the platform, and requires larger processing gain via longer integration time to overcome the difference.

Each scan, ym(t), is the sum of the target echo and the ambient sound, and zm(t) is the matched filtered and basebanded output, where t is the elapsed time

19 since each transmission.

ym(t) = σP sm(t − τm) + pm(t) (3.1)

zm(t) = σP Rm(t − τm) + vm(t). (3.2)

The autocorrelation function of sm(t) of the processed scan m, Rm(t), has a correlation width, wc, defined as the 3dB width of R(t), which is inversely proportional to the bandwidth of the transmitted waveform [40]. This width is considered the tolerance for misalignment when integrating target echo energy across multiple scans. The last term in (3.1), pm(t), is the ambient sound and vm(t) in (3.2) is the cross-correlation function of pm(t) and sm(t). Single-scan detection-based conventional tracking is unreliable for weak targets and TBD approaches using short timescale motion models become computationally infeasible since the search space grows exponentially with number of scans [1,36,39]. An approach to overcome this problem is to devise a long timescale motion model that can be utilized for more efficient path search and track detection of weak targets.

3.1 Long Timescale Motion Model

Integrating enough target echo energy for reliable detection of target track can be done by extending the integration time [39]. However, taking the motion model designed for tracking with sequential state updates on a short timescale and extending it to longer timescales can make the approach computationally infeasible [36]. A motion model that captures the physical characteristics of target motion may improve the search efficiency and achieve more reliable tracking performance [49]. Typical motion model used in TBD applications does not describe the charac- teristics of acceleration [1]. The noise term, u, in section 2.1.2 describes the random fluctuations of target position and uses the covariance matrix, Q, to implicitly characterize the effect of the hidden acceleration process. Considering the fact that observed motion over time is determined by the underlying acceleration process, it is important to capture the physical characteristics of acceleration in the motion model.

20 Motion models with consideration for physical plausibility have been proposed and applied to tracking maneuvering targets. An acceleration-based model by Singer [49], and turn-rate-based model as an extension of Singer’s model [50] showed improvement in tracking performance against their respective targets of interest. The assumption for Singer’s acceleration-based model is that the maneuvering pattern of manned target consists of constant velocity for the majority of the time with occasional turns. This idea is represented by using a combination of uniform distribution and specified probability masses, P0 and Pmax, for no acceleration and maximum acceleration, respectively. The main observation made in this dissertation on marine vehicle is that most of them exhibit smooth [51,52] motion due to infrequent and small accelerations on the timescale of sonar tracking. The motion model presented in this dissertation is a hidden Markov model which considers the radial acceleration of the target as the hidden dynamic state with the stochastic matrix describing the transition between the states. The state represents the direction and magnitude of the acceleration which is sampled from the Laplace distribution, also known as the double exponential distribution [53], with varying parameter values for each state. The acceleration is also applied according to Gaussian kernel over the span of each state segment. As a result, the overall smoothness of the sampled motion is more physically plausible for the targets of interest. A comparison of the distribution functions for acceleration is shown in Figure 3.1. The sampled motion does not, however, describe the position of the target for reasons that will become clear in the following section, instead describes the overall shape of the motion, or the evolution of the relative positions. Initial dynamic states are radial velocity and radial acceleration both of which are randomly sampled from uniform distribution with plausible support. The track shape over the observation time is determined with Newtonian dynamics computed from the sampled acceler- ation process from the acceleration-based hidden Markov motion model. (3.3) is the relation between observed motion and the hidden state, acceleration. This is consistent with the motion model framework introduced in Chapter 2.

dv d2r a = = (3.3) dt dt2

Figure 3.2 shows a collection of motion samples generated with the Singer’s

21 0.5 proposed model 0.45 Singer model

0.4

0.35

0.3

(x) 0.25

X f p 0.2 0

0.15 p p max max 0.1

0.05

0 -10 -8 -6 -4 -2 0 2 4 6 8 10 x (m/s2)

Figure 3.1. PDFs of acceleration, Singer’s model and the acceleration-based HMM motion model. This is a good model for tracking targets making frequent abrupt changes in direction, but most marine vehicles do not exhibit such behavior. Figure 3.3 shows a collection of motion samples from the acceleration-based HMM and it exhibits generally smooth and simple overall motion relative to those from the Singer’s model.

Figure 3.2. Sample motions from the Singer’s motion model

22 Figure 3.3. Sample motions from the acceleration-based HMM

3.2 Long Timescale Batch Track Detection

Since the acceleration-based HMM generates samples of long timescale motion associated with the entire observation time, evaluation of likelihood for a sample motion given the recorded data is no longer required to be sequential. Consider a collection of recordings from multiple transmissions with a weak target maneuvering as seen in Figure 3.4. Given this true target motion, a candidate motion represented as a sequence of acceleration and velocity for each transmission allows for Doppler compensation and alignment of target echo energy across the transmissions as shown in Figure 3.5. Instead of processing each scan with banks of Doppler compensated matched filters and choosing the highest likelihood state for each iteration, the likelihood of the entire motion is evaluated as a whole by batch processing all of the data with each of the hypothesized trial motion. The trial motion associated with the highest output is chosen as the estimated motion if the aligned target echo energy peak is large enough such that it is unlikely to have occurred by random clutter energy accumulation. The final track estimate is shifted by the location of the peak. Since the processing for each candidate motion is independent from others, the entire collection of transmission echoes can be processed in parallel by architectures such as graphical processing unit (GPU). The overall approach can be considered a form of informed random search akin to the approach used for efficient motion planning [54]. If the target echo energy is aligned well, as shown in Figure 3.5 and Figure 3.7. (c), the integrated peak as in

23 0 N m=1 m=2

m=M scans of # time since Tx Target position

Figure 3.4. Entire observed acoustic data with the target motion depicted as the curved dashed line.

0 N m=1 m=2

m=M scans of # time since Tx Motion compensated target position

Figure 3.5. Aligned observation using the motion information

24 0 N m=1 m=2

m=M scans of # Partially compensated time since Tx Motion compensated target position

Figure 3.6. Processed with incorrect motion

Figure 3.7.(f) is considered strong evidence for having made the correct hypothesis T for target motion, τ = [τ1τ2...τM ] for M scans. For large M, it is unlikely for clutter energy to be concentrated into a small number of range cells. In contrast, if a trial motion is similar enough to the true target motion, relative to the correlation width wc as seen in Figure 3.7.(b), it may still produce a peak with high enough contrast, as seen in Figure 3.7.(e), indicating a track detection. The batch track detection algorithm consists of three steps. First, using the acceleration-based HMM, sample a specified number, N, of trial overall motions, ξ(n), n = 1, ..., N. Secondly, process the entire data of M scans with each of the overall trial motion. Lastly, find the trial motion that yields the maximum integrated target echo energy peak. A detection is made if the peak is larger than a specified threshold determined based on the noise level, M, and tolerable false detection level. Since the motion model generates only the shape of the motion, the final track estimate needs to be referenced to where the integrated energy peak is located. The peak is the maximum of the integration of the aligned matched filtered output.

25 (a) (b) (c)

푤푐

(d) (e) (f)

Figure 3.7. Well aligned autocorrelation peaks produces higher contrast integrated peak as seen in (c) and (f)

Restating the processed scan from (3.2),

zm(t) = vm(t) + σP Rm(t − τm), m = 1, ..., M, (3.4) the motion compensated and integrated output is still a function of time, t, as,

M (n) X (n) (n) o (t) = vm(t + ξm ) + σP Rm(t − τm + ξm ). (3.5) m=1

The output peak is the maximum value of (3.5),

( M ) (n) X (n) (n) O(ξ ) = max vm(t + ξm ) + σP Rm(t − τm + ξm ) , (3.6) m=1

∗ and the track shape estimate is ξ(n ), where n∗ is expressed as,

n∗ = argmax O(ξ(n)). (3.7) n

26 Finally, the track estimate (3.9) is referenced to the peak location as (3.8)

(n∗) τref = argmax {o (t)} (3.8) t

(n∗) τˆ = ξ + τref (3.9)

The success of the batch track detection approach relies on the assumption that aligned and integrated target echo energy will produce a high-contrast peak distinguishable from random high peak from clutter energy integration. The performance depends on how well the motion model generates a trial motion that is similar enough to the actual target motion. One of the benefits of batch processing is its robustness against fluctuating target levels across scans. For example, as shown in Figure 3.6, the target echo energy from each scan is aligned with part of the candidate trial motion that compensates for the true target motion while other range bins are integrating random clutter energy. If the SNR is high enough for the section that are aligned, it will be integrated along the true target position to produce a high-contrast peak. As a note on motion complexity, in contrast to motions depicted in Figures 3.5 and 3.6, a typical motion sample generated by Singer’s model is likely to generate a complex motion shape associated with large accelerations. For a predetermined number of trials, the likelihood of such model generating a smooth motion similar to the true target motion is small, or alternatively, it would take many more trials to generate a motion sample with smooth characteristics. Therefore, capturing the overall physical characteristics in the motion model is a critical factor in terms of processing efficiency. Assuming that the complexity of target’s motion is constrained, the performance of the batch track detection approach can be predicted based on the number of trial motions and the distribution of similarity among the hypothesized motions generated by the motion model. Motion similarity is discussed in detail in section 3.4.

3.3 Comparison with TBD Methods

The simulation set-up for comparing the batch track detection (BTD) algorithm to the two TBD methods, dynamic programming (DP) and particle filter (PF), is as follows. For each trial, the true overall target motion denoted as τ, of length M, is

27 sampled from the acceleration-based motion model. Target’s echo energy, scaled by th σP , is added in the target position, τm, of each m scan. The matched filtered and basebanded time series data is provided to each of the algorithms, DP, PF, and the BTD. Each algorithm produces an estimate of the track, τˆ, and it is compared to the true track, τ. The performance of tracking algorithms are presented as a function of target

σP SNR, expressed as 20 log , where σN is the ambient noise level. The performance σN metric is the probability of success, where success is defined as when the root mean squared error (RMSE) between τˆ and τ is less than the specified threshold, τ . As seen in Figure 3.8, the performance of BTD algorithm is higher than those of DP or PF at lower SNR values. The performance gap is larger for longer tracks, M = 400, dotted lines. This illustrates how the BTD approach can perform reliably as long as the total integrated energy is high enough to yield a high-contrast peak, while TBD methods begin to fail once the SNR falls below a critical value.

Probability of track detection, e :5 D:1000 rmse 1

0.9

0.8

0.7

0.6

0.5

P(Success) 0.4

0.3 DP(M:200) PF(M:200) 0.2 BTD(M:200) DP(M:400) 0.1 PF(M:400) BTD(M:400) 0 -40 -38 -36 -34 -32 -30 -28 -26 -24 SNR (dB)

Figure 3.8. Track detection performance with varying SNR. Pulse duration: 1000

28 3.4 Motion Similarity

Measuring motion similarity is a crucial component for characterizing the perfor- mance of the batch track detection approach. For motion analysis in the context of this approach, a useful similarity measure is one that correlates well with the integrated autocorrelation peak when the data is processed with a trial motion, given a true target motion. As demonstrated in Figure 3.7, the loss of peak contrast due to misalignment of the autocorrelation functions is associated with the difference between the true target motion, a, and the trial motion, b, being used to process the data. How to measure the difference between the two motions is the question of measuring similarity between the two vectors that represent motions. Euclidean distance is a widely used distance metric, which is the L2 norm of the difference of two vectors, e, expressed as

v v u M u M u X 2 u X 2 kek2 = t em = t (am − bm) = ka − bk2. (3.10) m=1 m=1

From (3.10), it can be seen that a large partial difference can make a big contri- bution to the distance metric. However, in the context of target echo integration in

(3.5) when the difference is larger than the autocorrelation width, wc, there is no further loss of contribution to the peak contrast. A similarity measure reflecting this nature of signal processing is expressed as

( M ) ( M ) ∆ 1 X 1 X s(a, b) = max R(am − bm − γ) = max R(em − γ) . (3.11) γ γ M m=1 M m=1

The shifting parameter γ is to account for when the most frequent value of em is not 0. The function R(t) in (3.5) is the autocorrelation function of a band-limited

transmission waveform, but as long as the effective width of R(t) is wc, it serves its purpose. The 3dB width of 1 R(t) = (3.12) 1 + (ct)2

29 1 is c . The similarity measure with this function is

( 1 M 1 ) ( 1 M 1 ) s(a, b) =∆ max X = max X . γ 2 2 γ 2 2 M m=1 1 + c (am − bm − γ) M m=1 1 + c (em − γ) (3.13) As a reference, a comparison of a typical autocorrelation function for a band- limited waveform and (3.12) is made in Figure 3.9.

푤푐

Figure 3.9. Rxx(t) is the autocorrelation function of a band-limited transmission waveform, and R(t) is a function with the same 3dB width.

3.5 Motion Model Homogeneity

In this section, an analytic expression of performance for the batch track detection method will be presented. The performance is dependent on SNR or σP , on the duration of motion or number of scans, M, and on the motion model homogeneity, h, the parameter that describes the typicality of sampled target motion in comparison

30 to other trial motions. Large h value implies a sampled target motion is more likely to find other trial motions similar to it. This quantity can also be interpreted as the typical simplicity of a sampled target motion. While simple motions requires relatively small number of descriptors for representation, complex motions require more descriptors and are therefore high-dimensional, which makes it less likely to find other similar motions. With respect to the proposed batch track detection method, it is more likely to find a trial motion similar enough to a simple target motion, given a fixed number of trials. Signal to noise ratio, or SNR, defined as 20 log σP is a function of target echo σN level observed at the platform, σP , and the ambient noise level, σN . The number of scans is denoted as M, and the transmission pulse duration as D. Therefore, 2 the total integrated target energy can be computed as σP × M × D. As described in section 3.1, the motion model, denoted as H, is the probabilistic description of the target motion. Each sampled target motion, τ, is associated with parameter λ, and the conditional distribution function of similarity to other trial motions will be modeled as a truncated exponential distribution function expressed as λ exp−λs f (s|τ) = , 0 ≤ s ≤ 1. (3.14) S 1 − exp−λ The parameter λ is dependent on the typicality of the sampled target motion; negative λ value indicates other trial motions are likely to be similar to the sampled target motion, τ, as shown in Figure 3.10, and positive λ value indicates other trial motions are likely to be dissimilar to the sampled target motion. The motion model, H, with respect to the distribution of similarity for different target motion samples, is also modeled as a truncated exponential distribution function with the homogeneity parameter, h. With λmax as the limit for homogeneity, the distribution function of λ is expressed as,

h (1− λ ) he 2 λmax fΛ|h(λ) = h , −λmax ≤ λ ≤ λmax (3.15) 2λmax(e − 1)

A motion model with positive homogeneity, h > 0, is when there are more true target motion samples associated with negative λ value. In practice, homogeneity is likely to decrease as M increases since the complexity of the overall motion tends to grow with M. The observation and the assumption made in this dissertation is

31 that positive homogeneity value is reasonable for motions of marine vehicles.

3

2.5

2

(s) 1.5

S f

1

0.5

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 s

Figure 3.10. Distribution function of similarity to a true target motion with associated parameter value λ = −2.

3.6 Batch Track Detection Performance Prediction

The performance metric considered is the probability of successful track detection,

PTD. The criterion for success is defined in terms of the root mean squared error (RMSE), which makes the comparison with other tracking algorithms straightfor-

ward. Given the true track, τ, an estimated track τˆ, and a threshold, τ , it is considered a success when,

v u M u X 2 RMSE(τ, τˆ) = t (τm − τˆm) ≤ τ (3.16) m=1

There are certainly differences between the similarity measure and RMSE as discussed in section 3.4, but the mapping between the small-value region in RMSE and the near-1 region in similarity is close to one-to-one. In terms of the threshold,

τ ' 0 in RMSE corresponds to 1 − s ' 1 in the similarity measure, where 1 − s

32 is the threshold in similarity. In this analysis, it is considered a success when the maximum peak value associated with trial motions with s ≥ 1 − s is greater than the maximum peak value associated with trial motions with s < 1 − s. Consider a trial motion with similarity S = s. The integrated peak of the target echo energy as a random variable, O(s), associated with this trial motion is expressed as a parameterized Gaussian distribution,

2 σN O(s) ∼ N(Ms, M 2 ) (3.17) σP

The output peak values indexed by the sample, On, n = 1, 2, ..., N, are parti- tioned into two groups; A = {On|sn ≥ 1 − s} and B = {On|sn < 1 − s}. The number of samples in each group are NA = |A|, NB = |B|. It is considered a success when the maximum of A is larger than the maximum of B, assuming a trial motion with similarity to true motion close to 1 will yield a reliable integrated target energy. For notational convenience, the maximum output values are denoted as Amax = max {A} and Bmax = max {B}. Therefore, with the detection threshold

O, the probability of success is,

∆ PTD = P rob{Amax > Bmax,Amax > O}, (3.18)

and sufficiently large M will likely ensure Amax > O. The cumulative distribution functions (CDF) of maximum value of random variables with different underlying distribution functions, assuming independence, is expressed as the product of individual CDFs with different parameter values sn,

N YB FBmax|λ (o) = FOn|Sn (o(sn)). (3.19) n=1

N Y FAmax|λ (o) = FOn|Sn (o(sn)), (3.20) n=NB +1

The arguments of (3.20) and (3.19) are the CDFs of (3.17), and sn values are determined by N and λ associated with the true target motion as

1 1 ! sn = log n −λ . (3.21) λ 1 − N (1 − e )

33 Note that NA and NB = N − NA can also vary with different true motions and

associated similarity distribution function, (3.14), and the spacing between sn’s are

inversely proportional to the PDF in Figure 3.10. Those with sn ≥ 1−s contribute

to (3.20) and those with sn < 1 − s contribute to (3.19). The conditional PDF of

Amax for a given λ value is

dFA (o) f (o) = max|λ . (3.22) Amax|λ do

Probability of successful Track Detection, PTD, for a given σP and λ, using (3.18), (3.20), (3.19), (3.22), is expressed as

Z

PTD(σP , λ) = fAmax|λ (o)FBmax|λ(o)do. (3.23) O

Finally, using (3.23) and (3.15), the probability of success that only depends on

σP is expressed as

Z Z

PTD(σP ) = fΛ|h(λ)fAmax|λ (o)FBmax|λ(o)dodλ. (3.24) Λ O

Figure 3.11 compares the performance from the simulation with the prediction. The difference in the shape comes from the fact the modeled PDF of similarity does not exactly match the actual distribution of measured similarity. However, the overall performance and its SNR dependence is captured by the prediction model.

34 Probability of track detection, e :5 D:1000 rmse 1 Sim,M:200 0.9 Model,M:200 Sim,M:400 0.8 Model,M:400

0.7

0.6

0.5

P(Success) 0.4

0.3

0.2

0.1

0 -50 -45 -40 -35 -30 -25 -20 SNR (dB)

Figure 3.11. Track detection performance comparison between simulation and model. Solid lines are simulation and dashed lines are model. Black lines are when M = 200 and Blue lines are when M = 400.

35 Chapter 4 | Steganography

Steganography, a term from the Greek origin meaning “covered writing”, is the practice of concealing information in plain sight [18]. It is similar to cryptography in that it is used for information security, but different from it in that while cryptography focuses on making it difficult to decode the message, steganography endeavors to conceal the existence of a message. Although combining the two would result in a more secure approach in practice, the focus of this dissertation is better aligned with steganography alone. Simmons’ motivational scenario in [26] illustrates a typical steganography prob- lem. Alice and Bob are arrested accomplices in separate cells and they want to secretly communicate with each other using messages through a channel allowed by the warden. They need to make the messages seem innocuous while conspiring an escape plan, and the warden is monitoring the channel to intercept potential adverse messages. This is a covert communications problem, and one of the variational scenarios is when the warden can alter the messages to disrupt any potential hidden commu- nication even though detecting and decrypting the messages may not be possible. The warden in this case is called an active warden and the distortion introduced is called an attack. A message with an embedded hidden message is called a stego, and a cover, otherwise. The warden’s objective is two-fold; to detect the existence of hidden secret messages, and to decode them [18]. Detection of anomalous messages can be difficult if the warden does not have adequate information about the cover and the stego. To simplify the task, normal exchange of messages between Alice and Bob may be restricted to be on certain topics and any other topics may be prohibited. Alternatively, the warden may be

36 looking for a specific keyword that triggers the intercept detector. If Alice and Bob knew the warden’s detection scheme, they can use analogies to forgo the prohibited topic or use secret codes to circumvent the usage of trigger keywords. However, without this knowledge, hiding the escape plan can be equally difficult. Assumptions on the cover, ability to distinguish between cover and stego, ability to embed secret messages, and how these factors relate to underwater sonar tracking are discussed in this chapter. Transmitting sonar waveforms that are difficult to distinguish from underwater ambient background is the goal. While many of the elements in steganography can be transferred to sonar, some practical considerations, e.g., dynamic background and unknown waveform strength seen at the target, make the problem difficult to solve as well as interesting.

4.1 Information Hiding and Covert Waveform

One of the commonly used information hiding techniques for audio and image files is the least significant bit (LSB) manipulation [55–57]. The message is embedded in the cover by manipulating the LSBs of the digital data to minimize the perceptual change between the cover and the stego. The receiver only has to work with the LSBs to decode the embedded message. For most digital representation of images and audio with messages of modest size, the overall change is imperceptible by seeing the image or listening to the audio. Detection of the message requires knowledge of the encoding scheme unless one had access to the original file to make a direct comparison. Applying typical steganographic methods such as LSB manipulation to under- water sonar tracking needs to be done with caution. First, active sonar tracking is not a typical communication problem involving two parties, although it can be viewed as a variant where the transmitter and the receiver are the same and the goal is to confirm the receipt and estimate the time of receipt of the message, not to decode an unknown message. Secondly, the warden does not see the message as was sent by Alice. It will be distorted due to the one-way underwater propagation and interfered by other sources of natural sound which actually make up the cover. Thirdly, the message seen by Bob, the receiver, may be distorted and more severely attenuated due to the two-way underwater propagation, which can be thought of as the active warden scenario and the warden is likely to see a much less distorted

37 message than Bob. To translate these into sonar terms, the objective of the platform is not commu- nication, but detection of target echo. The target’s intercept detector is the warden, and the distortion due to the two-way propagation observed at the platform is seen as the warden’s attack. The cover is the ambient environmental sound, and the hidden message is the transmission from the platform, the two together make up the stego. Another critical difference between image/audio steganography and underwater sonar is the variability of the cover. Underwater ambient sound is constantly evolving stochastically due to numerous and diverse sound sources. Many of the steganographic methods developed for image and audio files are devised under the assumption that the cover digital file does not change significantly once it has been created. Approaches such as LSB steganography rely on the fact that most of the cover is intact while only a small portion of the file is intentionally and specifically altered with a sophisticated encoding scheme. In sonar, the particular background sound heard at the time of transmission from the platform will be different from the background sound heard when the transmitted waveform impinges on the target. The transmitter cannot predict the background sound that will be heard at the target to make subtle changes where it would best hide them. The statistical characteristics of the ambient sounds may be similar, but they are highly unlikely to be a recorded copy of the other. Steganographic methods used in such scenario as spread spectrum [58] can improve robustness, but there is an inherent correlation between survivability and detectability of the embedded message [18]. While minimizing the overall induced change to make the stego less conspicuous is the key idea for most information hiding techniques, a steganographic scheme without considerations for the cover in which it is hiding can be fragile and may have limited hiding capacity [59]. Existing methods for covert radar and sonar, also referred to as low probability of intercept (LPI), such as direct-sequence spread spectrum and frequency hopping suffer from this problem.

38 4.1.1 Spread Spectrum

Conventional sonar systems employed a narrowband pulsed waveform for detection and tracking [15]. Initial intercept detectors, therefore, employed narrowband filters to efficiently detect and track these waveforms. Knowing the structure of potential intercept detectors, spreading the waveform energy in the spectral domain is the natural path to evade detection. While there are several approaches for spectral redistribution of waveform energy, most of them are categorized as either frequency hopping or direct-sequence spread spectrum. Hollywood actress publicly known as Hedy Lamarr (or Hedy Kiesler Markey) and composer George Antheil are considered the inventors of the first frequency hopping spread spectrum communication method as the first patent awardees in 1942 [60]. Entitled “Secret Communication System”, it was intended for remote torpedo control. The major advantage of frequency hopping method is its robustness against potential jamming attempts to disrupt covert communications. However, since it is still a sequence of narrowband waveforms with varying carrier frequency it is relatively easy to detect compared to direct-sequence spread spectrum [61]. Its application to steganography and sonar include [24,62–65]. Direct-sequence spread spectrum is performed by modulating the message signal with pseudo noise (PN) sequence to achieve the bandwidth close to that of the PN sequence itself. The resulting noise-like waveform and its spectral characteristics make this approach widely accepted for communications [25,66–68] as well as image and audio steganography [31,55,69–71] Other applications in radar and sonar use noise-like chaotic waveform without modulating a message-bearing signal [21,72–74]

4.2 Adaptive Steganography

Spectral spreading of the waveform energy is an effective scheme to avoid detection for many applications. However, it may not be the most secure method in all envi- ronments. To use animal camouflage as illustrative examples, the striping pattern of zebras as seen in Figure 4.1 is understood to be effective against its predator and its parasite for its visual pattern causing motion confusion or camouflage [75,76]. However, this pattern is unchanged regardless of where it resides, making its effec- tiveness as camouflage variable depending on the background. Furthermore, it may

39 be effective against some of its predators, but not against humans, who can clearly see them stand out from the background. On the contrary, octopus can alter its skin pattern and color to match its surroundings with high fidelity, rapidly [77]. Figure 4.2 shows an octopus and compared to it camouflaged into the background which is very plausible in the given environment. The ability adapt to its surroundings quickly and to a wide range of environments is made possible due to its ability to sense color and the fine texture details of the background through chromatic aberration and u-shaped pupil [78].

Figure 4.1. The striping pattern of zebras, while effective against its predator and parasite, do not adapt to the environment, and are easily recognized by humans.

It is intuitively clear that the ability to understand the environment and adapting to it is critical to effective steganography. Spread spectrum approaches are effective due to its lack of exploitable structure and its similarity to unstructured noise. Arrival of spread spectrum waveforms appears as increased level of noise, which is plausible in many situations where it is natural for the noise level to fluctuate. The amount of natural fluctuation of the ambient noise level becomes useful information to exploit and adjusting the transmission level using this information can be considered a simple case of adaptive steganography. Adaptive steganography is an area that is only being investigated recently [27–29]. Stochastic and statistical steganography has been studied in [30–33], however, adaptive steganography for heterogeneous cover is still an active field of study. Underwater acoustic environment is highly non-stationary and heterogeneous. Furthermore, due to the cost and difficulty of collecting underwater data, existing

40 Figure 4.2. Octopus can blend into the environment with very high fidelity making it very difficult for its predators to recognize it. models for underwater sound are exploratory [34] and less sophisticated for purposes of steganography.

4.3 Epistemological Steganography

In abstract ideal settings with known cover models, discussions on model-based steganography provide insights on many elements of steganography including limits on security of steganographic systems and steganographic capacity, the amount of information that can be hidden without being detected [79]. The information- theoretic bound for steganographic security defined by Cachin as the Kullback- Leibler divergence, or the relative entropy, between the cover model and the stego model [35,80] is widely accepted in literature [19,81–86]. Results on steganographic capacity include the entropy of the cover model from an information-theoretic perspective [79], the square root law as a heuristic rule in practical steganography [87,88], and results similar to Shannon’s channel capacity [89–91]. The difference in these results can be attributed partly to the assumptions on the cover model. The indeterminacy as well as the temporal variability of the cover is a key factor of steganography that makes it possible and difficult in practice at the same time [84]. That is, if everything is known about the cover, small changes will be

41 detected and secure steganography is impossible. Gaps in knowledge about the cover is what makes hiding information in it, with plausible changes, possible and the amount of gap determines the amount of information allowed to be hidden. Epistemological approach for steganography has been suggested for practical implementation of steganography and the importance of modeling the knowledge about the cover and the empirical nature of such an approach is emphasized [19,85]. This perspective emphasizes the characterization framework of the cover and it points out that the security of real world implementation depends on the relative accuracy of the cover models of the hider and the seeker. The cover model estimated by the hider and the seeker may describe different aspects of the cover with different levels of accuracy. Given the true ideal cover model, P , consider estimated models, Pˆh and Pˆs by hider and seeker, respectively.

Some aspects of P may be described by both Pˆh and Pˆs, while there may be aspects of P only described by one or none of them, as shown in Figure 4.3.

푷 푷�풉 푷�풔

푷�풉 − 푷�풔 푷�풉 ∩ 푷�풔 푷�풔 − 푷�풉

Figure 4.3. Venn diagram of ideal cover model and estimate models by the hider and the seeker

Superiority of the estimated cover model will likely be an advantage and measur- ing the absolute accuracy of the estimated model is certainly of practical interest. However, with limited computational resources preventing perfect models for both

42 the hider and the seeker, investigating the subjective, or perceived, difference be- tween model and observation is an illustrative process for practical steganography, and is more of interest to this dissertation.

4.4 Steganography in Sonar

SONAR is originally an acronym for SOund Navigation And Ranging. Passive sonar is simply listening to the sound without any transmission from the platform, typically trying to detect sound made by marine vehicles. Active sonar involves sound transmission by the platform, and it is successful when the platform can detect and track the echo from targets of interest. A constraint on covertness, or steganographic security, is imposed to the design of the transmit waveform so that it is difficult to distinguish from the background environmental sound. Passive sonar may be more secure, but estimating the range to the target with passive sonar is significantly more unreliable and may require maneuvering for triangulation [92,93]. As the literature in steganography points out, steganographic security is achieved by understanding the cover and utilizing the knowledge for hiding the messages [19]. In the context of sonar, characterizing the underwater environmental sound and designing the transmission waveform that is difficult to distinguish from it aligns well with this approach. One experienced in sonar recognizes the diversity of underwater acoustic envi- ronment. Depending on the sea state, water depth, sound velocity profile, bottom type, and the distribution of marine biology, the acoustic background can vary sig- nificantly from one environment to another. Conventional sonar tracking methods were designed heavily based on the knowledge on target scattering and underwater acoustic propagation, which may be less affected by environmental changes. Once steganographic security becomes one of the considerations, environmental charac- terization and modeling can play a crucial role in determining the security of the transmission waveform. Heterogeneous cover may provide higher security for the hider since it is more difficult to model compared to homogeneous cover [84]. The hider may be at an advantage when both parties are unfamiliar with the cover. Adaptive steganography is necessary for heterogeneous and non-stationary cover such as the underwater environment. For example, deep water acoustic environment has been known to be

43 well modeled with Gaussian distributed noise with a spectrum known as the Wenz curve shown in Figure 4.4 [94, 95]. However, in shallow water, the environment is more impulsive with heavier tails that are due to diverse sources of biological sounds such as a swarm of snapping shrimp, and hydrodynamic activities such as rain drops and breaking waves from the surface and bottom interaction effects. Prior work on covert sonar [21] assumed chaotic transmission waveform and energy detector for the intercept detector, while the platform employed a matched filter. The background was assumed to be Gaussian. With feasible range defined as the range at which the platform and the target achieve equivalent detection performance, it was shown that feasibility was only limited to very close range and therefore impractical for typical marine vehicle encounter. The main conclusion was that the matched filter processing gain advantage at the platform against the energy detector at the target is not enough to overcome the two-way propagation loss compared to the one-way loss seen at the target. To overcome the difference in propagation loss the waveform duration would have to be impractically long. Follow-on work [74,96,97] showed that if the background is more heavy-tailed, the feasible range can be extended. For all of these works, ambient background models were relatively simple with a few parameters such as variance and kurtosis and were assumed to be spectrally white. The focus of these works were on detection performance characterization of energy detector and matched filter. More sophisticated models may be capable of capturing more details, and this dissertation focuses on characterizing the time-dependent evolution of the underwater ambient sound. For example, breaking waves on the surface can cause quasi-periodic fluctuations in sound level and this can be modeled with Gaussian noise with time-varying variance. Another example may involve a swarm of snapping shrimp [98]. The clicks generated by the shrimps’ claws can occur at different distances from the location of observation with various intensities by different individual shrimp. The distribution of shrimp can be modeled as a Poisson point process and the collection of clicking sounds arriving from each shrimp can be modeled as a random process with parameters such as swarm density. The underlying idea for the proposed modeling approach is to capture the time- evolving patterns that are seemingly random and to exploit the pattern for designing the steganographic transmission waveforms. The steganographic security of the waveforms can be computed by measuring the Kullback-Leibler divergence between

44 Figure 4.4. Wenz curve. It describes the pressure spectral density levels of underwater ambient noise from weather, wind, geologic, and commercial shipping

45 the cover model and the stego model. As will be presented in the next section, the performance of potential intercept detector can be linked to Kullback-Leibler divergence [99].

4.5 Steganographic Security of Sonar Waveform

Let P be the model of the underwater acoustic background, and let S be the model for platform’s transmission waveform. The model for the sum of the two, received at the target, denoted as Q, is the model of underwater ambient acoustic background perturbed by the platform’s transmission waveform. A realization of each these models, as a function of time, t, is expressed as,

q(t) = p(t) + s(t). (4.1)

The quantity D(P 1kP 2), Kullback-Leibler divergence (KLD) or the relative entropy between P 1 = P and P 2 = Q, measures the probabilistic distance between the two models. In steganography literature, this quantity is widely used as the steganographic security [19, 35, 80–86]. KLD is shown to be linked to the error probabilities in the context of hypothesis testing. By the Chernoff-Stein lemma, when the type I error is fixed, it is shown that the type II error is exponentially small, with an error exponent equal to D(P 1kP 2). Detailed proof is shown in [99]. It is shown that relative entropy typical sequence set achieves this error exponent and any other sequence of sets have larger error exponent. The lower bound for type II error is shown as (4.2). A practical interpretation of this result is that given an upper bound on the type I error, α < α, the lower bound of type II error, β, is expressed as (4.2), where n is the number of independent samples.

−n(D(P kQ)+α) β > (1 − 2α)2 (4.2)

In the context of intercept detector performance, type I error is false alert and type II error is missed detection, i.e., PMD ≡ β. Since this is the lower bound of

PMD for any detector, the steganographic security of the random process S can be evaluated regardless of the type of intercept detector the target might employ. By having a measure with respect to the model for the ambient background, one can avoid relying on particular detector structures for assessing the level of security.

46 With a desired lower bound on probability of missed detection, β, an expression for the steganographic security constraint is given as

PMD ≥ β. (4.3)

In the search for transmission waveform model that results in higher PMD to meet the desired steganographic security constraint, a number of practical factors should be considered. The signal, transmitted waveform from the platform, received at the target will be attenuated due to underwater propagation (see section 1.2.2). This attenuation is frequency-selective and range-dependent, which will be approximated as a scaling factor, σT , for simplicity of the analysis. Therefore, (4.1) becomes

q(t) = p(t) + σT s(t), σT ≥ 0. (4.4)

The energy of s(t) seen at the target is summarized in σT and the model S describes only the type of the waveform. For example, if the waveform is a constant- envelope phase modulated sinusoidal waveform, the carrier frequency and the phase information are described by the model while the received level at the target is described by σT . The received level at the target increases monotonically with the transmission level, or the source level (SL), at the platform. Also, as σT increases D(P kQ(σT )) increases and PMD decreases as

−nD(P kQ(σT )) PMD ∝ 2 (4.5)

As σT → 0, there is no transmission, or all transmitted energy is lost before reaching the target, making Q → P and PMD → 1. Alternatively, as σT → ∞, then PMD → 0. However, the rate at which PMD falls depends on the waveform type, determined by S.

This level, σT , as seen in section 1.2.2, is a function of the range to the target, which is an unknown for sonar tracking. Also, although the sonar equations describe the general relationship between the intensity levels seen at the target and the platform, the transmission loss, TL, has a high variance due to the random nature of underwater propagation. Therefore, it is difficult to control the strength of waveform observed at the target.

47 The rate at which D(P kQ(S, σT )) increases with σT , expressed as

dD(P kQ(S, σ )) T , (4.6) dσT or the logarithm of (4.6), d log D(P kQ(S, σ )) T , (4.7) dσT is practically useful for understanding how sensitively the steganographic security changes with waveform intensity. While the slope (4.7) may not be constant over ∗ σT , the optimal transmission waveform process model S can be defined as the random process that yields the lowest increase rate of log D(P kQ(S, σT )) with respect to σT .

d log D(P kQ(S, σ )) S∗ =∆ argmin T (4.8) S dσT A potential intuitive solution is to mimic the environment and use P itself for the waveform model, resulting in

0 q(t) = p(t) + σT p (t). (4.9)

Note that p0(t) is a realization from the same model P , but it will be uncorrelated with p(t) since it is unlikely for the platform to coherently predict p(t) when p0(t) arrives at the target. Given P as the choice for S and with the steganographic security constraint, (4.3), the upper bound for σT and the associated upper bound for the transmission source level, SL can be determined.

−n(D(P kQ(σT ))+α) PMD(σT ) > (1 − 2α)2 (4.10) 1 D(P kQ(SL)) ∝ D(P kQ(σ )) ≤ − log (P (σ )) (4.11) T n 2 MD T

In practice, this source level constraint can serve as the guideline for determining a sequence of transmission schemes to prevent inadvertently exposing the platform by gradually increasing the source level after initially transmitting at a very low level [96]. It is also instructive to reverse the problem and consider the cover model P

48 itself as the solution of S∗, which is mimicking the background, and find the cover model structure for which the defined steganographic security measure is optimized. Since mimicking waveforms are one of the most intuitively natural waveform to consider for steganographic waveforms, investigating under which condition this intuition holds will provide a valuable lesson. Another reason to consider this approach is the difficult nature of solving the former problem. Given a cover model, it is found that determining the optimal embedding function can be an NP-hard combinatorial optimization problem [35,84, 100]. Given the high complexity of empirical cover models, finding analytic solutions or performing numerical optimization is beyond the scope of this dissertation, especially with the added complexity with the unknown σT .

49 Chapter 5 | Background Sound Modeling

Modeling efforts for underwater acoustic environment has been driven by its potential applications as well as academic interests to understand marine life as well as dynamic mechanisms by which sound is generated and propagated. Various models characterize different aspects of underwater sounds some of which are considered noise defined as undesired sound [34]. A number of comprehensive overviews of underwater ambient sound include diverse sources of man-made and natural phenomena [34, 94, 95, 101, 102]. More detailed analyses of regional ambient sound are performed from long-duration observations [103–107]. Other detailed characterizations of ambient sound are done with specific applications in mind such as underwater detection [108, 109], acoustic communications [110–112] and energy harvesting [106]. Deep water ambient sound statistics in the absence of distinct near-field sources are known to be well modeled with Gaussian amplitude distribution [34,113] and with spectra known as the Wenz curves as shown in Figure 4.4. Shallow water ambient sound is more impulsive and dynamic with water surface interactions, e.g., rainfall, shipping, and breaking waves due to winds [105,114,115], and marine life activities such as snapping shrimps [116]. The temporal and spatial statistics of these phenomena have also been characterized [117]. Although most of the literature on underwater ambient noise find it to be non-stationary [34,113,118] , using tools from nonlinear dynamic systems analysis and chaos theory [119–121], the chaotic nature of ocean ambient sound and its predictability has been studied and it is found to be somewhat predictable [110, 122,123] and has been applied to various applications [124,125]. From the perspective of steganography, any ambient sound is of interest since

50 it is the cover in which the transmission waveform intends to hide under. The models for underwater ambient sound, or in some cases considered noise, contains knowledge of the cover which will aid in generating plausible sounds that can be used for steganographic sonar. The result that it is predictable to a certain extent indicates there is room for a good enough model to be useful for underwater steganography. For sonar detection and tracking, information on seasonal variations of ambient sound level may not be useful for designing plausible sonar waveforms since its timescales are too long. The timescales involved with these operations are on the order of seconds or minutes for a few transmissions, maybe hours at most when considering the batch track detection approach discussed in Chapter 3. However, existing models typically measure the sound level by taking the average of the recorded sound for a few seconds or minutes, repeated at hourly or daily intervals [101,107,113]. A desired sound model is one that captures the short timescale structures of the background sound relevant to sonar detections, which will likely be attempted by potential intercept detectors. The ability to generate plausible temporal evolution of short sound segments is critical to deceiving potential intercept detectors, therefore a model structure that accommodates this is crucial. Finally, as presented in Chapter 4, the steganographic security of sonar waveform is evaluated via the Kullback-Leibler divergence (KLD) between the cover model and the stego model. Being able to measure KLD is another important consideration for selecting the model structure. Considering these factors, a modeling framework that lends itself to stegano- graphic security evaluation and sonar waveform design is the hidden Markov model (HMM). Due to its ability to capture temporal patterns, HMMs were applied to speech [126], handwriting [127], and gesture [128] recognition and bioinformatics for protein sequencing [129]. In this dissertation, each state of HMM represents a distinct type of sound, and the emission density function captures the magnitude distribution for each state.

51 5.1 Time Series Modeling

Phase space representation is chosen in this work as a method to capture the information and structures of waveforms on short timescales. For speech signals short-time Fourier analysis is used to characterize the evolution of vocal harmonics and it is a suitable choice since speech sound is well modeled as a combination of sinusoids. However, underwater environment consists of not only periodic sounds but also sounds that are impulsive and transient due to natural events and marine life activities such as snapping shrimps. Phase space representation does not assume periodicity of these waveforms and is able to preserve the information without distortions due to inapt assumptions. As an illustration of the underwater , consider the ambient sound being recorded through a at a certain location. Sound from diverse sources from different distances are all impinging on it to make up the overall sound it receives. Sound waveforms from similar sources exhibit similar and consistent shapes. For example, a click from a snapping shrimp would sound similar to other shrimps’ clicks. As another example, sound created by one rain drop on the water surface would also be similar to other rain drops. The sound generated by natural events arriving from a long distance tend to have smaller magnitude than that from a short distance. Also, there would be more of these events at longer distances than those close to the hydrophone. This observation has also been made and modeled in [34,116,117]. Many of these sounds are transient and distinctly identifiable due to the physical mechanism that governs the sound generation. Phase space time series analysis has been used extensively in dynamical systems analysis for detecting anomalies [130, 131], improving SNR of brain signal [132], speech recognition [133], and body movement recognition [134]. Measurements on dynamical systems exhibit regular patterns in the phase space and small deviations, for example, due to mechanical fatigue or failure are easier to detect and can even be predicted to a certain extent [131]. The phase space representation of time series is done by sampling at a fixed interval, TS, for de samples as (5.1).

T y(n) = [x(t0 + n), x(t0 + n + TS), ..., x(t0 + n + (de − 1)TS)] , (5.1)

52 and the time evolution of y’s is given by y(n) → y(n + 1), and each segment, y(n), is assigned a symbol for sequence analysis. [135] discusses the process in detail of how to determine TS and the minimum embedding dimension, de. This process is intended to use the fewest number of samples while retaining as much independent samples to extract useful information from x(t). The consensus in the literature is that the most useful TS is the first local minimum of the average mutual information of x(t), and de is determined by observing the behavior of nearest neighbors and increasing its value until there are no more false nearest neighbors. False neighbors of y(n) are those whose distance to y(n) increases more than a specified threshold when de increases from d to d + 1.

5.1.1 Time Series Clustering

Prior to symbol assignment, similarly shaped segments are grouped to form the basis of sound segments via time series clustering using a distance metric devised for time series analysis similar to the motion similarity measure developed in Chapter 3. Figure 5.1 shows segments of shrimp clicks from a hydrophone recording near a swarm of snapping shrimp from Key West, FL. These segments are clustered and assigned a unique symbol for each cluster. For clustering, a distance metric is defined as

d(a, b) =∆ 1 − s(a, b). (5.2) where a and b are normalized by the maximum magnitude within the segment in order to capture only the shape of the sound, and as in Chapter 3, s(a, b) is defined as

( de ) ( de ) ∆ 1 X 1 X s(a, b) = max R(ai − bi − γ) = max R(ei − γ) . (5.3) γ γ de i=1 de i=1

However, for sound pressure time series similarity, γ can be set to 0 since acoustic pressure signal oscillates around the ambient mean pressure level. Therefore (5.3) is simplified to de de ∆ 1 X 1 X s(a, b) = R(ai − bi) = R(ei). (5.4) de i=1 de i=1 The choice R(t) = 1/(1+(ct)2) is made to ensure the distance metric (5.2), expressed

53 Figure 5.1. Segments of recording of snapping shrimp. The direct path arrivals are aligned, and the second arrival are reflections off the water-air interface. as de ∆ 1 X 1 d(a, b) = 1 − 2 2 , (5.5) de i=1 1 + c (ai − bi) is a normalized quantity that takes values between 0 and 1, and the allowed mismatch can be adjusted with c, the sample-by-sample tolerance. Given all of the sound segments from the observed ambient sound, the distance between each pair of segments are computed. Although there are many approaches for clustering with pair-wise distance, the main objective of clustering for this work is to find similar sounding segments within a specified threshold, d. Therefore, a cluster is formed when the pair-wise distances of all of the segments are less than

2d. An illustration of the clustering approach is shown in Figure 5.2. Each cluster is represented by the center of the circle, or the segments with the minimum average distance to other sound segments within the circle. The set of center segments form the basis set of sound segments representing the set of symbols, or the states, of the cover model. In Figure 5.2, parameter c in (5.5), which is the sample-by-sample difference tolerance, effectively adjusts the distance between each points, and the clustering threshold d determines the cluster radius.

54

휖푑

Figure 5.2. An illustration of the clustering topology. Each point represents a segment and a circle represents a cluster with radius d. Segments within the a circle are assigned the same symbol. Segments not included in any circle is a assigned a unique symbol.

Therefore, their values play a role in determining the number of clusters that are formed.

Larger c and d results in smaller number of clusters and requires fewer param- eters to describe the cover model. However, this may be at the cost of lumping different types of sound segments into one, therefore losing the ability to accurately synthesize plausible cover sound from the model. On the other hand, smaller c and d would generate larger number of clusters resulting in more parameters to estimate. Although the fidelity of synthesized cover sound may be higher, since small perturbations from the same type of sound may become separate basis seg- ments it could effectively capture noise as part of the model. This also results in sparser stochastic matrix and emission function which would require more data to estimate the HMM parameters reliably. This phenomenon is also known as the curse of dimensionality [136]. In essence, c and d can determine the complexity of the model.

55 5.1.2 HMM Parameter Estimation

After time series clustering and assignment of symbols, the ambient sound is represented as a sequence of symbols with associated magnitudes. Characterization of a hidden Markov model is done by estimating the stochastic matrix and the emission probabilities. While a typical HMM formulation considers that the state transitions are hidden and that only the output values are observed, in this work the states are considered observable, i.e., there is no uncertainty in the type of sound segment given the state symbol other than the magnitude. Therefore, the symbol sequence can be considered a realization of a Markov chain of which parameters are estimated by counting the number of state-to-state transitions [137]. Since each short time segment, associated with each state symbol, is normalized by the maximum magnitude within the segment, the emission density function is a PMF (probability mass function) of the quantized maximum magnitude of the sound segment. The estimated model, Pˆ , consists of a set of basis sound segments, the stochastic ˆ ˆ matrix, PN×M , and the emission matrix, EN×L. These are considered to be estimates of the underlying true model P , associated with true stochastic matrix, P , and the true emission matrix, E, respectively. Although the stochastic matrix only describes the first order transitions, a sparse stochastic matrix with some transitions with probability of 1 implies smaller number of elemental segments with different lengths. Therefore, the effective timescale the model captures can be longer than the unit segment length, de. By connecting the transitions with probability of 1 and treating the connected sound segments as the elemental sound, these elemental sounds with different lengths can be identified. An example histogram of elemental sound length from a snapping shrimp recording is shown in Figure 5.3. The estimation process of the true underlying stego model, Q, utilizes the structure of the cover model discussed thus far. Recall the stego process (4.4),

q(t) = p(t) + σT s(t), σT ≥ 0. (5.6)

The observed time series, q(t), is converted into the symbol sequence via the phase space representation, then by comparing each segment to the basis set of sound segments and assigning the symbol with the smallest distance. With small

σT , the perturbations caused by the transmitted sonar waveform, p(t), should be

56 Figure 5.3. Histogram of effective segment durations characterized by the modeling scheme.

small enough to be within the tolerance determined by c in (5.5). Increasing σT will begin to cause some of the segments to be not similar to any of the existing sound segments described in the model. Although the model does not accommodate adding more basis sound segments to capture the emergence of new types of sound, this effect will be reflected as changes in the estimated HMM parameter values. Maintaining the structure of the model of Pˆ and observing the change in the estimated parameter values makes it feasible to measure the Kullback-Leibler divergence (KLD) between Pˆ and Qˆ as in [138,139],

N N L ! ˆ ˆ X X pˆnm X eˆnl D(P kQ) = πn pˆnm log + eˆnl log . (5.7) n=1 m=1 qˆnm l=1 rˆnl where π’s are the steady-state distribution, and qˆnm and rˆnm are elements of ˆ ˆ QN×N , the stochastic matrix, and RN×L, the emission matrix, respectively, for the estimated model Qˆ. As in the case for the cover model, estimating the HMM parameters for the stego model involves counting the number of transitions between different symbols.

57 However, it is possible that some of the symbols are not observed in the data resulting in empty rows in the stochastic matrix, which is not valid. The parameter cannot be left out in order to compare the estimated model with the background model and compute the KLD. A regularization step is required to remedy this situation. A potential approach is to use the maximum entropy solution, the uniform distribution, for the unobserved symbol. This is done based on the assumption there is no information observed to use for distribution estimation. The only available information is that the symbol in question exists in the cover model as one of the basis segments. Given no observation of this symbol, the estimated probability of transitioning to other symbols should be equally likely for all of the symbols. This regularization method will likely cause the estimated model to deviate from the underlying true stego model, Q, in the direction of higher entropy given the fact the stochastic matrix is typically a sparse one associated with low entropy. In Figure 5.4, Qˆ0 is moving away from the initially estimated Qˆ in the direction of P MaxEnt, a model with the same basis set, but with a stochastic matrix, Pmax, that has a uniform distribution for all states. This regularization approach results in a conservative estimate in the context of steganography in that it will likely overestimate KLD, causing the estimated steganographic security to be worse. However, the inconsistency of this approach is the fact all of the other compo- nents of the cover model is still being utilized. The idea of maintaining the model structure is to use the perspective that was constructed in the cover modeling process, and projecting the observed data onto the existing framework. Therefore, it is reasonable to assume the parameter values for the background model are known and to utilize them. The second approach and the one adopted in this work is to use the values in the cover model for the unobserved symbols. It should be noted that this is an optimistic estimate for the platform in the context of steganography since it tends to underestimate KLD since Qˆ will shift toward Pˆ , therefore overestimating the steganographic security. The radii of the circles in Figure 5.4, D(P ||Pˆ ) and D(Q||Qˆ ) are the estimation errors, and for ergodic processes these will likely converge to 0 as Pˆ → P with infinite observation time [137,140]. However, it should be noted this convergence only implies that the estimated HMM parameters converged to the true values given the model choice, not that the true complete model for the cover has been

58 푃푀푎푥퐸푛푡

푄 ′ 푃 푄

푃 푄

Figure 5.4. A simplex illustration of the true model and the estimated model for both cover and stego, and the maximum entropy model. achieved, illustrated in Figure 4.3. Another aspect to consider for HMM parameter estimation is the numerical stability of estimated KLD with sparse stochastic matrices. It is computed as (5.7), and this quantity may be undefined when pnm 6= 0, qnm = 0 for any (n, m). Even with the regularization steps taken as discussed, this is not an unlikely condition for sparse stochastic matrices with p(t) affecting the HMM parameters.

Imagine a maximum entropy model, P MaxEnt, with the same basis set, but with a stochastic matrix, PMaxEnt, that has a uniform distribution for all states.

Small perturbation of the estimated parameters toward PMaxEnt can prevent the numerical instability problem by ensuring qnm 6= 0 for all (n, m). From Figure 5.4, it can be thought of as moving the estimated stochastic matrix, Qˆ of Qˆ, away from

59 the edge of the simplex. Using the Jensen-Shannon divergence defined [141] as

1 1 1 JSD(QˆkZˆ) = D(QˆkM) + D(ZˆkM),M = (Qˆ + Zˆ), (5.8) 2 2 2 where Zˆ is the initial estimated stochastic matrix with potential numerical instability for KLD. The final estimated stochastic matrix, Qˆ can be determined as,

1(2i − 1 1!i−1 ) Qˆ = Zˆ + P . (5.9) 2 2i−1 2 MaxEnt

The value of i is increased until JSD(QˆkZˆ) is less than a specified threshold,

jsd. The Jensen-Shannon divergence is symmetric and it is ensured to be stable 1 ˆ ˆ ˆ ˆ since M = 2 (Q + Z), where Q is a weighted sum between PMaxEnt and Z, therefore, mnm > 0 for all (n, m).

5.2 Song Modeling and Steganography

The background sound used to demonstrate the modeling approach is the audio track of a video entitled “Boots and Cats” by Robert Clouth [142], a sound designer based in Barcelona, Spain. This sound consists of a few number of distinct elements repeated with small variations, which makes it an apt candidate for the experiment. The duration of the recording is 100 seconds which was partitioned into 30ms segments. Using the time series distance metric and the clustering method described in this chapter with c = 5 and d = 0.1, the segments were clustered into 1562 symbols. This is a generative model that can be used to sample random realizations, which preserves the rhythm and local temporal characteristics of the original sound. This can be considered a qualitative affirmation of the modeling capability of the presented approach. Although the short timescale temporal features are captured with reasonable reproducibility, the long timescale structure of the sound is not present in the random realization. While some transitions with certainty effectively captures some temporal patterns longer than the unit segment length as shown in Figure 5.3, capturing the subtle variations of the repeated theme that gradually intensifies is a characteristic beyond the capability of the presented modeling

60 approach. However, this modeling framework does make it feasible to measure the proba- bilistic difference between the original sound and the altered sound via estimated Kullback-Leibler divergence (KLD). This is useful for quantitatively evaluating the performance of audio steganography without resorting to evaluating the per- formance of any specific anomaly detection method. It should be noted that one of the potential shortcomings of using KLD with this model framework for as- sessing steganographic security is the deficiency of the model. A superior model that captures longer timescale patterns may provide a more accurate assessment of steganographic security if measuring the distance between two models is also feasible. Considering an active sonar scenario where the transmitted waveform is added to the background, or an additive audio steganographic scenario, the steganographic security of such waveform will degrade as the relative magnitude, or the signal to noise ratio (SNR), of the waveform increases. However, different levels of security may be achieved for various types of waveforms, even with the same embedding strength. ˆ With the estimated reference model, P ref , samples of the cover sound and stego ˆ sound are generated as (5.6), with varying σT values. The estimated cover model P ˆ and the estimated stego model Q(σT ) are used to estimate the KLD with respect ˆ to P ref . Figure 5.5 is a comparison of expected lower bound of the probability of missed detection against waveform strength for two types of embedded waveforms, the mimic waveform and the white Gaussian chaotic waveform. ˆ ˆ For each set of samples, D(P ref kP ) is computed as the baseline, and for varying values of σT to represent the level of the waveform strength seen at the target, ˆ ˆ D(P ref kQ(σP )) is computed from which the lower bound of PMD is computed as ˆ ˆ D(P ref kQ(σT )) PMD ≥ 2 . The slopes of the curves of lower bound of PMD in Figure

5.5 show the loss in steganographic security as σT increases, for candidate waveform models Smimic and SGauss. It is also worth noting the PMD level of Pˆ, which may converge to 1 with longer observation time as Pˆ → P .

61 Figure 5.5. Steganographic security vs. Waveform strength

62 Chapter 6 | Steganographic Sonar

Practical application of steganographic sonar involves all of the elements discussed in the previous chapters. It also involves carefully reconsidering the assumptions made on signal processing methods, e.g., matched filtering as the preprocessing step, especially in more impulsive background sound environments. For example, for heavy-tailed noise environment, locally optimum detector can achieve higher detection performance than the matched filter discussed in Chapters 2 and 3. The processing gain of matched filter increases with waveform duration and the processing gain of energy detector also increases with integration time, for a given received signal level. However, the rates at which they increase are different due to the increase in variance of the test statistic of the energy detector [21]. Also, the performance of an energy detector degrades when the observation time is not aligned with the beginning and the end of transmitted waveform [96]. Another crucial element to consider is the difference in the strength of the observed waveform at the target and the received echo level at the platform. While the platform can potentially achieve greater processing gain with the information advantage of knowing exactly what was transmitted, the target has an energy advantage since the waveform has only propagated half the distance and therefore would be attenuated much less. In prior work, having taken many of these factors into account, the overall assessment on the feasibility of steganographic sonar is that it is only able to achieve acceptable detection performance against targets close in range with long waveform durations [21,96]. However, it is also noted that the feasibility increases in more impulsive environment where the variability of the background is relatively larger. In this chapter, a Monte Carlo simulation framework with synthesized time

63 series is presented to investigate the feasibility of steganographic sonar as well as to confirm the utility of using Kullback-Leibler divergence (KLD) as a measure of steganographic security. In particular, the background model is estimated from a snapping shrimp recording and this model is used as the reference cover model, from which instances of background sound are sampled. Various types of transmission waveforms are sampled from different waveform models, including the snapping shrimp cover model, and are added to the cover sample to make up the stego sample. The strength of the waveform is determined by the range to the target and model parameters are estimated from both cover and stego samples, as well as the KLD between them. The security measure via KLD is compared with the empirical performance of energy detector. Performance of the batch track detection approach is also evaluated by processing the received echo at the platform and comparing the estimated track with the true target track.

6.1 Matched Filter and Energy Detector

In the context of steganographic sonar, detailed analysis was performed on the decision statistics of matched filter and energy detector, the choice of processing methods for the platform and the target, respectively [21]. Among the assumptions made in this work, the ambient noise was white Gaussian, the platform used a white Gaussian chaotic waveform, and for comparison, the processing window lengths were assumed to be the same for both matched filter and energy detector. This dissertation investigates the cases when these assumptions do not hold. Matched filter is widely used for detecting known signals, especially in white Gaussian noise for its optimality as well as in other types of noise for its robustness [143]. In sonar, its performance begins to suffer as the target echo becomes distorted due to many factors including target motion-induced Doppler effect, frequency- selective attenuation during underwater propagation, and when the characteristics of the ambient noise deviates from white Gaussian. The processing gain for matched filter is primarily determined by the time-bandwidth product of the waveform [40]. For steganographic sonar, when the waveform energy is spread in time and the waveform duration is extended, due to the change in target velocity within the waveform duration the target echo will likely be distorted beyond what can be corrected with a constant Doppler-compensated matched filter. The batch track

64 detection approach discussed in Chapter 3 hypothesizes the entire evolution of target acceleration and velocity, and processes short segments, or scans, of the raw sonar recording with the associated time-varying Doppler compensated matched filters, and coherently integrates the matched filter outputs across multiple scans. As seen in Chapter 3, narrow 3dB autocorrelation width makes the coherent integration process more sensitive to small motion differences. This width is inversely proportional to the bandwidth of the transmission waveform [40]. The value of the integrated peak depends not only on the autocorrelation width, but also on the similarity between the true target motion and the most similar trial motion sampled from the motion model presented in Chapter 3. Therefore, a simple target motion has a higher probability of yielding a larger integrated peak value than what a complex target motion would produce. While one does not have control over the complexity of target motion, finding a trial motion similar enough to the target motion can be done in two different ways. One is by increasing the number of trial motions possibly from a more complex motion model, and the other is by increasing the autocorrelation width of the transmission waveform to increase the tolerance of motion dissimilarity. The latter effectively increases the chance of aligning the target echoes, which can be achieved by reducing the bandwidth of the waveform, but this needs to be done with caution to avoid detection by the target. The motion similarity distribution is approximately modeled as a truncated exponential distribution, as was shown in Figure 3.10. This distribution function may be associated with a positive λ value, as seen in Figure 6.1 for which, given a finite number of trial motions, it is less likely to find a motion that is similar to the target motion. Since the motion similarity, as defined in (3.11), depends on wc, the 3dB width of the autocorrelation function of the transmission waveform, this effect is amplified when the autocorrelation width is small resulting in greater value of λ. Figure 6.2 compares the power spectral density of synthetic white Gaussian, colored Gaussian with three different spectral exponents α = 1, 1.5, 1.9, and a recording of snapping shrimp. Figure 6.3 shows the comparison of noise-free autocorrelation function of the same data used for Figure 6.2. Although the processing gain of matched filter is proportional to the time-bandwidth product of the waveform, robust coherent integration of target echoes relies on the individual correlation function widths being wide enough to overlap despite small motion

65 3

2.5

2

(s) 1.5

S f

1

0.5

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 s

Figure 6.1. Distribution function of motion similarity for a complex target motion with associated parameter λ = 2. differences. Therefore, there is an inherent trade-off between individual processing gain and the overall processing gain. Processing gain of an energy detector is maximized when the integration window is matched with the duration of the signal it is searching for [96]. Since the target is assumed to be unaware of the details of the transmission waveform from the platform, it is reasonable to assume a mismatch between the platform’s waveform duration and the integration window length of the energy detector. If the platform employs a form of on-off keying transmission scheme, the mismatch can be greater. In addition to the integration window length and the underlying amplitude distribution, the spectral characteristics of the ambient sound, without any pre- processing steps such as whitening, can also affect the performance of the energy detector. Figure 6.4 shows a comparison of the distribution function of energy detector decision statistics for white Gaussian process, pink Gaussian process with α = 1 and other colored Gaussian processes with α = 1.5, 1.9, and snapping shrimp, with 1 second integration time. Snapping shrimp case is excluded from this figure due to its much higher variance which makes the interpretation of the illustration difficult.

66 Power Spectral Density -60 White Noise Pink Noise, ,: 1 -70 Pink Noise ,: 1.5 Pink Noise, ,: 1.9 Snapping Shrimp -80

-90

-100 PSD (dB/Hz) PSD

-110

-120

-130 100 101 102 103 104 Frequency (Hz)

Figure 6.2. Power spectral density of white noise, pink noise with α = 1, 1.5, 1.9, and snapping shrimp recording

Autocorrelation Function 5 White Noise Pink Noise, ,: 1 Pink Noise ,: 1.5 Pink Noise, ,: 1.9

0 Snapping Shrimp ) (dB) )

= -5 Rxx(

-10

-15 -10 -8 -6 -4 -2 0 2 4 6 8 10 " = (ms)

Figure 6.3. Zoomed in peak of autocorrelation functions of white noise, pink noise with α = 1, 1.5, 1.9, and snapping shrimp recording. The dotted line indicates −3dB.

67 Estimated PDF of Energy Detector Statistic 2 White Gaussian Pink (, = 1) , = 1.5 1 , = 1.9

) 0

ED

T

(f 10

log -1

-2

-3 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 T ED

Figure 6.4. Comparison of estimated PDF of energy detector decision statistics of white Gaussian and pink Gaussian (α = 1), and other colored Gaussian processes with α = 1.5, 1.9, for integration length of 1 second.

As shown in Figure 6.4, the decision statistic of an energy detector with a fixed integration length for signals with larger α value yields higher variance, which is determined by the peak of the short-time estimate of the autocorrelation function. Estimation with low variance requires sufficient number of independent samples. Within a fixed integration window, there are larger number of independent samples of white Gaussian process than what a pink Gaussian process contains. With longer integration window lengths, this effect becomes less prominent and the spectral shape becomes a less significant factor since there are sufficient number independent samples for pink Gaussian process as well. Based on the analysis on matched filter and energy detector, snapping shrimp background is chosen for investigating the steganographic sonar performance. The transmission waveforms considered are white Gaussian, colored Gaussian with varying spectral exponents (α = 1, 1.5, 1.9), and snapping shrimp sound itself. Details of the overall experiment are described in following sections.

68 6.2 Monte Carlo Simulation

The main parameters of the simulation are the source level (SL) at the platform and

the initial range to the target, Rinit. The observation time per scan is determined

by the maximum range considered for tracking, tscan = 2Rmax/c. Rmax considered in this thesis is 3km, and c is the speed of sound in water. The total observation

time across M scans is ttot = M × tscan. As inferred from the sonar equations in Chapter 1, range to the target determines the difference in the transmission loss

(TL) and the received signal levels, RLT at the target and RLP at the platform. The initial range to the target is randomly sampled from a uniform distribution

in the logarithm of Rinit. The support is bounded by the minimum initial range, 750m which makes the round trip time of 1 second, and the maximum range 3km with the round trip time of 4 seconds. The target track starts from the sampled range and travels according to the sampled motion relative to the platform. The source level, SL, is randomly sampled from a uniform distribution between 6dB

below the level that makes RLT = 0 when the target is at the minimum initial

range, and 6dB above what makes RLT = 0 when the target is at the maximum initial range. These are summarized as (6.1) and (6.2).

Rinit ∼ Unif(log(750), log(3000)) (6.1)

SL ∼ Unif(15 log10(750) − 6, 15 log10(3000) + 6) (6.2) The snapping shrimp process is chosen as the background for the simulation. It is a realistic sound, and it is complex enough to have room for un-modeled part

of it, as shown in Figure 4.3 as P − (Pˆh ∪ Pˆs), which allows for demonstration of steganography. The background sound model is estimated with a recording from a snapping shrimp environment, Key West, FL, and samples of cover sound is generated with the estimated model. The energy of the background sound is

normalized to be the same for each trial duration of ttot. Five types of candidate transmission waveforms are considered - white Gaussian, pink Gaussian (α = 1), snapping shrimp, and two other colored Gaussian processes with α = 1.5, 1.9. Pink noise signal approximates the overall spectral shape of the background sound for the entire frequency band, and waveform with spectral exponent α = 1.9 which is close to brown noise, matches the mid-region of the

69 bandwidth of the background sound, and α = 1.5 is chosen as the mid-point in the spectral exponent between pink and brown. Having various types of transmission waveform scenarios provides insight as to how each of the processing steps, including energy detector, matched filter, batch track detection, perform in each case. For each trial, a sample target motion is drawn from the motion model presented in Chapter 3, and the transmitted waveform arrives at the target at times computed with the sampled target motion for each scan, or coherent processing interval (CPI). The speed of sound is constant at c = 1500m/s. The round trip time for the target echo received at the platform is also computed with this sampled motion. The same target motion is used for all waveform scenarios. Figure 6.5 is an example of the target motion with amplified target echo level for illustration. When processed with the exact same target motion, the target echoes are aligned for coherent integration across scans as shown in Figure 6.6.

Figure 6.5. Matched filtered data with a strong signal for illustration.

At the target, energy detector statistic is computed with three different integra- tion times - 1 second, the CPI which is equal to tscan, and the total observation time, ttot. The test statistic is the squared sum of the observed recording for these integration times. Its structure is not optimized to account for the fact the cover sound is deviated from white Gaussian, and the effect on the performance will be

70 Figure 6.6. Processed data with true target motion. observed. The transmitted sonar waveform is present the entire time at the level determined by the source level and the transmission loss. Model parameters are estimated for Qˆ , and KLD is computed for both the cover and the stego, with respect to the reference model from which cover sound realizations are sampled. Processing steps at the platform consists of matched filtering with the associated Doppler-compensation based on the trial motion, followed by the batch track detection (BTD) algorithm discussed in Chapter 3. The trial motions used for each scenario are the same for all five waveform scenarios to make a more direct comparison of the performance, and BTD produces an estimate track regardless of the integrated peak value. When the root mean square error (RMSE) of the estimated track is smaller than the specified threshold, R = cτ /2, it is considered a successful target track detection. The performance metrics, as discussed in Chapters 3 and 4, are probability of missed detection at the target and probability of track detection at the platform. The overall success of steganographic sonar is defined as when the energy detector at the target fails to detect the existence of the sonar waveform at a specified probability of false alarm rate threshold, F AR, and the RMSE of the estimated track is smaller than a specified threshold, R.

71 6.3 Results

Simulation results are organized mainly in three parts; steganographic performance, batch track detection performance, and steganographic sonar performance. The last is the overall performance that includes the first two. It will be shown that the two events are statistically independent and that their joint performance can be computed as the product of their individual performances. Some of the key parameters for the simulations are listed below.

Parameter Value Description M 200 Number of scans tscan 4 seconds Time per scan ttot 800 seconds Total processing time

NTM 1024 Number of trial motions

Rinit 750 ∼ 3000 meters Initial range to the target SL 38 ∼ 58 dB Source level at the platform

LED 1 second Integration time for energy detector

F AR 1/3 hours Threshold for false alarm rate

R 50 meters Threshold for RMSE of estimated track

For each trial of the simulation, the source level at the platform and the initial range to the target are randomly sampled from the respective distribution, and the received levels at the target and at the platform are determined by the sonar equations. Results will be shown both in terms of the received level, and in terms of the combination of the source level, SL, and the range to the target in logarithm with base 10, log10 Rinit. At the target, the mean estimated Kullback-Leibler divergence (KLD) for the cover realization of snapping shrimp background sound with the entire observation h i of duration ttot is E D(P ref kPˆ) = 0.6471. This value is not as small as is the Boots and Cats example presented in Chapter 5, which indicates the inherent complexity of the sound is higher. However, estimated KLD computed as the the difference between two measured KLD values, or ∆KLD,

" pˆ# Dˆ(PˆkQˆ) = E log = D(P kQˆ) − D(P kPˆ) (6.3) P ref qˆ ref ref

72 is used for practical evaluation of the steganographic security of transmission waveforms. Note that the expectation of the log likelihood is taken with respect to ˆ P ref , and that this estimate converges, D(PˆkQˆ) → D(PˆkQˆ) as Pˆ → P ref . The expected lower bound on missed detection probability computed from the estimated KLD between cover and stego as a function of the received level at the target is shown in Figure 6.7. White Gaussian process maintains its steganographic security higher than other types of waveforms achieving expected lower bound of missed detection probability of over 0.4 at 0dB received signal level. Snapping shrimp does not show the best steganographic performance as was expected as the mimicking waveform, but better than two of the colored Gaussian processes in terms of the estimated KLD.

Expected Lower bound on Missed Detection Probability 1 White Gaussian Pink (, = 1) , = 1.5 0.8 , = 1.9

Snapping Shrimp )]

0.6 MD

0.4 E[min(Prob

0.2

0 -10 -8 -6 -4 -2 0 2 4 6 8 10 Received Level (dB)

Figure 6.7. Expected lower bound of missed detection probability computed from the estimated Kullback-Leibler divergence of the cover and the stego for white Gaussian, snapping shrimp, pink Gaussian (α = 1), and other colored Gaussian processes with α = 1.5, 1.9.

In comparison, the performance of the energy detector with 1 second integration time with the false alarm rate set to be one every three hours, are shown in Figure

6.8. This result is using all of the 1 second samples during the ttot observation time which is M × ttot = 800 samples. A detection is made for the trial when at least

73 one of the 800 samples is detected. This is equivalent to using the maximum of all of the decision statistics as the trial decision statistic. This skews the distribution of the waveform present case to the right, causing abrupt complete separation of distributions beyond a thresholding RL value, between 7 and 8 dB for most waveform types. The effect is shown as the missed detection probability suddenly converging to 0 in all of the curves.

P of Energy Detector, L : 1 second MD ED 1 White Gaussian Pink (, = 1) , = 1.5 0.8 , = 1.9 Snapping Shrimp

0.6

MD P 0.4

0.2

0 -10 -8 -6 -4 -2 0 2 4 6 8 10 Received Level (dB)

Figure 6.8. Probability of missed detection of energy detector with integration time of 1 second. The stego for white Gaussian, snapping shrimp, pink Gaussian (α = 1), other colored Gaussian with α = 1.5, 1.9.

The dominating factor of the performance plots in Figure 6.8 is the variability of the cover itself rather than the waveform type, which is why the curves are all clustered together, especially in the low RL region. It is worth noting snapping shrimp and colored Gaussian with α = 1.9 are the ones that deviate from the cluster due to their own high variance, as illustrated in Figure 6.4 for the colored Gaussian with α = 1.9 and the support for distribution of the snapping shrimp case does not even fit on the horizontal axis limit of Figure 6.4. The following Figures 6.9, 6.10, 6.11, 6.12, 6.13 compare the expected lower bound of missed detection probability via estimated KLD and the measured missed detection probability of energy detector with 1 second integration time.

74 Energy Detector and Estimated KLD, Stego: White Gaussian 1 Energy Detector 2(-"KLD)

0.8 )]

MD 0.6

, E[min(P , 0.4

MD P

0.2

0 -10 -8 -6 -4 -2 0 2 4 6 8 10 Received Level (dB)

Figure 6.9. Expected lower bound of missed detection probability and the missed detection probability of energy detector with 1 second integration time of stego with white Gaussian process.

Energy Detector and Estimated KLD, Stego: Snapping Shrimp 1 Energy Detector 2(-"KLD)

0.8 )]

MD 0.6

, E[min(P , 0.4

MD P

0.2

0 -10 -8 -6 -4 -2 0 2 4 6 8 10 Received Level (dB)

Figure 6.10. Expected lower bound of missed detection probability and the missed detection probability of energy detector with 1 second integration time of stego with snapping shrimp process.

75 The curves for white Gaussian and snapping shrimp imply the steganographic security estimated with KLD seem to be in agreement with the measured missed detection probability of the energy detector based on a qualitative observation that both pair of curves show poorer performance for snapping shrimp than white Gaussian. However, in the colored Gaussian cases, while the energy detector performance is not affected by the change in spectral shape of the transmitted waveform, KLD is able to capture the change in the temporal characteristics. Since the cover sound model with the HMM framework can measure the change in the HMM parameters of the stego sound, the steganographic security estimated with KLD is showing higher sensitivity to the temporal characteristics than the energy detector with 1 second integration time.

Energy Detector and Estimated KLD, Stego: Pink Gaussian (, = 1) 1 Energy Detector 2(-"KLD)

0.8 )]

MD 0.6

, E[min(P , 0.4

MD P

0.2

0 -10 -8 -6 -4 -2 0 2 4 6 8 10 Received Level (dB)

Figure 6.11. Expected lower bound of missed detection probability and the missed detection probability of energy detector with 1 second integration time of stego with pink (α = 1) Gaussian process.

The performance of energy detector with longer integration length tscan is not

shown since it is very similar to that of 1 second case, and the case for ttot = 800 does not yield any missed detection. While this result is expected since processing gain of an energy detector does grow with time, it is difficult to assume the waveform length to be known at the energy detector. The batch track detection performance at the platform also shows difficulty for the snapping shrimp process as shown in Figure 6.14. This is likely due to

76 Energy Detector and Estimated KLD, Stego: Colored Gaussian, , = 1.5 1 Energy Detector 2(-"KLD)

0.8 )]

MD 0.6

, E[min(P , 0.4

MD P

0.2

0 -10 -8 -6 -4 -2 0 2 4 6 8 10 Received Level (dB)

Figure 6.12. Expected lower bound of missed detection probability and the missed detection probability of energy detector with 1 second integration time of stego with colored Gaussian process with α = 1.5.

Energy Detector and Estimated KLD, Stego: Colored Gaussian, , = 1.9 1 Energy Detector 2(-"KLD)

0.8 )]

MD 0.6

, E[min(P , 0.4

MD P

0.2

0 -10 -8 -6 -4 -2 0 2 4 6 8 10 Received Level (dB)

Figure 6.13. Expected lower bound of missed detection probability and the missed detection probability of energy detector with 1 second integration time of stego with colored Gaussian process with α = 1.9.

77 the high side lobes of the autocorrelation function shown in Figure 6.3, and due to the background being similar to the waveform itself making the matched filter gain insufficient to overcome the interference. This interference is severe enough to result in near zero probability of track detection at RL = −40dB when all of the other waveforms achieve detection probability of near one.

Batch Track Detection Probability 1 White Gaussian Pink (, = 1) , = 1.5 0.8 , = 1.9 Snapping Shrimp

0.6

Prob(TD) 0.4

0.2

0 -60 -55 -50 -45 -40 -35 -30 Received Level (dB)

Figure 6.14. Batch Track Detection performance vs. received target echo level.

The colored Gaussian process with α = 1.9 achieves the best overall performance, closely followed by white Gaussian process. It is interesting to note the performance of colored processes are not ranked by their spectral exponent. With relatively simple motion model with low complexity chosen for reasons of demonstration of feasibility, wider autocorrelation peak is not shown to be strictly advantageous. However, the fact that white Gaussian process, with the most narrow autocor- relation width, achieves less than perfectly reliable performance even in the high RL value region indicates the target motion is complex enough such that small motion dissimilarities cause misalignment of target echo energy. More complex target motion has potential to separate the performance of the four waveform types that are clustered in Figure 6.14.

78 The steganographic performance and the track detection performance are pri- marily determined by the received signal level, RLT , or the target echo level, RLP . The two factors that serve as the link between the two are source level, SL, and initial range to the target, Rinit. Since the simulation sampled the two parameters randomly in the SL × log10 Rinit space, this space is partitioned in both dimensions into smaller bins. Two marginal performance metrics, PMD, and PTD, and the joint performance metric, PSS, are measured for each bin.

The overall performance, PSS, probability of steganographic sonar is determined by counting the cases when the energy detector at the target fails to detect the transmitted waveform and the platform detects the target track with RMSE less than the threshold, R, then dividing the total number by the number of samples within the grid boundaries. Figures 6.15, 6.16, 6.17, 6.18, 6.19 show their respective steganographic sonar performance.

P , White Gaussian in Snapping Shrimp SS 1

55 0.8

50 0.6

45 0.4 Source Level (dB) Level Source 0.2 40

0 2.9 3 3.1 3.2 3.3 3.4 log Range (m) 10

Figure 6.15. Steganographic sonar performance of white Gaussian process in snapping shrimp background.

At false alarm rate of one per three hours, an arbitrarily chosen value for feasibil- ity demonstration, the simulation shows reliable steganographic sonar performance close to probability of 1 in some combination of SL and Rinit values. The overall

79 P , Snapping Shrimp in Snapping Shrimp SS 1

55 0.8

50 0.6

45 0.4 Source Level (dB) Level Source 0.2 40

0 2.9 3 3.1 3.2 3.3 3.4 log Range (m) 10

Figure 6.16. Steganographic sonar performance of snapping shrimp process in snapping shrimp background.

P , Pink Gaussian in Snapping Shrimp SS 1

55 0.8

50 0.6

45 0.4 Source Level (dB) Level Source 0.2 40

0 2.9 3 3.1 3.2 3.3 3.4 log Range (m) 10

Figure 6.17. Steganographic sonar performance of pink Gaussian process (α = 1) in snapping shrimp background.

80 P , Colored Gaussian (, :1.5) in Snapping Shrimp SS 1

55 0.8

50 0.6

45 0.4 Source Level (dB) Level Source 0.2 40

0 2.9 3 3.1 3.2 3.3 3.4 log Range (m) 10

Figure 6.18. Steganographic sonar performance of colored Gaussian process with α = 1.5 in snapping shrimp background.

P , Colored Gaussian (, :1.9) in Snapping Shrimp SS 1

55 0.8

50 0.6

45 0.4 Source Level (dB) Level Source 0.2 40

0 2.9 3 3.1 3.2 3.3 3.4 log Range (m) 10

Figure 6.19. Steganographic sonar performance of colored Gaussian process with α = 1.9 in snapping shrimp background.

81 region of reliable performance is where the pair of received levels at the target and at the platform are shown to achieve reliable performance jointly. While the region with high performance are roughly consistent across waveform types, the boundaries of reliable performance region do show dependence on the waveform type. As for the mimicking waveform, the snapping shrimp waveform in snapping shrimp cover, again, does not show promising performance while other types of waveforms show a similar patterns of performance. The most desirable source level, source level that achieves the maximum per- formance, increases with range to the target. However, the maximum achievable performance degrades with increasing range. In practice, it is important to un- derstand the reason for failure in the low performance regions. The [high source level/close range] region of the parameter space fails due to detection at the target, and the [low source level/far range] region of the parameter space fails due to either insufficient source level or insufficient number of scans, M. Figure 6.20 shows the mean received level (RL) of the waveform at the target and the target echo level at the platform, respectively. Note the difference in the color axis. Figure 6.21 shows the missed detection probability and the batch track detection probability of pink Gaussian process waveform scenario. The plots for RL values at the target and at the platform, Figure 6.20, show slight difference in slope due to the difference in the propagation path length to the target and to the platform. This difference in slope is consistent with the performance seen in Figure 6.21. The corresponding RL values for the target and the platform are also consistent with the RL values in Figure 6.11 and the pink Gaussian curve in Figure 6.14. Taking the Hadamard product of the two marginal performance results in Figure 6.21, i.e., performing a point-by-point multiplication of the probability values, yields the left plot in Figure 6.22. This is very similar to the right plot, the actual joint performance of steganographic sonar, indicating independence between the detection events at the target and track detection events at the platform. This independence makes the evaluation of the performance convenient since the individual performances can be estimated separately, then the joint performance can be predicted by taking the dot product of the two. This allows for computationally efficient prediction or estimation and sensitivity analysis for other factors without having to run the full simulation for each permutation of parameters.

82 Mean RL, Target Mean Echo Level, Platform -30

55 10 55 -35

-40 5 50 50 -45 0 -50 45 45

-5 -55

Source Level (dB) Level Source (dB) Level Source

40 -10 40 -60

-65 3 3.2 3.4 3 3.2 3.4 log Range (m) log Range (m) 10 10

Figure 6.20. Mean received level of the waveform at the target and the mean target echo level at the platform, for each grid.

P , Target, Pink P , Platform, Pink MD TD 1 1

55 55 0.8 0.8

50 0.6 50 0.6

45 0.4 45 0.4

Source Level (dB) Level Source (dB) Level Source 0.2 0.2 40 40

0 0 3 3.2 3.4 3 3.2 3.4 log Range (m) log Range (m) 10 10

Figure 6.21. Probability of missed detection at the target and probability of track detection at the platform, for each grid.

Figure 6.23 is a comparison of PSS from the simulation and the prediction based on the estimated curve parameters from Figures 6.14 and 6.11, with finer resolution in the SL × log10 Rinit parameter space. This also allows one to extrapolate the performance prediction in other regions of the parameter space with relatively small number of simulations or measurements. From the predicted performance, the range of SL values that achieve reliable

83 P = P # P , Pink P , Pink SS MD TD SS

0.8 0.8 55 55 0.7 0.7

0.6 0.6 50 50 0.5 0.5

0.4 0.4 45 45

0.3 0.3

Source Level (dB) Level Source (dB) Level Source 0.2 0.2

40 0.1 40 0.1

0 0 3 3.2 3.4 3 3.2 3.4 log Range (m) log Range (m) 10 10

Figure 6.22. Steganographic sonar performance computed by taking the product of the individual performances at the target and at the platform compared to the jointly computed performance. steganographic sonar performance is shown to grow wider in close ranges. Figures 6.24, 6.25, 6.26 show the results for white Gaussian, and colored Gaussian with α = 1.5, 1.9, respectively. The reliable performance boundaries are clearer in the predicted plots in comparison with the coarsely binned estimates.

84 P , Joint Simulation, Pink P , Estimated Model, Pink SS SS 1 1

55 55 0.8 0.8

50 0.6 50 0.6

45 0.4 45 0.4

Source Level (dB) Level Source (dB) Level Source 0.2 0.2 40 40

0 0 3 3.2 3.4 3 3.2 3.4 log Range (m) log Range (m) 10 10

Figure 6.23. Steganographic sonar performance from the overall simulation and from the estimated model based on the individual simulation result and computing the joint performance with the independence assumption, for pink Gaussian waveform.

P , Joint Simulation, White Gaussian P , Estimated Model, White Gaussian SS SS 1 1

55 55 0.8 0.8

50 0.6 50 0.6

45 0.4 45 0.4

Source Level (dB) Level Source (dB) Level Source 0.2 0.2 40 40

0 0 3 3.2 3.4 3 3.2 3.4 log Range (m) log Range (m) 10 10

Figure 6.24. Steganographic sonar performance from the overall simulation and from the estimated model based on the individual simulation result and computing the joint performance with the independence assumption, for white Gaussian waveform.

85 P , Joint Simulation, ,:1.5 P , Estimated Model, ,:1.5 SS SS 1 1

55 55 0.8 0.8

50 0.6 50 0.6

45 0.4 45 0.4

Source Level (dB) Level Source (dB) Level Source 0.2 0.2 40 40

0 0 3 3.2 3.4 3 3.2 3.4 log Range (m) log Range (m) 10 10

Figure 6.25. Steganographic sonar performance from the overall simulation and from the estimated model based on the individual simulation result and computing the joint performance with the independence assumption, for colored Gaussian waveform with α = 1.5.

P , Joint Simulation, ,:1.9 P , Estimated Model, ,:1.9 SS SS 1 1

55 55 0.8 0.8

50 0.6 50 0.6

45 0.4 45 0.4

Source Level (dB) Level Source (dB) Level Source 0.2 0.2 40 40

0 0 3 3.2 3.4 3 3.2 3.4 log Range (m) log Range (m) 10 10

Figure 6.26. Steganographic sonar performance from the overall simulation and from the estimated model based on the individual simulation result and computing the joint performance with the independence assumption, for colored Gaussian waveform with α = 1.9.

86 Chapter 7 | Conclusion and Discussion

Steganographic transmission waveforms are desirable for many underwater appli- cations such as underwater acoustic communications, surveillance, detection, and tracking. In the context of active sonar tracking, an overall approach is presented that covers many topics including background sound modeling, steganographic security assessment of transmission waveform, target motion modeling, and batch track detection. Under shallow water environment background sound dominated by snapping shrimps, the presented approach shows feasibility of steganographic sonar for target track detection. For effective steganography, the hider should have a better model of the cover than the seeker. A background sound modeling approach is presented that utilizes symbolic time series analysis with phase space representation and clustering of time series segments. The time evolving characteristics are captured with the hidden Markov model (HMM). Covertness, or steganographic security, of the transmission waveform can be assessed by measuring the Kullback-Leibler divergence (KLD) between the cover model and the stego model. Using the presented sound modeling framework, KLD between the background sound and it perturbed by the transmission waveform is estimated with their respective HMM parameters. This is shown to be useful in capturing the different temporal characteristics in various stego processes that an energy detector fails to differentiate. For security, the transmission waveform energy is spread over longer time du- ration. Conventional tracking approaches fail to perform reliably, and existing track-before-detect approaches also fail due to overwhelming search computation with simplistic short timescale motion model. A plausible long timescale target mo-

87 tion model was presented that enforces constraints on the evolution of acceleration of the target motion. This improved the search efficiency and allowed for coherent integration of target echo energy and reliable target track detection. With success defined as when the target fails to detect the existence of the transmission waveform, and the platform detects the target track, a Monte Carlo- based simulation was performed to find the desirable source level for a given range to the target. Five types of transmission waveforms were investigated to understand the effects of the spectral shape and the similarity to the background sound of the waveform on steganographic security and track detection performance. In the snapping shrimp environment mimicking waveform did not prove to be a favorable waveform, but white Gaussian and colored Gaussian with spectral exponent of α = 1.9, close to brown noise, showed better performance for steganographic sonar against energy detector. The intention of the dissertation is to demonstrate the feasibility of stegano- graphic sonar for tracking, and each component has many variations that can potentially affect the overall performance. The following section discusses some of these topics that can be further investigated and their potential impact on the feasibility of steganographic sonar for tracking and other applications.

7.1 Future Effort: Model Mismatch and Model Com- plexity

It is pointed out in [19] that practical implementation of steganography involves good understanding of the empirical cover model and real-world covers are so complex that simple models cannot capture all of the relevant information. This is true for both of the models discussed in this dissertation, background sound model, and target motion model. Understanding the effect of model mismatch is a topic that can be further investigated with diverse background scenarios and target motion analysis with real navigation data. The motion similarity distribution modeled as a truncated exponential distribu- tion is an approximation devised for analytical derivation of the track detection performance. This distribution is sensitive to the actual target motion and the duration of the motion. While the sensitivity can be adjusted by the autocorrelation

88 width of the transmission waveform, changing the characteristics of the waveform also affects the steganographic security. Therefore, the effect of true motion com- plexity and the mismatch between the true target motion model and the motion model used for the batch track detection approach is of interest as future effort. On background sound modeling, in order to faithfully synthesize plausible sound the acoustic model needs to capture not only the amplitude distribution, but also the time-evolving characteristics. The presented symbolic time series analysis approach with hidden Markov model (HMM) is capable of capturing the stochastically dynamic characteristics of various types of cover background including the underwater environment. There are other acoustic modeling approaches such as long short-term memory (LSTM) recurrent neural network (RNN) [144] that have been shown to outperform HMM in its modeling and classification capability [145]. Artificial neural networks have gained significant attention for their ability to accurately approximate complex functions and have been applied to various domains including acoustic signal processing. The modeling approach via phase space representation and clustering introduced in this dissertation can be further improved by utilizing some of the work done in machine learning, especially LSTM RNNs when combined with a probabilistic description structure that allows for computation of KLD.

7.2 Discussion

A large body of work has been done in many of the related fields including, but not limited to, sonar tracking, waveform design, theoretical development in steganography, motion modeling, and underwater sound propagation. These fields have matured in their respective domain and has been optimized for different applications. Effective steganographic sonar requires all of these elements working together under a common objective and some of the conventional methods and assumptions in the respective fields need to be reconsidered when being applied to a different context. This is true for any field, but it was one of the most challenging aspects of demonstrating the feasibility of steganographic sonar. Taking the insights gained from this work and applying to a more diverse set of backgrounds and against various

89 classes of marine vehicles will involve reconsidering many of the assumptions made in this dissertation as well. Furthermore, although tracking was the application of interest, steganographic waveforms have potential for many other applications including communications, remote sensing, and navigation. Another interesting aspect of this topic is the importance of Figure 4.3. Although it is likely advantageous to learn the true model, P , as accurately as resources allow, deception of the seeker may be more effective with the knowledge of the seeker’s estimated model, Pˆs, rather than the true model. Game-theoretic approaches can be taken to investigate optimal strategies with the assumption that both parties are aware of the other. Lastly, the total amount of information that can be hidden under different backgrounds is of significant interest. Steganographic capacity has been studied from a theoretical standpoint, but practical estimation of this quantity may involve computing Kolmogorov complexity of empirically estimated model, which is known to be computationally prohibitive. However, it does not prevent one from pursuing it.

90 Bibliography

[1] Davey, S. J., M. G. Rutten, and B. Cheung (2008) “A comparison of detection performance for several track-before-detect algorithms,” EURASIP Journal on Advances in Signal Processing, 2008, p. 41.

[2] Arnold, J. F. and H. Pasternack (1990) “Detection and tracking of low-observable targets through dynamic programming,” in OE/LASE’90, 14-19 Jan., Los Angeles, CA, International Society for Optics and Photonics, pp. 207–217.

[3] Fernandez, M. F., T. Aridgides, and D. Bray (1990) “Detecting and tracking low-observable targets using IR,” in OE/LASE’90, 14-19 Jan., Los Angeles, CA, International Society for Optics and Photonics, pp. 193–206.

[4] Kirubarajan, T. and Y. Bar-Shalom (1996) “Low observable target motion analysis using amplitude information,” Aerospace and Electronic Systems, IEEE Transactions on, 32(4), pp. 1367–1384.

[5] Misovec, K., T. Inanc, J. Wohletz, and R. M. Murray (2003) “Low- observable nonlinear trajectory generation for unmanned air vehicles,” in Decision and Control, 2003. Proceedings. 42nd IEEE Conference on, vol. 3, IEEE, pp. 3103–3110.

[6] Hadzagic, M., H. Michalska, and E. Lefebvre (2005) “Track-before detect methods in tracking low-observable targets: A survey,” Sensors Trans Mag, 54(1), pp. 374–380.

[7] Barniv, Y. (1985) “Dynamic programming solution for detecting dim moving targets,” Aerospace and Electronic Systems, IEEE Transactions on, (1), pp. 144–156.

[8] Blostein, S. D. and H. S. Richardson (1994) “A sequential detection approach to target tracking,” Aerospace and Electronic Systems, IEEE Trans- actions on, 30(1), pp. 197–212.

91 [9] Boers, Y. and H. Driessen (2003) “Particle filter-based track before detect algorithms,” in Optical Science and Technology, SPIE’s 48th Annual Meeting, International Society for Optics and Photonics, pp. 20–30.

[10] Berger, C. R., M. Daun, and W. Koch (2007) “Low complexity track initialization from a small set of non-invertible measurements,” EURASIP Journal on Advances in Signal Processing, 2008(1), pp. 1–15.

[11] McDonald, M. and B. Balaji (2008) “Track-before-detect using Swerling 0, 1, and 3 target models for small manoeuvring maritime targets,” EURASIP Journal on Advances in Signal Processing, 2008(1), pp. 1–9.

[12] Nichtern, O. and S. R. Rotman (2008) “Parameter adjustment for a dynamic programming track-before-detect-based target detection algorithm,” EURASIP Journal on Advances in Signal Processing, 2008, p. 141.

[13] Wieneke, M. and W. Koch (2008) “On sequential track extraction within the PMHT framework,” EURASIP Journal on Advances in Signal Processing, 2008, p. 90.

[14] Urick, R. J. (1967) Principles of underwater sound for engineers, Tata McGraw-Hill Education.

[15] Burdic, W. S. (1991) Underwater acoustic system analysis, Prentice Hall.

[16] Stojanovic, M. and J. Preisig (2009) “Underwater acoustic communica- tion channels: Propagation models and statistical characterization,” IEEE Communications Magazine, 47(1), pp. 84–89.

[17] Provos, N. and P. Honeyman (2003) “Hide and seek: An introduction to steganography,” Security & Privacy, IEEE, 1(3), pp. 32–44.

[18] Cox, I., M. Miller, J. Bloom, J. Fridrich, and T. Kalker (2007) Digital watermarking and steganography, Morgan Kaufmann.

[19] Böhme, R. (2009) “An epistemological approach to steganography,” in Information Hiding, Springer, pp. 15–30.

[20] Pevny,` T., T. Filler, and P. Bas (2010) “Using high-dimensional image models to perform highly undetectable steganography,” in Information hiding, Springer, pp. 161–177.

[21] Willett, P., J. Reinert, and R. Lynch (2004) “LPI waveforms for active sonar?” in Aerospace Conference, 2004. Proceedings. 2004 IEEE, vol. 4, IEEE, pp. 2236–2248.

92 [22] Blanding, W. R., P. K. Willett, Y. Bar-Shalom, and R. S. Lynch (2005) “Covert sonar tracking,” in Aerospace Conference, 2005 IEEE, IEEE, pp. 2053–2062.

[23] Golomb, S. W. and H. Taylor (1984) “Constructions and properties of Costas arrays,” Proceedings of the IEEE, 72(9), pp. 1143–1163.

[24] Park Jr, J. (1986) “LPI techniques in the underwater acoustic channel,” in Military Communications Conference-Communications-Computers: Teamed for the 90’s, 1986. MILCOM 1986. IEEE, vol. 1, IEEE, pp. 10–5.

[25] Yang, T. and W.-B. Yang (2008) “Low probability of detection underwater acoustic communications using direct-sequence spread spectrum,” The Journal of the Acoustical Society of America, 124(6), pp. 3632–3647.

[26] Simmons, G. J. (1984) “The prisonersŠ problem and the subliminal channel,” in Advances in Cryptology, Springer, pp. 51–67.

[27] Fridrich, J., J. Kodovsky`, V. Holub, and M. Goljan (2011) “Steganal- ysis of content-adaptive steganography in spatial domain,” in Information Hiding, Springer, pp. 102–117.

[28] Schöttle, P. and R. Böhme (2012) “A game-theoretic approach to content- adaptive steganography,” in Information Hiding, Springer, pp. 125–141.

[29] Sedighi, V., R. Cogranne, and J. Fridrich (2016) “Content-Adaptive Steganography by Minimizing Statistical Detectability,” Information Foren- sics and Security, IEEE Transactions on, 11(2), pp. 221–234.

[30] Crandall, R. (1998) “Some notes on steganography,” Posted on steganog- raphy mailing list.

[31] Fridrich, J. and M. Goljan (2003) “Digital image steganography using stochastic modulation,” in Electronic Imaging 2003, International Society for Optics and Photonics, pp. 191–202.

[32] Fridrich, J., M. Goljan, and D. Soukal (2004) “Perturbed quantization steganography with wet paper codes,” in Proceedings of the 2004 workshop on Multimedia and security, ACM, pp. 4–15.

[33] Fridrich, J. and T. Filler (2007) “Practical methods for minimizing em- bedding impact in steganography,” in Electronic Imaging 2007, International Society for Optics and Photonics, pp. 650502–650502.

[34] Urick, R. J. (1984) Ambient noise in the sea, Tech. rep., DTIC Document.

93 [35] Cachin, C. (2004) “An information-theoretic model for steganography,” Information and Computation, 192(1), pp. 41–56.

[36] Tonissen, S. M. and R. J. Evans (1996) “Peformance of dynamic program- ming techniques for Track-Before-Detect,” Aerospace and Electronic Systems, IEEE Transactions on, 32(4), pp. 1440–1451.

[37] Johnston, L. A. and V. Krishnamurthy (2002) “Performance analysis of a dynamic programming track before detect algorithm,” Aerospace and Electronic Systems, IEEE Transactions on, 38(1), pp. 228–242.

[38] Orlando, D., G. Ricci, and Y. Bar-Shalom (2011) “Track-before-detect algorithms for targets with kinematic constraints,” Aerospace and Electronic Systems, IEEE Transactions on, 47(3), pp. 1837–1849.

[39] Rabaste, O., C. Riché, and A. Lepoutre (2012) “Long-time coherent integration for low SNR target via particle filter in track-before-detect,” in Information Fusion (FUSION), 2012 15th International Conference on, IEEE, pp. 127–134.

[40] North, D. O. (1963) “An analysis of the factors which determine signal/noise discrimination in pulsed-carrier systems,” Proceedings of the IEEE, 51(7), pp. 1016–1027.

[41] Tonissen, S. M. and Y. Bar-Shalom (1998) “Maximum likelihood track- before-detect with fluctuating target amplitude,” IEEE transactions on aerospace and electronic systems, 34(3), pp. 796–809.

[42] Bar-Shalom, Y., X. R. Li, and T. Kirubarajan (2004) Estimation with applications to tracking and navigation: theory algorithms and software, John Wiley & Sons.

[43] Rutten, M. G., B. Ristic, and N. J. Gordon (2005) “A comparison of particle filters for recursive track-before-detect,” in 2005 7th International Conference on Information Fusion, vol. 1, IEEE, pp. 7–pp.

[44] Larson, R. E. and J. Peschon (1966) “A dynamic programming approach to trajectory estimation,” Automatic Control, IEEE Transactions on, 11(3), pp. 537–540.

[45] Barniv, Y. and O. Kella (1987) “Dynamic programming solution for detecting dim moving targets part II: analysis,” Aerospace and Electronic Systems, IEEE Transactions on, (6), pp. 776–788.

[46] Liu, J. S. and R. Chen (1998) “Sequential Monte Carlo methods for dynamic systems,” Journal of the American statistical association, 93(443), pp. 1032–1044.

94 [47] Doucet, A., N. De Freitas, and N. Gordon (2001) “An introduction to sequential Monte Carlo methods,” in Sequential Monte Carlo methods in practice, Springer, pp. 3–14.

[48] Gordon, N. J., D. J. Salmond, and A. F. Smith (1993) “Novel approach to nonlinear/non-Gaussian Bayesian state estimation,” in IEE Proceedings F-Radar and Signal Processing, vol. 140, IET, pp. 107–113.

[49] Singer, R. A. (1970) “Estimating optimal tracking filter performance for manned maneuvering targets,” Aerospace and Electronic Systems, IEEE Transactions on, (4), pp. 473–483.

[50] Helferty, J. P. (1996) “Improved tracking of maneuvering targets: The use of turn-rate distributions for acceleration modeling,” Aerospace and Electronic Systems, IEEE Transactions on, 32(4), pp. 1355–1361.

[51] Gulati, S. and B. Kuipers (2008) “High performance control for graceful motion of an intelligent wheelchair,” in Robotics and Automation, 2008. ICRA 2008. IEEE International Conference on, IEEE, pp. 3932–3938.

[52] Park, J. J. and B. Kuipers (2011) “A smooth control law for graceful motion of differential wheeled mobile robots in 2D environment,” in Robotics and Automation (ICRA), 2011 IEEE International Conference on, IEEE, pp. 4896–4902.

[53] Kotz, S., T. Kozubowski, and K. Podgorski (2012) The Laplace dis- tribution and generalizations: a revisit with applications to communications, economics, engineering, and finance, Springer Science & Business Media.

[54] Kalisiak, M. and M. van de Panne (2007) “Faster motion planning using learned local viability models,” in Robotics and Automation, 2007 IEEE International Conference on, IEEE, pp. 2700–2705.

[55] Petitcolas, F. A., R. J. Anderson, and M. G. Kuhn (1999) “Informa- tion hiding-a survey,” Proceedings of the IEEE, 87(7), pp. 1062–1078.

[56] Katzenbeisser, S. and F. Petitcolas (2000) Information hiding tech- niques for steganography and digital watermarking, Artech house.

[57] Johnson, N. F., Z. Duric, and S. Jajodia (2000) Information Hiding: Steganography and Watermarking-Attacks and Countermeasures: Steganog- raphy and Watermarking: Attacks and Countermeasures, vol. 1, Springer Science & Business Media.

[58] Malvar, H. S. and D. A. Florêncio (2003) “Improved spread spectrum: a new modulation technique for robust watermarking,” IEEE transactions on signal processing, 51(4), pp. 898–905.

95 [59] Chandramouli, R. and N. Memon (2001) “Analysis of LSB based image steganography techniques,” in Image Processing, 2001. Proceedings. 2001 International Conference on, vol. 3, IEEE, pp. 1019–1022.

[60] George, A. and M. H. Kiesler (1942), “Secret communication system,” US Patent 2,292,387.

[61] Glenn, A. (1983) “Low probability of intercept,” IEEE communications magazine, 21(4), pp. 26–33.

[62] Mills, R. E. and G. E. Prescott (1995) “Waveform design and analysis of frequency hopping LPI networks,” in Military Communications Conference, 1995. MILCOM’95, Conference Record, IEEE, vol. 2, IEEE, pp. 778–782.

[63] Stojanovic, M., J. Proarkis, J. Rice, and M. Green (1998) “Spread spectrum underwater acoustic telemetry,” in OCEANS’98 Conference Pro- ceedings, vol. 2, IEEE, pp. 650–654.

[64] Cox, I. J., G. Doërr, and T. Furon (2006) “Watermarking is not cryp- tography,” in International Workshop on Digital Watermarking, Springer, pp. 1–15.

[65] Bandyopadhyay, S. K., D. Bhattacharyya, D. Ganguly, S. Mukher- jee, and P. Das (2008) “A tutorial review on steganography,” in International conference on contemporary computing, vol. 101, Citeseer.

[66] Heidari-Bateni, G. and C. D. McGillem (1994) “A chaotic direct- sequence spread-spectrum communication system,” IEEE Transactions on communications, 42(234), pp. 1524–1527.

[67] Nicholson, D. L. (1988) Spread spectrum signal design: LPE and AJ systems, Computer Science Press, Inc.

[68] Dillard, G. M., M. Reuter, J. Zeiddler, and J. Zeidler (2003) “Cyclic code shift keying: a low probability of intercept communication technique,” Aerospace and Electronic Systems, IEEE Transactions on, 39(3), pp. 786–798.

[69] Cox, I. J., J. Kilian, F. T. Leighton, and T. Shamoon (1997) “Secure spread spectrum watermarking for multimedia,” Image Processing, IEEE Transactions on, 6(12), pp. 1673–1687.

[70] Marvel, L. M., C. G. Boncelet, and C. T. Retter (1999) “Spread spectrum image steganography,” IEEE Transactions on image processing, 8(8), pp. 1075–1083.

[71] Morkel, T., J. H. Eloff, and M. S. Olivier (2005) “An overview of image steganography.” in ISSA, pp. 1–11.

96 [72] Lew, H. (1996) Broadband Active Sonar: Implications and Constraints., Tech. rep., DTIC Document.

[73] Pace, P. E. (2009) Detecting and classifying low probability of intercept radar, Artech House.

[74] Lynch, R. S., P. K. Willett, and J. M. Reinert (2012) “Some Analysis of the LPI Concept for Active Sonar,” Oceanic Engineering, IEEE Journal of, 37(3), pp. 446–455.

[75] How, M. J. and J. M. Zanker (2014) “Motion camouflage induced by zebra stripes,” Zoology, 117(3), pp. 163–170.

[76] Caro, T., A. Izzo, R. C. Reiner Jr, H. Walker, and T. Stankowich (2014) “The function of zebra stripes,” Nature communications, 5.

[77] Hanlon, R. (2007) “Cephalopod dynamic camouflage,” Current Biology, 17(11), pp. R400–R404.

[78] Stubbs, A. L. and C. W. Stubbs (2016) “Spectral discrimination in color blind animals via chromatic aberration and pupil shape,” Proceedings of the National Academy of Sciences, 113(29), pp. 8206–8211.

[79] Sallee, P. (2003) “Model-based steganography,” in International workshop on digital watermarking, Springer, pp. 154–167.

[80] Cachin, C. (1998) “An information-theoretic model for steganography,” in Information Hiding, Springer, pp. 306–318.

[81] Ker, A. D. (2008) “Perturbation hiding and the batch steganography problem,” in Information Hiding, Springer, pp. 45–59.

[82] Pevny,` T. and J. Fridrich (2008) “Benchmarking for steganography,” in International Workshop on Information Hiding, Springer, pp. 251–267.

[83] Filler, T. and J. Fridrich (2009) “Complete characterization of per- fectly secure stego-systems with mutually independent embedding operation,” in 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, pp. 1429–1432.

[84] Böhme, R. (2010) Advanced statistical steganalysis, Springer Science & Business Media.

[85] Ker, A. D., P. Bas, R. Böhme, R. Cogranne, S. Craver, T. Filler, J. Fridrich, and T. Pevny` (2013) “Moving steganography and steganalysis from the laboratory into the real world,” in Proceedings of the first ACM workshop on Information hiding and multimedia security, ACM, pp. 45–58.

97 [86] Sedighi, V., J. Fridrich, and R. Cogranne (2015) “Content-adaptive pentary steganography using the multivariate generalized Gaussian cover model,” in SPIE/IS&T Electronic Imaging, International Society for Optics and Photonics, pp. 94090H–94090H.

[87] Ker, A. D. (2007) “A capacity result for batch steganography,” IEEE Signal Processing Letters, 14(8), pp. 525–528.

[88] Filler, T., A. D. Ker, and J. Fridrich (2009) “The square root law of steganographic capacity for Markov covers,” in IS&T/SPIE Electronic Imaging, International Society for Optics and Photonics, pp. 725408–725408.

[89] Harmsen, J. J. and W. A. Pearlman (2005) “Capacity of steganographic channels,” in Proceedings of the 7th workshop on Multimedia and security, ACM, pp. 11–24.

[90] Wang, Y. and P. Moulin (2008) “Perfectly secure steganography: Capacity, error exponents, and code constructions,” IEEE Transactions on Information Theory, 54(6), pp. 2706–2722.

[91] Harmsen, J. J. and W. A. Pearlman (2009) “Capacity of steganographic channels,” IEEE Transactions on Information Theory, 55(4), pp. 1775–1792.

[92] Carter, G. C. (1981) “Time delay estimation for passive sonar signal processing,” Acoustics, Speech and Signal Processing, IEEE Transactions on, 29(3), pp. 463–470.

[93] Fortmann, T. E., Y. Bar-Shalom, and M. Scheffe (1983) “Sonar tracking of multiple targets using joint probabilistic data association,” Oceanic Engineering, IEEE Journal of, 8(3), pp. 173–184.

[94] Wenz, G. M. (1962) “Acoustic ambient noise in the ocean: spectra and sources,” The Journal of the Acoustical Society of America, 34(12), pp. 1936–1956.

[95] Committee on Potential Impacts of Ambient Noise in the Ocean on Marine Mammals, National Research Council, Ocean Studies Board, Division on Earth and Life Studies, Washington, DC (2003) Ocean noise and marine mammals, National Academies Press.

[96] Park, J. D., D. J. Miller, J. F. Doherty, and S. C. Thompson (2009) “A study on the feasibility of low probability of intercept sonar,” in Information Sciences and Systems, 2009. CISS 2009. 43rd Annual Conference on, IEEE, pp. 284–289.

98 [97] ——— (2010) “Feasibility of range estimation using sonar LPI,” in Informa- tion Sciences and Systems (CISS), 2010 44th Annual Conference on, IEEE, pp. 1–6.

[98] Versluis, M., A. Heydt, D. Lohse, and B. Schmitz (2000) “On the sound of snapping shrimp: The collapse of a cavitation bubble,” Journal of the Acoustical Society of America, 108(5), pp. 2541–2542.

[99] Cover, T. M. and J. A. Thomas (2012) Elements of information theory, John Wiley & Sons.

[100] Gary, M. R. and D. S. Johnson (1979) Computers and Intractability: A Guide to the Theory of NP-completeness, WH Freeman and Company, New York.

[101] Wenz, G. M. (1972) “Review of research: Noise,” The Journal of the Acoustical Society of America, 51(3B), pp. 1010–1024.

[102] Dahl, P. H., J. H. Miller, D. H. Cato, and R. K. Andrew (2007) “Underwater ambient noise,” Acoustics Today, 3(1), pp. 23–33.

[103] Ramji, S., G. Latha, V. Rajendran, and S. Ramakrishnan (2008) “Wind dependence of ambient noise in shallow water of Bay of Bengal,” Applied acoustics, 69(12), pp. 1294–1298.

[104] Andrew, R. K., B. M. Howe, J. A. Mercer, and M. A. Dzieciuch (2002) “Ocean ambient sound: comparing the 1960s with the 1990s for a receiver off the California coast,” Acoustics Research Letters Online, 3(2), pp. 65–70.

[105] Nystuen, J. A., S. E. Moore, and P. J. Stabeno (2010) “A sound budget for the southeastern Bering Sea: Measuring wind, rainfall, shipping, and other sources of underwater sound,” The Journal of the Acoustical Society of America, 128(1), pp. 58–65.

[106] Bassett, C., J. Thomson, and B. Polagye (2010) “Characteristics of underwater ambient noise at a proposed tidal energy site in Puget Sound,” in OCEANS 2010 MTS/IEEE SEATTLE, IEEE, pp. 1–8.

[107] Haxel, J. H., R. P. Dziak, and H. Matsumoto (2013) “Observations of shallow water marine ambient sound: the low frequency underwater sound- scape of the central Oregon coast,” The Journal of the Acoustical Society of America, 133(5), pp. 2586–2596.

99 [108] Bouvet, M. and S. Schwartz (1986) “Detection in underwater noises modeled as a Gaussian-Gaussian mixture,” in Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP’86., vol. 11, IEEE, pp. 2795–2798.

[109] Bouvet, M. and S. C. Schwartz (1988) “Underwater noises: Statistical modeling, detection, and normalization,” The Journal of the Acoustical Society of America, 83(3), pp. 1023–1033.

[110] Alvarez, A., C. Harrison, and M. Siderius (2001) “Predicting under- water ocean noise with genetic algorithms,” Physics Letters A, 280(4), pp. 215–220.

[111] Chitre, M., J. Potter, and O. S. Heng (2004) “Underwater acoustic channel characterisation for medium-range shallow water communications,” in OCEANS’04. MTTS/IEEE TECHNO-OCEAN’04, vol. 1, IEEE, pp. 40–45.

[112] Stojanovic, M. (2007) “On the relationship between capacity and distance in an underwater acoustic communication channel,” ACM SIGMOBILE Mobile Computing and Communications Review, 11(4), pp. 34–43.

[113] Arase, T. and E. M. Arase (1968) “Deep-Sea Ambient-Noise Statistics,” The Journal of the Acoustical Society of America, 44(6), pp. 1679–1684.

[114] Kerman, B. R. (1984) “Underwater sound generation by breaking wind waves,” The Journal of the Acoustical Society of America, 75(1), pp. 149–165.

[115] Carey, W. M. and R. B. Evans (2011) Ocean ambient noise: measurement and theory, Springer Science & Business Media.

[116] Radford, C. A., A. G. Jeffs, C. T. Tindle, and J. C. Montgomery (2008) “Temporal patterns in ambient noise of biological origin from a shallow water temperate reef,” Oecologia, 156(4), pp. 921–929.

[117] Potter, J. R., L. T. Wei, and M. Chitre (1997) “High-frequency ambient noise in warm shallow waters,” Sea Surface Sound, UK.

[118] Anderson, V. C. (1980) “Nonstationary and nonuniform oceanic back- ground in a high-gain acoustic array,” The Journal of the Acoustical Society of America, 67(4), pp. 1170–1179.

[119] Takens, F. (1981) “Detecting strange attractors in turbulence,” in Dynamical systems and turbulence, Warwick 1980, Springer, pp. 366–381.

[120] Abarbanel, H. D., R. Brown, and M. B. Kennel (1992) “Local Lya- punov exponents computed from observed data,” Journal of Nonlinear Science, 2(3), pp. 343–365.

100 [121] Abarbanel, H. D., R. Brown, J. J. Sidorowich, and L. S. Tsimring (1993) “The analysis of observed chaotic data in physical systems,” Reviews of modern physics, 65(4), p. 1331.

[122] Frison, T. W., H. D. Abarbanel, J. Cembrola, and B. Neales (1996) “Chaos in ocean ambient “noise”,” The Journal of the Acoustical Society of America, 99(3), pp. 1527–1539.

[123] Frison, T. W., H. D. Abarbanel, M. D. Earle, J. R. Schultz, and W. D. Scherer (1999) “Chaos and predictability in ocean water levels,” Journal of Geophysical Research: Oceans, 104(C4), pp. 7935–7951.

[124] Abarbanel, H. D., T. W. Frison, and L. S. Tsimring (1998) “Obtain- ing order in a world of chaos [signal processing],” IEEE Signal Processing Magazine, 15(3), pp. 49–65.

[125] Frison, T. W., M. D. Earle, H. D. Abanel, and W. D. Scherer (1999) “Interstation prediction of ocean water levels using methods of nonlinear dynamics,” Journal of Geophysical Research: Oceans, 104(C6), pp. 13653– 13666.

[126] Rabiner, L. and B. Juang (1986) “An introduction to hidden Markov models,” ieee assp magazine, 3(1), pp. 4–16.

[127] Starner, T., J. Makhoul, R. Schwartz, and G. Chou (1994) “On- line cursive handwriting recognition using speech recognition methods,” in Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on, IEEE, pp. V–125.

[128] Starner, T. and A. Pentland (1997) “Real-time american sign language recognition from video using hidden markov models,” in Motion-Based Recog- nition, Springer, pp. 227–243.

[129] Sonnhammer, E. L., G. Von Heijne, A. Krogh, et al. (1998) “A hidden Markov model for predicting transmembrane helices in protein sequences.” in Intelligent Systems for Molecular Biology ISMB, vol. 6, pp. 175–182.

[130] Chin, S. C., A. Ray, and V. Rajagopalan (2005) “Symbolic time series analysis for anomaly detection: a comparative evaluation,” Signal Processing, 85(9), pp. 1859–1868.

[131] Gupta, S., A. Ray, and E. Keller (2007) “Symbolic time series analysis of ultrasonic data for early detection of fatigue damage,” Mechanical Systems and Signal Processing, 21(2), pp. 866–884.

101 [132] Beim Graben, P. (2001) “Estimating and improving the signal-to-noise ratio of time series by symbolic dynamics,” Physical Review E, 64(5), p. 051104.

[133] Lindgren, A. C., M. T. Johnson, and R. J. Povinelli (2003) “Speech recognition using reconstructed phase space features,” in Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP’03). 2003 IEEE Interna- tional Conference on, vol. 1, IEEE, pp. I–60.

[134] Campbell, L. W. and A. E. Bobick (1995) “Recognition of human body motion using phase space constraints,” in Computer Vision, 1995. Proceedings., Fifth International Conference on, IEEE, pp. 624–630.

[135] Kennel, M. B., R. Brown, and H. D. Abarbanel (1992) “Determining embedding dimension for phase-space reconstruction using a geometrical construction,” Physical review A, 45(6), p. 3403.

[136] Bellman, R. (1956) “Dynamic programming and Lagrange multipliers,” Proceedings of the National Academy of Sciences, 42(10), pp. 767–769.

[137] Papoulis, A. and S. U. Pillai (2002) Probability, random variables, and stochastic processes, Tata McGraw-Hill Education.

[138] Do, M. N. (2003) “Fast approximation of Kullback-Leibler distance for dependence trees and hidden Markov models,” IEEE signal processing letters, 10(4), pp. 115–118.

[139] Xie, L., V. A. Ugrinovskii, and I. R. Petersen (2005) “Probabilistic distances between finite-state finite-alphabet hidden Markov models,” IEEE transactions on automatic control, 50(4), pp. 505–511.

[140] Grimmett, G. and D. Stirzaker (2001) Probability and random processes, Oxford university press.

[141] Lin, J. (1991) “Divergence measures based on the Shannon entropy,” IEEE Transactions on Information theory, 37(1), pp. 145–151.

[142] Clouth, R. (2012), “Boots and Cats,” . URL http://robclouth.com/

[143] Poor, H. V. (2013) An introduction to signal detection and estimation, Springer Science & Business Media.

[144] Hochreiter, S. and J. Schmidhuber (1997) “Long short-term memory,” Neural computation, 9(8), pp. 1735–1780.

102 [145] Sak, H., A. W. Senior, and F. Beaufays (2014) “Long short-term memory recurrent neural network architectures for large scale acoustic modeling.” in INTERSPEECH, pp. 338–342.

103 Vita Joonho Daniel Park

EDUCATION

Ph.D. Electrical Engineering, The Pennsylvania State University 2016 M.S. Electrical Engineering, The Pennsylvania State University 2009 B.S. Electrical Engineering, Seoul National University, South Korea 2007

PROFESSIONAL WORK EXPERIENCE

Mr. J. Daniel Park is an Associate Research and Development Engineer at the Applied Research Laboratory The Pennsylvania State University, where he has been employed for seven years. He conducts research and development on various topics including waveform design, tracking, signal processing, and classification. He has been the principal investigator of multiple research programs from government sponsors on quality assessment of signal processing, performance evaluation for classification algorithms, and signal design for acoustic tracking.