ENHANCING FLUID MODELING WITH TURBULENCE AND ACCELERATION
A dissertation submitted to Kent State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy
by
Fan Chen
May 2015 Dissertation written by
Fan Chen
B.S., Huazhong University of Science and Technology, 2005
M.S., Huazhong University of Science and Technology, 2007
M.S., Kent State University, 2009
Ph.D., Kent State University, 2015
Approved by
Dr. Ye Zhao , Chair, Doctoral Dissertation Committee
Dr. Feodor Dragan , Members, Doctoral Dissertation Committee
Dr. Arden Ruttan
Dr. Jing Li
Dr.Mina Katramatou
Accepted by
Dr. Javed Khan , Chair, Department of Computer Science
Dr.James L. Blank , Dean, College of Arts and Sciences ii TABLE OF CONTENTS
LISTOFFIGURES...... vii
LISTOFTABLES ...... x
1 Introduction ...... 1
1.1 Significance,ChallengeandObjectives...... 1
1.2 MethodologyandContribution ...... 3
1.3 Publications...... 5
2 Background ...... 7
2.1 FluidSimulation ...... 7
2.2 DistanceField...... 8
2.3 FluidTurbulence ...... 10
2.4 GPUaccelerationinFluidModeling ...... 13
3 DistanceField...... 16
3.1 Introduction...... 16
3.2 DistanceFieldTransform...... 19
3.2.1 Definition...... 19
3.2.2 Vector-BasedDistanceTransform ...... 20
3.2.3 OurComputationalScheme ...... 21
3.3 ActiveBandScheme ...... 22
iii 3.3.1 PropagationProcedure ...... 22
3.3.2 LifespanofaPoint ...... 23
3.3.3 GridStructuresandLifespanCoefficient ...... 25
3.4 ComputationalProcedure ...... 27
3.5 Multiple-SegmentDistanceTransform ...... 28
3.6 ResultsandDiscussion ...... 30
4 Adaptive and Controllable Turbulence Enhancement ...... 35
4.1 Introduction...... 35
4.2 RandomForcing ...... 40
4.3 TurbulenceSynthesis ...... 41
4.3.1 FrequencyDomainGeneration...... 42
4.3.2 EnergySpectrumControl ...... 44
4.3.3 Computation ...... 45
4.4 TurbulenceIntegration ...... 46
4.5 ConditionalandIntermittentTurbulence ...... 49
4.6 Experiments...... 53
5 LangevinParticlesinFlowSimulation ...... 57
5.1 Introduction...... 57
5.2 LangevinModel...... 60
5.2.1 ParticleMotion:ARandomProcess ...... 60
5.2.2 GeneralizedLangevinModel...... 61
5.2.3 FlowTurbulence ...... 62 iv 5.2.4 ComputationalScheme...... 63
5.3 LangevinParticlesinFlowSimulation ...... 64
5.3.1 LangevinForce...... 65
5.3.2 ParticleEvolution...... 66
5.3.3 TurbulenceControl...... 68
5.3.4 SimulationProcedure...... 68
5.4 Results...... 68
5.5 Discussion...... 75
6 UsingGPUinFluidModeling ...... 78
6.1 GPUcomputationwithCUDA ...... 78
6.2 LBMSimulation ...... 81
6.2.1 Introduction...... 81
6.2.2 ImplementationandResults ...... 83
6.3 FTLE ...... 85
6.3.1 Introduction...... 85
6.3.2 ImplementationandResults ...... 87
6.4 FluidCompression/Decompression...... 89
6.4.1 Compression ...... 89
6.4.2 Decompression ...... 90
6.4.3 Result...... 96
7 Conclusion...... 98
v BIBLIOGRAPHY...... 101
vi LIST OF FIGURES
1 Floating-PointOperationsperSecond.[1] ...... 14
2 Vectorbaseddistancepropagation...... 20
3 Distancetransformonarectangulargrid...... 23
4 Distancetransformonatrianglegrid...... 26
5 The12neighborsofanFCClatticesiteformsacuboctahedron (Image courtesy
ofDr.FengQiu)...... 26
6 DistancetransformonanFCCgrid...... 27
7 Computational time for distance transform on different distanceranges. . . . . 32
8 Performanceofusingdifferentsegmentsize...... 33
9 Distance field of the two points is rendered as isosurfaces with different dis-
tancevalues...... 33
10 Distance field of the Armadillo is rendered as isosurfaces with different dis-
tancevalues...... 34
11 Distance field of the bunny is rendered as isosurfaces with different distance
values...... 34
12 Random vector fields generated for a preferred scale with different deviations. . 42
13 Divergence-free vector fields with two scales. Top: Spectrum; Bottom: Vector
field. 1 = √2, 2 =8 and σ1 = σ2 =0.7...... 43
14 Data flow of FNS() computation...... 47
vii 15 Snapshots of turbulence enhancement simulations: (a) Original coarse simu-
lation; (b) Wavelet subgrid turbulence; (c) Our subgrid turbulence; (d) Add
vorticity confinement to (a); (e) Wavelet turbulence to (d); (f) Our turbulence
to (d) with q =0.8; (g) Our turbulence to (d) with q =0.2; (h) Our turbulence
to (d) with q =0.1...... 50
16 Snapshots of integrating turbulence to a laminar smoke...... 52
17 Snapshots of turbulence enhancement conditioned by the distance to obstacles:
(a) a laminar smoke simulation; (b) direct turbulence integration to (a) at the
same resolution; (c) Finer turbulent behavior achieved by executing simulation
on a coarser grid than (a), while coupling turbulence to the interpolated flow. . . 55
18 Snapshots of turbulence enhancement with SPH ((b) and (d)), in comparison
withoriginalsimulation((a)and(c))...... 56
19 Compute Langevin force F(t). u(t) is the particle velocity, u(t) is the mean flow velocity, and u(t +1) is the particle’s target velocity at the following time
step t +1 computed by Eqn. 33...... 65
20 Snapshots of integrating turbulence to a rising smoke flow with two different
turbulence levels controlled by the characteristic length scale lm. (a) Original
flow; (b) Vorticity confinement; (c) Random forcing; (d) lm =0.001; (e) lm =
0.003...... 69
21 Snapshots of integrating turbulence to a smoke simulation with diminishing
wind. Left: before wind stops; Right: after wind stops...... 71
22 Snapshots of turbulence enhancement of a smoke past obstacles with two dif-
ferentturbulencelevels...... 73 viii 23 Snapshots of turbulence enhancement of a flow over a table top with two dif-
ferentturbulencelevels...... 74
24 The GPU Devotes More Transistors to Data Processing [1]...... 79
25 GPUprogrammingmodel[1]...... 80
26 2Dand3DLBMlattice[2]...... 81
27 Flow pattern with FTLE and LCS. (a) Red: upward velocity. Green: downward
velocity; (b)(c) Red: high FTLE value; Blue: low FTLE value...... 87
28 Smokeanimationframeworkoverview...... 89
29 Bidirectional advection for P-Frames estimation from two consecutive K-Frames.
Red and purple arrow lines represent forward advection and backward advec-
tion,respectively...... 93
30 streamcompaction[3]...... 94
31 ScanAlgorithm[3]...... 95
32 ScanandScatter[3]...... 95
ix LIST OF TABLES
1 Performance of our distance transform method on multiple data sets. For
each model in different grid size, we compare the computational time between
seg multiple-segment method with a segment size lmax = 20 (see Section 3.5), and
the method with no segment. Computational time is measured inseconds. . . . 31
2 Performancereport...... 75
3 SRTLBMtimecomparingbetweenCPUandGPU ...... 84
4 MRTLBMtimecomparingbetweenCPUandGPU ...... 84
5 FTLEcomputationPerformance ...... 88
6 Fluiddecompression ...... 96
x CHAPTER 1
Introduction
1.1 Significance, Challenge and Objectives
The applications of modeling fluid phenomena such as water, smoke, gas and fire, are widely used in computer graphics, physics, etc. A well-developed research subject called Computa- tional Fluid Dynamics(CFD) proposes many advanced numerical methods to simulate the fluid.
The fundamental of CFD is Navier-Stokes equations [4] and how to efficiently and correctly solve these finite differential equations has been broadly researched and studied. In computer graphics, the fluid simulation has high expectation on reality, and the result is visually correct and not necessary to be physically correct. As the same time, the speed is also very significant in graphics applications.
Therefore, how to efficiently simulate turbulent and realistic fluid has become an important objective for graphics researchers. Many researchers have endeavored to solve the Navier-
Stokes equations in most two numerical methods:stable fluid solver [5] and Lattice Boltzmann
Method(LBM) [6]. Stable fluid solver employed the semi-Lagrangian advection to guarantee the result unconditionally stable. This is an implicit finite-difference method to solves the NS equations. Although this method can achieve real time speed on low resolution, the accelera- tion for high resolution simulation is still needed. Because of the resource and time limitation, it is not practical to simulate the fluid at very high resolution, especially in the real-time ap- plications. On the other hand, low resolution simulation satisfies the requirement of speed but
1 cannot provide abundant details and turbulence due to the numerical dissipation. Many re- searchers have endeavored to introduce turbulence for enhancing fluid animations based on the stable solver. Synthetic noise and energy injection have been used to enhance the details and turbulence.
From another aspect, the simulation can be accelerated by exploiting the programmability of graphics processing unit (GPU). Many researchers study how to utilize the GPU to parallelize the fluids modeling, then we can still efficiently stimulatethe fluids with relative high resolution but achieve abundant details. Lattice Boltzmann Method (LBM) [6] solved the Navier-Stokes equations explicitly on the lattice which can be intuitively accelerated by exploiting the pro- grammability of GPU.
Due to arrival of GPU, many researchers study how to utilize the GPU to accelerate the fluid modeling. The finite-time Lyapunov exponent (FTLE) is exploited in the fluid animation for controlling the fluid behavior, which contains very intense computation but has the parallel nature for GPU acceleration. Another application is the fluid compression/decompression.
Since there is a high expect on the decompression speed, resorting to GPU for acceleration is necessary to achieve better performance.
Another important work for fluid simulation is boundary handling which can highly influence the reality of the fluid. For obstacles inside the fluid, people usually compute the distance
fields of them which can implicitly represent the geometric shapes. The previous methods of computing the distance field cannot achieve fast computation especially for moving obstacles.
2 1.2 Methodology and Contribution
Although flow simulation in computer graphics has achieved astounding appearance of various natural phenomena, graphical fluid solvers are continuously improved confronting the chal- lenge from realistic simulation, energy dissipation and limited computational resources. To enhance and accelerate the modeling of fluid, we propose new methods to achieve better visual results and performance as follows:
The obstacle boundaries inside the fluids are usually represented by distance field. Com- • plicated fluid most of the time is affected by the obstacles inside the fluid. The previous
methods of computing the distance field cannot achieve fast computation especially for
moving obstacles. We propose a novel distance field transform method based on an it-
erative method adaptively performed on an evolving active band. Our method utilizes
a narrow band to store active grid points being computed. Unlike the conventional fast
marching method, we do not maintain a priority queue, and instead, perform iterative
computing inside the band. This new algorithm alleviates the programming complex-
ity and the data-structure (e.g. a heap) maintenance overhead, and leads to a parallel
amenable computational process. During the active band propagating from a starting
boundary layer, each grid point stays in the band for a lifespan time, which is determined
by analyzing the particular geometric property of the grid structure. In this way, we
find the Face-Centered Cubic (FCC) grid is a good 3D structure for distance transform.
We further develop a multiple-segment method for the band propagation, achieving the
computational complexity of O(m N) with a segment-related constant m.
We propose a new scheme for enhancing fluid animation with controllable turbulence. • 3 An existing fluid simulation from ordinary fluid solvers is fluctuated by turbulent vari-
ation modeled as a random process of forcing. The variation is pre-computed as a se-
quence of solenoidal noise vector fields directly in the spectral domain, which is fast and
easy to implement. The spectral generation enables flexible vortex scale and spectrum
control following a user prescribed energy spectrum, e.g. Kolmogorov’s cascade theory,
so that the fields provide fluctuations in subgrid scales and/or in preferred large octaves.
The vector fields are employed as turbulence forces to agitate the existing flow, where
they act as a stimulus of turbulence inside the framework of the Navier-Stokes equations,
leading to natural integration and temporal consistency. The scheme also facilitates adap-
tive turbulent enhancement steered by various physical or user-defined properties, such
as strain rate, vorticity, distance to objects and scalar density, in critical local regions.
Furthermore, an important feature of turbulent fluid, intermittency, is created by apply-
ing turbulence control during randomly selected temporal periods.
We develop a new Lagrangian primitive, named Langevin particle, to incorporate tur- • bulent flow details in fluid simulation. A group of the particles are distributed inside
the simulation domain based on a turbulence energy model with turbulence viscosity. A
particle in particular moves obeying the generalized Langevin equation, a well known
stochastic differential equation that describes the particle’s motion as a random Markov
process. The resultant particle trajectory shows self-adapted fluctuation in accordance
to the turbulence energy, while following the global flow dynamics. We then feed back
Langevin forces to the simulation based on the stochastic trajectory, which drive the
4 flow with necessary turbulence. The new hybrid flow simulation method features nonre-
stricted particle evolution requiring minimal extra manipulation after initiation. The flow
turbulence is easily controlled and the total computational overhead of enhancement is
minimal based on typical fluid solvers.
To accelerate the fluid modeling, we resort for the computation ability of graphics pro- • cessing unit (GPU). Lattice Boltzmann Method (LBM) [6] simulates the fluid by solving
the Navier-Stokes equations in explicit numerical scheme. This solver only requires the
local information from neighbor which is naturally suitable for GPU acceleration. The
finite-time Lyapunov exponent (FTLE) used for extracting the structure features from
the fluid seeds particles on each grid point and the divergence of these particles measure
the distortion of the fluid. Those particles increased the computation intensity and im-
plant this algorithm on GPU largely improve the speed performance. At last, the fluid
decompression has more speed requirement than compression for user application. The
decompression scheme suggested in this dissertation includes the frequency transform,
velocity reconstruction, advection, particle generation. Each method in this system is
parallelized into new scheme to accommodate the programmability of GPU.
1.3 Publications
1. Fan Chen, Ye Zhao. Distance Field Transform with an Adaptive Iteration Method, IEEE
International Conference on Shape Modeling and Applications (SMI), pages 111-118,
Beijing, China, June, 2009
2. Fan Chen, Ye Zhao, Zhi Yuan. Langevin Particle: A Self-Adaptive Lagrangian Primitive
5 For Flow Simulation Enhancement. Computer Graphics Forum ( Eurographics11 ), 30(
4).
3. Fan Chen, Ye Zhao, Zhi Yuan. Spectral Modeling of Divergence-Free Vector Fields.
IEEE VisWeek, 2010 (poster).
4. Zhi Yuan, Fan Chen, Ye Zhao, Zhiqiang Wang, Sean Reber, Cheng-Chang Lu. Ad Hoc
Compression of Smoke Animation. Submit to IEEE TVCG
5. Zhi Yuan, Fan Chen, Ye Zhao. Pattern-Guided Smoke Animation with Lagrangian Co-
herent Structure. ACM Transaction on Graphics (SIGGRAPH Asia 2011), 30 (6), 2011.
6. Zhi Yuan, Ye Zhao, Fan Chen. Incorporating Stochastic Turbulence in Particle-Based
Fluid Simulation. The Visual Computer, 28(5), 435-444, 2012, Springer.
7. Zhi Yuan, Ye Zhao, Fan Chen. Stochastic Modeling of Light-weight Floating Objects.
The 24th International Conference on Computer Animation and Social Agents, May,
2011. Extended abstract in ACM SIGGRAPH Symposium on Interactive 3D Graphics
and Games, March, 2011.
8. Ye Zhao, Zhi Yuan, Fan Chen. Enhancing Fluid Animation with Adaptive, Controllable
and Intermittent Turbulence. Proceedings of ACM SIGGRAPH/Eurographics Sympo-
sium on Computer Animation (July, 2010).
6 CHAPTER 2
Background
2.1 Fluid Simulation
Fluid simulation is generally achieved by solving the famous incompressible Navier-Stokes(NS) equations [4]. The equations applying Newton’s second Law to fluid motion are popularly used by the animators and researchers in the simulation of fluid. The description of the equations is as follows:
u = 0, (1) ∇ ∂u + u u = P + ν 2u + F. (2) ∂t ∇ −∇ ∇
Eqn.1 is called the incompressibility condition which makes sure the mass of fluid is conserved.
In other words, for one point in the fluid volume, the in and out velocity must sum up to zero.
Eqn.2 is called the momentum equation describing the fluid motion driven by the pressure and force. u stands for velocity, t is the time, P is the pressure, ν is the kinetic viscosity coefficient, and F is the external force. The first term on the right of Eqn.2 is the advection term addressing that the velocity is advected by itself. The second term means the velocity changes along the opposite gradient of the pressure and the pressure pushes the fluid from high pressure region to low pressure region. The third term which is the diffusion term says the velocity diffuses along the gradient of the velocity. The external forces can be the force such as gravity, buoyancy, etc.
7 These partial different equations(PDE) can be solved by using different methods. Lattice Boltz- mann Method(LBM) [7,8] solves the NS equations on the lattice which is well suitable for simulating the flow with complex geometries and can be easily implanted on parallel environ- ment due to its explicit scheme of solving PDE equations. However, it requires small time steps and fine discrete grid to achieve stable and decent results which may decrease the overall speed. Another grid-based solver called stable solver [5] proposed by Stam is uncondition- ally stable by using semi-Lagrangian advection. All these grid-based methods demand high resolution of discretization to achieve realistic result with fruitful details. Therefore, in some large scenery or high definition applications, the computational cost is highly expensive which is not suitable for some real-time applications. The emergence of Lagrangian approaches [9] release the fluid solver from the limitation of the grid(lattice), which employs the particles to model the fluid. Each particle carries the material properties to move in the fluid and the mass conservation is inherently satisfied. Meanwhile, the particle-based methods do not suffer the numerical dissipation introduced by interpolation operations during the advection step. The
Smoothed Particle Hydrodynamics(SPH) [10] was first proposed by Lucy [11] and by Gingold and Monaghan [12]. Then Reeves [13] introduced the particle-based scheme into computer graphics. Since then large number of work has been done in this field and achieved great success [14–18].
2.2 Distance Field
In the fluid simulation, boundary handling is very important for generating realistic fluid. The representation of an obstacle can affect the final animation of the flow. An inaccurate repre- sentation may make the fluid unrealistic. An obstacle is usually represented by the distance 8 field [19] in the fluid simulation which is an implicit representation of a geometric shape. The generation of distance field has been an essential research topic in computer vision, graphics and visualization, as well as applied mathematics [20–24].
A distance field can be generated directly by computing from each point to a geometric primi- tive. An interactive algorithm computes 3D Euclidean distance fields on GPU [25] by rasteriz- ing the distance vectors from the points on the slice to the primitives.
Related to our research, distance transform is a general approach to form the distance field by propagation from a starting set computed by direct geometric and analytic algorithms. Iteration methods are first applied to solve the shape-from-shading problem on the whole domain [26,27] in numerical computing field. Another strategy is to propagate the distances to neighbors with special templates through the domain. The template can be designed based on chamfer distance [19], or more precisely, on vector distance [28]. Fast marching methods [29–31] are proposed to compute the arrival time of a wavefront expanding in the normal direction at an active band of grid points, which actually solve the Eikonal equation from a given boundary condition. Zhao et al. [32,33] present a sweeping method to solve the Eikonal equation by
Gauss-Seidel iterations with alternating sweeping ordering on rectangle grid or meshes. Fast sweeping method is applied for static convex Hamilton-Jacobi equations [34]. Frisken et al.
[35] propose a hierarchical computation for distance field generation. For more details, we refer the interested readers to good surveys on 2D [36] and 3D distance transform [37].
Except the whole-domain iteration methods, these basic algorithms cannot be easily paral- lelized due to their computational schemes. Several approaches employ specially-designed
GPU algorithms, such as tile-based updating scheme of fast marching [38], domain division of
9 sweeping method [39], and delicate narrow band packing [40]. Working on polygon meshes, a method based on scan-conversion of the mesh is proposed by Mauch [41, 42]. Sigg et al. improve their algorithm with hardware implementation [43]. Another approach is to construct a Voronoi diagram, which leads to a distance field generation of 2D and 3D polygons [44].
Weber et al. present a parallel algorithm for approximation of geodesic distances on geome- try images [45]. These methods process the geometric elements independently, and thus can utilize the parallel nature of graphics hardware to achieve good performance and success in distance generation. In comparison, our method does not rely on the meshes to represent ge- ometric shapes. In our approach, we utilize the inherent parallel scheme in iteration methods, which provide a wavefront scheme to further improve the performance and flexibility.
2.3 Fluid Turbulence
“Turbulence is an irregular motion which in general makes its appearance in fluids, gaseous or liquid”, stated by Taylor and von Krmn in 1937. Therefore, turbulence is very significant in the fluid modeling and a key factor to enhance the reality of the fluid. However, the modeling of turbulence is quite difficult due to its irregularity and high Reynold number(which defines the degree of the turbulence).
Also, these fine-scale details are contingent on the resolution of computing grids usually re- stricted by computational resources and performance requirements. Moreover, numerical dissi- pation contributing to significant energy loss further leads to unrealistic detail damping. Alter- natively, fully particle-based solvers have been used, e.g., the Smoothed Particle Hydrodynam- ics [46] is employed in a large body of research work such as [14–18]. The pure Lagrangian
10 approach usually needs a large amount of Lagrangian primitives (particles) distributed in the domain, and it has not been intensively studied in computer graphics to simulate turbulent smoke.
Advanced numerical scheme Many advanced numerical schemes are proposed for solving the governing NS equation with reduced energy damping. The advection is replaced by the La- grangian fluid-implicit-particles (FLIP) [47] and higher order schemes with repeated semi-
Lagrangian steps [48, 49]. Furthermore, different numerical schemes are introduced includ- ing higher order advection scheme (BFECC) [48]. Alternative paths include adaptive high- resolution simulation (e.g. [50]), particle fluids (e.g. [14]) and precomputation (e.g. [51]).
The enforced circulation preservation [52] from Stokes’ theorem and an energy preserving scheme [53] in a finite volume manner provide stable Eulerian solutions on simplicial grids.
These methods commonly work on a stationary grid and require solving a large linear system with rapidly emerging complexity from grid size increase. The computational cost is reduced by coarsening the grid in particular for pressure solver with an efficient approximation of the pressure gradient on the fine grid [54]. Spatial refinements [50, 55, 56] adaptively provide details with high resolution at parts of the simulation domain with extra grid manipulation on-the-fly.
Noise field integration Fluid turbulence manifests stochastic fluctuation, and direct numerical simulation cannot model very turbulent behavior with the intrinsic nature. Synthetic noises are naturally employed to be integrated with the simulated velocities, which reduces cost and creates natural-looking results. A handful of recent approaches utilize Perlin or Wavelet noise to generate spectrum-controllable divergency-free fields with the curl operation [57], which are
11 added to the simulated flow fields. Divergence-free fields for artistic simulation are calculated by a fast simulation noise [58]. Beyond fluids, fractal mountains were created in the frequency domain according to fractal spectrum [59], which can be applied to fluid turbulence. Forces have also been used in animated fluid control [60–63]. Since such isotropic noises are not directly applicable for the fast evolving anisotropic flow fields, these methods endeavor to ma- nipulate the noises with turbulence parameters computed from special energy transport models.
For the purpose, Schechter and Bridson [64] propose a simple linear model, Kim et al. [65] use locally assembled wavelets, and Narain et al. [66] apply an advection-reaction-diffusion equa- tion. Most recently, a particle-based method is developed [67] to create scalable chaotic effects based on a very coarse grid simulation and using a particularly-stretched wavelet noise for tur- bulence production. The energy transport is solved by a two-equation k ε model on particles − to incorporate anisotropic noise and update particle velocities. This method achieves very fast performance by rendering particles directly. However, it aims at creating very chaotic flows and does not perform well for non-turbulent fluids. These models are in accordance to the complex estimation (e.g., statistical Kolmogorov theory [68]) of turbulence evolution from the simulation results. They integrate noises in a post-processing stage. Therefore, extra efforts are necessarily devoted to make the noise coupling temporally consistent with the evolving simulation.
Energy injection On the other hand, ongoing flows are also enhanced through turbulence energy injection. Vorticity confinement forces computed at all grid sites increase rolling features of smoke [69]. Later, manually seeded vortex particles carrying an additional vorticity apply similar rotational forces while the particles stream inside the flow [70]. The carried vorticity
12 is modified through a vorticity-velocity form of the NS equation. The method requires very careful seeding since it might mistakenly impose unnatural rotation to the flow without the guidance of a physical turbulence model. Pfaff et al. [71] present to sample vortex particles physically on boundary layers for obstacle induced turbulence. After seeding, the method solves energy transport equations to determine when the particles should increase or reduce their chaotic agitation, and correspondingly heuristic rules are developed for particle merging and splitting. It does not handle fluid streams without objects. spectrally generated divergence- free noises to instigate turbulence conditionally but not with energy evolution. This method adds forces in the whole domain so that it may introduce excessive energy into the system and make it unstable.
2.4 GPU acceleration in Fluid Modeling
The computational power of graphics processing unit (GPU) grows very fast recently for the demanding of the real time 3D graphics. The graphics card is dedicated to speed up rendering of graphics data such as polygons, texture and so on, is processed extraordinarily fast. Com- paring with CPU, there is large discrepancy in floating-point capability between them. Fig. 1 shows the peak performance of GT200 from NVIDIA is around 0.9 GFLOPS per second while the contemporary Intel Harpertomn is around 0.12 GFLOPS per second. The multithreaded architecture of GPU makes it appropriate for highly parallel and computation intensive appli- cation. Therefore, besides those graphics applications, more and more numerical applications and general computational problems resort to hardware for accelerating the performance [72].
In fluid modeling, the explicit LBM solver only involves local information during numerical
13 Figure 1: Floating-Point Operations per Second. [1] computation. It has advantages to being highly parallelized on GPU. Many researchers dedi- cate to accelerate LBM with graphics card. Zhao et al. [73] implements LBM solver with CG on a single GPU. Mattila et al. [74] optimize the memory layout with adopting the swapping technique. With the advent of Compute Unified Device Architecture (CUDA), it is more easier to program on graphics card. LBM can also accelerated by MPI based on CUDA on GPU clus- ter [75,76]. Finite-Time Lyapunov Exponent (FTLE) field extracting the fluid features such as flow separation, transport barriers has been used in visualization to better understanding the
fluid flow. The burdened computation intensity and parallelizability of FTLE can be efficiently performed on GPU through OpenGL [77]. Brunton et al. [78] reduce the redundancy of parti- cle integrations by approximate the particle flow map to speed up the calculation. Barakat et al. [79] adaptively sample and render the FTLE field depending on the view angle and com- putation is on the fly through optimizing the memory management and computation resources.
Fluid modeling generates 3D, high-resolution, and time-varying data sets. The large data size
14 imposes challenge on storing and transmitting the animations where good compression tech- niques are demanded. Current GPU based compression methods [80] are not suitable for fluid data since they can’t reserve the small-scale details which play important roles in fluid data.
15 CHAPTER 3
Distance Field
3.1 Introduction
A discrete distance field provides an implicit representation of a geometric shape, which is defined by a collection of sampling points inside an enclosing domain of the shape. Each sampling point stores the smallest distance from itself to the interested shape. Usually, these sampling points are grid points on a rectangular grid in 2D or 3D domain. The distance can be defined in terms of arbitrary metrics, while the Euclidean distance is the most popular model for graphical applications. As an implicit shape modeling scheme, the distance field is widely used in many important applications, such as image segmentation and processing, 3D shape editing, smoothing, morphing, collision detection, topology operations and volume graphics.
Therefore, distance field generation has been an essential research topic in computer vision, graphics and visualization, as well as applied mathematics. A variety of approaches have been proposed to address the problem that can be described as solving an Eikonal equation. Most recent endeavors adopt a strategy based on the distance field transform. Instead of direct com- puting the closest distance from every point to the shape, only the grid points belonging to a boundary band close to the shape is computed, from which the remaining distances are eval- uated by distance propagating to the rest of the volume. The distance propagation algorithms can be categorized into two main strategies:
Domain Sweeping: Distance propagation starts from some corners of the rectangle grid to the 16 whole domain by a predefined sequential order related to the axial directions. For example, for a 3D rectangular domain [0..NX] [0..NY ] [0..NZ], distance information propagates from × × (0, 0, 0) to ((NX 1), (NY 1), (NZ 1)) by traversing all grid points in an order of: first − − − x direction, next y direction, then z direction. Such distance transform does not consider − − − the arbitrary locations of the starting bands providing the basis of propagation. Obviously, one such propagating traversal cannot accomplish the task. Typically several passes of traversal in different directional orders are required.
Front(contour) Propagation: Taking the initial band into consideration, the propagation starts from the grid pointsinside the band and transfers the known distance information to their neigh- bors until the whole domain is computed. The adaptive approach guarantees distance transform in an increasing order, which is implemented by exploiting a special priority data structure (e.g. a sorted list or a heap) storing a sorted active band. The priority data structure maintains grid points being used for transform with the discrepancy among their distance values. The scheme only retrieves a grid point with the shortest distance from the band, and propagates the distance to its neighbors not in the band. Thus, it avoids backtracking over previously evaluated grid points and enables fast marching and correct results.
The sweeping methods go through the N = NX NY NZ grid points with several (a con- × × stant value) passes, and thus achieve asymptotic complexity O(N). In comparison, the front propagation methods have the complexity of O(NlogN) due to the heap maintenance efforts.
Though it seems the former is advantageous, the latter provides a more flexible approach when a particular limit of distance is required to compute. When answering a query “give me the points with distance smaller than dl?”, it can easily stop computing during propagation and
17 provide correct results. A small dl is usually imposed in many graphical applications, where the front propagation methods can provide a faster response than the sweeping approaches that have to complete the whole domain computation. Both strategies have been widely studied and achieved great success, we refer interested readers to a good survey [37] for detailed analysis.
A common disadvantage of both strategies is that none of the approaches can be easily par- allelized. Because both are established on an algorithmic basis of sequential processing. The sweeping methods compute distance transform on grid points one point to another according to particular traversal sequences. The front spreading methods process the active band with a sorted sequential order.
The distance transform can be performed in a parallel computational scheme. A straightfor- ward idea goes back to an approach utilizing iteration strategy [27]: At a time step T , each grids point asks all its neighbors about their current distance, and then updates the distance of itself by finding the smallest among all the values propagated from these neighbors respec- tively. The updated distance value is the new distance of each point at time T +1. When all the points have a converged distance value, i.e., the value does not change in consecutive time steps, the distance field of the whole domain is achieved. This strategy may take many steps to reach the convergence status, resulting in relatively slow computational performance. How- ever, in this approach each grid point can be processed concurrently at each time step, which enables parallel computing, which is critical for exerting the computational power of modern multi-core CPUs and other parallel architectures. However, this approach is not adaptive and cannot achieve the early termination for queries with a limited distance value.
In this dissertation, we propose a new approach that has (1) the ease of simple programming
18 and the parallel computing capability by an iterative scheme; and (2) the controlling ability to handle particular distance limits of propagating fronts, enabling the query-based computation.
In order to provide front propagation features, we enable a similar narrow band to manage the active grid points; however, to apply a parallel computing scheme, these points do not have different priorities as in the fast marching method [31]. Our method utilizes an adaptive iteration on the active band to fulfill the requirements. Without distinguishing points inside the band by a priority-enabled data structure, the algorithm provides correct distance results only if a grid point is activein thenarrow band until its distancewill no longer have the possibilityto be updated. Therefore, each grid point will be assigned a lifespan monitoring its existence inside the band. The lifespan is determined by the structural feature of the domain-decomposition grids. We examine the geometric properties of several grid structures to find the theoretical lifespan of an arbitrary grid point, and then, use it to control how long a grid point should stay in the band. Moreover, we exploit a multiple-segment narrow band propagation algorithm to further reduce the complexity and improve the performance.
3.2 Distance Field Transform
3.2.1 Definition
A distance field represents surfaces or curves with implicit representation, which has been broadly used in shape modeling purposes in computer graphics. A distance field is defined as a scalar field that specifies a distance to a shape, where the distance is usually signed to distinguish between the inside and outside of the shape. Data set X representing a distance
19 (a) (b) Figure 2: Vector based distance propagation.
field to surface S is defined as: X : R3 R and for p R3, → ∈
X(p)= sgn(p) min p q : q S (3) {| − | ∈ } where sgn(p)=1( 1) if p is inside (outside) of S, and is the Euclidean norm. − ||
The distance transform computes the distance field of all the points, p R3, from an initial ∈ starting set with known distance values. The starting set stores the distance of points in a boundary layer of S, which is computed by geometric or analytical algorithms.
3.2.2 Vector-Based Distance Transform
We compute the distance transform from one point to its neighbors by a vector propagation method, in which the distance is represented as a vector from a grid point to the closest point on the surface. As illustrated in Figure 2a, a known distance of P is represented as a vector
−→CP , where C is the closest point on the surface to P . When P propagates to its neighbor Q, the distance of Q will be computed by
−→CQ = −→CP + −→P Q. (4)
Here, −→P Q is the constant vector between two neighbors. 20 Note that the length of −→CQ is used as the distance of Q propagated from P , though theoretically
C is the closest point to P not Q. This is the assumption of all distance transform methods, which can lead to computational errors related to the grid resolution.
The grid point Q will be able to compute its distance from other neighbors in the same way as from P . For example, in Figure 2b, another neighbor R also provides Q’s distance represented by −−→DQ. The actual distance of Q for next step is one of these distance vectors with the smallest length.
3.2.3 Our Computational Scheme
The domain sweeping methods control the directions and orders of propagation. That is, Q is only allowed to compute the distance vector from particular neighbors in each pass, so that after a few different passes, every point achieves its shortest distance to surfaces. In comparison, the front marching methods only allow one existing point with the smallest distance to find its inactive neighbors and propagate. As a result, each neighboring point guarantees to acquire its shortest distance (see [31] for proof). In the conventional iteration method, each point Q of the whole domain computes many temporary distance vectors from all the neighbors and choose the smallest for the next time step. In an arbitrary time step, Q may not have achieved its final distance value. After many iteration steps (usually proportional to O(N)), eventually, all the grid points will approach their correct closest distance. Such convergency is achieved when all the points no longer change (in reality, within a very small error tolerance) their distance value in consecutive steps.
21 In our approach, the iteration computation applies only on an active narrow band, a small por- tion of the whole domain, to improve performance. Furthermore, we propose a new algorithm that does not maintain the priority queue inside the narrow band. Therefore, our method pro- vides an adaptive iteration method for distance transform. In the next section, we describe the details of our active band propagation algorithm and the iteration strategy based on a point’s lifespan.
3.3 Active Band Scheme
3.3.1 Propagation Procedure
At the beginning, grid points in a boundary layer closely enclosing the interested shape obtain their distance vectors by direct geometrical computing. This starting set initiates the active narrow band, NB. Thisband stores all the gridpoints, to whom the evolvingdistance transform front has been propagated and whose distance has the probability to be updated in future steps.
The band will evolve along time steps by adding new points, and by removing points having achieved their final distance.
T Assuming the distance dP of each point P inside the band NB is known at a time step T , each of its neighbors, Q, is considered. Next, we use the method described in Section 3.2.2 to compute a temporary distance of each Q. In this way, each Q will have a group of temporary
T +1 distances and it will use the one with the smallest magnitude as its distance dQ at next time step T +1. Q can be a point newly added to NB in T +1. It can also be a point already existing in NB before this time step. This illustrates the difference between our method and the fast marching methods.
22 We are facing a challenge of how to remove a point from NB. Due to the use of the narrow band, the iteration convergence rule, which examines whether points no longer update their distances, cannot be applied. Because the rule is only valid when all points in the whole domain are computed together. Instead, we seek a solution by investigating the structural property of the grid to define the lifespan of a point (i.e. how long a point should stay in NB). In other words, we want to find the time step when a point has achieved the correct final distance with no chance to be updated again, and then remove it from NB.