Scalable Data Transformations for Low-Latency Large-Scale Data Analysis

Scalable Data Transformations for Low-Latency Large-Scale Data Analysis Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Steven Martin, B.S., M.S. Graduate Program in Computer Science and Engineering The Ohio State University 2013 Dissertation Committee: Han-Wei Shen, Advisor Roger Crawfis Raghu Machiraju c Copyright by Steven Martin 2013 Abstract Interactive analysis of simulation results has become a mainstay of science and engineering. With continually increasing compute power, the size of simulation results con- tinues to grow. However, network and mass storage device throughput are not increasing as quickly. This introduces difficulties in scaling analysis workflows to take advantage of these new compute resources. This dissertation describes a body of work that increases the scalability of interactive volume analysis workflows by moving elements strongly dependent on the input data size from the interactive phase of the workflow to the data preparation phase. This reduces the overall computational complexity of the interactive phase, enabling reduced interaction latency. Two related groups of approaches are explored: salience-aware techniques, and techniques for scalable salience discovery. Salience-aware techniques leverage the tendency for different parts of volume data to be of differing importance. In this dissertation, salience-aware techniques are proposed for load-balancing of isosurfacing on clusters, and salience-aware level of detail selection. In many cases, the salience of data may not be known a priori. Salience discovery techniques seek to facilitate the discovery of salience of different interval volumes. In this dissertation, a technique for iterative salience discovery, in the context of interactive transfer function design on large-scale volumes, is discussed. Supporting that, a technique is described for evaluating distribution range queries. ii For both groups of techniques, scalable data transformations are described and target applications are explored. This work streamlines workflows for visualization of large-scale volume data. iii Acknowledgments I would like to thank Professor Han-Wei Shen for the advice and support he has given me as my advisor. I would also like to thank Professors Barb Cutler and W Randolph Franklin at Rensselaer Polytechnic Institute for giving me the chance to learn about aca- demic research. Finally, I would like to thank Pat McCormick at Los Alamos National Laboratory and Thomas Ruge at NVIDIA Corporation for their advice and support during my internships. The work in this dissertation would not have been possible without the feedback, advice, and support of these and many other individuals. It is in that context that “We” is used to refer to the author in this dissertation. iv Vita 2007 .......................................B.S.Electrical Engineering, Rensselaer Polytechnic Institute 2007 .......................................B.S.Computer and Systems Engineering, Rensselaer Polytechnic Institute 2012 .......................................M.S.Computer Science, The Ohio State University Publications Research Publications B. Cutler, Y. Sheng, S. Martin, D. Glaser, “Interactive Selection of Optimal Fenestration Materials for Schematic Architectural Daylighting Design”. Automation in Construction, 17, 2008. S. Martin, H. Shen, R. Samtaney, “Efficient Rendering of Extrudable Curvilinear Vol- umes”. Proceedings of IEEE Pacific Visualization Symposium, 2008. P. McCormick, E. Anderson, S. Martin, C. Brownlee, J. Inman, M. Maltrud, M. Kim, J. Ahrens, L. Nau, “Quantitatively Driven Visualization and Analysis on Emerging Archi- tectures”. SciDAC Journal of Physics, 2008. S. Martin, H. Shen, “Load-Balanced Isosurfacing on Multi-GPU Clusters”. Proceedings of Eurographics Symposium on Parallel Graphics and Visualization, 2010. S. Martin, H. Shen, “Histogram Spectra for Multivariate Time-Varying Volume LOD Se- lection”. Proceedings of IEEE Symposium on Large-Scale Data Analysis and Visualization, 2011. v S. Martin, H. Shen, “Interactive Transfer Function Design on Large Multiresolution Vol- umes”. Proceedings of IEEE Symposium on Large-Scale Data Analysis and Visualization, 2012. S. Martin, H. Shen, “Stereo Frame Decomposition for Error-Constrained Remote Visual- ization”. SPIE Visualization and Data Analysis, 2013. S. Martin, H. Shen, “Transformations for Volumetric Range Distribution Queries”. Pro- ceedings of IEEE Pacific Visualization Symposium, 2013. Fields of Study Major Field: Computer Science and Engineering vi Table of Contents Page Abstract......................................... ii Acknowledgments.................................... iv Vita ........................................... v ListofTables ...................................... xi ListofFigures...................................... xii 1. Introduction.................................... 1 1.1 Salience-AwareTechniques . 2 1.1.1 Load-Balanced Parallel Isosurfacing . .. 2 1.1.2 LevelofDetailSelection. 4 1.2 Techniques for Scalable Salience Discovery . ...... 5 1.2.1 Transformations for Volumetric Range Distribution Queries . 5 1.2.2 InteractiveTransferFunctionDesign . .. 6 1.3 Contributions................................ 8 2. Load-Balanced Isosurfacing on Multi-GPU Clusters . .......... 10 2.1 RelatedWork................................ 12 2.2 BlockDistributionAlgorithm . 15 2.2.1 IsosurfacingCostHeuristic . 16 2.2.2 Preprocessing ........................... 18 2.2.3 Profiling .............................. 18 2.2.4 Assignment ............................ 19 2.3 BlockIsosurfacingAlgorithm . 21 2.3.1 TriangleCounting . 21 vii 2.3.2 TriangleCreation . .. .. .. .. .. .. .. 22 2.3.3 Optimizations ........................... 23 2.4 Results ................................... 24 2.4.1 TriangleCountsversusIsosurfacingTime. ... 25 2.4.2 Effectsofsalientisovaluerangesonspeedup . .... 28 2.4.3 Volumesizescalability . 30 2.4.4 Strongscalability . 31 2.5 Conclusion ................................. 32 3. Stereo Frame Decomposition for Error-Constrained RemoteVisualization . 39 3.1 RelatedWork................................ 40 3.2 Technique.................................. 45 3.2.1 Reprojection ............................ 46 3.2.2 ResidualDecimation. 48 3.2.3 DecimatedResidualCodec . 49 3.2.4 Remapping............................. 51 3.2.5 RateBalancing........................... 52 3.3 ErrorConstraints.............................. 55 3.3.1 ColorDifference. .. .. .. .. .. .. .. 55 3.3.2 TransferFunctionDistance . 56 3.3.3 IntegratedTransferFunctionContrast . ... 57 3.4 Results ................................... 57 3.4.1 Datasets .............................. 58 3.4.2 LossyCodecs ........................... 58 3.4.3 LosslessCodecs .......................... 59 3.4.4 CompressionPerformance. 60 3.5 Conclusion ................................. 63 4. Histogram Spectra for Multivariate Time-Varying Volume LODSelection . 68 4.1 RelatedWork................................ 70 4.2 LevelofDetailSelection. 74 4.2.1 HistogramSpectra . 74 4.2.2 WeightedHistogramSpectra . 76 4.2.3 PredictedErrorUsingHistogramSpectra . .. 77 4.2.4 DiscretizationofHistogramSpectra . .. 78 4.2.5 Integer Programming Formulations for LOD Selection . ..... 79 4.2.6 Greedy Algorithm for Nonlinear Integer Programming Formu- lation................................ 81 4.2.7 MultivariateConsiderations . 83 4.3 Results ................................... 86 viii 4.3.1 Testdatasets............................ 87 4.3.2 Runningtimecomparisons . 88 4.3.3 Visualandstatisticalcomparisons . .. 91 4.4 Conclusion ................................. 92 5. Efficient Rendering of Extrudable Curvilinear Volumes . ........... 97 5.1 RelatedWork................................ 99 5.2 Applications ................................102 5.3 ComputationalSpaceRepresentation . .104 5.3.1 Dataandspaces ..........................104 5.3.2 Positionaltransformations . 104 5.3.3 Computationofp ¯(k),n ˆ(k),s ¯(i, j), andy ˆ. .............105 5.3.4 Jacobianmatrices . .108 5.3.5 AMRintegration. .110 5.4 Rendering..................................110 5.4.1 RayCasting ............................113 5.4.2 Correctionloop . .. .. .. .. .. .. ..114 5.4.3 Steplengthdetermination . .116 5.4.4 GPUimplementation . .117 5.5 Results ...................................119 5.6 Conclusion .................................122 6. Transformations for Volumetric Range Distribution Queries...........125 6.1 RelatedWork................................129 6.2 Technique..................................132 6.2.1 Highleveloverview . .133 6.2.2 IntegralDistributionFunction . 133 6.2.3 Discretization .. .. .. .. .. .. .. ..136 6.2.4 SpanDistributions . .138 6.2.5 StorageofSpanDistributions . .142 6.2.6 ApproximateQuerieswithSpanDistributions . .146 6.2.7 ComparingSpanDistributions. 148 6.3 WorkingSetsinApplications . 149 6.3.1 Application:Hövmollerdiagrams . 151 6.3.2 Application:Transferfunctiondesign . .153 6.4 ExtensionsandConclusion. 155 7. Interactive Transfer Function Design on Large MultiresolutionVolumes . 158 7.1 RelatedWork................................159 ix 7.2 Technique..................................161 7.2.1 CursorHistograms. .162 7.2.2 HistogramExpressions . .163 7.2.3 LevelofDetailSelection. .164 7.2.4 TransferFunctionConstruction . 166 7.2.5 Interaction .............................167 7.3 Results ...................................167 7.4 ConclusionandExtensions. 169 8. ExtensionsandConclusion

Load more