Parallel and Efficient Boolean on Polygonal Solids
Total Page:16
File Type:pdf, Size:1020Kb
Vis Comput (2011) 27: 507–517 DOI 10.1007/s00371-011-0571-1 ORIGINAL ARTICLE Parallel and efficient Boolean on polygonal solids Hanli Zhao · Charlie C.L. Wang · Yong Chen · Xiaogang Jin Published online: 22 April 2011 © Springer-Verlag 2011 Abstract We present a novel framework which can effi- 1 Introduction ciently evaluate approximate Boolean set operations for B- rep models by highly parallel algorithms. This is achieved Boolean operations, such as union (∪), subtraction (−), by taking axis-aligned surfels of Layered Depth Images and intersection (∩), are useful for combining simple mod- (LDI) as a bridge and performing Boolean operations on els to create complex solid objects on a Constructive Solid the structured points. As compared with prior surfel-based Geometry (CSG) tree, which has a variety of applications approaches, this paper has much improvement. Firstly, we in computer-aided design and manufacturing (CAD/CAM), adopt key-data pairs to store LDI more compactly. Secondly, virtual reality, and computer graphics (e.g., [4, 17, 18]). robust depth peeling is investigated to overcome the bottle- Polygonal meshes are the most popular boundary repre- neck of layer-complexity. Thirdly, an out-of-core tiling tech- sentation (B-rep) for 3D models. Although many commer- cial CAD systems, such as Rhinoceros [14] and ACIS [13], nique is presented to overcome the limitation of memory. are able to compute Boolean on polygons, they have diffi- Real-time feedback is provided by streaming the proposed culty in modeling very complex models (e.g., the scaffold pipeline on the many-core graphics hardware. of Bone as shown in Fig. 1). Polygons provide the simplest way to approximate freeform solids and the computation of Keywords Boolean operations · Layered depth images · Boolean is not necessary to be exact in many areas, such Depth peeling · Out-of-core · CUDA as biomedical engineering, jewelry industry, and game in- dustry. This gives us an opportunity to overcome the robust- ness problem [11] and to speed up the computation by using an approximate method. Recent researches have shown that H. Zhao College of Physics & Electronic Information Engineering, densely oriented point sets are competent to fulfill such a Wenzhou University, Wenzhou 325035, China task. Several algorithms [1, 23, 32] have been investigated e-mail: [email protected] to perform Boolean on oriented points but they do not op- erate on B-rep models directly. Some methods [3, 31] can C.C.L. Wang () Department of Mechanical and Automation Engineering, process B-rep models using well-structured points in Lay- The Chinese University of Hong Kong, Hong Kong, China ered Depth Images (LDI). However, both the efficiency and e-mail: [email protected] the memory-usage still have much room for improvement. In this paper, we aim to develop a framework that fully Y. Chen Epstein Department of Industrial and Systems Engineering, exploits the advantages of LDI [27] and the power of University of Southern California, Los Angeles, CA, USA Graphics Processing Unit (GPU) for evaluating approxi- e-mail: [email protected] mate Boolean operations on polygonal meshes. Our algo- rithm is efficient, parallel, and scalable compared with prior X. Jin approaches. These good properties are achieved by intro- State Key Lab of CAD & CG, Zhejiang University, Hangzhou 310058, China ducing the following work. First of all, we adopt a compact e-mail: [email protected] representation for LDI which contains axis-aligned surfels 508 H. Zhao et al. Fig. 1 A high-quality osseous scaffold is modeled with our system: surfel-represented Boolean result, where surfels with different colors (left) input models (Bone, offset of Bone, and cellular-scaffold), (mid- are from different input solids, and (right) the cross-section view of dle) the layout of solids where 3 × 3 × 3 of cellular-scaffold models are the output osseous tissue. The approximation resolution is 512 × 512 tiled together, resulting in a CSG tree with a depth of 28, (mid-bottom) in this example sampled from the boundary of input solids. By utilizing the The solution to these problems is the major contribution of atomic operation in graphics global memory, we develop a our approach. programmable rasterizer to capture all layers of LDI in a sin- The rest of our paper is organized as follows. After briefly gle pass. Moreover, the problem of limited sampling resolu- reviewing some related previous work in the next section, tion is avoided by splitting the whole working envelope into Sect. 3 introduces an overview of our new Boolean evalua- multiple smaller tiles and processing them independently us- tion framework. Section 4 presents the parallelization tech- ing the out-of-core strategy. Note that the computation in the niques on the GPU in detail. Experimental results and re- whole pipeline is highly parallel, thus allowing for a GPU- lated discussions are given in Sect. 5. Finally, Sect. 6 con- based implementation. To the best of our knowledge, this cludes the advantage of our approach and suggests some fu- is the first algorithm that is capable of computing Boolean ture work. combinations of freeform polygonal shapes in real-time (see the examples shown in Fig. 11). Even for a very complex CSG tree such as the one shown in Fig. 1, we are able to ob- 2 Related work tain the result of the high-quality osseous tissue in 1.7 s on an NVIDIA Geforce GTX 480 GPU while consuming only Boolean operations of solids have been investigated in com- 600 MB of the graphics memory. puter graphics for many years. For a comprehensive survey, In summary, this paper addresses the following important we refer readers to [26]. Here, we only review the Boolean questions in geometric modeling: algorithms based on surfels [24] and oriented points. – How to support real-time approximate geometric feed- Many approaches [6–9, 25] employ the image-based back for a CSG tree with polygonal models? depth-layering or depth-peeling technique to achieve the in- – How to support highly accurate approximation that may teractive rendering of CSG models. They gradually peel can- be beyond the memory capacity of graphics hardware? didate points from the boundaries of CSG primitives and Parallel and efficient Boolean on polygonal solids 509 only display the nearest boundary points of the Boolean re- Algorithm 1 Streaming Boolean operations sult. The points can be fast sampled using the graphics hard- 1: Input Boolean tree accompanied with polygonal meshes ware [5] and encoded as pixels in LDI. However, these ap- 2: for each volumetric tile in bounding volume do proaches only solve the fast rendering problem but do not 3: for each axis in {x, y, z} do provide the ability to obtain the geometric shape of resultant 4: for each node n by depth-first tree traversal do solids. 5: if n is a leaf then Surfel is a representation that can approximate the shape 6: Sample axis-aligned surfels by rasterizing n of a solid model on some oriented sample points. Boolean 7: Shell sort per-pixel surfels operations on freeform solids represented by surfels have 8: else been developed in [1, 23]. Since most CAD/CAM applica- 9: Merge sort candidate surfels from children tions, such as CAE analysis, CNC machining and rapid pro- 10: Boolean on the sorted candidate surfels totyping, need the B-rep of solid models, additional surface 11: end if reconstruction from the surfel-represented solid is required. 12: end for Zhou et al. [32] introduced a parallel surface reconstruc- 13: Obtain boundary surfels from the root node tion algorithm by extending the marching cubes algorithm 14: Accumulate the QEF matrices for boundary cells [20] with a GPU-based octree. They also presented an inter- 15: Mark interior/exterior cells along the axis active Boolean tool for oriented point set. However, their 16: end for adaptive marching cubes algorithm tends to smooth the 17: Output vertices by solving the QEF matrices sharp features in the underlying geometry. Moreover, their 18: Output indices by connecting boundary cells in-core reconstruction can only handle octrees with a lim- 19: end for ited number of depths. 20: return merge all tiles of polygons into a closed mani- Chen and Wang [3] presented an approximate Boolean fold approach that can preserve sharp features in the resultant solids. They made use of a variant of LDI, named as Lay- ered Depth-Normal Images (LDNI), for the efficient volu- each leaf node, we convert the B-rep model into a solid metric test [29], and employed the quadratic error function bounded by surfels using axis-aligned ray-casting. For each (QEF) in dual contouring [15] for surface generation. How- inner node, we first collect all candidate surfels from its ever, only the sampling of LDNI is performed using GPU- children nodes and sort the merged samples according to based depth peeling whereas other algorithms are performed their depth values. Next, valid boundary surfels are extracted on the CPU. Although multi-threaded schemes are designed, by performing Boolean on the candidate surfels in an effi- the parallelism is limited. Wang et al. [31] further transferred cient way. After traversing the CSG tree, the resultant surfel- the LDNI-based algorithm to the GPU to improve the ef- represented solid is obtained at the root node. ficiency. Unfortunately, as the workload on each thread is Efficient surface reconstruction is conducted to output B- not well balanced, the primary algorithm still cannot pro- rep models as the result of Boolean. We develop a modified vide real-time feedback. In addition, their algorithm can- version of dual contouring in this paper. The optimal posi- not process models which contain more than 256 LDI lay- tion of each boundary cell is determined by the QEF matrix, ers because the stencil buffer used in their sampling method which can be accumulated by all the boundary surfels in the only has 8 bits in the current graphics hardware architecture.