Scalable Point Cloud Geometry Coding with Binary Tree Embedded Quadtree

SCALABLE POINT CLOUD GEOMETRY CODING WITH BINARY TREE EMBEDDED QUADTREE Birendra Kathariya1, Li Li1, Zhu Li1, Jose Alvarez2, Jianle Chen2 ∗ 1University of Missouri-Kansas City, 2Futurewei Technologies Inc. [email protected], flil1,[email protected], fjose.roberto.alvarez,[email protected] ABSTRACT face area of an object or a scene scanned. Its size may also in- Many applications of point cloud have recently been identi- crease if 3D data constitutes one or more attributes like color, fied in automobile navigation system, visual communication, normal, reflectance, transparency etc. in addition to geometry. and so on. However, the huge data size of point cloud has To compress the point cloud, every aspect of point cloud that been a bottleneck for the practical implementations. In this contributes the size should be explored. Most works focus on paper, we present a compression scheme that utilizes variable- compression of geometry and attributes separately. They use rate coding of a same point cloud data at different quality. octree to generate the occupancy bitstream and represent the Point cloud is encoded at fixed-rate for highest representa- geometry. The octree is used mostly to compress the static tion. Encoder, however, can present variable-rate encoded da- point cloud or intra-code the frame in dynamic point cloud. ta for any lowest to highest representation to decoder which For example, [1] and [2] use octree space decomposition and is then decoded to reconstruct point cloud at different quality. predictive coding to estimate the child cell configurations u- Variable-rate encoding is achieved through the so-called bi- tilizing local surface estimation. Other works that use octree nary tree quadtree (BTQT) scheme. The BTQT scheme made for static point cloud compression can be found in [3] and [4]. the compression more effective by dividing point cloud frame Dynamic point cloud compression schemes use structural into blocks using binary-tree and encoding flat surfaces in the differences between two frames. For example, [5] encodes blocks by quadtree and non-flat surfaces by octree. Simula- the structural difference between the octree structure in two tion results show that scalable coding solution can efficiently point cloud frames. [6] constructs the fixed size block in compress point cloud data at variable rate compensating the the voxelized point cloud and computes the motion associ- quality. ated with these blocks in the next frame. More works relat- ed to dynamic point cloud compression can be found in [7] Index Terms— Binary-tree, Scalable coding, Octree, and [8]. The color attributes of point cloud on the other hand Quadtree, Point cloud. are compressed separately. Some state-of-art works such as [9] and [10] use orthogonal graph transform to decorrelate 1. INTRODUCTION color signal and arithmetic coding to encode the coefficients. Besides, one recent work [11] adopts regional-adaptive hier- Point cloud technology recently got much attention as its ap- archical transform (RAHT) method to compress the color at- plication potential has been discovered in wide variety of field tribute. Works such as [12], [13], and [14] use mesh compres- including immersive 3D teleconferencing, geospatial inspec- sion techniques by converting point cloud into mesh and ap- tion, 3D modeling and printing, architectural design, auto- ply method to reduce number of vertices and edges using sur- navigation system etc. The emergence of miniaturized and face approximation. These methods however involve compu- inexpensive 3D scanner made it popular in the consumer lev- tationally very expensive conversion of point cloud to mesh. el. In this work, we present a scalable octree and quadtree 3D sensing technology has matured a lot. The data ac- coding scheme that can represent the geometry of the same quired from these high-speed 3D sensors consumes huge point cloud at different quality achieving different com- amount of storage space and transmission bandwidth. With pression level. However, the advantage we get with oc- the efficient compression scheme, the point cloud technology tree/quadtree scheme is that once point cloud is encoded with can be made more accessible to the consumer level and thus octree/quadtree to its highest representation level, decoder help to explore new frontier of its possible applications. can request any representation from the same encoded data. The size of a point cloud depends on the resolution or vox- Here, the highest representation implies point cloud with voxel size, the number of frames captured per second, and the sur- el size of one. This avoids encoding the same point cloud mul- ∗We are grateful to the generous support of a gift grant from FutureWei tiple times for different representation level and saves storage Technology. space. After encoding, we compress encoded bitstream with 978-1-5386-1737-3/18/$31.00 ⃝c 2018 IEEE Authorized licensed use limited to: University of Missouri Libraries. Downloaded on May 17,2021 at 21:15:42 UTC from IEEE Xplore. Restrictions apply. PAQ compressor. Encoding with octree/quadtree on a point cloud with large number of points is however problematic, as it has to go deeper to encode to its highest representation level where encoding becomes slow and occupancy-bit or octree- representation-bit explodes out exponentially. We therefore, convert encoding problem into subproblem by dividing the point cloud into smaller chunk using binary-tree and encode each chunk separately. The method discussed in this paper is intra-coding and thus can be used to encode static as well as Fig. 1. Block formation using binary-tree in the 1st frame of dynamic point cloud. 4 datasets from 8i The rest of the paper is organized as follows. In section 2, a detailed description of encoder is presented. This section is divided into four subsections, each one dedicated to a par- ticular step in encoding process. Section 3 describes the PAQ compression scheme. Section 4 explains the decoding process in the decoder. Section 5 is for discussion of the result obtained. Section 6 concludes the whole paper. 2. EMBEDDED BTQT FOR GEOMETRY CODING Fig. 2. Binary-tree depiction in 2D plane Encoder encodes both geometry and color of a point cloud frame by creating the highest representation possible. In its Binary-tree blocks contain the local surfaces of the point highest representation, decoder can decode point cloud with cloud. Any local surface that exhibit flat characteristics are most of its voxel with size one. Thus, reconstructed point projected into 2D plane and encoded with quadtree. Other cloud has almost same number of points compared to the o- that are not flat enough are encoded with octree. While en- riginal and the reconstruction is almost lossless. Although, coding, bit 0 is signaled to indicate leaf-node encoded with decoder can decode colors at different lower resolutions, in quadtree and bit 1 for octree encoded leaf-node. This infor- this paper we are concerned with geometry compression and mation is sent to decoder. The bits cost R1, for signaling this thus we exclude encoding and compression process for col- information is ors. BTQT works with two-step approach. The outer layer of L coding is a lossy geometry approximation from Binary Tree R1 = (1) × 2 (1) (BT), which is further refined by Quadtree (QT) if the points lie on approximate 2D planes, and Octree if cannot be approx- Decoder, to reconstruct the local surface, also requires imated by 2D planes. the bounding-box information of the block within which surfaces are enclosed. Two diagonal points which represent the bounding-box information can be predicted from binary-tree 2.1. Binary-tree Block Construction structure given bounding-box of the point cloud frame, 2-bits Encoding process starts with the block formation in the point for signaling the dimension of cut (’01’ - x-axis, ’10’ - y-axis, cloud frame as shown in Fig. 1. Here, we use binary-tree ’11’ - z-axis, ’00’ - reserved) and b-bits for point of cut at to create blocks in the form of their leaf-nodes. Binary-tree each level of division. The total bits spent on bounding-box recursively divides the data by splitting axes into two halves at of binary-tree for depth L can be estimated as their median. This binary splitting can be better visualized in R = (2 + b) × 2L − 1 + (6 × b) (2) a plane as shown in Fig. 2. If a point cloud has N points then, 2 L − depth L binary tree will divide 2 1 times recursively and at where b is the bit-resolution of the voxelized point-cloud. If a L L final level we have 2 sub-blocks with almost N=2 points in point cloud is voxelized to n-bit then b = n. each block. The blocks at final level are known as binary-tree leaf-nodes. Point cloud is a 3D data i.e. x, y, z and division 2.2. Plane Projection with Planner Mode Decision should take in all three dimensions in turn. However, order of division precedes with dimension that has larger variance. Local surface in a block with flat characteristics has one of the Binary-tree is a data partition scheme and it’s leaf-nodes three dimension almost same value. By projecting the surface are adaptive to the surface of the data. Their leaf-nodes are in appropriate tangent plane, we can get rid of one whole di- balanced with almost same number of points with difference mension. By encoding this approximated 2D surface, we can of 1 and none of them are empty. save huge bits compared to encoding the same with octree in Authorized licensed use limited to: University of Missouri Libraries. Downloaded on May 17,2021 at 21:15:42 UTC from IEEE Xplore.

Scalable Point Cloud Geometry Coding with Binary Tree Embedded Quadtree

Lossless Data Compression with Transformer

Chapter 2 HISTORY and DEVELOPMENT of MILITARY LASERS

Mcgraw-Hill New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul Singapore Sydney Toronto Mcgraw-Hill Abc

Adaptive Weighing of Context Models for Lossless Data Compression

A Machine Learning Perspective on Predictive Coding with PAQ

Arxiv:2011.13544V1 [Cs.CV] 27 Nov 2020

Gendered Violence and Safety: a Contextual Approach to Improving Security in Women’S Facilities

Point Cloud Compression and Low Latency Streaming

Building Multimedia Security Applications in the MPEG Reconfigurable Video Coding (RVC) Framework

Affect-Based Indexing and Retrieval of Multimedia Data Ching Hau Chan

Hp I-Paq Cell Phone Users Guide.Pdf

From Optimised Inpainting with Linear Pdes Towards Competitive Image Compression Codecs