No Cover Image

Conference Paper/Proceeding/Abstract 85 views 29 downloads

Multi-modal Dynamic Point Cloud Geometric Compression Based on Bidirectional Recurrent Scene Flow

Fangzhe Nan, Frederick Li, Zhuoyue Wang, Gary Tam Orcid Logo, Zhaoyi Jiang, DongZheng, Bailin Yang

ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Pages: 1 - 5

Swansea University Author: Gary Tam Orcid Logo

  • ICASSP2025_paper.pdf

    PDF | Accepted Manuscript

    Author accepted manuscript document released under the terms of a Creative Commons CC-BY licence using the Swansea University Research Publications Policy (rights retention).

    Download (1.25MB)

Abstract

Deep learning methods have recently shown significant promise in compressing the geometric features of point clouds. However, challenges arise when consecutive point clouds contain holes, resulting in incomplete information that complicates motion estimation. To our knowledge, most existing dynamic...

Full description

Published in: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
ISBN: 979-8-3503-6875-8 979-8-3503-6874-1
ISSN: 1520-6149 2379-190X
Published: IEEE 2025
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa68655
Abstract: Deep learning methods have recently shown significant promise in compressing the geometric features of point clouds. However, challenges arise when consecutive point clouds contain holes, resulting in incomplete information that complicates motion estimation. To our knowledge, most existing dynamic point cloud compression methods have largely overlooked this critical issue. Moreover, these methods typically employ a multi-scale single-pass approach for motion estimation, performing only one estimation at each scale. This limits accuracy and adversely impacts compression performance. To address these challenges, we propose a dynamic point cloud compression model called M2BR-DPCC (Multi-Modal Multi-Scale Bidirectional Recursion for Dynamic Point Cloud Compression). Our method introduces two key innovations. First, we integrate both point cloud and image data as inputs, leveraging a multi-modal feature representation completion (MFRepC) approach to align information across modalities. This addresses the issue of missing data in point clouds by using complementary information from images. Second, we implement a multi-scale bidirectional recursive (MSBR) motion estimation method. This module iteratively refines motion flows in both forward and backward directions, progressively enhancing point cloud features and improving motion estimation accuracy. Experimental results on widely used datasets, including MVUB and 8iVFB, demonstrate the effectiveness of our approach. Compared to existing methods, M2BR-DPCC achieves superior performance, with an average BD-rate improvement of 95.23% over V-PCC, 12.92% over D-DPCC, and 16.16% over patchDPCC. These results underscore the potential of leveraging multi-modal data and bidirectional refinement for dynamic point cloud compression.
Keywords: Point cloud compression; Technological innovation; Image coding; Accuracy; Limiting; Motion estimation; Dynamics; Estimation; Filling; Speech processing
College: Faculty of Science and Engineering
Funders: National Natural Science Foundation Grant 62172366, Zhe jiang Province Natural Science Foundation No. LY21F020013, LY22F020013; Royal Society grant IEC/NSFC/211159
Start Page: 1
End Page: 5