Learning Generalizable Shape Completion with SIM(3) Equivariance

1Technical University of Munich, 2Munich Center for Machine Learning
NeurIPS 2025
PaCo teaser figure

Three paradigms for shape completion. Explicit canonicalization, including SO(3)‐ and SE(3)‐equivariant variants, leak pose and scale cues and fail on non-canonical inputs. Data augmentation mitigates the alignment bias but incurs ambiguity. We present a SIM(3)‐equivariant approach that generalizes to arbitrary similarity transforms.

Overview

3D shape completion methods typically assume scans are pre-aligned to a canonical frame. This leaks pose and scale cues that networks may exploit to memorize absolute positions rather than inferring intrinsic geometry. When such alignment is absent in real data, performance collapses. We argue that robust generalization demands architectural equivariance to the similarity group, SIM(3), so the model remains agnostic to pose and scale. Following this principle, we introduce the first SIM(3)-equivariant shape completion network, whose modular layers successively canonicalize features, reason over similarity-invariant geometry, and restore the original frame. Under a de-biased evaluation protocol that removes the hidden cues, our model outperforms both equivariant and augmentation baselines on the PCN benchmark. It also sets new cross-domain records on real driving and indoor scans, lowering minimal matching distance on KITTI by 17% and Chamfer distance \(\ell_1\) on OmniObject3D by 14%. Perhaps surprisingly, ours under the stricter protocol still outperforms competitors under their biased settings. These results establish full SIM(3) equivariance as an effective route to truly generalizable shape completion.


Comparison with other methods

PaCo teaser figure

Robustness to pose and scale perturbations. Under larger pose and scale changes, our SIM(3)-equivariant model maintains completion quality, whereas competing methods degrade.

💡 Tips
Click an image below to view in 3D
PaCo teaser figure

Evaluation on PCN. We compare methods supporting only SO(3) (top) and SE(3) (middle), and those with SIM(3) augmentation (bottom). “Transform” indicate train/test settings. Our model outperforms competitors limited to partial transform groups and those with data augmentation. CD-\(\ell_1\) values are scaled by a factor of 1000. Bold numbers indicate the best SIM(3) results.

PaCo teaser figure

Cross-domain generalization to real scans. Our PCN-trained model completes driving (KITTI) and indoor (OmniObject3D) scans, with more details than the augmented baseline.


How it works

PaCo architecture

Overview of our $\mathrm{SIM}(3)$-equivariant shape completion pipeline. We extract point patch features with VN-DGCNN and feed them into a Transformer encoder-decoder. Within each layer module, we $(i)$ canonicalize features to be translation- and scale-invariant, $(ii)$ reason intrinsic geometry via $\mathrm{SIM}(3)$-invariant attention, and $(iii)$ restore the original transform. This guarantees that both intermediate features and the reconstructed shape adhere to SIM(3) transforms.

BibTeX

@article{wang2025simeco,
    title={Learning Generalizable Shape Completion with SIM(3) Equivariance}, 
    author={Yuqing Wang and Zhaiyu Chen and Xiao Xiang Zhu},
    journal={arXiv preprint arXiv:2509.26631},
    year={2025}
}