SegMASt3R: Geometry Grounded Segment Matching

Rohit Jayanti1    Swayam Agrawal1    Vansh Garg1    Siddharth Tourani2    Siddharth Tourani3    Muhammad Haris Khan3    Sourav Garg3    K Madhava Krishna1   

1 IIIT Hyderabad, India    2 Heidelberg University, Germany    3 MBZUAI, UAE   


Segment matching is an important intermediate task in computer vision that establishes correspondences between semantically or geometrically coherent regions across images. Unlike keypoint matching, which focuses on localized features, segment matching captures structured regions, offering greater robustness to occlusions, lighting variations, and viewpoint changes. We leverage the spatial understanding of 3D foundation models to tackle wide-baseline segment matching, a challenging setting involving extreme viewpoint shifts. We propose an architecture that uses the inductive bias of these 3D foundation models to match segments across image pairs with up to 180° rotation. Extensive experiments show that our approach outperforms state-of-the-art methods, including the SAM2 video propagator and local feature matching methods, by up to 30% on the AUPRC metric, on ScanNet++ and Replica datasets. We further demonstrate benefits of the proposed model on relevant downstream tasks, including 3D instance segmentation and object-relative navigation.