Due to the diverse geographical environments, intricate landscapes, and high-density settlements,
the automatic identification of urban village boundaries using remote sensing images remains a highly challenging task.
This paper proposes a novel and efficient neural network model called UV-Mamba for accurate boundary detection in high-resolution remote sensing images.
UV-Mamba mitigates the memory loss problem in lengthy sequence modeling, which arises in state space models with increasing image size, by incorporating deformable convolutions.
Its architecture utilizes an encoder-decoder framework and includes an encoder with four deformable state space augmentation blocks for efficient multi-level semantic extraction and a decoder to integrate the extracted semantic information.
We conducted experiments on two large datasets showing that UV-Mamba achieves state-of-the-art performance.
Specifically, our model achieves 73.3% and 78.1% IoU on the Beijing and Xi'an datasets, respectively, representing improvements of 1.2% and 3.4% IoU over the previous best model while also being 6× faster in inference speed and 40× smaller in parameter count.
Source code and pre-trained models are available at https://github.com/Devin-Egber/UV-Mamba.