Automated Text Recognition and Segmentation for Historic Map Vectorization: A Mask R-CNN and UNet Approach

Main Article Content

Suresh Dodda, Naveen Kunchakuri, Anoop Kumar, Sukender Reddy Mallreddy


Historic maps are essential for comprehending how buildings and landscapes have changed over time. For this—vectorization can be a useful method of analysis for an extensive collection of these maps. However, text overlaps with structural elements—often makes this process more difficult. Therefore, an automated pipeline for text recognition, pixel-level text mask creation, dataset generation, and text bounding box detection has been proposed. Findings shows—text segmentation, detection, and recognition were demonstrated by the combination of Mask Region-based Convolutional Neural Network (Mask R-CNN) and UNet model achieved a 99.12% of all text occurrences in images—which also attained an accuracy of 87.72% while collecting text inside bounding boxes. This end-to-end pipeline shows potential for a wide range of future uses, especially when it comes to text removal for the purpose of making historic maps easier to vectorize and analyze—which will improve the understanding of historical buildings and landscapes.   

Article Details