Urban Scene Segmentation and Cross-Dataset Transfer  Learning using SegFormer

Hatkar, Tanmay Sunil; Ahmed, Saad Bin

View/Open

Hatkar&Ahmed-2025-Urban_ Scene_Segmentation_and_Cross-Dataset_Transfer_Learning_using_SegFormer.pdf (1.983Mb)

Date

2025-08-01

Author

Hatkar, Tanmay Sunil

Ahmed, Saad Bin

Metadata

Show full item record

Abstract

Semantic segmentation is essential for autonomous driving applications, but state-of-the-art models are typically evaluated on large datasets like Cityscapes, leaving smaller datasets underexplored. This research gap limits our understanding of how transformer-based models generalize across diverse urban scenes with limited training data. This paper presents a comprehensive evaluation of SegFormer architectural variants (B3, B4, B5) on the CamVid dataset and investigates cross-dataset transfer learning from CamVid to KITTI. Using an optimization framework combining cross-entropy loss with class weighting and boundary-aware components, our experiments establish new performance baselines on CamVid and demonstrate that transfer learning provides benefits w hen target domain data is limited. We achieve a modest 2.57% relative mean Intersection over Union (mIoU) improvement on KITTI through knowledge transfer from CamVid, along with 61.1% faster convergence. Additionally, we observe substantial class-specific improvements of up to 30.75% for challenging c ategories. Our analysis provides insights into model scaling effects, c ross-dataset k nowledge t ransfer m echanisms, a nd p ractical s trategies for addressing data scarcity in urban scene segmentation.

URI

https://knowledgecommons.lakeheadu.ca/handle/2453/5460

Collections

Department of Computer Science [3]