Exploring the Impact of Model Capacity and Parameter Tuning on 3D Semantic Segmentation

Furkan Karaman; Fatma Gümüş

doi:10.55525/tjst.1637713

Research Article

3B Semantik Bölütleme Performansı Üzerinde Model Kapasitesi ve Parametre Ayarının Etkisinin Araştırılması

Year 2025, Volume: 20 Issue: 1, 327 - 337, 27.03.2025

Furkan Karaman , Fatma Gümüş

https://doi.org/10.55525/tjst.1637713

Abstract

3B semantik bölütleme, üç boyutlu uzaydaki her noktaya anlamsal etiketler atama sürecidir ve otonom sürüş, robotik, tıbbi görüntüleme ve kentsel haritalama dahil olmak üzere çok sayıda uygulama için kritik öneme sahiptir. Önemli ilerlemeye rağmen, veri dengesizliği, ölçeklenebilirlik ve gerçek zamanlı işleme kısıtlamaları gibi zorluklar devam etmektedir. Bu çalışma, ShapeNetCore veri setini kullanan Tiny, Medium ve Large olarak PointNet’ten esinlenen modelleri karşılaştırmak suretiyle gerçek zamanlı işleme sorununu ele almaktadır. Modeller, geometrik dönüşümlere karşı gürbüzlüğü korumak üzere poz normalizasyonu için T-Net modülünü içerir. Uçak, Motosiklet ve Araba sınıfları için ayrı modeller eğitilerek sınıf-özel segmentasyon çalışılmış ve her sınıf için özel optimizasyon değerlendirilmiştir. Küme büyüklüğünün 16 olduğu ve 1×10^(-3) başlangıç öğrenme oranıyla 40 epok boyunca eğitilen 512 örneklenmiş noktaya sahip Tiny modeli, %86,18 ortalama eğitim doğruluğu ve %83,50 ortalama doğrulama doğruluğu elde etti ve test hızı ve yüksek doğruluğu nedeniyle gerçek zamanlı uygulamalar için ideal olduğu değerlendirilmiştir.

Keywords

3B semantik bölütleme, nokta bulutu işleme, ölçeklenebilirlik, gerçek zamanlı işleme

References

Qi CR, Su H, Mo K, Guibas LJ. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. Proc IEEE Conf Comput Vis Pattern Recognit (CVPR), 2017; Honolulu, HI, USA. 652-660.
Qi CR, Su H, Yi L, Guibas LJ. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. Adv Neural Inf Process Syst (NIPS), 2017; Long Beach, CA, USA. 30.
Choy C, Gwak JY, Savarese S. 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. Proc IEEE/CVF Conf Comput Vis Pattern Recognit (CVPR), 2019; Long Beach, CA, USA. 3075-3084.
Behley J, Garbade M, Milioto A, Quenzel J, Behnke S, Stachniss C, Gall J. SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences. Proc IEEE/CVF Conf Comput Vis Pattern Recognit (CVPR), 2019; Long Beach, CA, USA. 9297-9307.
Zhang Y, Zhou Z, David P, Yue X, Xi Z, Gong B, Foroosh H. PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation. Proc. IEEE/CVF Conf Comput Vis Pattern Recognit (CVPR), 2020; Virtual. 9601-9610.
Cortinhal T, Tzelepis G, Aksoy EE. SalsaNext: Fast, Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving Int Symp Vis Comput., 2020; San Diego, CA, USA. 207-222.
Tang H, Liu Z, Zhao S, Lin Y, Lin J, Wang H, Han S. Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution Eur Conf Comput Vis (ECCV), 2020; Glasgow, UK. 685-702.
Li S, Zhang C, He X. Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images. Med Image Comput Comput Assist Interv (MICCAI), 2020; Lima, Peru. 552–561.
Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM. Dynamic Graph CNN for Learning on Point Clouds. ACM Trans Graph (TOG), 2019; 38 (5): 1-12.
Graham B, Engelcke M, van der Maaten L. 3D Semantic Segmentation with Submanifold Sparse Convolutional Networks. Proc IEEE Conf Comput Vis Pattern Recognit (CVPR), 2018; Salt Lake City, UT, USA. 9224-9232.
Thomas H, Qi CR, Deschaud JE, Marcotegui B, Goulette F, Guibas LJ. KPConv: Flexible and Deformable Convolution for Point Clouds. Proc. IEEE/CVF Int Conf Comput. Vis. (ICCV), 2019; Seoul, Korea (South). 6410-6419.
Liu Z, Tang H, Lin Y, Han S. Point-Voxel CNN for Efficient 3D Deep Learning. Adv. Neural Inf. Process. Syst. (NeurIPS), 2019; Vancouver, Canada. 32.
Heller N, Isensee F, Maier-Hein KH, Hou X, Xie C, Li F, Nan Y, Mu G, et al. The state of the art in kidney and kidney tumor segmentation in contrast-enhanced CT imaging: Results of the KiTS19 Challenge. Med Image Anal 2021; 67: 101821.
Vinodkumar PK, Karabulut D, Avots E, Ozçınar C, Anbarjafari G. A Survey on Deep Learning Based Segmentation, Detection and Classification for 3D Point Clouds. Entropy, 2023; 25 (4): 635.
He Y, Yu H, Liu X, Yang Z, Sun W, Anwar S, Mian A. Deep Learning Based 3D Segmentation: A Survey Inf Fusion 2025; 115: 102722.
Chang AX, Funkhouser T, Guibas L, Hanrahan P, Huang Q, Li Z, Savarese S, Savva M, Song S, et al. ShapeNet: An Information-Rich 3D Model Repository. arXiv. 2015; 1512.03012.
Xiang Y, Mottaghi R., Savarese S. Beyond PASCAL: A Benchmark for 3D Object Detection in the Wild. Proc. IEEE Winter Conf. Appl. Comput. Vis. (WACV), 2014; Steamboat Springs, CO, USA. 75-82.
Jaderberg M, Simonyan K, Zisserman A, Kavukcuoglu K. Spatial Transformer Networks. Adv Neural Inf Process Syst (NIPS), 2015; Montreal, Canada. 29.
Zhao H, Jiang L, Jia J, Torr P, Koltun V. Point Transformer. Proc. IEEE/CVF Int Conf Comput Vis (ICCV), 2021; Montreal, Canada. 16239-16248.

Exploring the Impact of Model Capacity and Parameter Tuning on 3D Semantic Segmentation

Year 2025, Volume: 20 Issue: 1, 327 - 337, 27.03.2025

Furkan Karaman , Fatma Gümüş

https://doi.org/10.55525/tjst.1637713

Abstract

3D semantic segmentation, the process of assigning semantic labels to every point in a 3D space, is critical for numerous applications, including autonomous driving, robotics, medical imaging, and urban mapping. Despite significant progress, challenges such as data imbalance, scalability, and real-time processing constraints persist. This study addresses the real-time processing issue by comparing Tiny, Medium, and Large PointNet-inspired models utilizing the ShapeNetCore dataset. The models incorporate the T-Net module for pose normalization to maintain robustness against geometric transformations. Class-specific segmentation is explored by training separate models for the Airplane, Motorbike, and Car classes, allowing custom optimizations for each class. The Tiny model with 512 sampled points where the batch size is 16 and trained for 40 epochs with a starting learning rate of 1×10^(-3) achieved an average training accuracy of 86.18% and an average validation accuracy of 83.50%, making it optimal for real-time applications due to its fast inference speed and high accuracy.

Keywords

3D semantic segmentation, point cloud processing, scalability, real-time processing.

References

Qi CR, Su H, Mo K, Guibas LJ. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. Proc IEEE Conf Comput Vis Pattern Recognit (CVPR), 2017; Honolulu, HI, USA. 652-660.
Qi CR, Su H, Yi L, Guibas LJ. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. Adv Neural Inf Process Syst (NIPS), 2017; Long Beach, CA, USA. 30.
Choy C, Gwak JY, Savarese S. 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. Proc IEEE/CVF Conf Comput Vis Pattern Recognit (CVPR), 2019; Long Beach, CA, USA. 3075-3084.
Behley J, Garbade M, Milioto A, Quenzel J, Behnke S, Stachniss C, Gall J. SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences. Proc IEEE/CVF Conf Comput Vis Pattern Recognit (CVPR), 2019; Long Beach, CA, USA. 9297-9307.
Zhang Y, Zhou Z, David P, Yue X, Xi Z, Gong B, Foroosh H. PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation. Proc. IEEE/CVF Conf Comput Vis Pattern Recognit (CVPR), 2020; Virtual. 9601-9610.
Cortinhal T, Tzelepis G, Aksoy EE. SalsaNext: Fast, Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving Int Symp Vis Comput., 2020; San Diego, CA, USA. 207-222.
Tang H, Liu Z, Zhao S, Lin Y, Lin J, Wang H, Han S. Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution Eur Conf Comput Vis (ECCV), 2020; Glasgow, UK. 685-702.
Li S, Zhang C, He X. Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images. Med Image Comput Comput Assist Interv (MICCAI), 2020; Lima, Peru. 552–561.
Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM. Dynamic Graph CNN for Learning on Point Clouds. ACM Trans Graph (TOG), 2019; 38 (5): 1-12.
Graham B, Engelcke M, van der Maaten L. 3D Semantic Segmentation with Submanifold Sparse Convolutional Networks. Proc IEEE Conf Comput Vis Pattern Recognit (CVPR), 2018; Salt Lake City, UT, USA. 9224-9232.
Thomas H, Qi CR, Deschaud JE, Marcotegui B, Goulette F, Guibas LJ. KPConv: Flexible and Deformable Convolution for Point Clouds. Proc. IEEE/CVF Int Conf Comput. Vis. (ICCV), 2019; Seoul, Korea (South). 6410-6419.
Liu Z, Tang H, Lin Y, Han S. Point-Voxel CNN for Efficient 3D Deep Learning. Adv. Neural Inf. Process. Syst. (NeurIPS), 2019; Vancouver, Canada. 32.
Heller N, Isensee F, Maier-Hein KH, Hou X, Xie C, Li F, Nan Y, Mu G, et al. The state of the art in kidney and kidney tumor segmentation in contrast-enhanced CT imaging: Results of the KiTS19 Challenge. Med Image Anal 2021; 67: 101821.
Vinodkumar PK, Karabulut D, Avots E, Ozçınar C, Anbarjafari G. A Survey on Deep Learning Based Segmentation, Detection and Classification for 3D Point Clouds. Entropy, 2023; 25 (4): 635.
He Y, Yu H, Liu X, Yang Z, Sun W, Anwar S, Mian A. Deep Learning Based 3D Segmentation: A Survey Inf Fusion 2025; 115: 102722.
Chang AX, Funkhouser T, Guibas L, Hanrahan P, Huang Q, Li Z, Savarese S, Savva M, Song S, et al. ShapeNet: An Information-Rich 3D Model Repository. arXiv. 2015; 1512.03012.
Xiang Y, Mottaghi R., Savarese S. Beyond PASCAL: A Benchmark for 3D Object Detection in the Wild. Proc. IEEE Winter Conf. Appl. Comput. Vis. (WACV), 2014; Steamboat Springs, CO, USA. 75-82.
Jaderberg M, Simonyan K, Zisserman A, Kavukcuoglu K. Spatial Transformer Networks. Adv Neural Inf Process Syst (NIPS), 2015; Montreal, Canada. 29.
Zhao H, Jiang L, Jia J, Torr P, Koltun V. Point Transformer. Proc. IEEE/CVF Int Conf Comput Vis (ICCV), 2021; Montreal, Canada. 16239-16248.

There are 19 citations in total.

Details

Primary Language	English
Subjects	Image Processing
Journal Section	TJST
Authors	Furkan Karaman 0009-0009-1304-8103 Fatma Gümüş 0000-0001-5191-0037
Publication Date	March 27, 2025
Submission Date	February 11, 2025
Acceptance Date	March 18, 2025
Published in Issue	Year 2025 Volume: 20 Issue: 1

Cite

APA	Karaman, F., & Gümüş, F. (2025). Exploring the Impact of Model Capacity and Parameter Tuning on 3D Semantic Segmentation. Turkish Journal of Science and Technology, 20(1), 327-337. https://doi.org/10.55525/tjst.1637713
AMA	Karaman F, Gümüş F. Exploring the Impact of Model Capacity and Parameter Tuning on 3D Semantic Segmentation. TJST. March 2025;20(1):327-337. doi:10.55525/tjst.1637713
Chicago	Karaman, Furkan, and Fatma Gümüş. “Exploring the Impact of Model Capacity and Parameter Tuning on 3D Semantic Segmentation”. Turkish Journal of Science and Technology 20, no. 1 (March 2025): 327-37. https://doi.org/10.55525/tjst.1637713.
EndNote	Karaman F, Gümüş F (March 1, 2025) Exploring the Impact of Model Capacity and Parameter Tuning on 3D Semantic Segmentation. Turkish Journal of Science and Technology 20 1 327–337.
IEEE	F. Karaman and F. Gümüş, “Exploring the Impact of Model Capacity and Parameter Tuning on 3D Semantic Segmentation”, TJST, vol. 20, no. 1, pp. 327–337, 2025, doi: 10.55525/tjst.1637713.
ISNAD	Karaman, Furkan - Gümüş, Fatma. “Exploring the Impact of Model Capacity and Parameter Tuning on 3D Semantic Segmentation”. Turkish Journal of Science and Technology 20/1 (March 2025), 327-337. https://doi.org/10.55525/tjst.1637713.
JAMA	Karaman F, Gümüş F. Exploring the Impact of Model Capacity and Parameter Tuning on 3D Semantic Segmentation. TJST. 2025;20:327–337.
MLA	Karaman, Furkan and Fatma Gümüş. “Exploring the Impact of Model Capacity and Parameter Tuning on 3D Semantic Segmentation”. Turkish Journal of Science and Technology, vol. 20, no. 1, 2025, pp. 327-3, doi:10.55525/tjst.1637713.
Vancouver	Karaman F, Gümüş F. Exploring the Impact of Model Capacity and Parameter Tuning on 3D Semantic Segmentation. TJST. 2025;20(1):327-3.

Download Cover Image

Article Files

Full Text