Academic publishing in Europe and N. America

Archive Publication ethics Submission Payment Contacts
In the original languageTranslation into English

HYBRID DEEP MODEL FOR UZBEK LANGUAGE PUNCTUATION PREDICTION

Authors

Hushnudbek Adinaev, Maksud S. Sharipov, Shahzodbek S. Ganijonov

Rubric:Informatics
27
0
Quote
27
0

Annotation

Automatic punctuation restoration is one of the important tasks in natural language processing, especially for low-resource languages such as Uzbek, where this problem remains particularly relevant. The lack of sufficiently annotated corpora negatively affects the performance of downstream applications, including speech recognition, machine translation, and semantic text analysis. This study proposes a hybrid deep learning model designed to predict punctuation marks in Uzbek texts. The proposed approach combines BERT-based contextual vector representations, BiLSTM for modeling sequential dependencies, and a rule-based post-processing stage grounded in linguistic knowledge. This architecture effectively leverages the semantic capabilities of transformer models and the temporal dependency modeling strengths of recurrent networks, while the rule-based correction component improves the accuracy of punctuation detection in ambiguous boundary cases. Experimental results obtained on an annotated Uzbek corpus demonstrate that the proposed model outperforms existing neural and statistical approaches in terms of precision, recall, and F1-score. The findings confirm that integrating deep neural architectures with linguistic rules significantly enhances punctuation restoration performance for low-resource languages. This work presents a practical and extensible approach for advancing Uzbek language processing and improving the accuracy of various applied NLP systems.

Keywords

punctuation marks; NLP; BERT model
BiLSTM model
Rule-based; F1 metrics

Authors

Hushnudbek Adinaev, Maksud S. Sharipov, Shahzodbek S. Ganijonov

Rubric:Informatics
27
0

Share

27
0

References:

O. Guhr, A.-K. Schumann, F. Bahrmann, H.-J. Böhme, “FullStop: Multilingual Deep Models for Punctuation Prediction,” Proceedings of the SEPP-NLG Shared Task, 2021.

M. S. Sharipov, H. S. Adinaev, E. R. Kuriyozov, “Rule-Based Punctuation Algorithm for the Uzbek Language,” 2024.

O. Attia et al., “Automatic Spelling and Punctuation Correction for Arabic,” Computational Linguistics, 2014.

M. Sharipov, H. Adinaev, O. Sobirov, “Bidirectional LSTM-CRF Models for Punctuation Restoration in Uzbek Texts,” IEEE UBMK, 2025.

 J. Salimbajevs, “Automatic Punctuation Restoration Using Bidirectional LSTM Models,” IOS Press, 2018.

 H. S. Adinaev, “The mBERT Model for Restoring Punctuation in Uzbek-Language Texts,” European Science Review, 2025.

V. Shymkovych et al., “Joint Punctuation Restoration and Text Capitalisation with a Hybrid XLM-RoBERTa–LSTM Model,” IEEE IDAACS, 2025.

X. Zhu et al., “Resolving Transcription Ambiguity in Spanish: A Hybrid Acoustic-Lexical System for Punctuation Restoration,” ACL Workshop, 2024.

J. Qiu et al., “Punctuation-aware Hybrid Trainable Sparse Attention for Large Language Models,” arXiv:2601.02819, 2026.

Other articles of the issue

Minko Oleksandr Practical implementation of multiparametric design of functional elements of a power machine
76 views
cc-license
About us Journals Books
Publication ethics Terms of use of services Privacy policy
Copyright 2013-2025 Premier Publishing s.r.o.
Praha 8 - Karlín, Lyčkovo nám. 508/7, PSČ 18600, Czech Republic pub@ppublishing.org