Academic publishing in Europe and N. America

Archive Publication ethics Submission Payment Contacts
In the original languageTranslation into English

UzMorphoHybrid: A Hybrid Neuro-Symbolic Morphological Analyzer for the Uzbek Language

Authors

Maksud Sharipov

Rubric:Informatics
22
0
Quote
22
0

Annotation

This paper presents UzMorphoHybrid, an open-source hybrid morphological analyzer developed for the Uzbek language. Uzbek is an agglutinative Turkic language, and unlike existing statistical models—which often struggle with analyzing low-frequency or rare word forms—UzMorphoHybrid adopts a neuro-symbolic approach. The model integrates a BERT-based Part-of-Speech (POS) tagger for contextual disambiguation with a rule-based Finite-State Machine (FSM) for deterministic morphological segmentation. The software routes words through grammatically defined chains (“Paths”) identified within a domain-specific routing mechanism, ensuring high precision for rule-governed analyses. UzMorphoHybrid is implemented in Python and provides a modular framework for lemmatization, stemming, and full morphological analysis. This makes it a valuable tool for constructing large-scale Uzbek language corpora and improving the accuracy of information retrieval systems.

Keywords

BERT
NLP
POS tagging
Uzbek Morphologic analayzer
FSA
Hybrid Neuro-Symbolic

Authors

Maksud Sharipov

Rubric:Informatics
22
0

Share

22
0

References:

Salaev, U. (2023). Modeling Morphological Analysis Based on Word-Ending for Uzbek Language. Science and Innovation International Scientific Journal, 2(11), 118-124.

Abdurakhmonova, N., & Ismailov, A. S. MorphUz: Morphological Analyzer for the Uzbek language. Alisher Navo‘i Tashkent State University of Uzbek Language and Literature.

Abdurakhmonova, N., et al. (2025). An annotated morphological dataset for Uzbek word forms: Towards rule-based and machine learning approaches. Data in Brief, 61, 111702.

Yuret, D., Akyürek, E., & Dayanık, E. Morphological Analysis Using a Sequence Decoder. Koç University Artificial Intelligence Laboratory.

Klimaszewski, M., & Wróblewska, A. COMBO: State-of-the-Art Morphosyntactic Analysis. Warsaw University of Technology.

Hämäläinen, M., et al. (2021). Neural Morphology Dataset and Models for Multiple Languages, from the Large to the Endangered. University of Helsinki.

Chubakov, T., et al. Transformers on Multilingual Clause-Level Morphology. KUIS AI, Koç University.

Ismayilzada, M., et al. (2025). Evaluating Morphological Compositional Generalization in Large Language Models. arXiv:2410.12656.

Rice, E., von der Wense, K., & Palmer, A. Interdisciplinary Research in Conversation: A Case Study in Computational Morphology for Language Documentation. University of Colorado Boulder.

Salaev, U. UzMorphAnalyser: A Morphological Analysis Model for the Uzbek Language Using Inflectional Endings. Urgench State University.

cc-license
About us Journals Books
Publication ethics Terms of use of services Privacy policy
Copyright 2013-2025 Premier Publishing s.r.o.
Praha 8 - Karlín, Lyčkovo nám. 508/7, PSČ 18600, Czech Republic pub@ppublishing.org