Named Entity Recognition for Confidential and Military Texts in the Uzbek Language
Keywords:
Named Entity Recognition, Uzbek NLP, Military Text Processing, Information SecurityAbstract
This thesis presents a specialized Named Entity Recognition (NER) system for identifying sensitive entities in Uzbek confidential and military texts. By fine-tuning the bertbek-ner-uznews transformer model with domain-specific synthetic data, we achieved an F1-score of 89.6% across seven custom entity categories. The system addresses critical gaps in Uzbek NLP for security applications and provides a foundation for Data Loss Prevention (DLP) integration.
Downloads
References
1. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 4171–4186.
2. Huang, Z., Xu, W., & Yu, K. (2015). Bidirectional LSTM-CRF Models for Sequence Tagging. arXiv preprint arXiv:1508.01991.
3. Mengliev, D., Barakhnin, V., Abdurakhmonova, N., & Eshkulov, M. (2024). Developing named entity recognition algorithms for Uzbek: Dataset insights and implementation. Data in Brief, 54, 110413.
4. Mengliev, D., Barakhnin, V., Eshkulov, M., Ibragimov, B., & Madirimov, S. (2024). A comprehensive dataset and neural network approach for named entity recognition in the Uzbek language. Data in Brief, 58, 111249.
5. elmurod1202. (2023). bertbek-ner-uznews: BERT-based Named Entity Recognition for Uzbek News. Hugging Face Model Hub. Available at: https://huggingface.co/elmurod1202/bertbek-ner-uznews
6. Li, X., Li, D., Yang, Z., Zhao, H., Cai, W., & Lin, X. (2023). ND-NER: A Named Entity Recognition Dataset for OSINT Towards the National Defense Domain. In Neural Information Processing. ICONIP 2022. Communications in Computer and Information Science, vol 1792. Springer.
7. Nitzl, C., Cyran, A., Krstanovic, S., & Borghoff, U. (2025). The Application of Named Entity Recognition in Military Intelligence. In Computer Aided Systems Theory – EUROCAST 2024. Lecture Notes in Computer Science, vol 15172. Springer.
8. Wang, X. R., Xiong, Z. H., & Du, X. Y. (2020). NER in threat intelligence domain with TSFL. Proceedings of the 9th International Conference on Natural Language Processing and Chinese Computing, 157–169.



















