A Deep Learning Based Offline Optical Character Recognition Model for Printed Ottoman Turkish

Main Article Content

Ahmed AL-KHAFFAF
Ümit ATİLA

Abstract

Developing efficient optical character recognition (OCR) systems for printed Ottoman text is a problem since current OCR models created for Arabic have restrictions that make it difficult to be performed. The performance of these models has been shown to be low when used for the recognition of Ottoman text. It has also been shown that these models that have been subjected to specialized training on Ottoman text have produced results that are not sufficient. In this study, an analysis of printed Ottoman Turkish documents in the Matbu font is conducted using a deep learning model that is proposed. Through the use of an end-to-end trainable architecture that integrates convolutional neural networks (CNNs) with bidirectional long short-term memory (BiLSTM) units, this study proposes an efficient solution to the Ottoman optical character recognition (OCR) issue. Experimental results show that the proposed model achieved overall scores for accuracy, sensitivity, and precision of 99.6%, 87.1%, and 93.3% on the test dataset respectively.


Article Details

How to Cite
AL-KHAFFAF, A., & ATİLA, Ümit. (2023). A Deep Learning Based Offline Optical Character Recognition Model for Printed Ottoman Turkish. Technium: Romanian Journal of Applied Sciences and Technology, 18, 47–64. https://doi.org/10.47577/technium.v18i.10252
Section
Articles

References

R. Keleş and others, “Osmanl Türkçesinde Kâf Harfi: Tasnif ve Seslendirme Meselesi,” Cumhuriyet Ilahiyat Dergisi, vol. 25, no. 1, pp. 195–216, 2021.

Ibtisam Uraibi Abdullah and M. F. Jihankheer, “‘ Kabusnâme ve Mezâki Divan Örneğinde’ 14. ve 17. Yüzylarda Osmanlı Türkçesinde Harf-i Ta’rifin Kullanmas Üzerine Bir Inceleme.,” Journal of College of Languages, no. 47, 2023.

D. Steel, “Iron Cross and Crescent Press Discussion of the Ottoman Empire in the United Kingdom, 1914-1918,” 2019.

I. Dolek and A. Kurt, “Ottoman OCR: Printed naskh font,” in 2021 International Conference on INnovations in Intelligent SysTems and Applications (INISTA), 2021, pp. 1–5.

K. Karthick, K. B. Ravindrakumar, R. Francis, and S. Ilankannan, “Steps involved in text recognition and recent research in OCR; a study,” International Journal of Recent Technology and Engineering, vol. 8, no. 1, pp. 2277–3878, 2019.

N. A. M. Isheawy and H. Hasan, “Optical character recognition (OCR) system,” IOSR Journal of Computer Engineering (IOSR-JCE), e-ISSN, pp. 2278–2661, 2015.

A. A. A. Ali, M. Suresha, and H. A. M. Ahmed, “A survey on arabic handwritten character recognition,” SN Comput Sci, vol. 1, no. 3, p. 152, 2020.

N. Lamghari, M. E. H. Charaf, and S. Raghay, “Template matching for recognition of handwritten Arabic characters using structural characteristics and Freeman code,” The International Journal of Computer Science and Information Security, vol. 14, no. 12, pp. 31–40, 2016.

C. Clausner, A. Antonacopoulos, and S. Pletschacher, “Efficient and effective OCR engine training,” International Journal on Document Analysis and Recognition (IJDAR), vol. 23, pp. 73–88, 2020.

A. Özer, “Ottoman Turkish Characters.” 2020. [Online]. Available: https://www.kaggle.com/dsv/1443328

E. F. Bilgin Tasdemir, “Printed Ottoman text recognition using synthetic data and data augmentation,” International Journal on Document Analysis and Recognition (IJDAR), pp. 1–15, 2023.

E. F. B. Tasdemir et al., “Transcription of Ottoman Machine-Print Documents,” 2022.

Ishak Dölek and A. Kurt, “Ottoman Optical Character Recognition with deep neural networks Derin sinir ağlaryla Osmanlca optik karakter tanma,” Journal of the Faculty of Engineering and Architecture of Gazi University, vol. 38, no. 4, 2023.

U. Alp, Ö. Alperen, and H. I. TURKMEN, “Evrişimsel Sinir Ağ Tabanl Osmanlca Belge Çözümleyici,” International Journal of Advances in Engineering and Pure Sciences, vol. 33, no. 4, pp. 581–591, 2021.

Ishak Dölek and A. Kurt, “A deep learning model for Ottoman OCR,” Concurr Comput, vol. 34, no. 20, p. e6937, 2022.

Mahmood, Basim, Marcello Tomasini, and Ronaldo Menezes. "Estimating memory requirements in wireless sensor networks using social tie strengths." In 2015 IEEE 40th Local Computer Networks Conference Workshops (LCN Workshops), pp. 695-698. IEEE, 2015.

A. Abdi, S. M. Shamsuddin, S. Hasan, and J. Piran, “Deep learning-based sentiment classification of evaluative text based on Multi-feature fusion,” Inf Process Manag, vol. 56, no. 4, pp. 1245–1259, 2019.

J. Brownlee, Long short-term memory networks with python: develop sequence prediction models with deep learning. Machine Learning Mastery, 2017.

H. Li, J. Li, X. Lin, and X. Qian, “Pancreas segmentation via spatial context based u-net and bidirectional lstm,” arXiv preprint arXiv:1903.00832, 2019.

A. Özer, “Ottoman Turkish Characters.” 2020. [Online]. Available: https://www.kaggle.com/dsv/1443328

Mahmood, Basim Mohammed, and Marwah M. Dabdawb. "The pandemic COVID-19 infection spreading spatial aspects: A network-Based software approach." AL-Rafidain Journal of Computer Sciences and Mathematics 14, no. 1 (2020): 159-170.

S. Singh, P. K. Sarangi, C. Singla, and A. K. Sahoo, “Odia character recognition system: A study on feature extraction and classification techniques,” Mater Today Proc, vol. 34, pp. 742–747, 2021.

Sultan, Nagham A., Basim Mahmood, Karam H. Thanoon, and Dheyaa S. Khadhim. "Network Centralities-Based Approach for Evaluating Interdisciplinary Collaboration." In 2020 6th International Engineering Conference “Sustainable Technology and Development"(IEC), pp. 216-221. IEEE, 2020.

J. A. Etzel, V. Gazzola, and C. Keysers, “An introduction to anatomical ROI-based fMRI classification analysis,” Brain Res, vol. 1282, pp. 114–125, 2009.

A. LaTorre, L. Alonso-Nanclares, S. Muelas, J. M. Peña, and J. DeFelipe, “Segmentation of neuronal nuclei based on clump splitting and a two-step binarization of images,” Expert Syst Appl, vol. 40, no. 16, pp. 6521–6530, 2013.

K. Naresh, K. A. Khan, R. Umer, and W. J. Cantwell, “The use of X-ray computed tomography for design and process modeling of aerospace composites: A review,” Mater Des, vol. 190, p. 108553, 2020.

M. Badry, M. Hassanin, A. Chandio, and N. Moustafa, “Quranic script optical text recognition using deep learning in IoT systems,” CMC-Comput. Mater. Contin, vol. 68, pp. 1847–1858, 2021.

I. Vasilev, D. Slater, G. Spacagna, P. Roelants, and V. Zocca, Python Deep Learning: Exploring deep learning techniques and neural network architectures with Pytorch, Keras, and TensorFlow. Packt Publishing Ltd, 2019.

Similar Articles

<< < 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.