Evaluasi kinerja validasi KTM berbasis Tesseract OCR menggunakan metode Adaptive Thresholding dan Levenshtein Distance

Authors

  • Muhammad Faizul Ulum Program Studi Informatika, Fakultas Ilmu Komputer, Universitas Pembangunan Nasional “Veteran” Jawa Timur
  • Retno Mumpuni Program Studi Informatika, Fakultas Ilmu Komputer, Universitas Pembangunan Nasional “Veteran” Jawa Timur
  • Afina Lina Nurlaili Program Studi Informatika, Fakultas Ilmu Komputer, Universitas Pembangunan Nasional “Veteran” Jawa Timur

Keywords:

Levenshtein Distance, OCR, Tesseract, Validasi Identitas, Adaptive Thresholding

Abstract

Validasi Kartu Tanda Mahasiswa (KTM) yang dilakukan secara manual pada sistem penyewaan inventaris UKKI UPN "Veteran" Jawa Timur rentan terhadap kesalahan manusia dan risiko keamanan aset. Penelitian ini bertujuan untuk mengevaluasi kinerja sistem validasi otomatis berbasis Tesseract OCR yang mengintegrasikan metode Adaptive Gaussian Thresholding untuk pra-pemrosesan citra dan Levenshtein Distance untuk verifikasi hasil ekstrkasi teks. Penelitian menerapkan pendekatan validasi yang mengombinasikan pencocokan string dan aturan logika berbasis struktur pada Nomor Pokok Mahasiswa (NPM). Hasil pengujian menunjukkan bahwa tahapan pra-pemrosesan citra efektif meningkatkan keterbacaan teks pada kondisi pencahayaan tidak merata dan orientasi acak. Mekanisme validasi bertingkat ini terbukti efektif dalam menangani kesalahan baca (noise) pada OCR, termasuk kemampuan pemulihan data (data recovery) yang terdistorsi menggunakan logika fallback. Evaluasi kinerja sistem menggunakan Confusion Matrix terhadap 20 sampel citra menghasilkan nilai Akurasi sebesar 95%, Presisi 100%, dan Recall 93,3%. Tingkat akurasi mengindikasikan bahwa sistem mampu mencegah penerimaan dokumen tidak sah secara efektif, menjadikan solusi ini efektif untuk meningkatkan keamanan dan efisiensi verifikasi identitas pada penyewaan inventaris. 

Downloads

Download data is not yet available.

References

Akinbade, D., Ogunde, A. O., Odim, M. O., & Oguntunde, B. O. (2020). An adaptive thresholding algorithm-based optical character recognition system for information extraction in complex images. Journal of Computer Science, 16(6), 784–801. https://doi.org/10.3844/JCSSP.2020.784.801

Carta, S., Giuliani, A., Piano, L., & Tiddia, S. G. (2024). An End-to-End OCR-Free Solution For Identity Document Information Extraction. Procedia Computer Science, 246(C), 453–462. https://doi.org/10.1016/j.procs.2024.09.425

Dey, R., Balabantaray, R. C., Mohanty, S., Singh, D., Karuppiah, M., & Samanta, D. (2022). Approach for Preprocessing in offline Optical Character Recognition (OCR). 2022 International Conference on Interdisciplinary Research in Technology and Management, IRTM 2022 - Proceedings. https://doi.org/10.1109/IRTM54583.2022.9791698

Erameh, K. B., & Odoh, B. I. (2021). Design and Implementation of a Web-Based Inventory Control System Using a Small Medium Enterprise (SME) as a Case Study. NIPES - Journal of Science and Technology Research, 3(3), 211–219. https://doi.org/10.37933/nipes/3.3.2021.21

Hamad, K. A., & Kaya, M. (2016). Applied Mathematics , Electronics and Computers A Detailed Analysis of Optical Character Recognition Technology. https://www.researchgate.net/profile/Karez_Hamad/publication/311851325_A_Detailed_Analysis_of_Optical_Character_Recognition_Technology/links/5862191908ae8fce490767f6/A-Detailed-Analysis-of-Optical-Character-Recognition-Technology.pdf

Hládek, D., Staš, J., Ondáš, S., Juhár, J., & Kovács, L. (2017). Learning string distance with smoothing for OCR spelling correction. Multimedia Tools and Applications, 76(22), 24549–24567. https://doi.org/10.1007/s11042-016-4185-5

Ingle, P. D., & Kaur, P. (2017). Adaptive thresholding to robust image binarization for degraded document images. Proceedings - 1st International Conference on Intelligent Systems and Information Management, ICISIM 2017, 2017-Janua, 189–193. https://doi.org/10.1109/ICISIM.2017.8122172

Kusumawardhani, W., Purwanto, A., Fariqi, M., & Afnan G, L. (2025). Perancangan Sistem Peminjaman Barang Inventaris Berbasis Website untuk Meningkatkan Keamanan Aset Inventaris ITS. Blantika: Multidisciplinary Journal, 3(3), 259–269. https://doi.org/10.57096/blantika.v3i3.300

Nasution, A. B., Aulia, H., Audiansyah, W., & Raihan, M. S. (2023). Implementasi Keamanan Aset Sekolah Angkasa Berbasis Website. Jurnal Sains Dan Teknologi (JSIT), 3(1), 68–73. https://doi.org/10.47233/jsit.v3i1.495

Prakisya, N. P. T., Kusmanto, B. T., & Hatta, P. (2024). Comparative Analysis of Google Vision OCR with Tesseract on Newspaper Text Recognition. Media of Computer Science, 1(1), 31–46. https://doi.org/10.69616/mcs.v1i1.178

Qhitfir, M., Sujani, H., & Sholva, Y. (2025). Analisis Perbandingan Tesseract Ocr dan Easyocr Untuk Pengenalan Karakter Dengan Yolo Sebagai Alat Bantu Dalam Pendeteksian Plat Nomor Kendaraan. Scientica: Jurnal Ilmiah Sains Dan Teknologi, 3(5), 333–340.

Rusli, F. M., Adhiguna, K. A., & Irawan, H. (2021). Indonesian ID Card Extractor Using Optical Character Recognition and Natural Language Post-Processing. 2021 9th International Conference on Information and Communication Technology, ICoICT 2021, 621–626. https://doi.org/10.1109/ICoICT52021.2021.9527510

Saoji, S., Arora, A., Singh, R., Mangal, A., & Eqbal, A. (2021). Text Recogination and Detection From Images Using Pytesseract. Article in Journal of Interdisiplinary Cycle Research, XIII(Vii), 1674–1679. https://www.researchgate.net/publication/353679800

Sofwan, A., Sumardi, A. Y., Santoso, I., Adi Soetrisno, Y. A., Arfan, M., & Handoyo, E. (2021). Optimization of OCR in Detecting Research Proposal and Lecturer Community Service Documents using Thresholding Method. 2021 8th International Conference on Information Technology, Computer and Electrical Engineering, ICITACEE 2021, 175–179. https://doi.org/10.1109/ICITACEE53184.2021.9617487

Vakili, M., Ghamsari, M., & Rezaei, M. (2020). Performance Analysis and Comparison of Machine and Deep Learning Algorithms for IoT Data Classification. http://arxiv.org/abs/2001.09636

Wibisono, R. S., Sofianti, T. D., & Awibowo, S. (2016). Development of A Web-Based Information System for Material Inventory Control: The Case of An Automotive Company. CommIT (Communication and Information Technology) Journal, 10(2), 71. https://doi.org/10.21512/commit.v10i2.1579

Zacharias, E., Teuchler, M., & Bernier, B. (2020). Image Processing Based Scene-Text Detection and Recognition with Tesseract. 1–6. http://arxiv.org/abs/2004.08079


Downloads

Published

2026-03-31

How to Cite

Ulum, M. F., Mumpuni, R., & Nurlaili, A. L. (2026). Evaluasi kinerja validasi KTM berbasis Tesseract OCR menggunakan metode Adaptive Thresholding dan Levenshtein Distance. Prosiding Seminar Nasional Penelitian Dan Pengabdian Kepada Masyarakat LPPM Universitas ’Aisyiyah Yogyakarta, 4, 892–901. Retrieved from https://proceeding.unisayogya.ac.id/index.php/prosemnaslppm/article/view/2214

Issue

Section

Penelitian