Evaluasi kinerja validasi KTM berbasis Tesseract OCR menggunakan metode Adaptive Thresholding dan Levenshtein Distance
Keywords:
Levenshtein Distance, OCR, Tesseract, Validasi Identitas, Adaptive ThresholdingAbstract
Validasi Kartu Tanda Mahasiswa (KTM) yang dilakukan secara manual pada sistem penyewaan inventaris UKKI UPN "Veteran" Jawa Timur rentan terhadap kesalahan manusia dan risiko keamanan aset. Penelitian ini bertujuan untuk mengevaluasi kinerja sistem validasi otomatis berbasis Tesseract OCR yang mengintegrasikan metode Adaptive Gaussian Thresholding untuk pra-pemrosesan citra dan Levenshtein Distance untuk verifikasi hasil ekstrkasi teks. Penelitian menerapkan pendekatan validasi yang mengombinasikan pencocokan string dan aturan logika berbasis struktur pada Nomor Pokok Mahasiswa (NPM). Hasil pengujian menunjukkan bahwa tahapan pra-pemrosesan citra efektif meningkatkan keterbacaan teks pada kondisi pencahayaan tidak merata dan orientasi acak. Mekanisme validasi bertingkat ini terbukti efektif dalam menangani kesalahan baca (noise) pada OCR, termasuk kemampuan pemulihan data (data recovery) yang terdistorsi menggunakan logika fallback. Evaluasi kinerja sistem menggunakan Confusion Matrix terhadap 20 sampel citra menghasilkan nilai Akurasi sebesar 95%, Presisi 100%, dan Recall 93,3%. Tingkat akurasi mengindikasikan bahwa sistem mampu mencegah penerimaan dokumen tidak sah secara efektif, menjadikan solusi ini efektif untuk meningkatkan keamanan dan efisiensi verifikasi identitas pada penyewaan inventaris.
Downloads
References
Akinbade, D., Ogunde, A. O., Odim, M. O., & Oguntunde, B. O. (2020). An adaptive thresholding algorithm-based optical character recognition system for information extraction in complex images. Journal of Computer Science, 16(6), 784–801. https://doi.org/10.3844/JCSSP.2020.784.801
Carta, S., Giuliani, A., Piano, L., & Tiddia, S. G. (2024). An End-to-End OCR-Free Solution For Identity Document Information Extraction. Procedia Computer Science, 246(C), 453–462. https://doi.org/10.1016/j.procs.2024.09.425
Dey, R., Balabantaray, R. C., Mohanty, S., Singh, D., Karuppiah, M., & Samanta, D. (2022). Approach for Preprocessing in offline Optical Character Recognition (OCR). 2022 International Conference on Interdisciplinary Research in Technology and Management, IRTM 2022 - Proceedings. https://doi.org/10.1109/IRTM54583.2022.9791698
Erameh, K. B., & Odoh, B. I. (2021). Design and Implementation of a Web-Based Inventory Control System Using a Small Medium Enterprise (SME) as a Case Study. NIPES - Journal of Science and Technology Research, 3(3), 211–219. https://doi.org/10.37933/nipes/3.3.2021.21
Hamad, K. A., & Kaya, M. (2016). Applied Mathematics , Electronics and Computers A Detailed Analysis of Optical Character Recognition Technology. https://www.researchgate.net/profile/Karez_Hamad/publication/311851325_A_Detailed_Analysis_of_Optical_Character_Recognition_Technology/links/5862191908ae8fce490767f6/A-Detailed-Analysis-of-Optical-Character-Recognition-Technology.pdf
Hládek, D., Staš, J., Ondáš, S., Juhár, J., & Kovács, L. (2017). Learning string distance with smoothing for OCR spelling correction. Multimedia Tools and Applications, 76(22), 24549–24567. https://doi.org/10.1007/s11042-016-4185-5
Ingle, P. D., & Kaur, P. (2017). Adaptive thresholding to robust image binarization for degraded document images. Proceedings - 1st International Conference on Intelligent Systems and Information Management, ICISIM 2017, 2017-Janua, 189–193. https://doi.org/10.1109/ICISIM.2017.8122172
Kusumawardhani, W., Purwanto, A., Fariqi, M., & Afnan G, L. (2025). Perancangan Sistem Peminjaman Barang Inventaris Berbasis Website untuk Meningkatkan Keamanan Aset Inventaris ITS. Blantika: Multidisciplinary Journal, 3(3), 259–269. https://doi.org/10.57096/blantika.v3i3.300
Nasution, A. B., Aulia, H., Audiansyah, W., & Raihan, M. S. (2023). Implementasi Keamanan Aset Sekolah Angkasa Berbasis Website. Jurnal Sains Dan Teknologi (JSIT), 3(1), 68–73. https://doi.org/10.47233/jsit.v3i1.495
Prakisya, N. P. T., Kusmanto, B. T., & Hatta, P. (2024). Comparative Analysis of Google Vision OCR with Tesseract on Newspaper Text Recognition. Media of Computer Science, 1(1), 31–46. https://doi.org/10.69616/mcs.v1i1.178
Qhitfir, M., Sujani, H., & Sholva, Y. (2025). Analisis Perbandingan Tesseract Ocr dan Easyocr Untuk Pengenalan Karakter Dengan Yolo Sebagai Alat Bantu Dalam Pendeteksian Plat Nomor Kendaraan. Scientica: Jurnal Ilmiah Sains Dan Teknologi, 3(5), 333–340.
Rusli, F. M., Adhiguna, K. A., & Irawan, H. (2021). Indonesian ID Card Extractor Using Optical Character Recognition and Natural Language Post-Processing. 2021 9th International Conference on Information and Communication Technology, ICoICT 2021, 621–626. https://doi.org/10.1109/ICoICT52021.2021.9527510
Saoji, S., Arora, A., Singh, R., Mangal, A., & Eqbal, A. (2021). Text Recogination and Detection From Images Using Pytesseract. Article in Journal of Interdisiplinary Cycle Research, XIII(Vii), 1674–1679. https://www.researchgate.net/publication/353679800
Sofwan, A., Sumardi, A. Y., Santoso, I., Adi Soetrisno, Y. A., Arfan, M., & Handoyo, E. (2021). Optimization of OCR in Detecting Research Proposal and Lecturer Community Service Documents using Thresholding Method. 2021 8th International Conference on Information Technology, Computer and Electrical Engineering, ICITACEE 2021, 175–179. https://doi.org/10.1109/ICITACEE53184.2021.9617487
Vakili, M., Ghamsari, M., & Rezaei, M. (2020). Performance Analysis and Comparison of Machine and Deep Learning Algorithms for IoT Data Classification. http://arxiv.org/abs/2001.09636
Wibisono, R. S., Sofianti, T. D., & Awibowo, S. (2016). Development of A Web-Based Information System for Material Inventory Control: The Case of An Automotive Company. CommIT (Communication and Information Technology) Journal, 10(2), 71. https://doi.org/10.21512/commit.v10i2.1579
Zacharias, E., Teuchler, M., & Bernier, B. (2020). Image Processing Based Scene-Text Detection and Recognition with Tesseract. 1–6. http://arxiv.org/abs/2004.08079