TY - JOUR ID - TI - A new approach for finding duplicated words in scanned Arabic documents based on OCR and SURF AU - Asmaa Alsimry AU - asmamahdi44@gmail.com AU - Enas Wahab Abood PY - 2021 VL - 47 IS - 1 SP - 201 EP - 215 JO - Journal of Basrah Researches (Sciences) مجلة ابحاث البصرة ( العلميات) SN - 18172695 2411524X AB - Copy move forgery detection (CMFD) is an important challenge nowadays especially with increasing electronic dealings for documents (e.g., images, pays lists, contracts, invoices, or any other information documents) and its availability in clouds exposes it to the danger of hackers and deceivers. Thus, many methods of detection images forgeries were presented to detect the forgery in the natural scenes and a few had scanned text documents. In this paper, an approach is presented to find duplicated words in Arabic scanned text documents as part of a system to detect forgery in official documents based on Optical character recognition (OCR) and Speeded Up Robust Feature (SURF). OCR was used for the purpose of segmenting the word out of the document while SURF was used to compute the feature descriptors of each segmented word for later matching which has two stages, the first stage is using the Euclidean equation for the purpose of matching features, followed by computing the ratio of the number of features corresponding to the reference word features. The two words are considered to be duplicated if the percentage increase or equal to a specific threshold (30%). The experimental results showed that the proposed approach was efficient and robust in finding duplicated words as it implemented for (image size <=90x90 pixel) the evaluation metrics were: TPR =50.54%, FPR=66.66%, ACC=25.436% , Running time=1.25 sec, And for ( image size >90x90 pixel) TPR =100%, FPR=60%, ACC=50.2% , Running time=1.87 sec. for SURF , while (image size <=90x90 pixel) TPR =95.3%, FPR=0%, ACC=97.65%, Running time=0.15 sec, and ( image size >90x90 pixel) TPR =100%, FPR=0%, ACC=100%, Running time=0.18 sec.

ER -