طريقة مُقتَرَحة للكشف عن تشابه النَصِّوص في الوثائق الانكليزية باستخدام الشبكات الدلالية

Abstract

The massive increase in documents on the world-wide web and the ease to reach them have led to a dangerous problem which is using others' works without giving them credits. Although a number of methods have been developed to discover some cases of plagiarism, as changing sentence structure or replacing words with their synonyms, it is still difficult to diagnose plagiarism when modifying deliberately quoted sentences. In this paper a Semantic Similarity Algorithm System is proposed for discovering plagiarism in texts using semantic networks. A data base is formed to accommodate the original texts belonging to different fields. The system depends on determining the degree of similarity between original documents and inserted documents on the basis of sharing the largest number of similar words. It has been improved by determining the similarity between the plagiarized documents and the big source books in addition to words of multiple meanings. It compares the plagiarized document to the data base text, and recalls the text percentage of similarity over 50%. It also compares the original text with the plagiarized one and recall the percentage of similarity between them as well as the number of nouns, verbs, adjective and adverbs which are similar in both writing and meaning.