Document Type : Original Article
Authors
1 Associate Professor of the Faculty of Computer Science and Engineering and the Interdisciplinary Qur'anic Studies Research Institute Shahid Beheshti University Iran
2 Assistant Professor of Interdisciplinary Qur'anic Studies Research Institute Shahid Beheshti University Iran
Abstract
Abstract— With the increase in textual data generated on the internet and the limited time individuals have for reading, the need for automatic text summarization is more essential than ever. One application of summarization is title generation. The goal of this study, which falls within the field of digital humanities and interdisciplinary studies, is to provide a framework for title generation through extractive and abstractive summarization methods, focusing specifically on chapters of the Qur'an. For extractive summarization, eleven different methods have been examined, some of which are novel and innovative. For the abstractive part and title generation, several models have been trained to select the most effective one. In this research, the Persian translation of the Qur'an is used as the primary source, and a dataset was created based on the first ten parts (juz) of the Qur'an, including extractive summaries, abstractive summaries, and titles for various sections of the chapters. The results of this study indicate that the titles generated through summarization are close to human-generated titles, based on BERTScore, R-1, R-2, and R-l values of 21.03, 6.85, 20.73, and 52.51, respectively. It is important to note in the evaluation that a single fixed title does not exist for a document; multiple titles may also be valid. In human evaluation, we observed that the average score produced by the proposed approach is 0.59, while for the best results from other approaches, this value is 0.44.
Keywords
- Keywords— extractive summarization, abstractive summarization, title generation for Qur'
- anic surahs, computational Qur'
- anic studies
Main Subjects