Document Type : Original Article
Authors
1 Computer Science and Engineering Shahid Beheshti University Tehran, Iran
2 Interdisciplinary Studies of Quran Shahid Beheshti University Tehran, Iran
Abstract
Abstract—In today's world, simplifying compound and complex sentences into simple sentences is crucial for enhancing machine understanding in various natural language processing (NLP) tasks, such as inference, machine translation, and information extraction. This simplification process improves accuracy. Consequently, our research is inspired by a text simplification method called "split and rephrase." We introduce a new sequence-to-sequence text generation model that transforms complex sentences into simple ones based on the conjunction "and" in Persian. By utilizing linguistic models with millions or even billions of parameters, our approach facilitates a better understanding of text complexities and more accurate identification of breaking points. Our results show an output accuracy of 0.47 in the BLEU score for the generated simple sentences, which are both grammatically correct and fluent. By utilizing linguistic models with millions or even billions of parameters, our approach facilitates a better understanding of text complexities and more accurate identification of breaking points. Our results show an output accuracy of 0.47 in the BLEU score for the generated simple sentences, which are both grammatically correct and fluent.
Keywords
- Keywords—Split and rephrase
- Text simplification
- Compound sentence
- Complex sentence
- Simple sentence
- Text generation
Main Subjects