ADVANCMENTS IN TEXT SUMMARIZATION AND EXTRACTIVE QUESTION- ANSWERING : A MACHINE LEARNING APPROACH
DR D. Eswara Chaitanya , Nandigam Mounika , Paridala Sai Lokesh , Vudumula Gopichand , Paritala Bharath Kumar
dept of Electronics and Communication Engineering RVR & JC College Of Engineering, Guntur
Abstract— In the era of social media platforms, the rapid expansion of data mining in the fields of information retrieval and natural language processing emphasizes the crucial need for automated text summarization. At the current time, pretrained word embedding techniques and sequence to sequence models can be effectively repurposed in the realm of social network summarization to efficiently condense significant information with strong encoding capabilities. However, dealing with the challenge of extended text dependency and efficiently utilizing latent topic mapping presents an increasingly significant obstacle for these computational models. In this document, we propose a topic- focused approach for both extractive and abstractive summarization, integrating Question Answering features by employing BERT and Pegasus pretrained models.
A comprehensive analysis of the architectural complexities of the pre-existing models utilized in the creation of our model has been presented. The evaluation of our Text Summarization & Question Answering Model was conducted meticulously on a variety of datasets including Multi news, XSum, and CNN/Daily Mail, with experimental results indicating its achievement of state-of-the-art performance levels based on ROGUE scores. Following this, the verification of our results through human evaluation confirms that the summaries generated by our model are in alignment with the performance standards set by humans across various datasets.
Furthermore, we have integrated an Extractive Question Answering task into our model, proposing an architecture leveraging BERT and conducting comparative analyses against alternative language models to gauge its efficacy.
Keywords—Natural Language Processing (NLP), Text Summarization, Question & Answering (Q&A), Transformer Models, Pegasus, BERT (Bidirectional Encoder Representations from Transformers), Neural Topic Modeling, Extractive Summarization, Abstractive Summarization, Semantic Representation, Topic Embedding