Sentence Annotation based Enhanced Semantic Summary Generation from Multiple Documents
- 1 Department of Computer Science and Engineering, Kongu Engineering College, Erode, India
Problem statement: The goal of document summarization is to provide a summary or outline of manifold documents with reduction in time. Sentence extraction could be a technique that is employed to pick out relevant and vital sentences from documents and presented as a summary. So there is a need to develop more meaningful sentence selection strategy so as to extract most significant sentences. Approach: This study proposes an approach of generating initial and update summary by performing sentence level semantic analysis. In order to select the necessary information from documents all the sentences are annotated with aspects, prepositions and named entities. To detect most dominant concepts within a document, Wikipedia is used as a resource and the weight of each word is calculated using Term Synonym Concept Frequency-Inverse Sentence Frequency (TSCF-ISF) measure. Sentences are ranked based on the scores they have been assigned and the summary is formed from the highest ranking sentences. Results: To evaluate the quality of a summary based on coverage between machine summary and human summary intrinsic measures called Precision and Recall are used. Precision is used to determine exactness whereas Recall is used to measure the completeness of the summary. Then our results are compared with LexRank Update summarization task and with the Semantic Summary Generation method. The ROUGE-1 measure is used to identify how well machine generated summary correlates with human summary. Conclusion: The performance of update summarization relies highly on measurement of sentence similarity based on TSCF-ISF. The experiment result shows that low overlap between initial summary and its update summary.
Copyright: © 2012 A. Kogilavani and P. Balasubramanie. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 2,069 Views
- 3,328 Downloads
- 5 Citations
- Term Synonym Concept Frequency-Inverse Sentence Frequency (TSCF-ISF)
- sentence annotation
- semantic element extraction
- sentence scoring
- initial summary
- update summary