

Deadwood Detection and Elimination in Text Summarization for Punjabi Language
As the internet is growing rapidly, this has resulted in large amount of information. Text summarization provides shorthand version for such information, which is no longer than half of the original text. This paper proposes a system for detection and removal of Deadwood in summaries for Punjabi language. Deadwood means word or phrase that can be omitted without loss in meaning. Removing it shortens and clarifies the summary. The first step in this process is preprocessing which consists of sentence segmentation and removal of Punjabi stop words and then in the second step weight is assigned to the sentences in the source text .We used five different features for the assignment of weight to the sentences. In the next step the highest scoring sentences are selected to form the summary. In the last step the Deadwood is eliminated and removed from the summary.
Keywords
Deadwood, Phrase, Summary.
User
Font Size
Information

Abstract Views: 284

PDF Views: 0