2. Single document summarization
Proposed use for Findwise:
• Meta data for indexing service
Unsupervised:
• No need for trainingset
• Relative domain independence
• Relative language independence
4. Sentence extraction
Sentence ranking
• Real value ranking
• Relevance ordering
Sentence selection
• Desired summary length
Sentence ordering
• Final presentation
5. TextRank
Graph based
• Sentences as vertices
• Similarity as edges
Iterative ranking
• PageRank
6. Sentence Similarity
What makes two sentences similar?
Explored variations
• Shared words
• Word importance
• Lexical filtering
• Length normalization
• Advanced analysis
7. K-means clustering
Approach:
• Sentences as points
• Divide into clusters
• Select sentences from each cluster
• Diverse summaries
8. Domain customization
Domain: short news articles in English
• Sentence position important
• Use domain knowledge to improve performance
• Other boosting for other domains