Which words in a corpus have the highest values and which ones have the least?
Answers
Answered by
1
Answer:
Stop words like - and, this, is, the, etc. have highest values in a corpus. But these words do not talk about the corpus at all. Hence, these are termed as stop words and are mostly removed at the pre-processing stage only. Rare or valuable words occur the least but add the most importance to the corpus. Hence, when we look at the text, we take frequent and rare words into consideration.
Attachments:
Similar questions
Chemistry,
8 days ago
Computer Science,
8 days ago
Science,
16 days ago
History,
16 days ago
Social Sciences,
9 months ago
Math,
9 months ago
English,
9 months ago