Computer Science, asked by halfdinner4444, 1 year ago

Preprocessing method used for unstructured data classification?

Answers

Answered by writersparadise

The question is vague.

Preprocessing methods are important steps that are critical in Text Mining, in NLP or Natural Language Processing, and in IR or information retrieval.

Text gathering is a step in which all the raw data is collected. All these data is unstructured. Preprocessing steps allow the segmentation and representation of these raw unstructured data in a more defined way.

Tokenization is a step in preprocessing of unstructured data in which the raw data is divided into terms which are called as feature generation. The stemming algorithm is applied so that all the terms are represented in a stemmed form and also the stop words are removed.

Previous Question

Next Question