Computer Science, asked by kalechandrakant5978, 1 year ago

How will you treat text having short cut words (like bcz u thr etc...) in text mining?


afrujaKawsar48: isliye me jyada chat nhi kar paungi
afrujaKawsar48: and chat krke hi block kar dungi
afrujaKawsar48: fir unblock
afrujaKawsar48: nd suyash ko bi bol dena
afrujaKawsar48: especially block wali bat yad krke bolna
afrujaKawsar48: muahhh

Answers

Answered by sakshikumarisingh27
9
✨ hey mate here is ur answer ✨

→ TXT MNNG

☃️ HOPE THIS HELPS UH ☃️
Answered by darllinghari1
1

Answer:

After a text is obtained, we start with text normalization. Text normalization includes:

Converting all letters to lower or upper case.

Converting number into words or removing numbers.

Removing punctuations, accent marks, etc.

Removing white spacesExpanding abbreviations

Removing stop words, spares terms and particulaer words.

Shortcut words can be treated in two ways

a.Expand the shortcut words: stemming can bring the words in root form, though stemming object group needs to be defined for these words. Normalization techniques can be applied to expand these words.

b.Remove the shortcut words from the text using tokenization in Python or using “re” regex library or stop words list can also be updated to remove these words from text.

Explanation:

Similar questions