Computer Science, asked by designerjahanvyas, 5 months ago

Function ___________ divides a line of text into individual words.​

Answers

Answered by GarimaaPandey
0

Funtion tokenization divides a line of text into individual words.

Answered by freedarajesh2003
0

Answer:

Explanation:

Splitting a Sentence into Words: .split()

Below, mary is a single string. Even though it is a sentence, the words are not represented as discreet units. For that, you need a different data type: a list of strings where each string corresponds to a word. .split() is the method to use:

 

>>> mary = 'Mary had a little lamb'

>>> mary.split()  

['Mary', 'had', 'a', 'little', 'lamb']  

.split() splits mary on whitespce, and the returned result is a list of words in mary. This list contains 5 items as the len() function demonstrates. len() on mary, by contrast, returns the number of characters in the string (including the spaces).

 

>>> mwords = mary.split()  

>>> mwords

['Mary', 'had', 'a', 'little', 'lamb']  

>>> len(mwords)                # number of items in mwords

5  

>>> len(mary)                  # number of characters

22  

Whitespace characters include space ' ', the newline character '\n', and tab '\t', among others. .split() separates on any combined sequence of those characters:

 

>>> chom = ' colorless     green \n\tideas\n'       # ' ', '\n', '\t' bunched up

>>> print(chom)

colorless     green  

ideas

 

>>> chom.split()

['colorless', 'green', 'ideas']  

Splitting on a Specific Substring

By providing an optional parameter, .split('x') can be used to split a string on a specific substring 'x'. Without 'x' specified, .split() simply splits on all whitespace, as seen above.

 

>>> mary = 'Mary had a little lamb'

>>> mary.split('a')                 # splits on 'a'

['M', 'ry h', 'd ', ' little l', 'mb']  

>>> hi = 'Hello mother,\nHello father.'

>>> print(hi)

Hello mother,

Hello father.  

>>> hi.split()                # no parameter given: splits on whitespace

['Hello', 'mother,', 'Hello', 'father.']  

>>> hi.split('\n')                 # splits on '\n' only

['Hello mother,', 'Hello father.']  

String into a List of Characters: list()

But what if you want to split a string into a list of characters? In Python, characters are simply strings of length 1. The list() function turns a string into a list of individual letters:

 

>>> list('hello world')

['h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd']  

More generally, list() is a built-in function that turns a Python data object into a list. When a string type is given, what's returned is a list of characters in it. When other data types are given, the specifics vary but the returned type is always a list. See this tutorial for details.

Joining a List of Strings: .join()

If you have a list of words, how do you put them back together into a single string? .join() is the method to use. Called on a "separator" string 'x', 'x'.join(y) joins every element in the list y separated by 'x'. Below, words in mwords are joined back into the sentence string with a space in between:

 

>>> mwords

['Mary', 'had', 'a', 'little', 'lamb']  

>>> ' '.join(mwords)

'Mary had a little lamb'  

Joining can be done on any separator string. Below, '--' and the tab character '\t' are used.

 

>>> '--'.join(mwords)

'Mary--had--a--little--lamb'  

>>> '\t'.join(mwords)

'Mary\thad\ta\tlittle\tlamb'  

>>> print('\t'.join(mwords))

Mary    had     a       little  lamb  

The method can also be called on the empty string '' as the separator. The effect is the elements in the list joined together with nothing in between. Below, a list of characters is put back together into the original string:

 

>>> hi = 'hello world'

>>> hichars = list(hi)

>>> hichars

['h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd']  

>>> ''.join(hichars)

'hello world'

Similar questions