Understanding that when duck appears with water, the word duck probably refers to an animal, whereas when duck appears with ice, the word duck probably refers to sports. With this knowledge, our similarity metric would find these documents not very similar at all. Suppose we had a library of words that are used in multiple contexts such as: string[] multicontextwords= {"duck", "crane", "book", }; suppose also that we have a multi-dimensional array that shows the multi-context words and common words that are used with them: string[][] wordcontext = { {"duck (animal)", "zoo", "feathers", "water", }, {"duck (sports)", "hockey", "anaheim", "ice", }, {"duck (politics)", "congress", "lame", }, {"crane (animal)", "bird", "water", }, {"crane (construction)", "building", "equipment", }, }; modify the createdocumentvectors() pseudocode from above to take advantage of the multicontextwords[] and wordcontext[][] arrays to create better document vectors so that the subsequent call to documentsimilarity() will better distinguish contexts.
Answers
Answered by
0
Answer:
sorry I am not understanding clearly
Similar questions