Explain the term ‘Statistical Clustering’ and ‘Induction Tree’ with suitable example?
Answers
Decision-tree induction is a well-known technique for assigning objects to categories in a white-box fashion. Most decision-tree induction algorithms rely on a sub-optimal greedy top-down recursive strategy for growing the tree. Even though such a strategy has been quite successful in many problems, it presents several deficiencies. For instance, there are cases in which the hyper-rectangular surfaces generated by these algorithms can only fit the input space after several sequential partitions, which results in a large and incomprehensible tree. In this paper, we propose a new decision-tree induction algorithm based on clustering named Clus-DTI. Our intention is to investigate how clustering data as a part of the induction process affects the accuracy and complexity of the generated models. Our performance analysis is not based solely on the straightforward comparison of our proposed algorithm to baseline classifiers. We also perform a data-dependency analysis in order to identify scenarios in which Clus-DTI is a more suitable option for inducing decision trees.