a) Consider the dataset in Table 1. Grade, Bumpiness and Speed-limit are the features and Speed
is label.

Table 1: Dataset for decision tree



Speed-limit Speed
1 steep bumpiness yes slow
2 steep smooth yes slow
3 flat bumpiness no fast
4 steep smooth no fast

Answer the followings:
i) Determine the entropy of Speed.
ii) Which attribute should be selected as a root of the decision tree?
iii) Construct the decision tree for this dataset based on information gain.

b) What to you mean by clustering? Consider the following sample points, A (1, 1), B (2, -2), C (2, 3), D (3, 3). Perform k-means clustering, show the calculation of distance matrix and group assignment matrix for two epochs only. [Assume k=2]

