WPC300 Quiz 7: Clustering

Started: Feb  2021

Which of the following is a step of agglomerative hierarchical clustering?

• By separating cluster into two finer groups
• By joining two clusters that not at a Euclidean distance
• By joining two clusters that are closest to each other
• By joining two clusters farthest away from each other

Which of the following is true about k-means clustering

• The cluster analysis will give us an optimum value for k
• We choose the value for k before doing the clustering analysis
• A tree diagram is used to illustrate the steps in the clustering analysis
• It is a type of hierarchical clustering

Which of the following is true of hierarchical clustering?

• All clusters must have the same number of data
• All clusters must have more than one object in it
• No single cluster can have all data
• The data partition does not occur in a single step

In a cluster analysis, the distance between the clusters should be:

• Minimized
• Even
• Maximized
• Zero

Which of the following is not an application of clustering analysis?

• Crime prediction analysis
• Web click stream analysis
• Market segmentation analysis
• Collaborating filtering analysis

In the Target story discussed in the lecture, why did Target send the teen daughter maternity ads?

• Target analytic model confused her with an older woman with a similar name
• Target was using special promotion that targeted all teens in her geographical area
• Target analytics model suggested she was pregnant based on her buying habit
• Target was sending ads to all women in a particular neighborhood

Which of the following category of data mining you would use for Spam filtering of emails?

• Supervised
• Unsupervised
• Heuristics
• Both supervised and unsupervised

Which of the following statements below is false about supervised/unsupervised data analysis?

• The data is not labeled for unsupervised
• Data is not labeled for supervised analysis
• For unsupervised analysis, the goal is to find cases that are similar to each other
• The data is labeled for supervised analysis

Which of the following is a false statement?

• The k-means algorithm is a method for doing partitional clustering
• Reducing SSE (sum of squared error) within cluster increases cohesion
• To predict sales from transactional data one should perform clustering analysis.
• In the cluster analysis, the objects within clusters should exhibit an high amount of similarity

Which of the following is a definition of distance between two clusters in a complete linkage clustering?

• The sum of square of the distance between clusters
• The distance between the least distant pair of objects, one from each group
• The distance between the most distant pair of objects, one from each group
• The average of distance between all pairs of objects, where each pair is made up of one object from each group

