Chapter19 Clustering Analysis Content ? Similarity coefficient ? Hierarchical clustering analysis ? Dynamic clustering analysis ? Ordered sample clustering analysis D iscrimina ntA nalysis : having known with certainty to come from two or more populations, it ’ s a method to acquire the discriminate model that will allocate further individuals to the correct population. Clustering Analysis : a statistic method for grouping objects of random kind into respective categories. It ’ s used when there ’ s no priori hypotheses, but trying to find the most appropriate sorting method resorting to mathematical statistics and some collected information. It has e the first selected means to uncover great capacity of ic messages. Both are methods of multivariate statistics to study classification. Clustering analysis is a method of exploring statistical analysis. It can be classified into two major species according to its aims. For example, m refers to the number of variables(. indexes) while n refers to that of cases(. samples) ,you can do as follows: (1) R-type clustering : also called index clustering. The method to sort the m kinds of indexes, aiming at lowering the dimension of indexes and choosing typical ones. (2) Q-type clustering : also called sample clustering. The method to sort the n kinds of samples to find the commonness among them. The most important thing for both R-type clustering and Q-type clustering is the definition of similarity, that is how to quantify similarity. The first step of clustering is to define the metric similarity between two indexes or two samples- similarity coefficient § 1 similarity coefficient 1 similarity coefficient of R-type clustering Suppose there are m kinds of variables: X 1,X 2,…,X m . R- type clustering usually use the absolute value of simple correlation coefficient to define the similarity coefficient among variables: The two variables tend to be more similar when the