NCJ Number
84965
Date Published
1981
Length
52 pages
Annotation
A new clustering program written to use a measure of association not previously available is compared with existing programs by analyzing a single set of real data.
Abstract
Cluster analysis is used to find clusters (groups of objects similar to each other) from a set of objects using qualitative or quantitative measures of the objects. A new clustering program has been written which uses q as the index for finding items that cluster together. The program is described in McCormick, et al (1980). The model allows for overlapping sets of items. It can recover a hierarchical arrangement but does not impose that structure. The method starts each item as a cluster. Then a second item is joined to a cluster which has the highest pairwise q with the original item. From this point on, an item is added if it has the highest average q value with the cluster. Average is computed as the arithmetic average, similar to the average linkage clustering described in Sneath and Sokal (1973). The program can compute other indices of association as well as q. Other indices used are KR20, gamma, and Pearson r. This new program was compared with existing programs using data consisting of adjective ratings by social workers of a group of mothers. Six different clustering programs were examined. The solution obtained with the new program adds another option to the field of cluster analysis, especially for data consisting of items distributed in a nonnormal manner. Crime is a type of data with a nonnormal distribution, in that most people commit few or no crimes while a few people commit the majority of crimes, although frequencies of crimes may still be small. Twenty-nine references, tabular data, and mathematical equations are provided.