CLASSIFICATION.EXAMPLE 3/10/89 JF THIS IS A RESULTS FILE PRODUCED BY A RUN OF CA CLA WITH 50S DATA SET COMMENTS APPEAR HERE ==================== SPIDER V5 (01/16/86)/(ALBANY VX750 ) ON 14-JUL-88 AT 16:42:0 SPIDER 5 *** PROJECT CODE: DE5 DATA CODE: FIA *** WELCOME %%% TO %%% +++ THE +++ ... WORLD ... OF *** SPIDER *** .OPERATION: B23 .OPERATION: B23 ** START OF B23.DE5 ** 1 CA CLA 2 7 3 CLU007 4 1-4 5 5,3 6 4 7 0.000000 8 2.000000 9 Y 10 DDG007 20 EN .OPERATION: CA CLA .INP FILE CODE: 7 0 The file code identifying the .CLUST FILE: CORAN output files CLU007 Cluster output file. .FACTOR NUMBERS: 1-4 Iterations = iterations of the .# OF ITER./PART., # CENTERS/PART.: K-means algorithm. 5 3 Centers = images chosen . # OF PARTITIONS: randomly from which the K-means 4 0 algorithm starts. STEP ** CLASSY ** ------------------------------------------------------------------------------------------------------------------------------ SPECIFICATIONS FOR CLASSY FACTORS USED : 1 2 3 4 NBASE= 4 NITER= 5 NCLAS= 3 NKLA = 100 MEMORY RESERVATION YOU HAVE RESERVED100000 YOU NEED 7752 CLUSTERING BY AGGREGATION AROUND MOBILE CENTERS PARTITION OF 100 OBJECT CHARACTERIZED BY 4 CARTESIAN COORDINATES ------------------------------------------------------------------------------------------------------------------------------ PARTITION CONTAINS 100 CLASSES THE 99 FIRST CONTAINS THE MOST STABLE OBJECTS IN THE 4 BASIC PARTITIONS Summary of parameters EACH PARTITION IS GENERATED BY 5 ITERATIONS AROUND 3 SEED-OBJECTS DRAWN AT RANDOM specified .ENTER SEED INTEGER (0=RANDOM DRAW): 0.000000 ** RANDOM SEED ASSIGNED = 601393 Use this integer when you wish to precisely CONSTRUCTION OF A PARTITION WITH SEED-OBJECTS 68 13 35 reproduce this result! SIZE OF CLUSTERS AFTER 5 ITERATIONS partition 1 66. 17. 17. CONSTRUCTION OF A PARTITION WITH SEED-OBJECTS 27 24 10 SIZE OF CLUSTERS AFTER 5 ITERATIONS partition 2 33. 35. 32. CONSTRUCTION OF A PARTITION WITH SEED-OBJECTS 93 15 60 SIZE OF CLUSTERS AFTER 5 ITERATIONS partition 3 35. 33. 32. CONSTRUCTION OF A PARTITION WITH SEED-OBJECTS 1 20 48 SIZE OF CLUSTERS AFTER 5 ITERATIONS partition 4 35. 32. 33. SIZE OF THE 81 CLUSTERS FROM THE CROSSED-PARTITION 81=3**4 possible clusters FOLLOWED BY THECUMULATIVE PERCENTAGES. 34 32 17 16 1 of these, only five contain any objects! One cluster 34.0 66.0 83.0 99.0 100.0 has a single member only SIZE OF RESIDUAL CLUSTER (NUMBER 100 )= 0 PERCENTENTAGE = 0.00 ------------------------------------------------------------------------------------------------------------------------------- 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 cluster assignments of the 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 4 4 3 3 100 objects (20 per row, 4 5 3 4 3 4 4 3 4 3 4 3 4 3 4 3 4 3 4 3 from 1 to 100, numbers refer 3 4 3 4 3 4 3 4 2 2 2 2 2 2 2 2 2 2 2 2 to cluster number) 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 .PERC. FOR CLASS CUTOFF(0=NO CUTOFF): 2.000000 DESCRIPTION OF THE HIERARCHY NODES At this stage a decision is made on which clusters to include in the NO SENIOR JUNIOR NO. WEIGHT INDEX hierarchical merging process. The cutoff (2%) 5 3 4 2 33.00 0.0049 **** specified excludes the 6 1 2 2 66.00 0.0331 ********************* single-member cluster 7 5 6 4 99.00 0.1457 ************************************************************************************** DO YOU WANT DENDROGRAM PLOT FILE? (Y/N): Y Result of hier. classification: .ENTER FILE NAME FOR DENDROGRAM: Since only four clusters have passed DDG007 the cutoff, the merged clusters File already exists, O.K. to erase and overwrite? (N/Y): (classes) are assigned numbers 5 and Y up. The description of the hierarchy DELETE DDG007.FIA;* nodes contains the steps of merging FILE OPENED: DDG007.FIA each of the original clusters with the group of clusters already merged. In our example, the original clusters 3 and 4 are found to be closest, and they are merged to form the new cluster #5. Similarily, 1 and 2 are merged to form #6. The last step (which is always trivial ) merges the new clusters #5 and #6 to form the trivial cluster #7, which contains all objects. NODE INDEX SENIOR JUNIOR SIZE DESCRIPTION OF THE HIERARCHY CLASSES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The hierarchy index gives the decrease 5 0.005 3 4 2 3 4 in intra-cluster variance resulting from the merging. A small decrease (as in 3+4) indicates that the 6 0.033 1 2 2 1 2 clusters are in close proximity, and can practically be regarded as the same cluster. A large 7 0.146 5 6 4 3 4 1 2 decrease (as in 1+2) indicates that the clusters are truly distinct, and should only be regarded as a single cluster relative to even more distant configurations. WEIGHT INDEX DENDROGRAM (SCALE 0.00 0.15 ) 32.000 0.033 2 ....................... . 34.000 0.146 1 .................................................................................................. . 16.000 0.005 4 ... . . . 17.000 ------ 3 .................................................................................................. ----------------------------------------------------------------------------------------------------------------------------- END OF STEP ** CLASSY ** ----------------------------------------------------------------------------------------------------------------------------- LIST OF CLASS MEMBERS CLASS 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 note that a single object, #42 21 22 23 24 25 26 27 28 29 30 31 32 33 34 is missing, because its cluster was rejected based on the 2% 2 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 cutoff. 89 90 91 92 93 94 95 96 97 98 99 100 3 35 36 39 40 43 45 48 50 52 54 56 58 60 61 63 65 67 4 37 38 41 44 46 47 49 51 53 55 57 59 62 64 66 68 LIST OF CLASS CENTER COORDINATES CLASS SIZE 1 2 3 4 This gives a quick idea of how many factorial axes 1 34 -0.0120 0.0227 -0.0028 0.0002 are involved in the distinctions. In this example, 2 32 -0.0285 -0.0188 0.0003 0.0003 factor 1-3 are needed to get a good picture of what 3 17 0.0424 -0.0067 -0.0096 -0.0037 is happening. 4 16 0.0366 -0.0050 0.0131 0.0033 RE-CLASSIFICATION LOOKUP TABLE ORIGINAL CLASS only two non-trivial cuts can be made in the dendo- gram to decide on class membership: cut #1 leads to 1 2 3 4 two classes (row #2 of table gives classification 2 1 1 2 2 of original clusters into 1 or 2); cut #2 leads to 3 1 2 3 3 3 classes (row #3 of table gives classification into 1, 2, and 3). DISPERSIONS AND INTER-CLASS DISTANCES OF 10 LARGEST CLUSTERS CLASS DISP NEIGHBORS 1 2 3 4 5 6 7 8 9 10 1 0.0191 2 4 3 For each cluster, the dispersion and the distance 2 0.0204 1 4 3 0.0448 to the other clusters is calculated. In addition, 3 0.0178 4 1 2 0.0623 0.0727 the three nearest neighbours (in the order of 4 0.0128 3 1 2 0.0583 0.0678 0.0244 increasing distance) are printed out. Dispersion: D=1/N*sum(xi-xc)**2 (sum from 1 to N) (xi-xc)**2 is the squared euclidean distance between object i and the cluster center c. .OPERATION: EN COMPLETED 14-JUL-88 AT 16:42:41