Cluster records using distance metrics and linkage controls. Inspect merges, coefficients, memberships, and downloadable reports. Designed for clear statistical grouping, comparison, and decision support.
| Label | Revenue | Visits | Conversion |
|---|---|---|---|
| A | 120 | 20 | 4.8 |
| B | 118 | 18 | 5.1 |
| C | 300 | 47 | 6.3 |
| D | 305 | 50 | 6.1 |
| E | 82 | 12 | 3.2 |
| F | 85 | 10 | 3.0 |
| G | 220 | 32 | 5.7 |
| H | 225 | 34 | 5.5 |
d(i,j) = √Σ(xik - xjk)²
This measures straight-line separation across all variables.
d(i,j) = Σ|xik - xjk|
This adds absolute variable differences.
d(i,j) = max|xik - xjk|
This focuses on the largest variable gap.
Single linkage: minimum distance between members of two clusters.
Complete linkage: maximum distance between members of two clusters.
Average linkage: mean distance across all member pairs.
Ward linkage: merge the pair with the smallest increase in within-cluster variance.
z = (x - mean) / standard deviation
This helps when variables use very different scales.
Paste your data into the textarea. Put the label in the first column. Place numeric variables in the remaining columns.
Select a linkage method. Choose a distance metric. Enter the number of clusters you want to inspect.
Enable standardization if your variables use different units. This prevents large-scale variables from dominating the solution.
Press Analyze Clusters. The results appear below the header and above the form.
Review the cluster memberships, centroids, agglomeration schedule, and distance matrix. Export the merge history, cluster memberships, or the full result area when needed.
Hierarchical cluster analysis groups observations by similarity. It builds clusters step by step. Each merge forms a larger group. This helps analysts spot natural structure in a dataset.
The calculator is useful in statistics, market research, biology, quality control, and social science. It works well when you want to compare several records across multiple numeric variables. You can test different distance metrics and linkage rules on the same data.
Distance metrics shape how similarity is measured. Euclidean distance emphasizes straight line separation. Manhattan distance adds absolute differences across variables. Chebyshev distance focuses on the largest gap between two records. Standardization is helpful when variables use different scales.
Linkage controls how clusters are merged. Single linkage uses the smallest distance between cluster members. Complete linkage uses the largest distance. Average linkage uses the mean of all pairwise distances. Ward linkage merges clusters that create the smallest increase in within cluster variance.
The agglomeration schedule shows every merge step. Small merge distances suggest close similarity. Larger jumps often signal stronger separation between groups. You can review the merge history and then choose a practical number of clusters for interpretation.
Cluster membership output makes results easy to apply. You can see which labels belong together at the requested cluster count. This is useful for segmentation, anomaly review, and exploratory data analysis. The distance matrix also helps you verify how individual observations relate before clustering decisions.
This calculator supports pasted CSV style data. Add a label in the first column. Put numeric variables in the remaining columns. Using more than one metric is a good practice. Stable clusters across settings often indicate stronger patterns. Pair the output with subject knowledge and sensible validation checks.
It groups similar observations into clusters by merging them step by step. The method creates a nested structure that helps you study patterns, proximity, and possible segment boundaries in numeric data.
Standardize when variables use different units or scales. Without scaling, a large-range variable can dominate the distance calculation and distort the final cluster structure.
Single linkage uses the closest pair across clusters. Complete linkage uses the farthest pair. Single linkage can create chains, while complete linkage usually forms tighter groups.
Each metric measures similarity differently. Euclidean emphasizes overall geometric distance, Manhattan adds absolute gaps, and Chebyshev highlights the largest variable difference. That changes merge order and memberships.
Ward linkage merges clusters that add the smallest amount of within-cluster variance. It often produces balanced and compact groups. This calculator uses Euclidean logic internally for Ward merges.
There is no single universal answer. Review the merge distances and look for larger jumps between steps. A sharp increase often suggests a useful stopping point for interpretation.
Yes. Paste CSV-style rows into the textarea. Use one label column first, then numeric columns. A header row is supported, and all data rows must have equal column counts.
You can export the merge history, cluster memberships, and centroid table as CSV files. You can also generate a printable PDF version of the full result section.