Description
Attached Files:
- Week 4 Cluster Data.xlsx (8.977 KB)
Included with this assignment is an Excel spreadsheet that contains data with two dimension values.
The purpose of this assignment is to demonstrate steps performed in a K-Means Cluster analysis.
Review the "k-MEANS CLUSTERING ALGORITHM" section in Chapter 4 of the Sharda et. al. textbook for additional background.
Use Excel to perform the following data analysis.
- Plot the data on a scatter plot.
- Determine the ideal number of clusters.
- Choose random center points (centroids) for each cluster. (Note: Each student will select a different random set of centroids.)
- Using a standard distance formula measure the distance from each data point to each center point.
- Assign each data point to an initial cluster region based on closeness.
- For each cluster calculate new center points.
- Repeat steps 4 through 6.
You will use Excel to help with calculations, but only standard functions should be used (i.e. don't use a plug-in to perform the analysis for you.) You need to show your work doing this analysis the long way. If you were to repeat steps 4 through 6, what will likely happen with the cluster centroids? The rubric for this assignment can be viewed when clicking on the assignment link.
Here is a link to an example spreadsheet using a smaller data set. It contains two tabs. The first tab is the raw data. The second tab contains the analysis that was performed. Make sure that you use a different starting center points from the example.
Explanation & Answer
Hello! I have attached the answer in word document for you.
:) Let me know if you have any further questions.
Data Point #
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
X
11.4
38.6
42.5
16.4
41.4
45.8
10.1
16
2.7
4.2
33.8
2.4
18.6
34.3
39.9
28
21.3
24.1
Y
22
34.1
0.2
27.8
29.2
9.2
42.7
17.1
3.3
37.2
33.9
9.5
48.9
46.1
26.9
32.5
25.2
39.3
Scatter Plot
60
50
y
40
30
20
10
0
0
5
10
15
20
25
X
30
35
40
45
50
Data Point #
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
X
11.4
38.6
42.5
16.4
41.4
45.8
10.1
16
2.7
4.2
33.8
2.4
18.6
34.3
39.9
28
21.3
24.1
Y
22
34.1
0.2
27.8
29.2
9.2
42.7
17.1
3.3
37.2
33.9
9.5
48.9
46.1
26.9
32.5
25.2
39.3
First round (Center Points are Guessed)
Find Distances to Center Points
Highlight shows nearest points to centers
Center 1
Center 2
Center 3
6
20
22
44
38
X
Y
X
Y
X
5.76
24.42
26.62
35.52
19.33
13.11
41.52
48.36
21.28
13.00
17.14
22.65
36.58
24.40
8.88
41.24
42.16
14.14
23.07
11.97
35.35
10.41
27.56
22.34
17.02
45.04
39.49
17.29
19.05
37.48
31.08
15.53
13.57
11.10
39.68
37.41
31.53
5.96
33.98
38.50
12.48
25.37
34.60
24.76
6.20
25.30
12.97
15.24
16.16
18.81
17.22
26.46
5.15
22.98
Centroids
1
2
3
Center 3
21
Y
26.62
13.11
21.28
22.65
8.88
14.14
35.35
22.34
39.49
37.48
13.57
37.41
33.98
25.37
6.20
15.24
17.22
22.98
X
6
22
38
Y
20
44
21
60
50
40
30
20
10
0
0
5
10
15
20
25
Data Point #
30
Centroids
35
40
45
50
50
Data Point #
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
X
11.4
38.6
42.5
16.4
41.4
45.8
10.1
16
2.7
4.2
33.8
2.4
18.6
34.3
39.9
28
21.3
24.1
Y
22
34.1
0.2
27.8
29.2
9.2
42.7
17.1
3.3
37.2
33.9
9.5
48.9
46.1
26.9
32.5
25.2
39.3
First round (Center Points are Guessed)
Find Distances to Center Points
Highlight shows nearest points to centers
Center 1
Center 2
Center 3
6
20
22
44
38
X
Y
X
Y
X
5.76
24.42
26.62
35.52
19.33
13.11
41.52
48.36
21.28
13.00
17.14
22.65
36.58
24.40
8.88
41.24
42.16
14.14
23.07
11.97
35.35
10.41
27.56
22.34
17.02
45.04
39.49
17.29
19.05
37.48
31.08
15.53
13.57
11.10
39.68
37.41
31.53
5.96
33.98
38.50
12.48
25.37
34.60
24.76
6.20
25.30
12.97
15.24
16.16
18.81
17.22
26.46
5.15
22.98
After Round One (Assign Groups to Points)
Group 1
Group 2
Center 3
21
Y
X
11.4
26.62
13.11
21.28
22.65
8.88
14.14
35.35
22.34
39.49
37.48
13.57
37.41
33.98
25.37
6.20
15.24
17.22
22.98
16.4
16
2.7
4.2
2.4
21.3
Total
Average
74.4
10.63
Y
22
27.8
X
16.4
Y
Group 3
X
Y
38.6
34.1
41.4
45.8
29.2
9.2
33.8
33.9
39.9
28
21.3
26.9
32.5
25.2
27.8
10.1
42.7
33.8
33.9
18.6
34.3
48.9
46.1
17.1
3.3
37.2
9.5
25.2
142.1
20.30
28
21.3
24.1
186.6
23.33
32.5
25.2
39.3
296.4
37.05
248.8
35.54
191
27.29
Find Distances to Center Points
New Center Point based on Averages
Center 1
Center 2
10.63
20.30
23.33
37....