Data Analysis (Cluster Analysis)

User Generated

xnaan256

Computer Science

Description

Attached Files:

Included with this assignment is an Excel spreadsheet that contains data with two dimension values.

The purpose of this assignment is to demonstrate steps performed in a K-Means Cluster analysis.

Review the "k-MEANS CLUSTERING ALGORITHM" section in Chapter 4 of the Sharda et. al. textbook for additional background.

Use Excel to perform the following data analysis.

  1. Plot the data on a scatter plot.
  2. Determine the ideal number of clusters.
  3. Choose random center points (centroids) for each cluster. (Note: Each student will select a different random set of centroids.)
  4. Using a standard distance formula measure the distance from each data point to each center point.
  5. Assign each data point to an initial cluster region based on closeness.
  6. For each cluster calculate new center points.
  7. Repeat steps 4 through 6.

You will use Excel to help with calculations, but only standard functions should be used (i.e. don't use a plug-in to perform the analysis for you.) You need to show your work doing this analysis the long way. If you were to repeat steps 4 through 6, what will likely happen with the cluster centroids? The rubric for this assignment can be viewed when clicking on the assignment link.

Here is a link to an example spreadsheet using a smaller data set. It contains two tabs. The first tab is the raw data. The second tab contains the analysis that was performed. Make sure that you use a different starting center points from the example.

Example Excel Analysis

User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Hello! I have attached the answer in word document for you.
:) Let me know if you have any further questions.

Data Point #
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

X
11.4
38.6
42.5
16.4
41.4
45.8
10.1
16
2.7
4.2
33.8
2.4
18.6
34.3
39.9
28
21.3
24.1

Y
22
34.1
0.2
27.8
29.2
9.2
42.7
17.1
3.3
37.2
33.9
9.5
48.9
46.1
26.9
32.5
25.2
39.3

Scatter Plot
60
50

y

40
30
20
10
0
0

5

10

15

20

25

X

30

35

40

45

50

Data Point #
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

X
11.4
38.6
42.5
16.4
41.4
45.8
10.1
16
2.7
4.2
33.8
2.4
18.6
34.3
39.9
28
21.3
24.1

Y
22
34.1
0.2
27.8
29.2
9.2
42.7
17.1
3.3
37.2
33.9
9.5
48.9
46.1
26.9
32.5
25.2
39.3

First round (Center Points are Guessed)
Find Distances to Center Points
Highlight shows nearest points to centers
Center 1
Center 2
Center 3
6
20
22
44
38
X
Y
X
Y
X
5.76
24.42
26.62
35.52
19.33
13.11
41.52
48.36
21.28
13.00
17.14
22.65
36.58
24.40
8.88
41.24
42.16
14.14
23.07
11.97
35.35
10.41
27.56
22.34
17.02
45.04
39.49
17.29
19.05
37.48
31.08
15.53
13.57
11.10
39.68
37.41
31.53
5.96
33.98
38.50
12.48
25.37
34.60
24.76
6.20
25.30
12.97
15.24
16.16
18.81
17.22
26.46
5.15
22.98

Centroids
1
2
3

Center 3
21
Y
26.62
13.11
21.28
22.65
8.88
14.14
35.35
22.34
39.49
37.48
13.57
37.41
33.98
25.37
6.20
15.24
17.22
22.98

X
6
22
38

Y
20
44
21

60
50
40
30
20
10
0
0

5

10

15

20

25

Data Point #

30
Centroids

35

40

45

50

50

Data Point #
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

X
11.4
38.6
42.5
16.4
41.4
45.8
10.1
16
2.7
4.2
33.8
2.4
18.6
34.3
39.9
28
21.3
24.1

Y
22
34.1
0.2
27.8
29.2
9.2
42.7
17.1
3.3
37.2
33.9
9.5
48.9
46.1
26.9
32.5
25.2
39.3

First round (Center Points are Guessed)
Find Distances to Center Points
Highlight shows nearest points to centers
Center 1
Center 2
Center 3
6
20
22
44
38
X
Y
X
Y
X
5.76
24.42
26.62
35.52
19.33
13.11
41.52
48.36
21.28
13.00
17.14
22.65
36.58
24.40
8.88
41.24
42.16
14.14
23.07
11.97
35.35
10.41
27.56
22.34
17.02
45.04
39.49
17.29
19.05
37.48
31.08
15.53
13.57
11.10
39.68
37.41
31.53
5.96
33.98
38.50
12.48
25.37
34.60
24.76
6.20
25.30
12.97
15.24
16.16
18.81
17.22
26.46
5.15
22.98

After Round One (Assign Groups to Points)
Group 1
Group 2

Center 3
21
Y

X
11.4

26.62
13.11
21.28
22.65
8.88
14.14
35.35
22.34
39.49
37.48
13.57
37.41
33.98
25.37
6.20
15.24
17.22
22.98

16.4

16
2.7
4.2
2.4

21.3
Total
Average

74.4
10.63

Y
22

27.8

X

16.4

Y

Group 3
X

Y

38.6

34.1

41.4
45.8

29.2
9.2

33.8

33.9

39.9
28
21.3

26.9
32.5
25.2

27.8

10.1

42.7

33.8

33.9

18.6
34.3

48.9
46.1

17.1
3.3
37.2
9.5

25.2
142.1
20.30

28
21.3
24.1
186.6
23.33

32.5
25.2
39.3
296.4
37.05

248.8
35.54

191
27.29

Find Distances to Center Points
New Center Point based on Averages
Center 1
Center 2
10.63
20.30
23.33
37....


Anonymous
Really helpful material, saved me a great deal of time.

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4

Similar Content

Related Tags