University of Cumberlands Introduction to Data Mining Questions

User Generated

XX225

Writing

University of cumberlands

Description

Answer the following questions. Please ensure to use the Author, YYYY APA citations with any content brought into the assignment.

  1. For sparse data, discuss why considering only the presence of non-zero values might give a more accurate view of the objects than considering the actual magnitudes of values. When would such an approach not be desirable?
  2. Describe the change in the time complexity of K-means as the number of clusters to be found increases.
  3. Discuss the advantages and disadvantages of treating clustering as an optimization problem. Among other factors, consider efficiency, non-determinism, and whether an optimization-based approach captures all types of clusterings that are of interest.
  4. What is the time and space complexity of fuzzy c-means? Of SOM? How do these complexities compare to those of K-means?
  5. Explain the difference between likelihood and probability.
  6. Give an example of a set of clusters in which merging based on the closeness of clusters leads to a more natural set of clusters than merging based on the strength of connection (interconnectedness) of clusters.

No plagarisam

User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Attached.

Running Head: COMPUTER SCIENCE

Computer Science
Student’s Name:
Institutional Affiliation:

1

COMPUTER SCIENCE

2
Question 1

The addition of zero's in a set of data makes it challenging to account for that data.
Consequently, the more the zeros are present in a set of data, the higher the likelihood of errors
and an incorrect outcome from the data. Hence, in displaying the data making use of the graph
will make it presentable and also very easy to look into contrasted with a situation when zeros
are included, thereby making the graph look vague (Tan et al., 2016). More importantly, the
technique of contemplating only the existence of non-zeros may not be worthwhile, especially
when performing a clustering assessment. Hence, if the exact magnitude of those values is
contemplated, the outcomes will demonstrate the valid number of clusters in the data set. An
example of a situation where the existence of non-zeros can be utilized is the market basket.
When looking at the number of zeros, one can presume the relationship of a small list of objects
instead of assessing the entire inventory list.
Question 2
Typically, the K-means possess the time complexity feature in it. For each iteration, there
will be distance calculation, distance ...


Anonymous
Really helped me to better understand my coursework. Super recommended.

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4

Similar Content

Related Tags