Identifying Patterns and Relationships

1171428301_
timer Asked: Dec 3rd, 2018

Question Description

The purpose of this assignment is to perform k-Nearest Neighbor classification, interpret the results, and analyze whether or not the information generated can be used to address a specific business problem.

For this assignment, you will use the "Adult Incomes" data set from the Topic Materials.

ABC Survey Company collects data via surveys that it then sells to marketing departments. Marketing departments typically do not like missing data. Since survey takers typically do not like to answer questions regarding their salary, the one question usually missing from the survey results is, "Is your annual salary $50,000 or more?"

You are the analyst who has been tasked with finding a way to impute (i.e., fill-in) the answer to the question, "Is your annual salary $50,000 or more?" This information can best be imputed based upon how individuals answer other survey questions related to their marital status, educational level, occupation, and familial relationship status. If this important question can be accurately imputed, then the worth of the survey data provided by ABC Survey Company increases dramatically.

Question 1: Using only "Marital_Status," "Education," "Occupation," and "Relationship" variables, find the number of neighbors (k) that minimizes the error rate. Use a range of k between 3 and 10. Include the "k Selection Error Log" output when submitting the answer.

Question 2: Using the same variables and the k selected in Question 1, rerun the nearest neighbor model using the feature selection option in the IBM SPSS Modeler. What is the set of variables that minimize the error rate? Include the "Predictor Selection Error Log" output when submitting the answer.

Question 3: Using the value of k and the set of variables that minimizes the error rate, rerun the k-Nearest Neighbor model. What is the classification table? Include the pivot table output when submitting the answer.

Question 4: Consider the following individual: Marital_Status=Never-married, Education=Masters, Occupation=Sales, and Relationship=Not-in-family. Based on the k-Nearest Neighbor model from Question 3, how would this individual be classified? Provide the predicted income level (">50K" or "<=50K") and explain the process that you used to determine the income level. Include the table illustrating the data when submitting the answer.

Question 5: Describe the model building process you used to determine whether or not a particular survey taker earned an annual salary of $50,000 or more. Include discussion of the accuracy of the k-Nearest Neighbor model and how it can be used in practice to impute the answer to the question, "Is your annual salary $50,000 or more?"

General Requirements:

Submit the answers to Questions 1-4 and the executive summary as Word documents.

APA format is not required, but solid academic writing is expected

Unformatted Attachment Preview

ID Age 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 34 36 38 49 47 54 29 41 39 29 22 44 31 61 35 54 29 22 29 50 32 70 26 32 55 38 25 51 17 83 52 37 35 43 25 19 44 21 30 22 28 54 36 50 46 34 Age_Category 25-34 35-44 35-44 45-54 45-54 45-54 25-34 35-44 35-44 25-34
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

This question has not been answered.

Create a free account to get help with this and any other question!

Related Tags

Brown University





1271 Tutors

California Institute of Technology




2131 Tutors

Carnegie Mellon University




982 Tutors

Columbia University





1256 Tutors

Dartmouth University





2113 Tutors

Emory University





2279 Tutors

Harvard University





599 Tutors

Massachusetts Institute of Technology



2319 Tutors

New York University





1645 Tutors

Notre Dam University





1911 Tutors

Oklahoma University





2122 Tutors

Pennsylvania State University





932 Tutors

Princeton University





1211 Tutors

Stanford University





983 Tutors

University of California





1282 Tutors

Oxford University





123 Tutors

Yale University





2325 Tutors