timer Asked: May 6th, 2020

Question Description

Project Description

For this project you will use Lending Club loan data. I have already cleaned up the data set for you. The ultimate goal of the project is to identify whether a given customer will default on his loan or not. You need to run several machine learning algorithms to perform this task. Your main challenge is able to do the prediction with multiple models using SPSS modeler and compare their performance. You are also welcomed to pick your own datasets but please come to me for approval before you start on your project. Extra points will be given for choosing your own problem.

Some facts about the data set.

The data consist of 140 features of almost 40000 different individual loan record from lending club database.2. The target variable is the loan status. ‘Charged off’ denote default and ‘Fully paid’ denotes not default.3. Although I already cleaned up the data, there are some features (variable) in the data that are either to messy to work with or probably not required for building your model. So, use your judgment before assigning these features as your input into the model.

This what i need to submit

1. A report (word/PDF file) that is 5 pages (double space, including tables and figures) long. The report should include:

-An Introduction: problem description and definition-

-Data description



- Discussion

Tip: Run multiple models to find the one with the best performance. Notice that it is a classification problem (supervised learning), so make sure using the right models.2. Go back and adjust the selection of input variables. Select or Unselect some variables and see does this gives you a better performance.3. You should try at least two to three classification techniques. And report the best performing one.4. The performance evaluation and comparison should be discussed in full detail. You need to include the predictor importance result from the rule induction model and discuss it. Also, in the result part of your project, highlight the best accuracy you get and corresponding model settings from which you achieved that accuracy.

NOTE: The paper has to be good as I need to pass my class.

Student has agreed that all tutoring, explanations, and answers provided by the tutor will be used to help in the learning process and in accordance with Studypool's honor code & terms of service.

This question has not been answered.

Create a free account to get help with this and any other question!

Brown University

1271 Tutors

California Institute of Technology

2131 Tutors

Carnegie Mellon University

982 Tutors

Columbia University

1256 Tutors

Dartmouth University

2113 Tutors

Emory University

2279 Tutors

Harvard University

599 Tutors

Massachusetts Institute of Technology

2319 Tutors

New York University

1645 Tutors

Notre Dam University

1911 Tutors

Oklahoma University

2122 Tutors

Pennsylvania State University

932 Tutors

Princeton University

1211 Tutors

Stanford University

983 Tutors

University of California

1282 Tutors

Oxford University

123 Tutors

Yale University

2325 Tutors