## Description

In your organization, discuss the role of predictive analytics. How does it help? What are some of the issues and challenges faced in your organization? Would people in your organization be hesitant to make decisions based on predictive analytics? If so, why? (Banking and finance is the organization

part II

**Predicting Behavior with Logistic Regression**

#### Customer churn occurs when customers stop doing business with a company. Retaining existing customers is less expensive than it is to acquire new customers and hence, building a good predictive model for customer churn is of importance to many companies. Download the dataset *Telco.customer.csv*. Through this dataset, we attempt to predict behavior to retain customers using logistic regression.

Follow the steps below and create a PowerPoint presentation.

Using R, partition the dataset into training and testing sets by using the code:

(YOURDATA is the name of your dataset in R.)

*intrain- createDataPartition(YOURDATA$Churn,p=0.7,list=FALSE)*

*set.seed(2017)*

*training- YOURDATA[intrain,]*

*testing- YOURDATA[-intrain,]*

Fit a logistic regression model by using the code:

*glm(Churn ~ .,family=binomial(link="logit"),data=training)*

Examine the resulting fitted model. What are the significant factors that affect customer churn? Explain how and why they are significant.

Now, let's examine how the model fits using the following code.

*testing$Churn - as.character(testing$Churn)*

*testing$Churn[testing$Churn=="No"] - "0"*

*testing$Churn[testing$Churn=="Yes"] - "1"*

*fitted.results - predict(LogModel,newdata=testing,type='response')*

*fitted.results - ifelse(fitted.results 0.5,1,0)*

*misClasificError - mean(fitted.results != testing$Churn)*

*print(paste('Logistic Regression Accuracy',1-misClasificError))*

This provides the accuracy of the model.

- How can you make a customer churn prediction from the model you fitted? Explain. include the actual calculation.

## Explanation & Answer

I have attached the powerpoint presentation and the R code used in full

Customer Churn – Logistic regression

First name Last name

School name

Date

Exploring the data

We have 21 variables

• 18 variables are

categorical

(factors)

• 2 are numeric

• First variable only

enumerates each

customer, this can

be ignored.

Logistic regression

What do we want to model/predict?

• Customer churn

This is a categorical variable, only having two (binary) outcomes: Yes or No

What type of regression model is best?

• Logistic regression is used to model a binary categorical outcome (Lantz, 2019).

Lo...