business analytics and machine learning

User Generated

serfuyl12

Mathematics

Description

Question 2: Machine Learning

a) Which of unsupervised or supervised machine learning is best suited to assessing causation? Explain your choice.

b) Your analytics team presents you with two sets of results that have improved the organization’s ability to predict customer defections. The first method uses deep learning and has a precision of 85%. The second method uses decision trees and has a precision of 70%. The previous approach had a precision of 40%.

i) Make a case for using the results of the deep learning method.

ii) Make a case for using the decision tree method.

In your answers, consider aspects of customer lifetime value and managerial decision making.

c) An analytics team used two different models to predict the likelihood of an outcome. The results from two different analysts are below:

Don’s Analysis

Actual

Positive

Negative

Predicted

Positive

220

100

Negative

30

650

Katie’s Analysis

Actual

Positive

Negative

Predicted

Positive

170

10

Negative

80

740


i) Use the Confusion Matrix and Index Calculation tables below to calculate the model performance measures.

Confusion Matrix

Actual

Positive

Negative

Predicted

Positive

TP

FP

Negative

FN

TN

Formula

Don Calculation

Katie Calculation

Accuracy

(completed as an example)

(TP + TN) / (TP + TN + FP + FN)

(220 + 650) / (220 + 650 + 100 + 30)

0.87

(170 + 740) / (170 + 740 + 10 + 80)

0.91

Precision

TP / (TP + FP)

Error rate

(FP + FN) / (TP + TN + FP + FN)

Recall

TP / (TP + FN)

Specificity

TN / (TN + FP)

False positive rate

FP / (TN + FP)

F-score

2* ((Precision*Recall) / (Precision + Recall))

ii) Describe a medical or business context where you would prefer to use Don’s model. Why do you prefer Don’s model?

iii) Describe a medical or business context where you would prefer to use Katie’s model. Why do you prefer Katie’s model?

Ian is an intern with the team who claims he made a breakthrough with a model that outperforms both Don’s and Katie’s. The confusion matrix for his model is below:

Ian’s Analysis

Actual

Positive

Negative

Predicted

Positive

249

2

Negative

1

748

iv) What could possibly have gone wrong that would result in his results being invalid? How could this be solved? (15 marks)

Question 3: Experiments

Jennifer was given the results of an experiment that was designed to determine if a 10% reduction in price on an online shopping portal would lead to an increase in purchases. Control and treatment group were created. These groups are described below:

Control Group

Treatment Group

Number of males

25

25

Number of females

25

25

Average age

47 years

37 years

Average spend per visit in the month BEFORE the experiment

$25.00

$25.00

Average spend per visit in the month AFTER the experiment

$25.00

$29.00

a) Were the control and treatment groups effectively randomized? Why or why not?

b) What are the two most likely explanations for the treatment groups showing a higher average spend than the control group?

c) What type of analysis could be used to remove one of the possible explanations for the difference in average spend?

d) Experiments are useful in helping determine if people have responded due to a stimulus or if they would have responded even without the stimulus. Design an experiment that could demonstrate what proportion of people have responded to a stimulus. These people could be customers or employees within a company. Examples could be an advertising campaign to customers, or a policy of flexible work hours for employees. Requirements:

i) How would you pick the treatment and control groups? Fill in the table below to indicate the number of people and 3 important characteristics that describe each group

Control Group

Treatment Group

People

Characteristic 1:

Characteristic 2:

Characteristic 3:

ii) Predict the results and state the managerial conclusion you could make from this result. Use the table below to indicate the change in behavior you expect to observe.

Control Group

Treatment Group

Observed behavior before treatment:

Observed behavior after treatment:

iii) State the managerial action you could take from the results of your experiment. Briefly describe a useful follow-up experiment that would further deepen understanding of why people behaved in the manner observed.

User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Your assignment is complete, if you have any queries shoot me a message and hang tight, I'll assist you in a couple minutes or asap. :-)Have a blessed day

Running Header: BUSINESS ANALYTICS AND MACHINE LEARNING

Business Analytics and Machine Learning
Institutional Affiliation
Date

1

BUSINESS ANALYTICS AND MACHINE LEARNING

2

Question 2: Machine Learning
a) Which of unsupervised or supervised machine learning is best suited to assessing
causation? Explain your choice.
Unsupervised learning technique is best suited for assessing causation. Unsupervised learning
techniques rely on latent variables to assess for causation. With unsupervised learning, it is
possible to learn larger and more complex models than with supervised learning. This is because
in supervised learning one is trying to find the connection between two sets of observations. The
causal structure of supervised learning technique assumes that you have inputs at the start of the
model and outputs at the end. The difficulty of the learning task increases exponentially in the
number of steps between the two sets and that is why supervised learning cannot, in practice,
learn models with deep hierarchies.
b) Your analytics team presents you with two sets of results that have improved the
organization’s ability to predict customer defections. The first method uses deep learning
and has a precision of 85%. The second method uses decision trees and has a precision of
70%. The previous approach had a precision of 40%.
i) Make a case for using the results of the deep learning method.
Deep learning methods perform best under situations where the data is unstructured (audio,
images, text, video). Given such a data set, I would consider using deep learning method to be
able to obtain better results.
ii) Make a case for using the decision tree method.
Decision trees are part of the random forests ensemble methods. Decision trees work best in
situations of binary classifications. Random forests are good in classification and prediction n

BUSINESS ANALYTICS AND MACHINE LEARNING

3

scenarios where the number of variables is greater than the number of observations (high
dimensional data sets). Therefore when dealing with binary data sets that are high dimensional in
nature...


Anonymous
Really useful study material!

Studypool
4.7
Indeed
4.5
Sitejabber
4.4

Related Tags