a) Which of unsupervised or supervised machine learning is best suited to assessing causation? Explain your choice.
b) Your analytics team presents you with two sets of results that have improved the organization’s ability to predict customer defections. The first method uses deep learning and has a precision of 85%. The second method uses decision trees and has a precision of 70%. The previous approach had a precision of 40%.
i) Make a case for using the results of the deep learning method.
ii) Make a case for using the decision tree method.
In your answers, consider aspects of customer lifetime value and managerial decision making.
c) An analytics team used two different models to predict the likelihood of an outcome. The results from two different analysts are below:
i) Use the Confusion Matrix and Index Calculation tables below to calculate the model performance measures.
(completed as an example)
(TP + TN) / (TP + TN + FP + FN)
(220 + 650) / (220 + 650 + 100 + 30)
(170 + 740) / (170 + 740 + 10 + 80)
TP / (TP + FP)
(FP + FN) / (TP + TN + FP + FN)
TP / (TP + FN)
TN / (TN + FP)
False positive rate
FP / (TN + FP)
2* ((Precision*Recall) / (Precision + Recall))
ii) Describe a medical or business context where you would prefer to use Don’s model. Why do you prefer Don’s model?
iii) Describe a medical or business context where you would prefer to use Katie’s model. Why do you prefer Katie’s model?
Ian is an intern with the team who claims he made a breakthrough with a model that outperforms both Don’s and Katie’s. The confusion matrix for his model is below:
iv) What could possibly have gone wrong that would result in his results being invalid? How could this be solved? (15 marks)
Question 3: Experiments
Jennifer was given the results of an experiment that was designed to determine if a 10% reduction in price on an online shopping portal would lead to an increase in purchases. Control and treatment group were created. These groups are described below:
Number of males
Number of females
Average spend per visit in the month BEFORE the experiment
Average spend per visit in the month AFTER the experiment
a) Were the control and treatment groups effectively randomized? Why or why not?
b) What are the two most likely explanations for the treatment groups showing a higher average spend than the control group?
c) What type of analysis could be used to remove one of the possible explanations for the difference in average spend?
d) Experiments are useful in helping determine if people have responded due to a stimulus or if they would have responded even without the stimulus. Design an experiment that could demonstrate what proportion of people have responded to a stimulus. These people could be customers or employees within a company. Examples could be an advertising campaign to customers, or a policy of flexible work hours for employees. Requirements:
i) How would you pick the treatment and control groups? Fill in the table below to indicate the number of people and 3 important characteristics that describe each group
ii) Predict the results and state the managerial conclusion you could make from this result. Use the table below to indicate the change in behavior you expect to observe.
Observed behavior before treatment:
Observed behavior after treatment:
iii) State the managerial action you could take from the results of your experiment. Briefly describe a useful follow-up experiment that would further deepen understanding of why people behaved in the manner observed.