Subscription Based Service R Coding Dataset Project
PART 1:
Subscription.csv dataset is about a firm with a focus on subscription-based service and is collected from N=300 respondents based on several demographical and behavioral attributes. Company has the information on age, gender, income, kids, segments people are assigned to, whether people have own homes, and whether they are subscribed to the service or not. They want to understand the basic differences between people who have subscription and who do not have subscription to craft strategies to increase their subscription services.
In your answers, I want three parts to be submitted: (1) R code, (2) R output, and (3) detailed interpretation (if I am asking specifically for an interpretation). Please also include the question you are answering.
Q1. Please examine the nature of the dataset. In your examination, make sure that you list
(i)the number of rows (respondents) and number of columns (variables),
(ii)the data types of each variable,
(iii)the first 3 and the last 3 rows of the dataset,
(iv)whether there is any missing value in the dataset,
(v)some of the variables are character types, convert them into factors because we will need them in the factor type for analysis. After you convert the character to a factor type variable, make sure that it is converted (with an R function).
(vi) run summary statistics of the dataset (after cleaning the missing variables if there are any).
(vii)Based on the results from (i-vi), talk about the properties of the dataset.
Q2. Dataset has a variable called “subscribe” (whether the person is subscribed to a service or not). Please sort the data based on subscription behavior (subNo vs subYes) and come up with the Summary statistics for each subscription behavior separately. For each set, please discuss age, gender, income, number of kids, whether they own a home or not, and what customer segment they belong to. Based on the output do you see any difference between people who subscribe to the service and who don't subscribe to the service?
Q3. There are two subscription status: Yes or No. Please create cross-tables and proportion-based cross table between subscription status and (i) home ownership, (ii) gender, (iii) segments. And please interpret the overall findings at the end
Q4. Let’s examine the data even further to understand the segments. Please find out what the average income, average number of kids, and average age is for each segment. Please interpret the findings
Q5. Please create a boxplot for income vs. customer segments in the same output (note: income is on the x-axis, customer segments are on the y-axis). Please interpret the findings in detail..
Q6. Please create a boxplot for age vs. customer segments in the same output (note: age is on the x-axis, customer segments are on the y-axis. Please interpret the findings in detail..
Q7. By using the aggregate function please find out (i) the mean income for each segment, (ii) mean age for each segment, (iii) mean number of kids for each segment, (iv) total number of home ownership for each segment, (v) total number of subscription for each segment, (vi) total number of males and females in each segment. Please interpret the findings.
Q8. When you bring the results together, what do you recommend the company to do to increase the number of their subscription service customers?
BONUS. Please provide your opinion on the course, its content, and execution. [your bonus pts will be proportional to the volume of insight your provide]
(i)Do you face any challenges and struggles in learning R and if yes, which aspect(s) you find most challenging in your learning experience?
(ii)So far, thinking what we have learned in the class with R, what aspects of R you like?
(iii)What aspects of the course you find positive about your learning experience?
(iv)What aspects of the course can be improved to contribute to your learning experience? [basically, what else can be done/or differently done to enhance your learning process]