Data Mining
Question Description
The purpose of this assignment is to perform data cleaning tasks to identify missing values, outliers, and anomalies, and effectively communicate findings to stakeholders.
You are a business analytics analyst working for BigTel, a telecommunications provider in the American Southwest. The customer service manager has asked for assistance in identifying the geographic origin of the majority of customer service calls. In order to provide the most accurate information, your supervisor has asked you to execute data cleaning tasks to identify missing values, outliers, and anomalies and to report your findings in an executive summary.
Use the " Churn" dataset to complete the data preparation activities in Microsoft Excel. This dataset represents customer account information from BigTel.
- Using Microsoft Excel, explore whether there are missing values for any of the variables. If so, specify the variables that have missing data, the rows that have the missing values, and, using the mean imputation technique, the specific values that you enter into the dataset for the fields that have missing data. Note that the actual Excel file does not need to be updated.
- Using Microsoft Excel, compare the area code and state fields and identify any apparent abnormalities. If abnormalities are determined, explain how they should be addressed in subsequent analyses.
- Using Microsoft Excel, determine whether there are any outliers among the number of calls to customer service and identify the method used to make this determination. Use the graphical method and the Z score method. For the graphical method, assume a bin size of 1. For the Z score method, assume+/- 4 Z score boundary points for outliers.
Write a 250 word executive summary of your observations and recommendations related to data preparation. Use Excel to generate relevant charts and graphs and include the charts, graphs, and calculations with the summary.
In the executive summary, address the following:
- Summarize your findings including missing values, abnormalities, and outliers.
- Explain recommendations for fixing identified issues.
- Explain how the identified issues and the recommended fixes might affect your analysis.
Unformatted Attachment Preview
This question has not been answered.
Create a free account to get help with this and any other question!
Brown University
1271 Tutors
California Institute of Technology
2131 Tutors
Carnegie Mellon University
982 Tutors
Columbia University
1256 Tutors
Dartmouth University
2113 Tutors
Emory University
2279 Tutors
Harvard University
599 Tutors
Massachusetts Institute of Technology
2319 Tutors
New York University
1645 Tutors
Notre Dam University
1911 Tutors
Oklahoma University
2122 Tutors
Pennsylvania State University
932 Tutors
Princeton University
1211 Tutors
Stanford University
983 Tutors
University of California
1282 Tutors
Oxford University
123 Tutors
Yale University
2325 Tutors