he purpose of this assignment is to filter and sort data, identify
patterns and relationships within data, and use those findings to
address business needs.
For this assignment, you will use the "Adult Incomes" data set from
the Topic Materials. Most data categories are self-explanatory.
Clarifying notes are as follows.
- Sex: Gender
- Relationship: Relationship status
- Education: Highest level of education completed
- Hours _Per _Week: Number of weekly hours that the individual works
- Capital _Gain: Amount of capital gains from tax records
- Income: Income of the individual – either ">50K" or "<=50K"
The marketing department in your organization believes that customers
with more education have higher incomes than those with less education,
and the department director wants to formulate an ad campaign targeting
customers with a specific income level. While the assumption stands to
reason, having data to support this hypothesis is necessary in order for
the department to justify the new advertising campaign strategy. The
department director has asked you to determine several specific numbers
and percentages along with calculating to what degree educational level
is correlated with income. Based upon your finding, the marketing
department will determine whether or not its idea for marketing
specifically to those customers is likely to create a positive return on
Question 1: Use Excel and various pivot
tables and charts to calculate the following information. When
summarizing categorical data, sometimes it is easier to create a new
indicator (0/1) variable to obtain a percentage (the average of an
indicator is a percent). Create a new indicator variable "Income_GT50"
where the variable is equal to 1 when the income is greater than
$50,000; otherwise use 0. Using this indicator variable, calculate the
overall percent of individuals who earn more than $50,000 annually.
Question 2: You are interested in
understanding the demographics of a particular cohort in your data set:
those who have at least a "Bachelors," "Masters," or "Doctorate" degree
whose occupation is also "Exec-managerial." Filter the data to include
only these individuals, and report the count of individuals who meet the
criteria along with their average age and the percentage who earn more
than $50,000 annually.
Question 3: Many times age does not have a
purely linear relationship with the dependent variable, so it is
separated into age brackets. Create a new variable categorizing age into
brackets as follows. Include the pivot table output when submitting the
- "<25": If age less than 25
- "25-34": If age between 25 and 34 inclusive
- "35-44": If age between 35 and 44 inclusive
- "45-54": If age between 45 and 54 inclusive
- "55-64": If age between 55 and 64 inclusive
- "65+": If age is 65 or more
Question 4: Calculate the percentage of
customers who earn more than $50,000 annually for each age bracket.
Create a pivot chart showing the results and include the chart when
submitting the answer. Is the effect of age on income a linear or
Question 5: Using a pivot table, calculate
the percent of customers earning more than $50,000 annually by education
level? Include the pivot table and pivot chart outputs and refer to
them as you address whether or not the marketing department's belief
that customers with more education have higher incomes than those with
less education. Present your findings in the form of a 250-word
executive summary that includes relevant data, charts, and tables.
Submit the answers to Questions 1-4 and the executive summary as Word documents.
APA format is not required, but solid academic writing is expected.
This assignment uses a grading rubric. Please review the rubric prior
to beginning the assignment to become familiar with the expectations
for successful completion.