ALY6100 Northeastern University Netflix Efficiency Plan Proposal Paper

User Generated

nyyralhna1222

Mathematics

ALY6100

Description

(1) Email-essay scenario: Your boss has tasked you with leading the data science team effort for this project. (Or, your team for the Netflix prize has put you in charge.) Last week, you worked on defining the project's objectives and questions that need to be answered. Now it's time to make a plan for the first two weeks of work, which will be focused on defining what data needs to be used to answer the business questions and reach the objectives, and gather that data. Your boss has asked you to send a proposed plan as an email, including:

  • What datasets will be needed
  • Why these datasets? How does the information that they contain inform the decision or answer business questions?
  • Which datasets exist internally?
  • If any datasets don't already exist, specify how they will be collected.

* Use your knowledge of the cases / how businesses work to imagine what likely exists already internally at Salesforce and Netflix. This week's video "Delivering High Quality Analytics at Netflix" will give you a sense of what sorts of data exists at Netflix, and help you imagine what data may exist at Salesforce.

Requirements:

  • Minimum 300 words
  • Minimum 2 references (can use book as a reference) with in line citations as appropriate
  • Reference list

(3) Data description exercise

For one dataset specified in your email, write up a partial data encyclopedia and dictionary. Examples of one dataset:

  • Salesforce
    • History of salaries and bonuses for each employee
  • Netflix
    • All customer ratings for each video

Include (see Bartlett 12.2 for more details):

  1. Purpose of Dataset
  2. Source of dataset
  3. Time window (that the dataset represents)
  4. Cost of data (to the company)
  5. Collection techniques (see also Bartlett Chapter 10)
  6. Collection tools
  7. Quality
  8. Completeness
  9. For each column in the dataset
    1. Name
    2. Definition
    3. Variable Classification (see Bartlett Table 12.1, p. 247)

For any details you cannot find in the cases or through research, make up a reasonable description.

Reading:

https://www.thrillist.com/entertainment/nation/the-netflix-prize

https://medium.com/netflix-techblog/netflix-recommendations-beyond-the-5-stars-part-1-55838468f429

https://medium.com/netflix-techblog/netflix-recommendations-beyond-the-5-stars-part-2-d9b96aa399f5


Vedio:

https://www.youtube.com/watch?v=nMyuCdqzpZc

Example for the exercise and last week's work below:

Unformatted Attachment Preview

Data description exercise ALY 6100 Example For this exercise, I will consider the S&P return data, presented in the “Data Collection Methodology” video. 1. Purpose of Dataset – Record yearly returns for S&P index, to use to inform parameters for a retirement savings model 2. Source of dataset – Downloaded from http://pages.stern.nyu.edu/~adamodar/New_Home_Page/datafile/histretSP.html 3. Time window (that the dataset represents) – 1928-2017 4. Cost of data (to the company) – Free download from internet 5. Collection techniques – Yearly market returns were calculated from 6. Collection tools – See Collection techniques 7. Quality – No quality concerns, as this is exact data, but historical changes in markets must be considered when using this data to forecast the markets 8. Completeness – Complete, all years from 1928-2017 included Variables Name Year Return Definition Year Annual returns on investments in S&P 500, including appreciation and dividends, in percent Variable Classification Integer Continuous Email Scenario In the Netflix Prize and recommendation problem, there is a need for accuracy by reducing errors. Thus, there are two main objectives of this project. The first one is to educate the employees about the causes of missing data in the system which reduce accuracy and the second one is to find the solutions for the missing data (Amatriain & Basilico, 2012). These are fundamental objectives due to several reasons. First, as employees, people get involved in several actions which may lead to the loss of data in the process. Loss of data leads to system errors which lead to poor performance or sometimes failure of the systems. Some of the causes include programming errors, loss of data during the transfer process, failure of the user to fill in some required fields, or even ignorance by the users due to personal beliefs about performance. In this context, an organization is likely to run into significant losses. Can you imagine if the company lost the details of all the clients due to programming error? It might have to spend lots of resources to recover from the loss. Thus, under this goal, the data science team will know how their actions and knowledge about data-driven decisions affect the welfare of the organization. In the second goal, strategy formulation is an important thing to learn. People need the data science team needs to know the best practices to keep the data free from errors. This objective is derived from the first one. Having known the cause, why not formulate a strategy? Knowing that a problem exists and dealing with it are two different things. Thus, this makes this objective vital to this project. There are several questions which the data science team should be able to respond to in order to make sound recommendations in this project. These questions include: 1. From the previous errors, which kind of incidences in the organization led to their existence? 2. Are there any identified ways to improve the efficiency of the project? If present, which ones are they and how can they be of help? 3. Which technologies are at our exposure to help mitigate the risks we encounter on the way? References Amatriain. X., & Basilico. J. (2012, April 06). Netflix Recommendations: Beyond the 5 stars (Part 1). Retrieved from https://medium.com/netflix-techblog/netflix-recommendationsbeyond-the-5-stars-part-1-55838468f429 Amatriain. X., & Basilico. J. (2012, April 06). Netflix Recommendations: Beyond the 5 stars (Part 2). Retrieved from https://medium.com/netflix-techblog/netflix-recommendationsbeyond-the-5-stars-part-2-d9b96aa399f5 Short Answer From the Netflix case, there was improved rating due to the use of data driven-driven decisions. Thus, what the data-driven decision adds to the project is accuracy. When dealing with figures, accuracy is the most important thing. Providing accurate figures is fundamental for the analysis of the business’s position (Amatriain & Basilico, 2012). For instance, the Netflix case talks of ranking which is a concept used by most online movie selling organizations. People need to know the level of enjoyment they will get by subscribing to given movies. In order to get this right, data-driven decisions need to be made. First, every client’s opinion is analyzed, and rated to a chosen scale, for instance, a scale of 1-5 or 1-10. By selecting all the data involved, the true reflection of the organization is produced. It is from truthful values that an individual is able to make a successful move. Accuracy is the most added value data-driven decisions possess (Yamin-Ali, 2014). The best decisions are those ones based on facts. The reason why most resolutions fail to produce the expected results correctly is due to the inaccuracies in them: they are based on assumptions. For instance, if Netflix will assume that five hundred people out of a thousand who watched a given movie from its website loved it, then it will be operating on a wrong basis. It can run into losses because they will continue distributing the content without facts. What if just two hundred people loved it yet the crew decided to make an assumption based on the analysis of the first fifty people only and ignored to complete the results for the remaining five hundred and fifty? Datadriven decisions must be taken seriously for any omission will lead to great errors. For instance, in the Netflix case study, the organization is able to improve on its ranking if accurate data is filled by the data science team. References Finkelstein, S., Whitehead, J., & Campbell, A. (2009). Think Again: Why Good Leaders Make Bad Decisions and How to keep it From Happening to You. Boston: Harvard Business Review Press. Yamin-Ali, J. (2014). Data-Driven Decision Making in Schools: Lessons from Trinidad. Basingstoke: Palgrave Macmillan Limited.
Purchase answer to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Hey buddy! Your work is ready. Have a look at it. In case of any questions, feel free to ask

Running Head: OUTLINE

A plan for Netflix Efficiency

Student’s Name:

Course Title:

Date:

OUTLINE

1
Reducing Errors for Efficiency

First Two Weeks’ Plan


Step 1: Data Set Identification



Step 2: Implementation



Step 3: Evaluation.

Data Description Exercise
MovieLens Dataset
References


Running Head: FIRST TWO WEEKS’ PLAN

A plan for Netflix Efficiency

Student’s Name:

Course Title:

Date:

FIRST TWO WEEKS’ PLAN

1
Reducing Errors for Efficiency
First Two Weeks’ Plan

With the increased need to get the attention of a diverse global audience for Netflix
movies and other content, the firm needs to work on increasing efficiency. This can only be
achieved by writing down a well-structured plan. The plan will consist of several ...


Anonymous
Great! 10/10 would recommend using Studypool to help you study.

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4

Related Tags