Find a Dataset and Write Pseudocode that Would Operate on the Data in a Hadoop Cluster

Content Type

User Generated

User

oboubcr69

Subject

Computer Science

School

Colorado Technical University Online

Description

Research Kaggle.com datasets, and identify one of interest. Once you have identified a dataset, discuss the data and goals of using it in a business scenario. Construct MapReduce Pseudocode on how this data may be processed using the MapReduce programming approach.

MapReduce is often used in a parallel processing environment, such as Hadoop. Doing so allows operations to execute on each node in the cluster. This approach is commonly used to process Big Data. For this assignment, complete the following:

Research Kaggle.com, and identify a dataset that is suitable for MapReduce programming in a distributed environment.
Construct pseudocode that would operate on these data as if they were stored in a Hadoop cluster. This operation should be tied to a defined goal of the dataset. This pseudocode should have mappers and reducers defined.
Discuss how this form of processing is beneficial and can be used in a business setting.

This is the dataset I have chosen: https://www.kaggle.com/sakshigoyal7/credit-card-cu...

User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

This question has not been answered.

Create a free account to get help with this and any other question!

24/7 Homework Help

Stuck on a homework question? Our verified tutors can answer all questions, from basic math to advanced rocket science!

COSC 101 Robert Morris University Memphis Primary Materials Paper

COSC 101 Robert Morris University Memphis Primary Materials Paper

9.2 Discussion: TJX Computer Intrusion

opic StatementThere are plenty of case studies and material regarding the TJX breach that occurred in 2007. “In January ...

9.2 Discussion: TJX Computer Intrusion

opic StatementThere are plenty of case studies and material regarding the TJX breach that occurred in 2007. “In January 2007, TJX reported that it had suffered from a computer intrusion. The company was not sure of the identity of the perpetrators nor of how many customers were affected. A deeper analysis revealed that the intrusion had started earlier and affected more customers than previously thought. Ensuing investigations concluded that TJX was collecting unnecessary information, keeping it for too long and employing obsolete and insufficient safeguards. TJX denied any wrongdoing but implemented most of the recommended remedies to strengthen their security,” (source: CasePlace.org). (Links to an external site.)Links to an external site.DiscussBased on what we have covered in this course, and upon any number of resources that you might find and cite, what went wrong?Based specifically on the TJX breach, what should others do to avoid the same fate?Do you think the business community at large has learned anything from that breach? How do you think such a breach would be handled any differently now (in North America).Do you think the breach would be handled any differently in other areas such as Europe, China, Australia, Brazil, etc.?Post and EngageAfter posting your response, review your classmates' posts and make a thoughtful and substantive response to at least two others.Refer to the rubric for grading criteria.

Network Security (U4D)

Below needs to be completed in APA format. Approximately 2-3 pages. The purpose of the Discussion Board is to allow studen ...

Network Security (U4D)

Below needs to be completed in APA format. Approximately 2-3 pages. The purpose of the Discussion Board is to allow students to learn through sharing ideas and experiences as they relate to course content and the DB question. Because it is not possible to engage in two-way dialogue after a conversation has ended, no posts to the DB will be accepted after the end of each unit.Some organizations have gone as far as inserting radio-frequency identification (RFID) chips into their employees to control their access into secure areas and to monitor their movement and location.Research and discuss where RFID tags have been used for security measures.Discuss your feelings on how these devices are being used.Be sure that you fully explain your viewpoint and justify your response.As the popularity and use of RFID tags increase, the concern about these devices also continues to grow.Research and discuss a company that currently uses RFID tags.Explain how they are being used.Discuss your feelings on the security concerns that have arisen from the use of these devices.In your own words, please post a response to the Discussion Board and comment on other postings. You will be graded on the quality of your postings.For assistance with your assignment, please use your text, Web resources, and all course materials.Grading CriteriaProject CriteriaExceeds: 90%–100%Very good: 80%–89%Meets: 70%–79%Needs Improvement: Below 70%Content(40%)Response covers all topics indicated in the assignment and adds additional content. The writing was of collegiate level with no errors in spelling or grammar.Response covers most topics indicated in the assignment. The writing was of collegiate level with one or less errors in spelling or grammar.Response covers many topics indicated in the assignment. The writing was of collegiate level with two or less errors in spelling or grammar.Response covers none to some of the topics indicated in the assignment. The writing was less than collegiate level with errors in spelling or grammar.Interaction(40%)Multiple learner interactions per week that add to the discussion. Responses show evidence of critical analysis with questions to other classmates.At least three learner interactions per week that add to the discussion. Responses show evidence of critical analysis.No less than two learner interactions per week that add to the discussion. Responses show evidence of critical analysis.Zero to one learner interaction per week in the discussion board. Response lacks evidence of critical analysis.Supporting Analysis(20%)Analysis exceeded minimum requirements. Appropriate sources were used to support analysis and were properly referenced.Basic analysis provided to support discussion. When appropriate to support discussion, appropriate sources were cited and properly referenced.Some limited analysis provided to support discussion. When appropriate to support discussion, sources were cited, appropriate, and properly referenced.No or inaccurate analysis, no sources were cited when needed, analysis and/or sources were not appropriate. When sources used, were not properly referenced. Other InformationThere is no additional information to display at this time.

University of Houston Organizational Leadership Essay

Discuss what performance management is and how it influences effective teams. Review table 11.1, define leadership behavio ...

University of Houston Organizational Leadership Essay

Discuss what performance management is and how it influences effective teams. Review table 11.1, define leadership behaviors (in your own words) and note which behaviors are beneficial at specific organizational activities (example: project planning, leading coworkers, etc…). Please note at least five organizational activities and be specific when responding. Note at least two organizational capabilities and compare and contrast each.

Project Management Best Practices and Recognizing Project Migraines, Week 9 Cis 498 discussion

"Project Management Best Practices and Recognizing Project Migraines" Please respond to the following:As a newly mint ...

Project Management Best Practices and Recognizing Project Migraines, Week 9 Cis 498 discussion

"Project Management Best Practices and Recognizing Project Migraines" Please respond to the following:As a newly minted CIO, you have been hired to join a company without a history of project best practices. Suggest strategy and process for your Chief Executive Officer (CEO) to develop standards for your organization that is without any such organizational project history. Justify the main reasons why your suggestion would be effective.Reflect upon Kerzner’s “Sources of Smaller Migraines” from Chapter 2. Select three (3) issues from the list. Then, specify the reasons why these issues are more critical to control than the others. Justify your response.

College of Wilmington Create a C Program Computer Programming Task

You have two sample files attached to this assignment: one showing how to work with memory sharing and another showing imp ...

College of Wilmington Create a C Program Computer Programming Task

You have two sample files attached to this assignment: one showing how to work with memory sharing and another showing implementing a copy program using open(), read(), and write() methods. Your job here is to convert the copy program to copy one file to another file(with a different name) using memory mapping APIs, not using read() methods. You can use write API.Copy.c#define NULL 0#define BUFFSIZE 512#define PMODE 0644 /* RW for owner, R for group, others */main ( int argc, char *argv[]){ int f1, f2, f3; char buf[BUFFSIZE]; if(argc != 3) error("Usage: copy from to", NULL); if(( f1 = open(argv[1], 0)) == -1) error("copy can't open %s", argv[1]); if(( f2 = create(argv[2], PMODE)) == -1) error("copy: can't create %s", argv[2]); while (( f3 = read(f1, buf, BUFFSIZE)) >0) if(write (f2, buf, n) != n) error("copy: write error", NULL); exit(0); } void error( char *s1, char *s2){ printf(s1, s2); printf("\n"); exit(1); }MappingSample.c#include <stdio.h>#include <sys/types.h>#include <sys/stat.h>#include <fcntl.h>#include <unistd.h>#include <sys/mman.h>int main(int argc, char *argv[]){ struct stat sb; off_t len; char *p; int fd; if(argc < 2){ fprintf(stderr, "usage: %s <file>\n", argv[0]); return 1; } fd = open(argv[1], O_RDONLY); if(fd == -1){ perror("open"); return 1; } if(fstat(fd, &sb) == -1){ perror("fstat"); return 1; } if(!S_ISREG(sb.st_mode)){ fprintf(stderr, "%s is not a file\n", argv[1]); return 1; } p = mmap(0, sb.st_size, PROT_READ, MAP_SHARED, fd, 0); if(p == MAP_FAILED){ perror("mmap"); return 1; } for(len = 0; len < sb.st_size; len++){ putchar(p[len]); } if(munmap(p, sb.st_size) == -1) { perror("munmap"); return 1; } if(close(fd) == -1) { perror("close"); return 1; } return 0; }

Similar Content

The Ten Commandments of Computer Ethics Analysis

Research (on internet) “The Ten Commandments of Computer Ethics” and summarize your findings of the ten commandments i...

Campbellsville University Service Oriented Architecture Discussion

Do research on service oriented architecture (SOA) and find three different images/diagram that represent its functions an...

UMGC WLAN Security Protocols WEP WPA and WPA2 Comparative Essay

Compare and contrast these three WLAN security protocols:WEPWPAWPA2250 words, please cite sources. Thanks ...

SEC8100 Wilmington Supply Chain Management The Beer Game Video Discussion

After watching the video, Supply Chain Management: The Beer Game, create your own thread discussing at least three concept...

Abraham Lincoln University Transactions Discussion

1. For transaction, there are 4 properties, please address those properties. 2. For concurrent control, please describe th...

Components and Operation of Public Key Infrastructure Discussion

Discuss the components and operation of the Public Key Infrastructure (PKI). References and limited quotations and refere...

Legal Regulation.edited

One risk associated with the system includes manipulation of information. Since employees will be allowed to change person...

We can define the classification as it the procedure of classification of the data using the help of class labels while in...

Speed in software development has become quite important since timelines provided for most software products are limited a...

Related Tags

software engineering networking File Source parity bits technology value Software Testing technology data integration Data warehouse WAN Implementation information technology

Homework help

Our tutors provide high quality explanations & answers.

Post question

COSC 101 Robert Morris University Memphis Primary Materials Paper

COSC 101 Robert Morris University Memphis Primary Materials Paper

9.2 Discussion: TJX Computer Intrusion

opic StatementThere are plenty of case studies and material regarding the TJX breach that occurred in 2007. “In January ...

9.2 Discussion: TJX Computer Intrusion

opic StatementThere are plenty of case studies and material regarding the TJX breach that occurred in 2007. “In January 2007, TJX reported that it had suffered from a computer intrusion. The company was not sure of the identity of the perpetrators nor of how many customers were affected. A deeper analysis revealed that the intrusion had started earlier and affected more customers than previously thought. Ensuing investigations concluded that TJX was collecting unnecessary information, keeping it for too long and employing obsolete and insufficient safeguards. TJX denied any wrongdoing but implemented most of the recommended remedies to strengthen their security,” (source: CasePlace.org). (Links to an external site.)Links to an external site.DiscussBased on what we have covered in this course, and upon any number of resources that you might find and cite, what went wrong?Based specifically on the TJX breach, what should others do to avoid the same fate?Do you think the business community at large has learned anything from that breach? How do you think such a breach would be handled any differently now (in North America).Do you think the breach would be handled any differently in other areas such as Europe, China, Australia, Brazil, etc.?Post and EngageAfter posting your response, review your classmates' posts and make a thoughtful and substantive response to at least two others.Refer to the rubric for grading criteria.

Network Security (U4D)

Below needs to be completed in APA format. Approximately 2-3 pages. The purpose of the Discussion Board is to allow studen ...

Network Security (U4D)

Below needs to be completed in APA format. Approximately 2-3 pages. The purpose of the Discussion Board is to allow students to learn through sharing ideas and experiences as they relate to course content and the DB question. Because it is not possible to engage in two-way dialogue after a conversation has ended, no posts to the DB will be accepted after the end of each unit.Some organizations have gone as far as inserting radio-frequency identification (RFID) chips into their employees to control their access into secure areas and to monitor their movement and location.Research and discuss where RFID tags have been used for security measures.Discuss your feelings on how these devices are being used.Be sure that you fully explain your viewpoint and justify your response.As the popularity and use of RFID tags increase, the concern about these devices also continues to grow.Research and discuss a company that currently uses RFID tags.Explain how they are being used.Discuss your feelings on the security concerns that have arisen from the use of these devices.In your own words, please post a response to the Discussion Board and comment on other postings. You will be graded on the quality of your postings.For assistance with your assignment, please use your text, Web resources, and all course materials.Grading CriteriaProject CriteriaExceeds: 90%–100%Very good: 80%–89%Meets: 70%–79%Needs Improvement: Below 70%Content(40%)Response covers all topics indicated in the assignment and adds additional content. The writing was of collegiate level with no errors in spelling or grammar.Response covers most topics indicated in the assignment. The writing was of collegiate level with one or less errors in spelling or grammar.Response covers many topics indicated in the assignment. The writing was of collegiate level with two or less errors in spelling or grammar.Response covers none to some of the topics indicated in the assignment. The writing was less than collegiate level with errors in spelling or grammar.Interaction(40%)Multiple learner interactions per week that add to the discussion. Responses show evidence of critical analysis with questions to other classmates.At least three learner interactions per week that add to the discussion. Responses show evidence of critical analysis.No less than two learner interactions per week that add to the discussion. Responses show evidence of critical analysis.Zero to one learner interaction per week in the discussion board. Response lacks evidence of critical analysis.Supporting Analysis(20%)Analysis exceeded minimum requirements. Appropriate sources were used to support analysis and were properly referenced.Basic analysis provided to support discussion. When appropriate to support discussion, appropriate sources were cited and properly referenced.Some limited analysis provided to support discussion. When appropriate to support discussion, sources were cited, appropriate, and properly referenced.No or inaccurate analysis, no sources were cited when needed, analysis and/or sources were not appropriate. When sources used, were not properly referenced. Other InformationThere is no additional information to display at this time.

University of Houston Organizational Leadership Essay

Discuss what performance management is and how it influences effective teams. Review table 11.1, define leadership behavio ...

University of Houston Organizational Leadership Essay

Discuss what performance management is and how it influences effective teams. Review table 11.1, define leadership behaviors (in your own words) and note which behaviors are beneficial at specific organizational activities (example: project planning, leading coworkers, etc…). Please note at least five organizational activities and be specific when responding. Note at least two organizational capabilities and compare and contrast each.

Project Management Best Practices and Recognizing Project Migraines, Week 9 Cis 498 discussion

"Project Management Best Practices and Recognizing Project Migraines" Please respond to the following:As a newly mint ...

Project Management Best Practices and Recognizing Project Migraines, Week 9 Cis 498 discussion

"Project Management Best Practices and Recognizing Project Migraines" Please respond to the following:As a newly minted CIO, you have been hired to join a company without a history of project best practices. Suggest strategy and process for your Chief Executive Officer (CEO) to develop standards for your organization that is without any such organizational project history. Justify the main reasons why your suggestion would be effective.Reflect upon Kerzner’s “Sources of Smaller Migraines” from Chapter 2. Select three (3) issues from the list. Then, specify the reasons why these issues are more critical to control than the others. Justify your response.

College of Wilmington Create a C Program Computer Programming Task

You have two sample files attached to this assignment: one showing how to work with memory sharing and another showing imp ...

College of Wilmington Create a C Program Computer Programming Task

You have two sample files attached to this assignment: one showing how to work with memory sharing and another showing implementing a copy program using open(), read(), and write() methods. Your job here is to convert the copy program to copy one file to another file(with a different name) using memory mapping APIs, not using read() methods. You can use write API.Copy.c#define NULL 0#define BUFFSIZE 512#define PMODE 0644 /* RW for owner, R for group, others */main ( int argc, char *argv[]){ int f1, f2, f3; char buf[BUFFSIZE]; if(argc != 3) error("Usage: copy from to", NULL); if(( f1 = open(argv[1], 0)) == -1) error("copy can't open %s", argv[1]); if(( f2 = create(argv[2], PMODE)) == -1) error("copy: can't create %s", argv[2]); while (( f3 = read(f1, buf, BUFFSIZE)) >0) if(write (f2, buf, n) != n) error("copy: write error", NULL); exit(0); } void error( char *s1, char *s2){ printf(s1, s2); printf("\n"); exit(1); }MappingSample.c#include <stdio.h>#include <sys/types.h>#include <sys/stat.h>#include <fcntl.h>#include <unistd.h>#include <sys/mman.h>int main(int argc, char *argv[]){ struct stat sb; off_t len; char *p; int fd; if(argc < 2){ fprintf(stderr, "usage: %s <file>\n", argv[0]); return 1; } fd = open(argv[1], O_RDONLY); if(fd == -1){ perror("open"); return 1; } if(fstat(fd, &sb) == -1){ perror("fstat"); return 1; } if(!S_ISREG(sb.st_mode)){ fprintf(stderr, "%s is not a file\n", argv[1]); return 1; } p = mmap(0, sb.st_size, PROT_READ, MAP_SHARED, fd, 0); if(p == MAP_FAILED){ perror("mmap"); return 1; } for(len = 0; len < sb.st_size; len++){ putchar(p[len]); } if(munmap(p, sb.st_size) == -1) { perror("munmap"); return 1; } if(close(fd) == -1) { perror("close"); return 1; } return 0; }