testing my program (python)
User Generated
z6nn6
Computer Science
Description
Hello!
i have a program that I need to make sure that it works. Because im not good with libraries: this is the requirements : Screen Scraper (2).docx
and this is the program : scapper (1).zip
Please also i if you can fix it and tech me how make it work.
thank you
Unformatted Attachment Preview
ITEC 423
A Simple Screen Scraper
In this exercise we will create a simple screen scraper using the python language that extracts
links from a given web site of a depth of two (can be modified later if desired), from an initial
start page. Links will be extracted in a breadth-first fashion. This scraper will use the Beautiful
Soup API that will be helpful to separate the HTML code from the page content. It works well
for parsing pages with broken HTML. See Blackboard for more information to download
Beautiful Soup and for working with python. I will have a few demonstrations examples in class
and will distribute sample code.
For this assignment we will modify the original “searchengine.py” file to include a function that
stores all of the extracted links into a file. We will then read each link, visit the page and extract
the content only from each page (stripping out all HTML tags). You will need to create a
function (or modify the existing calc() function) to include a word frequency component. The
word frequencies of the words in each page should be printed to the screen and to a file. If you
have extra time, feel free to add other components and features into your program.
Turn in:
1. Your complete python code (you do not need to submit the code for Beautiful Soup)
2. The output file where the URL’s were stored
3. The output from the program with the results of the word frequency counter
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.
This question has not been answered.
Create a free account to get help with this and any other question!
24/7 Homework Help
Stuck on a homework question? Our verified tutors can answer all questions, from basic math to advanced rocket science!
Most Popular Content
Ozarks Technical Community College Atwood Country Club Cash Flow Projection
Use a MacBook to complete .This should take about an hour! I will attach instructions and file Just part one of the three ...
Ozarks Technical Community College Atwood Country Club Cash Flow Projection
Use a MacBook to complete .This should take about an hour! I will attach instructions and file Just part one of the three parts needs to be completed
Web Development Exercise 7-1
In this project, you will create a database to contain tables of batting
statistics for major league ...
Web Development Exercise 7-1
In this project, you will create a database to contain tables of batting
statistics for major league baseball teams. You will then create a table
named teamstats in the baseball_stats database and add records
to the new table from a file named team_stats.txt in your Projects
1. Log in to MySQL Monitor with your root account or with the
user name and password supplied by your ISP or instructor.
2. Enter the following command to create a database named
baseball_stats:
mysql> CREATE DATABASE baseball_stats;[ENTER ]
3. After you see the “Query OK” message, enter the following
command to select the baseball_stats database:
mysql> USE baseball_stats;[ENTER ]
4. After you see the “Database changed” message, type
the following command to ensure that you selected the
baseball_stats database:
mysql> SELECT DATABASE();[ENTER ]
5. Enter the following command to create the teamstats table.
The Team field uses the VARCHAR data type. Eleven of the col- 439
umns use INT data types, and the remaining two fields use
FLOAT data types. Each of the statistical field names uses com-
mon baseball abbreviations, such as G for games, AB for at-
bats, R for runs, and HR for home runs.
mysql> CREATE TABLE teamstats (Team VARCHAR(50),
FirstYear INT,[ENTER ]
-> G INT, W INT, L INT, Pennants INT, WS INT,[ENTER ]
-> R INT, AB INT, H INT, HR INT, AVG FLOAT,[ENTER ]
-> RA INT, ERA FLOAT);[ENTER ]
6. After you see the “Query OK” message, enter the following
command to display the structure of the new table:
mysql> DESCRIBE teamstats;[ENTER ]
7. Enter a LOAD DATA statement that inserts records from the
team_stats.txt file in your Projects directory for Chapter 7
Use the
into the teamstats table. Replace path_to_PHP_folders with
MySQL serv-
the full path for your PHP_Projects directory for Chapter 7. er’s direc-
mysql> LOAD DATA INFILE 'path_to_PHP_folders/ tory path,
Chapter.07/Projects/team_stats.txt'[ENTER ] not the Web
-> INTO TABLE teamstats;[ENTER ] URL path.
8. After you see the “Query OK” message, enter the following
command to view all the records in the teamstats table:
mysql> SELECT * FROM teamstats;[ENTER ]
20 pages
Quantitative Qualitative Risk Assessment
Organizational assets are resources, either physical or intangible goods that a given firm or business entity owns and has ...
Quantitative Qualitative Risk Assessment
Organizational assets are resources, either physical or intangible goods that a given firm or business entity owns and has an economic value to the ...
Research Paper on Software Developer
Project Title: Software DeveloperFourth Iteration: Implementation and Evaluation of Software Developing Tasks**This paper ...
Research Paper on Software Developer
Project Title: Software DeveloperFourth Iteration: Implementation and Evaluation of Software Developing Tasks**This paper is the continuation of Second paper attached below. (Have a look on 4th iteration definition in 2nd paper).The 4th Iteration to include your Plan, Action, Observations, and ReflectionsPlan – at least one page in length, should include a description of all the planning activity that has taken place…may include agendas or other manuscripts as appropriateAction – at least one page in length, should include a description of that actual activityObservation – at least one page in length, should include a description of all the information collected as well as any analysisReflection – at least one page in length, should include a description of your thoughts about what happened, what went well, as well as not so well. If your iteration was a meeting, you may want to discuss the effectiveness of the meeting, did you have the best participants, did you miss any (not invite) or learned during the meeting you should have invited someone else..if so, what are your thoughts regarding mitigation …etc…The paper should contain at least 5 pages of content not count title page, table page and a reference page.Paper Requirements:* Table of contents and 4th iteration Introduction are mandatory* APA Format should follow all APA rules (citations, quotations, references)* Follow Template* No Plagiarism
2 pages
Cs 305 Module Two Static Testing Summary Template
Replace the bracketed text with your own words. If you choose to include images or supporting One of the reasons for secur ...
Cs 305 Module Two Static Testing Summary Template
Replace the bracketed text with your own words. If you choose to include images or supporting One of the reasons for security vulnerability is the old ...
4 pages
Week 8 Final Project Oracle
The final project is meant to be comprehensive. It requires you to pull all your knowledge You are required to submit your ...
Week 8 Final Project Oracle
The final project is meant to be comprehensive. It requires you to pull all your knowledge You are required to submit your scripts and screen ...
Similar Content
Create the E/R model of the database you are going to build.
Create the E/R model of the database you are going to build. The more variety the better grade. (All kinds of relationship...
LACC Megaputer WebAnalyst Discussion
"Google search" for Web Mining tools, and discuss at least 1 Web Mining tool and provide a minimum of 2 applications of th...
Cloud / Client Computing IT trend
Why will be Cloud / Client Computing the lead important IT trend in Organizations or even private...
New England College of Business and Finance Hacking Article Summary
Search the Internet and locate an article that relates to the topic of HACKING and summarize the reading in your own words...
Middle East College Decision tree and Locational Marginal Price Essay
1- explain in detail what is Decision tree and how we use .2-Explain in detail what is Locational Marginal Price (LMP)3-Ex...
University of Arizona Global Campus Risky Business for EZTechMovie Report
Conduct a risk assessment on an IT system that is a component of the critical business function that you identified in Wee...
Deep Learning
Deep learning is a category of machine learning inspired by artificial neural networks responsible for networks that are d...
Artificial Inteligence
Here, assuming that those f -vale is same which been the last added to the Green (upper) numbers are the order and the pur...
Assignment
The risk presented by cyberterrorism has caught the eye of the broad communications, the security network, and the data in...
Related Tags
Book Guides
Get 24/7
Homework help
Our tutors provide high quality explanations & answers.
Post question
Most Popular Content
Ozarks Technical Community College Atwood Country Club Cash Flow Projection
Use a MacBook to complete .This should take about an hour! I will attach instructions and file Just part one of the three ...
Ozarks Technical Community College Atwood Country Club Cash Flow Projection
Use a MacBook to complete .This should take about an hour! I will attach instructions and file Just part one of the three parts needs to be completed
Web Development Exercise 7-1
In this project, you will create a database to contain tables of batting
statistics for major league ...
Web Development Exercise 7-1
In this project, you will create a database to contain tables of batting
statistics for major league baseball teams. You will then create a table
named teamstats in the baseball_stats database and add records
to the new table from a file named team_stats.txt in your Projects
1. Log in to MySQL Monitor with your root account or with the
user name and password supplied by your ISP or instructor.
2. Enter the following command to create a database named
baseball_stats:
mysql> CREATE DATABASE baseball_stats;[ENTER ]
3. After you see the “Query OK” message, enter the following
command to select the baseball_stats database:
mysql> USE baseball_stats;[ENTER ]
4. After you see the “Database changed” message, type
the following command to ensure that you selected the
baseball_stats database:
mysql> SELECT DATABASE();[ENTER ]
5. Enter the following command to create the teamstats table.
The Team field uses the VARCHAR data type. Eleven of the col- 439
umns use INT data types, and the remaining two fields use
FLOAT data types. Each of the statistical field names uses com-
mon baseball abbreviations, such as G for games, AB for at-
bats, R for runs, and HR for home runs.
mysql> CREATE TABLE teamstats (Team VARCHAR(50),
FirstYear INT,[ENTER ]
-> G INT, W INT, L INT, Pennants INT, WS INT,[ENTER ]
-> R INT, AB INT, H INT, HR INT, AVG FLOAT,[ENTER ]
-> RA INT, ERA FLOAT);[ENTER ]
6. After you see the “Query OK” message, enter the following
command to display the structure of the new table:
mysql> DESCRIBE teamstats;[ENTER ]
7. Enter a LOAD DATA statement that inserts records from the
team_stats.txt file in your Projects directory for Chapter 7
Use the
into the teamstats table. Replace path_to_PHP_folders with
MySQL serv-
the full path for your PHP_Projects directory for Chapter 7. er’s direc-
mysql> LOAD DATA INFILE 'path_to_PHP_folders/ tory path,
Chapter.07/Projects/team_stats.txt'[ENTER ] not the Web
-> INTO TABLE teamstats;[ENTER ] URL path.
8. After you see the “Query OK” message, enter the following
command to view all the records in the teamstats table:
mysql> SELECT * FROM teamstats;[ENTER ]
20 pages
Quantitative Qualitative Risk Assessment
Organizational assets are resources, either physical or intangible goods that a given firm or business entity owns and has ...
Quantitative Qualitative Risk Assessment
Organizational assets are resources, either physical or intangible goods that a given firm or business entity owns and has an economic value to the ...
Research Paper on Software Developer
Project Title: Software DeveloperFourth Iteration: Implementation and Evaluation of Software Developing Tasks**This paper ...
Research Paper on Software Developer
Project Title: Software DeveloperFourth Iteration: Implementation and Evaluation of Software Developing Tasks**This paper is the continuation of Second paper attached below. (Have a look on 4th iteration definition in 2nd paper).The 4th Iteration to include your Plan, Action, Observations, and ReflectionsPlan – at least one page in length, should include a description of all the planning activity that has taken place…may include agendas or other manuscripts as appropriateAction – at least one page in length, should include a description of that actual activityObservation – at least one page in length, should include a description of all the information collected as well as any analysisReflection – at least one page in length, should include a description of your thoughts about what happened, what went well, as well as not so well. If your iteration was a meeting, you may want to discuss the effectiveness of the meeting, did you have the best participants, did you miss any (not invite) or learned during the meeting you should have invited someone else..if so, what are your thoughts regarding mitigation …etc…The paper should contain at least 5 pages of content not count title page, table page and a reference page.Paper Requirements:* Table of contents and 4th iteration Introduction are mandatory* APA Format should follow all APA rules (citations, quotations, references)* Follow Template* No Plagiarism
2 pages
Cs 305 Module Two Static Testing Summary Template
Replace the bracketed text with your own words. If you choose to include images or supporting One of the reasons for secur ...
Cs 305 Module Two Static Testing Summary Template
Replace the bracketed text with your own words. If you choose to include images or supporting One of the reasons for security vulnerability is the old ...
4 pages
Week 8 Final Project Oracle
The final project is meant to be comprehensive. It requires you to pull all your knowledge You are required to submit your ...
Week 8 Final Project Oracle
The final project is meant to be comprehensive. It requires you to pull all your knowledge You are required to submit your scripts and screen ...
Earn money selling
your Study Documents