Python Question

User Generated

UUUQ19

Programming

University of orgone

Description

Please find all the details about the assignments attached

I will send the Exercises in a different file

I do have the exam Qs we will just need to work on some answers

Unformatted Attachment Preview

All the dates and times by Mountain Time (MT) Tasks Assignment 2 Assignment: JSON and MongoDB Peer Review: JSON & MongoDB Assignment: JSON and MongoDB ( Final) Exercise 12A: Overflow Errors (Auto Graded) Exercise 12B: Input Validation Assignment: Data Modeling & Secure Coding (6020 Initial) Peer Review: Data Modeling & Secure Coding Exercise 13: Pandas Assignment: Data Modeling & Secure Coding (6020 Final) Exercise 14: Numpy & Matplotlib Assignment: Data Analysis and Visualization (6020 Initial) Peer Review: Data Analysis & Visulaization Assignment: Data Analysis and Visualization (6020 Final) Final Exam ISMG 6020 (Remotely Due date Oct 31 by 10pm Nov 5 by 10pm Nov 7 by 10pm Nov 13 by 10pm Nov 13 by 10pm Nov 15 by 10pm Nov 19 by 10pm Nov 20 by 10pm Nov 29 by 10pm Dec 4 by 10pm Dec 6 by 10pm Dec 10 by 10pm Dec 13 by 10pm Dec 18 by 10pm Assignment 1 (due Nov 31) Assignment: JSON and MongoDB I have already submitted this assignment so you do not have to worry about it I just leave it here because we may need it to do the rest of the assignments I will share my solution for this assignment and the correct solution from the professor as well so you can understand it. • This is the first assignment in a series of assignments that utilizes bike station data from https://data.cityofchicago.org/Transportation/Divvy-Bicycle-Stations-Historical/eq458inv (Links to an external site.). • • • • You will use object oriented programming techniques to create a BikeStation object with a constructor and multiple methods. Run a data modeling processes to extract the necessary bike station utilization information from the JSON data and group the data by station. You will then run an analysis to calculate some basic statistics about the data. You will create different types of visualizations of the bike station data you have retrieved and processed and save some plots. • The final program will consist of multiple modules including: BikeStation.py, downloader.py, model_data.py, stats.py and plots.py. • In this assignment you will be creating the BikeStation.py, that will read some bike station data from a file and create 5 bike_station objects based on some of the the data in the file, as described in the Assignment Instructions and in the video below: Submitting Your Work Please Upload Your Initial Submission on the assignment due date: • A screen shot of your running the BikeStation.py application. • • • A screen shot of your BikeStation.py code -showing all code. Neither screen shot should show your name. You can use https://carbon.now.sh/ (Links to an external site.) to capture a screenshot of your code even if it is longer that the length of your screen. Review assignments for two classmates within 4 days of the original assignment submission date: • • Submitting the reviews within this assignment will be the submission for the peer review assignment. See Code Reviewing for instructions on conducting a Code Review. Revise your assignment and resubmit to: Assignment: Object Oriented Programming (6020 Final) • • • • • Your actual BikeStation.py code file as a TEXT file (NOT A Jupyter Notebook file) - PLEASE add your name to the text file. A screen shot of your running the BikeStation.py application. If you use ANY CODE you found in another student's assignment PLEASE o Add a comment to that piece of code indicating the source of the code e.g. "Lines 12-16 were adapted from code I reviewed" o Make sure not to use more than 10% of the code from another student. Please add a comment in the Canvas comments describing how you updated your code from the first draft (or if you did no updates). Please submit the entire assignment EVEN IF YOU MADE NO CHANGES FOLLOWING CODE REVIEW. Overall Project Structure Assignment Instructions • • • This assignment will utilize Bike Station data from the City of Chicago (original data source https://data.cityofchicago.org/Transportation/Divvy-Bicycle-StationsHistorical/eq45-8inv). (Links to an external site.)This assignment will use a file (bdata.txt Download bdata.txt ) (you will find it attached in a different file named bdata) containing one record for each bike station (669 stations) but the the complete data set has 176 million records. Each bike station record contains multiple pieces of data about the time, bike station, number of docks at the station, the number of bikes at the station, the status to of the station and where the station is located. The data is stored as name value pairs as shown below (NOTE: the actual file has all of the data for one bike station on a single line). { "id": "515", "timestamp": "2020-11-16T11:55:55.000", "station_name": "Paulina St & Howard St", "total_docks": "19", "docks_in_service": "19", "available_docks": "9", "available_bikes": "10", "percent_full": "53", "status": "In Service", "latitude": "42.019159", "longitude": "-87.673573", "location": { "type": "Point", "coordinates": [-87.673573, 42.019159] }, "record": "51520201116115555" } Step 1: Select a set of Bike Stations to take data for • Go to Assignment: Choose Set of Bike Stations to Collect Data For and follow the instructions on the page to choose the five Bike Stations you collect data for. Step 2: Write an BikeStation class that demonstrates the following aspects of OOP in Python: 1. Create a basic class that can hold all relevant Bike data for an individual bike station (see above). o As with a database, you don't normally want to have calculated values stored in your objects because it could cause data integrity issues if, for example you update the available_bikes when a bike is returned but forget to update percent_full. o As such, you BikeStation class should only include the data values to uniquely specify the current state (number bikes available, number slots in the station). 2. Appropriately create a constructor to set all data values. 3. Use the @property decorator to make at least one property in your BikeStation class private. 4. Use the @???.setter method to validate the private property in some way (e.g. check if its numeric, change it's data type, change it's length) before setting its value. 5. Create one or more regular class methods to CALCULATE other relevant bike station values from the data attributes you are storing as a part of the class. 6. Override the __str__ method to print a string representation of a BikeStation that looks like the String below o Note that the format of the date string is different than it is in the data file and requires you to use String methods to remove the T between the date & time as well as the seconds and milliseconds from the time: o Paulina St & Howard St had 10 bikes on 2020-11-16 11:55 Step 3: Write a second class called Point or Position to store two or more pieces of BikeStation data. • • The class could store latitude & longitude You should use this class in the BikeStation class instead of storing the vales as primitive data types You may use the Author and Article classes presented in lecture as a starting point for these classes. Step 4: Do one thing I did not ask you to do explicitly in the assignment. • • • • There are many things about Object Oriented Programming that may not have been covered in the lectures. An important part of becoming a programmer is finding resources and and better ways of writing your code. Find one other object oriented concept not explicitly required for the assignment in the lectures, book(s) or using Google and add it to your code. The change should improve the molecularity of the program or "reusability" of your code. Document what you did in the in code comments as well as in the submission comments in Canvas. Step 5: Testing you classes using data from the data file bdata.txt Download bdata.txt 1. Prompt for a file name (bdata.txt Download bdata.txt ): (you will find it attached in a different file named bdata) 2. Open that file and read through the file 3. Display a custom error message if the file does not exist 4. Use an appropriate String method find lines for the BikeStations you were assigned e.g. lines that contain: "id": "###" (where (###) is the number for one of the Bike Stations you were assigned). 5. Once you have found lines of, you can pull the BikeStation data out from the the line by splitting the line on a ", " and then splitting the string a second time using a colon. 6. Create a new BikeStation object and add them to a list. 7. Print the BikeStation Object 8. You code should be efficient (you should not have to open the file multiple times NOR should you have to loop through the data more than once). 9. Print out the total number of Bikes available for each of the five Bike Stations you were are gathering data for after the file has been completely read. 10. Print out the total number of empty bike docks available for each of the five Bike Stations you were are gathering data for after the file has been completely read. Your output should look something like this: Paulina St & Howard St had 10 bikes on 2020-11-16 11:55 Clark St & Jarvis Ave had 1 bikes on 2020-11-16 11:55 Greenview Ave & Jarvis Ave had 3 bikes on 2020-11-16 11:55 Bosworth Ave & Howard St had 1 bikes on 2020-11-16 11:55 Eastlake Ter & Rogers Ave had 2 bikes on 2020-11-16 11:55 Stations: [515, 517, 520, 522, 523] Bikes Available 17 Docks Available 58 OOP Assignment Rubric This criterion is linked to a Learning Outcome BikeStation class with at least 6 attributes defined in an __init__ function that correctly sets all BikeStation properties. This criterion is linked to a Learning Outcome BikeStation class includes a regular method that returns a calculated value. This criterion is linked to a Learning Outcome BikeStation class overrides the __str__ method and returns a string containing at least 2 pieces of BikeStation data. BikeStation class uses the @property decorator to make at least one property private Both the @property and ???.setter methods for that property are created correctly. A second class called Point or Position was created and used in the BikeStation Class Must include an __init__ method This criterion is linked to a Learning Outcome BikeStation data is read from a file and appropriate methods are used to open, read from and close the file This criterion is linked to a Learning Outcome A Custom Error Message is displayed if the file cannot be opened. This criterion is linked to a Learning Outcome The lines containing the BikeStation ids that you are extracting data for are correctly identified. This criterion is linked to a Learning Outcome Appropriate String methods are used to extract the relevant pieces of BikeStation data from the lines from the data file. This criterion is linked to a Learning Outcome Relevant BikeStation data is used to create a list of BikeStation objects This criterion is linked to a Learning Outcome The total number of bikes available and the total number of docks available across the five BikeStations is calculated and displayed. Code includes one "Extra" feature Code includes one object oriented feature not explicitly required in the assignment description. Feature is documented in the comments. Code runs correctly A screenshot file is included demonstrating the program runs and displays data correctly Code is clear, well commented and has good overall design quality. Watch out for: Bad variable or method names. Convoluted control flow (if and while statements) that are repetitive or that could be simplified. Packing too much into one line of code, or too much into one method. Failing to comment obscure code. Having too many trivial comments that are simply redundant with the code. Variables used for more than one purpose. My solution for assignment 1: import json class Bikestation: def __init__(self, station_id, timestamp, station_name, total_docks, available_docks, available_bikes): self.id = station_id self.timestamp = timestamp self.station_name = station_name self.total_docks = total_docks self.available_docks = available_docks self.available_bikes = available_bikes @property def station_id(self): return self.station_id @station_id.setter def station_id(self, x): assert isinstance(x, int) self.station_id = x def available_docks_percentage(self): percentage = 100 * (self.available_docks / self.total_docks) return percentage def __str__(self): f_date = self.timestamp.replace("T", " ").split(".")[0] f_date = f_date[:len(f_date)-3] return f"{self.station_name} had {self.available_bikes} bikes on {f_date}" def __repr__(self): return self.__str__() def main(): f_name = input("Enter data source file name: ") fp = None try: fp = open(f_name, 'r') except Exception: print(f"Unable to open {f_name}") return chosen_stations = [149, 150, 263, 335, 406] all_stations = json.load(fp) bike_stations = [] for station in all_stations: for id in chosen_stations: if station["id"] == str(id): station_id = station['id'] timestamp = station['timestamp'] station_name = station['station_name'] total_docks = station['total_docks'] available_docks = station['available_docks'] available_bikes = station['available_bikes'] bike_stations.append(Bikestation(station_id, timestamp, station_name, total_docks, available_docks, available_bikes)) break fp.close() all_bikes = 0 all_docks = 0 for bike_station in bike_stations: all_bikes += int(bike_station.available_bikes) all_docks += int(bike_station.available_docks) print(bike_station) print(f"Stations: {chosen_stations} Bikes Available {all_bikes} Docks Available {all_docks}") if __name__ == "__main__": main() The professor solution: Assignment 2 Assignment: JSON and MongoDB For each assignment we have to do three things: 1- Solve the assignment 2- make some corrections or notes (Peer Review) to two students’ solution for the same assignment 3- correct our solution after see the student notes for our assignment’ solution. • • • • This is the second assignment in a series of assignments that utilizes bike station data from https://data.cityofchicago.org o You will use object oriented programming techniques to create a BikeStation object with a constructor and multiple methods. o Run a data modeling processes to extract the necessary bike station utilization information from the JSON data and group the data by station. o You will then run an analysis to calculate some basic statistics about the data. o You will create at least three different types of visualizations of the bike station data you have retrieved and processed and save some plots. This assignment is the second part a a project where you will download some of the bike station data from https://data.cityofchicago.org/resource/eq45-8inv.json and store it in a MongoDB database. The final program will consist of multiple modules including: bike_station.py, downloader.py, model_data.py, and analysis.py. In this assignment you will be creating the downloader.py, that downloads live bike station data from the City of Chicago and saves them to a MongoDB database running in the cloud, as described in the Assignment Instructions below. Submitting Your Work Please Upload Your Initial Submission on the assignment due date: • • • • A screen shot of your running the downloader.py application to produce the bike stations database/collection. A screen shot of your downloader.py code -showing all code. Neither screen shot should show your name. You can use https://carbon.now.sh/ (Links to an external site.) to capture a screenshot of your code even if it is longer that the length of your screen. Review assignments for two classmates within 4 days of the original assignment submission date: • • Submitting the reviews within this assignment will be the submission for the peer review assignment. See Code Reviewing for instructions on conducting a Code Review. Revise your assignment and resubmit to: Assignment: JSON and MongoDB (6020 Final) • Your actual downloader.py code file as a TEXT file (NOT A Jupyter Notebook file) - PLEASE add your name to the text file. • • • • • A screen shot of your running the downloader.py application to produce the bike_stations database/collection. A screen shot of the bike_stations collection page in Atlas, showing the number of bike_station records downloaded. If you use ANY CODE you found in another student's assignment PLEASE o Add a comment to that piece of code indicating the source of the code e.g. "Lines 12-16 were adapted from code I reviewed" o Make sure not to use more than 10% of the code from another student. Please add a comment in the Canvas comments describing how you updated your code from the first draft (or if you did no updates). Please submit the entire assignment EVEN IF YOU MADE NO CHANGES FOLLOWING CODE REVIEW. Overall Project Structure Assignment Instructions downloader.py • The python code in the following lectures is a good starting point for coding this assignment. o Example: Reading JSON Via HTTP o Example: MongoDB and Atlas Step 1: Select a set of Bike Stations to take data for Go to Assignment: Choose Set of Bike Stations to Collect Data For and follow the instructions on the page to choose the Bike Stations you collect data for. We already did this and our Bike stations are: 149, 150, 263, 335, 406 No need to choose a new set of Bike Stations if you already did this in the previous assignment. Step 2: Download your first Bike Data • You will need to edit the program to: o Download data for Chicago Bike Stations using the Bike Station URL instead of the Colorado Business data URL: https://data.cityofchicago.org/resource/eq45-8inv.json (don't forget to add the ? on the end so you can add parameters) (Links to an external site.) o The API for the data can be found here: https://dev.socrata.com/foundry/data.cityofchicago.org/eq45-8inv (Links to an external site.) o Instead of printing out data for the individual bike stations extract the entire array from the JSON data returned - this is the array of bike station readings. o Use an appropriate python function to get the number of bike station readings in the list of bike station readings downloaded, to confirm you downloaded the number of readings you were expecting. o Ideally you should sort the Bike Station data so you get the most current bike station readings as eventually you will want to download all of the bike station data for each of your bike stations for a period of time (a week, or a month) so you can analyze usage patterns. Step 3: Write Data to MongoDB • You are now ready to write your data to a MongoDB database! o If you haven't already done so, create your own MongoDB cluster on Atlas (see MongoDB Project Atlas) (I will gave you my account and you will find the MongoDB project Atlas attached) o To connect to your free tier cluster, first, you'll need to import the MongoClient class from PyMongo. o Next, you need to actually connect to your free tier cluster. You do that by instantiating a MongoClient object and specify the URI for your cluster (copy the URI connection string for connecting your application from Atlas). o Remember that you will need to update the password in the connection string. o Once with that is done, you should run the code to test whether or not you were successful (it works if there are no errors). o Before the start of your loop you will want to create a database to hold your bike data (e.g. bikesdb or stationdb), then create a collection to hold your bike station readings (eg. bikedata). o Inside the loop, you will want to ADD the array you get from the JSON data to the MongoDB bikedata collection. Remember the JSON data contains multiple bike station readings so you will want to use the command that allows you to add multiple items to a collection. Step 4: Multiple Downloads • • You will be assigned at least 5 bike stations to download the most recent data for. This means you will need to download data at least 5 times - once for each bike station you are assigned. Check the total number of bike station readings downloaded for each bike station and generate an error if it is fewer than the target number (e.g. 1000). Step 5: Expand the Assignment • • • • There are many things about Networked programs, JSON, Dates and MongoDB that were not covered in the lectures. Find one other feature or check you can do related to dates in the book(s) or using Google and add it to your code. Ideally the change should improve the automation, accuracy, efficiency or completeness of your code. Document what you did in the in code comments as well as in the submission comments in Canvas. Step 6: Downloading Data • • • • Once your code is working, you will want to delete your bike stations collection and then start your download for real. I suggest deleting the collection at this point because it is likely that you got duplicate data in your collection while you were debugging your code. Change your bike station ids to the values you were assigned. Change the limit to 5000 so you are downloading 5000 bike station readings for the stations you were assigned at a time. Rerun downloader.py until you get at least 25,000 bike station readings - 5000 for each station you were assigned. Step 7: Validate Download • • • After you download your data you will want to validate that the data has been written to MongoDB correctly. Verify that you have at least 5000 bike station readings in the dataset for each station ID using a MongoDB query with a filter. Verify that you have at least one month of data in the data set for each station ID using a MongoDB query to extract the bike station record with the earliest time. o You can sort the data to get the first document for a particular station by putting a sort statement in the parentheses of the find_one method then extract the "timestamp" value from the data and display it. Data Collection Options: • • • All students must collect 4,000 to 6,000 bike station readings for 5 different bike stations. Students will select different bike station groupings here Assignment: Choose Set of Bike Stations to Collect Data For The groups are for nearby bike stations so you can estimate the number of bikes available at any given time within a small area of Chicago. The professor solution.. we MUST change it because if he find out that we have his solution he will accuse me of cheating (we have to cahge the names some of the way to solve the assignment it is fine to made some small mistakes on purpose so he can not know that we have the answers) Peer Reviews: Peer Reviews Requirements: In this course, you will read your classmates’ code and give them comments about it. Although you can make comments about anything you think is relevant, the primary goal of this class is to learn how to write code that is safe from bugs, easy to understand, and ready for change. Read the code with those principles in mind. For each program you review, you should: • • Compare the code to the rubric to determine whether the submission meets all of the requirements set forth in the assignment and rubric (be sure to complete the rubric!). That is - did the student implement a program that provides all of the information it needs to incorporate? Make a comment by clicking on a line of code or selecting a range of lines, and typing your comment or by adding a text comment in the Canvas comment window. Things you might comment on include (see Code Reviewing guidelines for details): o Bugs or potential bugs: Repetitive code, Inconsistent indentation, Spelling or capitalization mistakes o Format: Format in Python is essential to program function. Format includes use of proper indenting, the use of whitespace and comments ... o Unclear, messy code: Bad variable or method names, poor comments... o Design Quality: The design chosen should be clear and concise. Is the solution chosen excellent, better than average, average or worse than other ways of approaching the given problem? Design quality problems might include: Repetitive code, Global variables, Convoluted control flow that could be simplified, etc. Follow the Guidelines outlined here: Code Reviewing In this course, you will read your classmates’ code and give them comments about it. This document describes the whys and hows of the code reviewing process. You can’t learn how to write without learning how to read – and programming is no exception. Code reviewing is widely used in software development, both in industry and in open source projects. Some companies like Google have instituted review-before-commit as a required policy. You can’t get a line of code into the Google source code repository unless another Google engineer has read it, given feedback about it, and signed off on it. The benefits of code review in practice are several. Reviewing helps find bugs, in a way that’s complementary to other techniques (like static checking, testing, assertions, and reasoning). Reviewing uncovers code that is confusing, poorly documented, unsafe, or otherwise not ready for maintenance or future change. Reviewing also spreads knowledge through an organization, allowing developers to learn from each other by explicit feedback and by example. So code reviewing is not only a practically important skill that you will need in the real world, but also a learning opportunity. What to Look For Although you can make comments about anything you think is relevant, the primary goal of this class is to learn how to write code that is safe from bugs, easy to understand, and ready for change. Read the code with those principles in mind. Here are some concrete examples of problems to look for: Bugs or potential bugs. • • • • • • • • • • • Repetitive code (remember DRY, Don’t Repeat Yourself). Disagreement between code and specification (your code does not do everything required). Relying on the assignment operator instead of the equality operator: (=) in stead of (==) Off-by-one errors: Remember that a loop doesn’t count the last number you specify in a range. So, if you specify the range [1:11], you actually get output for values between 1 and 10. Inconsistent indentation. Many Python features rely on indentation and getting it wrong can create a bug. Placing function calls in the wrong order when creating complex statements: Python always executes functions from left to right. So the statement MyString.strip().center(21, "*") produces a different result than MyString.center(21, "*").strip(). Misplacing punctuation: You can put punctuation in the wrong place and create an entirely different result. Remember that you must include a colon at the end of each structural statement. In addition, the placement of parentheses is critical. For example, (1 + 2) * (3 + 4), 1 + ((2 * 3) + 4), and 1 + (2 * (3 + 4)) all produce different results. Using the wrong capitalization: Python is case sensitive, so MyVar is different from myvar and MYVAR. Always check capitalization when you find that you can’t access a value you expected to access. Making a spelling mistake: Even seasoned developers suffer from spelling errors at times. Ensuring that you use a common approach to naming variables, classes, and functions does help. However, even a consistent naming scheme won’t always prevent you from typing MyVer when you meant to type MyVar. Optimistic, insecure programming. etc. Unclear, messy code. • • • • • • • Bad variable or method names. Convoluted control flow (if and while statements) that could be simplified. Packing too much into one line of code, or too much into one method. Failing to comment obscure code. Having too many trivial comments that are simply redundant with the code. Variables used for more than one purpose. etc. … Positive comments are also a good thing. Don’t be afraid to make comments about things you really like, for example: Unusually elegant code. Creative solutions. Great design. Process We will be using the Canvas Peer Reviewing system to allow you to review two student's code and provide helpful feedback. Here’s how the process will typically go. Only 80 point assignment code will be reviewed. • • • Initial due date: Submit a screen shot of assignment code and screen shot of code running to Canvas as separate files. 1-2 days after initial due date: you should visit Canvas, read the code that you were assigned, and make comments about them. For each piece of code you review, you should: o Compare the code to the rubric to determine whether the submission meets all of the requirements set forth in the assignment and rubric (be sure to complete the rubric!). That is - did the student implement a program that provides all of the information it needs to incorporate? o Make a comment by clicking on a line of code or selecting a range of lines, and typing your comment or by adding a text comment in the Canvas comment window. Things you might comment on include: Format: Format in Python is essential to program function. Format includes use of proper indenting, the use of whitespace and comments . Modularity in Design: Avoid accomplishing too many tasks in one function. Design Quality: The design chosen should be clear and concise. Is the solution chosen excellent, better than average, average or worse than other ways of approaching the given problem? Design quality problems might include: Repetitive code, Global variables, Convoluted control flow that could be simplified, Variables used for more than one purpose. o 2 days after initial due date: code reviews are due. Day after code review date: Look at your own code, and start revising your code for resubmission, and re-submit your work by the resubmission deadline, (the assignment link will disappear one week after the initial assignment was due). This approach to re-grades is sometimes referred to the “mastery approach”. You will get 50% credit for any improvements you make to your code after the initial submission. So, if your grade on the initial assignment would have been an 80% and you get 100% on the revision, you final grade will be 90%. If you got 50% on the initial submission and got 90% on the regrade, your final grade will be 70%. The code reviews can give you an idea of what your initial grade is likely to be BUT all assignments will be reviewed by a grader to determine both the initial and final grades on the assignment. Privacy & Visibility As a code author, you are anonymous. The system does not display your name with any of the code you wrote. Please don’t put your full name, username, email address, or other identifying information in your source code. As a reviewer, you are anonymous. The system does not displays your name BUT the instructor will be able to see your name when viewing the reviews. Respect Be polite. Sarcasm, insults, and belittling words have no place in a code review. It doesn’t matter whether you’re talking about a person (a fellow reviewer or a code author) or about code. Don’t call code “stupid,” because that transfers all too easily to the author of the code, whether you meant it that way or not. Be constructive. Don’t just criticize, but be helpful: point the way toward solutions. “Hopeless mess” is not a constructive comment; “name the variables more descriptively, e.g. tmp1 is not a great name” is much more constructive. As a code author, you should read your feedback carefully, and keep an open mind. Don’t get defensive. If your reviewers – who are fellow students, TAs and your instructor – find your code confusing, then you should consider what this says about its clarity and maintainability in the real world. If you disagree with a comment, you can indicate what you disagree with in a Canvas comment when you submit your final version of you code. FAQ for Reviewers Does reviewing affect my grade? Yes, it contributes to your Exercise/Peer review grade; The two reviews you do for a given assignment are worth 5 points (for two) just like a homework exercise. Does my reviewing affect other people’s grades? Not directly. The instructor will grade the initial and final assignment submission independent of your review. But your reviewing will hopefully help other students become better programmers and acquire a better understanding of the course, and indirectly improve their grades. You may also learn something about programming and the assignment expectations by completing a review. How many programs do I need to review, and how much do I need to do on each one? You will be assigned two reviews for the four 80 point assignments in the class. At a minimum, you must do at least two things on each program assigned to you – complete the rubric and make a comment. I don’t feel like I know anything. How can I possibly review other people’s code? You know more than you think you do. You can read for clarity, comment on places where the code is confusing, look for problems we read about or talked about in class, etc. What if I can’t find anything wrong? You can write a specific positive comment about something good. How much of the whole program do I have to try to understand? You do not have to understand the entire program. You can look for the key things asked for tin the programming rubric and click "Full Marks" if you find it and "No Marks" if you do not. This is useful because if you cannot find something and it is there - it may mean that the student's code you are reviewing is not clear. If you do see something you think is worn - please comment to help the other student out. How can I find out who wrote this code? Code authors are anonymous. If there is a serious situation requiring the identity of the code author – e.g., a potential case of plagiarism – then bring it to the attention of the teaching staff. FAQ for Code Authors Can somebody use my code to cheat in the class/can I copy the code I review? It is possible that a student could use ideas from the code they review in their final assignment submission. However, copying the work of current or past ISMG 4400/6020 students is still considered academic dishonesty - so never cut and paste another student's code into your program. If you figure out how to correct an error in your code by seeing it in in another student's code, please do the following: 1. Edit your code to add the new piece of logic you got from another student 2. Add a comment before the code you added indicating "Lines 12-16 were adapted from code I reviewed" 3. Make sure not to use more than 10% of the code from another student. 4. Do not type the other student's code directly into your code. 5. Remember, partial credit on an assignment is always better than a 0. Can I look at the reviews of my code while reviewing is still in progress? Yes, you can see your reviews as soon as they are posted. You will receive points for completing both reviews, including meaningful comments, correctly identifying the presence (or absence) of requires program features using the assignment rubric. Assignment 3 Assignment: Data Modeling & Secure Coding • • • This is the third assignment in a series of assignments that utilizes bike station data from https://data.cityofchicago.org o You will use object oriented programming techniques to create a BikeStation object with a constructor and multiple methods. o Run a data modeling processes to extract the necessary bike station utilization information from the JSON data and group the data by station. o You will then run an analysis to calculate some basic statistics about the data. o You will create at least three different types of visualizations of the bike station data you have retrieved and processed and save some plots. The final program will consist of multiple modules including: bike_station.py, downloader.py, model_data.py, and analysis.py. In this part of the project you will be creating model_data.py which will extract relevant data from the BikeStations stored in MongoDB and write them to a SQLite database. This assignment will also include a error checking process to validate the downloaded data has the correct format. Submitting Your Work Please Upload Your Initial Submission on the assignment due date: • • • • A screen shot of your running the model_data.py application to produce the index.sqlite database/collection - with some print statements to show updates are made every 100 or so records. A screen shot of your model_data.py code -showing all code. Neither screen shot should show your name. You can use https://carbon.now.sh/ (Links to an external site.) to capture a screenshot of your code even if it is longer that the length of your screen. Review assignments for two classmates within 4 days of the original assignment submission date: • • Submitting the reviews within this assignment will be the submission for the peer review assignment. See Code Reviewing for instructions on conducting a Code Review. Revise your assignment and resubmit to: Data Modeling & Secure Coding (6020 Final) • • • • Your actual model_data.py code file as a TEXT file (NOT A Jupyter Notebook file) - PLEASE add your name to the text file. A screen shot of your running the model_data.py application to produce the index.sqlite database/collection - with some print statements to show updates are made every 100 or so records. Your actual A screen shot of the index.sqlite database. If you use ANY CODE you found in another student's assignment PLEASE o • • Add a comment to that piece of code indicating the source of the code e.g. "Lines 12-16 were adapted from code I reviewed" o Make sure not to use more than 10% of the code from another student. Please add a comment in the Canvas comments describing how you updated your code from the first draft (or if you did no updates). Please submit the entire assignment EVEN IF YOU MADE NO CHANGES FOLLOWING CODE REVIEW. Project Structure model_data.py • • • • • • model_data.py reads the rough/raw data from the MongoDB downloads and produces a cleaned-up and well-modeled version of the data in the file index.sqlite. The file index.sqlite will be much smaller (often 10X smaller for data sets with more data attributes) than the raw data because it only contains the data you will want to use in your data analytics tasks - and should not include data that can be calculated from other data in the file. Each time model_data.py runs - it should completely wipe out and re-build the index.sqlite, database allowing you to adjust its parameters and edit the mapping tables in index.sqlite to tweak the data modeling process. It should store the data necessary to create a BikeStation object (Assignment 1) as downloaded from the MongoDB data (Assignment 2) It should do some validation on the data Running model_data.py can take quite a bit of time because it loops through every record, extracts relevant fields from the data, it will run faster if you do not write every item to the screen and only commit the data to write it to the database every 50 to 100 records. • The python code in the following lectures that is a good starting point for coding this assignment. o Example: Python and Databases (I will attached the lecture) o Example: MongoDB and Atlas (I will attached the lecture) Reading and Writing Data • • • • You will need to update the SQL queries to create and insert bike station data into a database table called bike_stations - or some other reasonable name. It should store the data necessary to create a BikeStation object (Assignment 1) Instead of reading raw JSON data from the web as done in the example, your program will connect to your MongoDB database and collection created in Assignment 2, Then you can loop through all of the documents in your bike_station collection and extract the relevant data from the dictionary object for each bike station and insert it into the database table. Secure Coding • • • • It should extract the data from the MongoDB data and store it individual variables for each piece of data you are writing to the database. It should cast numeric variables as int() or float() and break out of the data processing loop if the data is not of the correct format. It should convert ISO formatted date to a date time to validate it as well BUT, since you cannot store Dates=Time data in SQLite, you should store the data in the database as either an ISO formatted date or as a numeric time stamp. It should use a parameterized query to minimize the opportunity for SQLInjection. Expanding the Assignment. • • • • As with prior assignments, look for ways you can improve your code that may not have been covered in the lectures. Find one other feature or check you can do based on information discussed in class, in the book(s) or using Google and add it to your code. Ideally the change should improve the automation, accuracy, efficiency or completeness of your code. Document what you did in the in code comments as well as in the submission comments in Canvas. The Final Product • • When you are done, you will have a nicely indexed version of the bike station data in index.sqlite. This is the file to use to do data analysis (project 4). The professor solution.. we MUST change it because if he find out that we have his solution he will accuse me of cheating (we have to cahge the names some of the way to solve the assignment it is fine to made some small mistakes on purpose so he can not know that we have the answers) Assignment 4 Assignment: Data Analysis and Visualization (6020 Initial) • • • • This is the fourth and final assignment in a series of assignments that utilizes bike station data from https://data.cityofchicago.org o You will use object oriented programming techniques to create a BikeStation object with a constructor and multiple methods. o Run a data modeling processes to extract the necessary bike station utilization information from the JSON data and group the data by station. o You will then run an analysis to calculate some basic statistics about the data. o You will create at least three different types of visualizations of the bike station data you have retrieved and processed and save some plots. The final program will consist of multiple modules including: bike_station.py, downloader.py, model_data.py, and analysis.py. In this part of the project you will be creating analysis.py which will extract relevant data from the BikeStations stored in an SQLite database and calculate some basic statistics on the data using either NumPy or Pandas. You will then create at least three different types of visualizations of the data you have retrieved and processed and save some plots. For example: o o o Either a bar chart or a stacked bar chart to visualize the average number of bikes (or docks) available by bike station or by hour of the day or day of the week. A histogram of the numbers of bikes available in the data set. A scatter plot to show how the number of bikes available is changing over time. Submitting Your Work Please Upload Your Initial Submission on the assignment due date: • • • • • A screen shot of your running the analysis.py application to compute basic histogram data and statistics on the messages you have retrieved. A .png file each of the plots you made with analysis.py. A screen shot of your analysis.py code -showing all code. None of the screen shots should show your name. You can use https://carbon.now.sh/ (Links to an external site.) to capture a screenshot of your code even if it is longer that the length of your screen. Review assignments for two classmates within 4 days of the original assignment submission date: • • Submitting the reviews within this assignment will be the submission for the peer review assignment. See Code Reviewing for instructions on conducting a Code Review. Revise your assignment and resubmit to: Assignment: Data Analysis and Visualization (6020 Final) • • • • • • Your actual amalysis.py code file as a TEXT file (NOT A Jupyter Notebook file) PLEASE add your name to the text file. A screen shot of your running the analysis.py application to compute basic histogram data and statistics on the messages you have retrieved. A .png file each of the plots you made with analysis.py. If you use ANY CODE you found in another student's assignment PLEASE o Add a comment to that piece of code indicating the source of the code e.g. "Lines 12-16 were adapted from code I reviewed" o Make sure not to use more than 10% of the code from another student. Please add a comment in the Canvas comments describing how you updated your code from the first draft (or if you did no updates). Please submit the entire assignment EVEN IF YOU MADE NO CHANGES FOLLOWING CODE REVIEW. Project Structure analysis.py The python code in the following lectures is a good starting point for coding this assignment. • • Example: Analysis and Visualization (I will attached the lecture) Exercise: Pandas Solution (I will attached the lecture) Step 1: • • • • Add the appropriate import statements to allow you to use the SQLite database, statistics and plotting modules Make the database connection and run select query Get data from the database and add it to dictionaries and/or lists so it can be used in analysis, You can either put data values into individual lists for each column of data in the database OR you will create a list of BikeStation objects using the BikeStation class created in the OOP assignment. Step 2: Statistical Analysis Statistical analysis can be done either using NumPy or Pandas • • If you created individual lists/dictionaries containing data you should: o Convert lists to NumPy Arrays o Calculate basic statistics around average bikes available and docks in service by station using a Numpy filter to filter the numpy arrays by station o Print the statistics to the screen If you created a list of BikeStation objects you will need to convert list of BikeStation objects to a Panda's Dataframe. o See https://stackoverflow.com/questions/47623014/converting-a-list-ofobjects-to-a-pandas-dataframe for an example of how to use a custom method to turn you object into a dictionary AND use list comprehension to create a list of Bike Station dictionaries to create your DataFrame. o Create PANDAS Pivot Table to calculate means for average bikes available and docks in service by station and display the pivot table to the screen. Step 3: Generating plots • • • You can use the lists of data created for your statistical analysis to generate the plots If you are using a Pandas DataFrame, you can get Pandas Series from the DataFrame for each column in the data frame, and those series can be used just like a list in matplotlib.pyplot. Create at least three different types of plots using the ideas below: o Create a bar chart showing the average bikes and/or docks available by bike station (can also be made directly from the Pandas DataFrame) o Create a bar chart showing the average bikes and/or docks available by day of the week (Monday, Tuesday ...) or hour of the day. This will require you to convert the timestamp to a datetime so you can create a new list of data for the hour/day associated with each BikeStation reading. o Create a histogram counting the of the number of bikes available (or docks available) for each 10 minute measurement in the data set. o • • • Create a histogram counting the of the number of bikes available (or docks available) for each 10 minute measurement in the data set for an individual BikeStation. o Create a scatter plot of the number of bikes available or docks in service for each BikeStation over time. o Calculate the average bikes available by day for each BIkeStation and create a line plot showing the changer over time. Create a title, xlabel and ylabel for each of your plots Make sure the scale used is appropriate for the plot Call savefig for each of your plots and save as a png. Step 4: Expanding the Assignment. • • • • As with prior assignments, look for ways you can improve your code that may not have been covered in the lectures. Find one other analysis or visualization you can do based on information discussed in class, in the book(s) or using Google and add it to your code. Ideally the change should improve the analysis you did or the ability of the user to understand the data. Document what you did in the in code comments as well as in the submission comments in Canvas. The professor solution.. we MUST change it because if he find out that we have his solution he will accuse me of cheating (we have to cahge the names some of the way to solve the assignment it is fine to made some small mistakes on purpose so he can not know that we have the answ
Purchase answer to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Please vi...


Anonymous
Really helpful material, saved me a great deal of time.

Studypool
4.7
Indeed
4.5
Sitejabber
4.4

Related Tags