ECS32A UC Davis Python Security Threat Analysis in Python Project

User Generated

Mmfgp

Computer Science

ECS32A

UC Davis

Description

Background

This zyLab will show you some of the actual power of the Python language in analyzing real security log files. Right now, the operating system (OS) you are using to view this zyLab is creating entries in various log files, keeping track of errors that occur, hardware events, and the like. Security logs keep track of successful and failed logins for a UC Davis research server. For this zyLab, you will be using Python to analyze 2 pre-processed security log files showing failed login attempts for this system.

Here is an example of a line from the log:

<code></code>

Prashant is the name of a Professor in the CS department who could have mistakenly thought this was one of his systems, but the issue here is that this login traces back to a part of the world where Prof. Mohapatra definitely wasn't during the time of the attempt. These attackers are getting smarter, and your mission, should you choose to accept it, is to help us track them.

Deconstructing this line, May 20 05:11:17 is the time of the attack, prashant is the attempted username, and 182.254.184.247 is the IP address. Machines on the internet send data to and from various IP addresses. In some ways, it is the internet analogue of a mailing address. (The tricky part is that, unlike mailing addresses, IP addresses are often changed).

Building an Attacker Profile

The attackers of this system are attempting to use a variety of automation tools to "guess" a correct username and password combination. In order to see if our security system (fail2ban), is working effectively, we are going to keep track of the usernames used by each attacker, the time they started attacking, and the time they stopped attacking.

Modules and Organization

logreader.py

This assignment consists of two modules. logreader.py will only contain the implementation of a single function, but has been split off because we want to keep all the log parsing in a single module. There will be no code outside of its single function definition: the function extract_attacks. This function takes a string filename as its only parameter, and returns a nested dictionary of lists of attackers, implemented as in the following example:

<code></code>

The above is a dictionary of a dictionary of lists. Here is the reasoning behind this confusing structure:

  1. We need to index the outermost data structure by the IP address, because that is the most relevant identifier.
  2. Then, we need to store 2 lists: one stores the usernames used by the attacker, and the other stores the times of attack.

The usernames and times keys are the "wasted" keys we discussed in class; they exist only to discern the 2 lists for each IP address in the outer dictionary.

In summary, extract_attacks will use the .split() and .join() string methods to parse the log files into dictionaries as above. Be careful to handle the case of a new entry differently than an entry that already exists. In one case, the lists must be created, but in the other, they need only be appended to.
Note: Use negative indices on the list of strings returned by .split() to isolate the IP address and username.

Suppose your name the above dictionary attacks. It will function as follows:
attacks['182.254.184.247'] will return the sub-dictionary for the given IP address:

<code></code>

attacks['182.254.184.247']['usernames'] will return just the list of usernames for the given IP address:

<code></code>

threat_analysis.py

This module will hold the rest of the functions that make up your program. The template file included with this assignment has the function stubs that the program will use.

As in previous assignments, the main loop of the program will go in the main() function and it will allow the user to choose options. Those options are implemented as individual functions. The functions to be implemented are as follows:

get_max_attacks(dict_par)

In all of these functions, dict_par is just the dictionary described above. This function will go through the dictionary looking for the attacker with the longest list of usernames and return both their IP address and that list of usernames to the caller. You will need a forloop, and you will have to manually implement the logic to find the max as we did in class. Keep track of the length of the longest list of usernames and the IP address with the longest list of usernames, at a minimum.

get_start_time(ip_str, dict_par), get_end_time(ip_str, dict_par)

In all of these except print_ips_by_username(), ip_str is really just the IP of the hacker, represented as a string. These two functions can be implemented as a single line. If you are able to isolate the list of times for the given hacker, you just need to use the max()or min() function to return the maximum or minimum date string. (module datetime isn't required for this because "Apr" alphabetically comes before "May," and the numbers that follow will also rank alphabetically, but in the general, general case, yes you would want to use datetime).

print_hacker_info(ip_str, dict_par)

For this one, the list of usernames, the start time, and the end time for a given attacker should print as in the following example:

<code></code>

This function should also return the dict_par value (sub dictionary) for the given ip address.

print_ips_by_username(username, dict_par)

This function just prints out all the IP addresses that had a given username in their usernames list. I recommend printing them with two tabs end=\t\t), but it's up to you. It also returns the IPs as a list. If the username is not found, it will raise a ValueError with the string Username not found. Otherwise, it's just a simple little for loop function.

main()

Finally, the main program function. It does the following:

  1. Creates a threats dictionary using the extract_attacks function. In order to include results from both files, use the .update()dictionary method to merge dictionaries.
  2. Presents the user with the following menu in an input-while loop:
<code></code>
  1. Implements the options with function calls.
The m option:

This option just prints the IP address and usernames from the get_max_attacks() function, separated by tabs.

The i option:

This option will prompt the user for the IP address with:

<code></code>

and then it just calls print_hacker_info() with the given IP address.

The u option:

This option will prompt the user for the username with:

<code></code>

and then just calls the print_ips_by_username() with that username.

Error Handling

For the i option, you might get a KeyError on the call to print_hacker_info(), so put that call in a try-except construct that handles the KeyError by printing IP not found. You do not need to raise the KeyError.

For the u option, first edit print_ips_by_username() to raise a ValueError with the message Username not found if, indeed, the username does not occur in any of the username lists. Then, back in the main() function, use another try-except construct to handle the error by printing its message.

Unformatted Attachment Preview

Students: n orted r. see stems ments or a ted rs. ! # " This content is controlled by your instructor, and is not zyBooks content. Direct questions or concerns about this content to your instructor. If you have any technical issues with the zyLab submission system, use the Trouble with lab button at the bottom of the lab. Students: Section 12.8 is a part of 1 assignment: zyLab 12.8 Requirements: Lab Activities Due: 06/10/2019, 1:00 PM 12.8 Security Threat Analysis in Python Background This zyLab will show you some of the actual power of the Python language in analyzing real security log Bles. Right now, the operating system (OS) you are using to view this zyLab is creating entries in various log Bles, keeping track of errors that occur, hardware events, and the like. Security logs keep track of successful and failed logins for a UC Davis research server. For this zyLab, you will be using Python to analyze 2 pre-processed security log Bles showing failed login attempts for this system. Here is an example of a line from the log: May 20 05:11:17 pear sshd[22277]: Invalid user prashant from 182.254.184.247 Prashant is the name of a Professor in the CS department who could have mistakenly thought this was one of his systems, but the issue here is that this login traces back to a part of the world where Prof. Mohapatra deBnitely wasn't during the time of the attempt. These attackers are getting smarter, and your mission, should you choose to accept it, is to help us track them. Deconstructing this line, May 20 05:11:17 is the time of the attack, prashant is the attempted username, and 182.254.184.247 is the IP address. Machines on the internet send data to and from various IP addresses. In some ways, it is the internet analogue of a mailing address. (The tricky part is that, unlike mailing addresses, IP addresses are often changed). Building an Attacker ProIle The attackers of this system are attempting to use a variety of automation tools to "guess" a correct username and password combination. In order to see if our security system (fail2ban), is working effectively, we are going to keep track of the usernames used by each attacker, the time they started attacking, and the time they stopped attacking. Modules and Organization logreader.py This assignment consists of two modules. logreader.py will only contain the implementation of a single function, but has been split off because we want to keep all the log parsing in a single module. There will be no code outside of its single function deBnition: the function extract_attacks. This function takes a string Blename as its only parameter, and returns a nested dictionary of lists of attackers, implemented as in the following example: {'182.254.184.247' : { 'usernames' : 'times': ['prashant', 'bob'], ['May 20 04:38:13', 'May 20 04:42:01'] } } The above is a dictionary of a dictionary of lists. Here is the reasoning behind this confusing structure: 1. We need to index the outermost data structure by the IP address, because that is the most relevant identiBer. 2. Then, we need to store 2 lists: one stores the usernames used by the attacker, and the other stores the times of attack. The usernames and times keys are the "wasted" keys we discussed in class; they exist only to discern the 2 lists for each IP address in the outer dictionary. In summary, extract_attacks will use the .split() and .join() string methods to parse the log Bles into dictionaries as above. Be careful to handle the case of a new entry differently than an entry that already exists. In one case, the lists must be created, but in the other, they need only be appended to. Note: Use negative indices on the list of strings returned by .split() to isolate the IP address and username. Suppose your name the above dictionary attacks. It will function as follows: attacks['182.254.184.247'] will return the sub-dictionary for the given IP address: { 'usernames' : ['prashant', 'bob'], 'times': ['May 20 04:38:13', 'May 20 04:42:01'] } attacks['182.254.184.247']['usernames'] will return just the list of usernames for the given IP address: ['prashant', 'bob'] threat_analysis.py This module will hold the rest of the functions that make up your program. The template Ble included with this assignment has the function stubs that the program will use. As in previous assignments, the main loop of the program will go in the main() function and it will allow the user to choose options. Those options are implemented as individual functions. The functions to be implemented are as follows: get_max_attacks(dict_par) In all of these functions, dict_par is just the dictionary described above. This function will go through the dictionary looking for the attacker with the longest list of usernames and return both their IP address and that list of usernames to the caller. You will need a for loop, and you will have to manually implement the logic to Bnd the max as we did in class. Keep track of the length of the longest list of usernames and the IP address with the longest list of usernames, at a minimum. get_start_time(ip_str, dict_par), get_end_time(ip_str, dict_par) In all of these except print_ips_by_username(), ip_str is really just the IP of the hacker, represented as a string. These two functions can be implemented as a single line. If you are able to isolate the list of times for the given hacker, you just need to use the max() or min() function to return the maximum or minimum date string. (module datetime isn't required for this because "Apr" alphabetically comes before "May," and the numbers that follow will also rank alphabetically, but in the general, general case, yes you would want to use datetime). print_hacker_info(ip_str, dict_par) For this one, the list of usernames, the start time, and the end time for a given attacker should print as in the following example: Usernames: uuu httpd thom test lou admprod admin zeta yq to test factorio usuario ubuntu sv fu guest jaxson upload mc webuser cp tt live mou service mhr test test8 cmschef prashant sao alfresco ecomode tw pp tyler netdump jessica nin movies hadoop han user1 wildfly pc1 cemergen vd Start Time: May 20 01:17:58 End Time: May 20 07:39:54 This function should also return the dict_par value (sub dictionary) for the given ip address. print_ips_by_username(username, dict_par) This function just prints out all the IP addresses that had a given username in their usernames list. I recommend printing them with two tabs end=\t\t), but it's up to you. It also returns the IPs as a list. If the username is not found, it will raise a ValueError with the string Username not found. Otherwise, it's just a simple little for loop function. main() Finally, the main program function. It does the following: 1. Creates a threats dictionary using the extract_attacks function. In order to include results from both Bles, use the .update() dictionary method to merge dictionaries. 2. Presents the user with the following menu in an input-while loop: MENU m - Get the attacker with the max attacks i - Get the attacker by IP address u - Get the attackers by username q - Quit 3. Implements the options with function calls. The m option: This option just prints the IP address and usernames from the get_max_attacks() function, separated by tabs. The i option: This option will prompt the user for the IP address with: Enter the IP address: and then it just calls print_hacker_info() with the given IP address. The u option: This option will prompt the user for the username with: Enter the username: and then just calls the print_ips_by_username() with that username. Error Handling For the i option, you might get a KeyError on the call to print_hacker_info(), so put that call in a try-except construct that handles the KeyError by printing IP not found. You do not need to raise the KeyError. For the u option, Brst edit print_ips_by_username() to raise a ValueError with the message Username not found if, indeed, the username does not occur in any of the username lists. Then, back in the main() function, use another try-except construct to handle the error by printing its message. LAB ACTIVITY 12.8.1: Security Threat Analysis in Python 0 / 23 Submission Instructions Downloadable Iles threat_analysis.py , logreader.py , Download hackers2.log , and hackers1.log Upload your Iles below by dragging and dropping into the area or choosing a Ile on your hard drive. Drag Ile here threat_...sis.py or Choose on hard drive. logreader.py Drag Ile here or Choose on hard drive. Submit for grading Latest submission No submissions yet Trouble with lab? Activity summary for zyLab 12.8(0 of 23 points) Due: 06/10/2019, 1:00 PM Section 12.8 Hide $ Lab Activities 12.8 0 / 23 Loading... Submit to canvas
Purchase answer to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

The solution is att...


Anonymous
Just the thing I needed, saved me a lot of time.

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4

Related Tags