PHYSICS151 UC Davis Python Linear Regression Problem

User Generated

ZNGGEBLPR

Programming

PHYSICS151

Description

Solve the entire HW problem. Using python to solve the linear regression problem. And functions are given,

Unformatted Attachment Preview

CS410P SUMMER 2019 DUE: 6/16/19 ASSIGNMENT 4P Purpose The purpose of this assignment is to have you work on a mathematical problem/analysis with lists and functions and file input, along with working with multiple files for creating your program/source code. Scenario When the relationship between two related quantities appears linear, a linear regression equation will give the best fit to that data. The context of this assignment is that you will compute a linear regression equation to fit a straight line to load deflection data for a mechanical coil spring. The computed equation will satisfy the least squares criterion. That is, it will minimize the sum of the squares of the deviations of the observed deflections from those predicted by the equation. We start with a set of experimental data which plots the relationship between two quantities, the weight of a load on a spring and the corresponding compression of the spring, Fitting a curve to known data enables estimation of spring deflections for other loads not in the data. This curve can be the basis for calibrating a spring scale. The data consists of a collection of experimentally obtained pairs (xi, yi) with i = 0, 1, 2, …., n-1. The xi s (independent variables) are the weights; the yi s (dependent variables) are the corresponding deflections. We want to find a (linear) function f such that f(xi) f(yi). A) Least Squares Method The method of least squares, a technique used for determining the equation of a straight line for a set of data pairs, will calculate coefficients m and b in the following linear regression equation: f(x) = mx + b When the original xi values are substitued into the equation, estimate f(xi) values are obtained. For each xi, the residual is defined to be the difference beween the observed yi and the estimated f(xi). Thus in the least squares method the sum of the square of the residuals will be minimized. Algorithm For the n data pairs (xi,yi) with i = 0,, 1, …., n-1 the slope m and the intercept b, for the least squares linear approximation f(x) = mx + b are calculated in the following way: Page 1 of 5 After printing the information about the data file and number of data points, your program will compute the m and b values and print these coefficients followed by the original xi, yi, estimated f(xi) values, residuals. Next we want to compute and print the sum of squared residuals, the total sum of squares and the coefficient of determination as determined by the following formula: B) Pearson Method The Pearson Coefficient will be a number between -1 and 1, where a value close to 1 indicates good correlation, values close to 0 will indicate no correlation and values close to -1 indicates negative correlation. The Pearson method is not used for line fitting but just to indicate what kind of correlation exists between the pairs of (x,y) values. Algorithm For the n data pairs (xi,yi) with i = 0,, 1, …., n-1 the Pearson Correlation Coefficient r is found by using the following formula: The Pearson Coefficient will be a number between -1 and 1, where a value close to 1 indicates good correlation, values close to 0 will indicate no correlation and values close to -1 indicates negative correlation. For the Pearson method all we need to is to print the above “r” value which is the Pearson correlation coefficient which will be a number between -1 and 1. Input The input is assumed to be coming from a file. The input consists of a number of pairs of values, where each pair contains two floating point values which represent a load on the spring and the resulting compression distance, respectively. You should read the data into two lists one for x and one for y. These are “parallel” lists meaning that you have exactly the same number of items in both lists, and the corresponding element of each lists make up an (x, y) pair. Output Output from the Least Squares method is printed first followed by the Pearson method. While your output does not have to look exactly like mine but it should be as complete and as easy to read. For the Pearson method we are simply interested in seeing the final Pearson Correlation Coefficient. Requirements    Using the two lists containing the input (x, y) pairs, your program must eventually create two other lists – one which stores the corresponding f(x) value for each (x, y) pair, and another which stores the corresponding residual values for each (x, y) pair. In other words you end up having 4 parallel lists in your program. You cannot use any library (or numpy) functions for computing the regression coefficients for any methods. You will need to compute the equations in your program. You must use at least the following functions: o o o o o    A function that reads in the data to the list (argument: data file name, return values are the two lists storing x and y values) A function that computes both the y-intercept and slope (arguments: lists containing x and y values; return values: y-intercept and slope) A function that computes two lists: f(x) and residuals (arguments: lists containing x and y values, slope and y-intercept (computed from the function above); return values: list containing f(x), list containing residuals, a numeric value for the regression coefficient which is the sum of squared residuals) and a numeric value for the total sum of squares. (Optionally you can create separate functions for computing and returning the sum of squared residuals and total sum of squares and return them appropriately) A function that computes the Pearson correlation coefficient (arguments: lists containing x and y values; return value: Pearson coefficient r) All function definitions must have at least have a block comment describing what they do. Printing (output) must be done only in your main function Other than your main function and a function that reads data from your input file into your lists, all your other functions must be written in a separate Python file. o Let’s say your file that contains all the functions is called functions.py o To be able to include all your functions from this functions.py file in your main program/file you can use the import statement as follows: import functions as f o The “f” at the end allows you to abbreviate your functions module to be specified easily while calling any function from that file. For example if compute_sxy() is a function defined in your functions.py file then you can call this function from your main program file as follows (assuming this function returns one value back) result = f.compute_sxy(xvalues, yvalues)   The file containing your main() function where you control the flow of your program by seeking input from the user (for filename), calling the function to read data from the input file, followed by calling appropriate functions to compute the results. All input and output must be done only in this file. Work on the program using the modularity of the functions As always test your program carefully and follow good programming style. Sample Output Grade Key: A 4 parallel lists created one each for x, y, fx, residual 8 B Minimum of 4 functions created with comments in a separate file (4 points for creating a separate file with all the functions related to regression computation) Printing done only in main() function, output format reasonable 12 40 E Accuracy of results: slope, intercept, fx, residual – 5 points each; sum of squared residuals; total sum of squares, coefficient of determination – 3 points each – 3 points each, Pearson coefficient – 11 points (-25 if library functions are used to compute any coefficients) Program works correctly for each input file F G Programming style, modularity (arguments, return values of functions) Late Penalty 8 C D 20 12
Purchase answer to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

I updated the c...

Similar Content

Related Tags