Linear Regression Functions Using Two Models Python Coding Project

User Generated

1002834702_

Programming

Description

  • We have a few data files each of them having a set of values for two variables (say x & y), one pair per line in the data file. We are trying to determine if there is a linear relationship between them.
  • You are given a set of regression functions in a module. Apply the algorithms and graphically plot the linear relationship if any for the data present in each file. Also print the results (to the output) of your numerical computation to the output as you go along so that your graphical results correlate with the numerical results.
  • The two techniques used in the program are (1) Method of Least Squares that gives us a Coefficient of Determination about the linearity of the relationship and also allows us try and fit a line along a possible linear path (2) Pearson technique which gives us a Correlation Coefficient about the linearity of the relationship
  • The four data files are in1, in2, in3 and in4. Run the program against a data file and observe the results

• In order to plot the relationship of x and y, do the following:

  • (1)Plot x vs y as scatterplot(2)Plot x vs f(x) as a line plot
  • Apply appropriate text in the plot for axes, title, as well as the coefficient of determination (from Least Squares) and Pearson coefficient (from Pearson). Your plots should also show me all meaningful information to infer whether or not the visualization and numerical results agree with each other.
  • If you are done with one data file, repeat the process for all four of them to appear in the same figure window as sub plot

Unformatted Attachment Preview

Assignment A7 Problem1: Linear Regression using two models • We have a few data files each of them having a set of values for two variables (say x & y), one pair per line in the data file. We are trying to determine if there is a linear relationship between them. • You are given a set of regression functions in a module. Apply the algorithms and graphically plot the linear relationship if any for the data present in each file. Also print the results (to the output) of your numerical computation to the output as you go along so that your graphical results correlate with the numerical results. • The two techniques used in the program are (1) Method of Least Squares that gives us a Coefficient of Determination about the linearity of the relationship and also allows us try and fit a line along a possible linear path (2) Pearson technique which gives us a Correlation Coefficient about the linearity of the relationship • The four data files are in1, in2, in3 and in4. Run the program against a data file and observe the results • In order to plot the relationship of x and y, do the following: • (1)Plot x vs y as scatterplot(2)Plot x vs f(x) as a line plot • Apply appropriate text in the plot for axes, title, as well as the coefficient of determination (from Least Squares) and Pearson coefficient (from Pearson). Your plots should also show me all meaningful information to infer whether or not the visualization and numerical results agree with each other. • If you are done with one data file, repeat the process for all four of them to appear in the same figure window as sub plots (you can supply all 4 file names at the beginning of the program and have the main() function loop to repeat the task for each file). Sample output Sample output shown below does not contain all the details that you need to have in your plots - this is just to give you an idea of how each of the plots should look like for the input data files (more customization is required). Grade Key 1.A Numerical results for each input data file is accurate 8 2.B Scatter and line plots accurately represents the relationship between two sets of variables 8 for each input data file 3.C 4 sets of plots in one figure window as sub plots for each input file 4 4.D Customizing plots using text/labels/legendsE Graphical Results Numerical Results Input File: in1 Data points: 8 Least Squares Method -------------------- Coefficients: m = -2.930300 Sum of Squared Residuals: Total Sum of Squares: 20.972400 Coefficient of determination: 0.957848 Pearson Method -------------Pearson Correlation Coefficient: Input File: in2 Data points: 34 Least Squares Method --------------------0.978697 b = 7.059781 0.884022 Coefficients: m = -37.765848 b = 432.147160 Sum of Squared Residuals: 90865.187272 Total Sum of Squares: 102703.558824 Coefficient of determination: 0.115267 Pearson Method -------------Pearson Correlation Coefficient: -0.339511 Input File: in3 Data points: 588 Least Squares Method -------------------Coefficients: m = 0.463425 Sum of Squared Residuals: Total Sum of Squares: 0.528158 Coefficient of determination: 0.231073 Pearson Method -------------Pearson Correlation Coefficient: 0.480700 Input File: in4 Data points: 45 Least Squares Method -------------------Coefficients: m = 0.287612 Sum of Squared Residuals: Total Sum of Squares: 0.040196 Coefficient of determination: 0.776891 Pearson Method -------------Pearson Correlation Coefficient: 0.881414 Problem2: Scikit-Learn is used to work with several clustering techniques including Hierarchical Clustering. Here’s an example that you can find that explains this in detail: https://stackabuse.com/hierarchical-clusteringwith-python-and-scikit-learn/ This assignment is to understand how this all works with scikit-learn by following through with Example 2 which shows how to segment customers into different groups based on their shopping trends. You can create the whole program by copying/downloading the code and data set and understanding how it actually works. Next step is to solve the same problem that we solved with our dog breeds example using our Python program, but you will use scikit-learn. Your program output should show a dendrogram and a scatter plot. The dendrogram is drawn by scikit-learn using matplotlib and the two plots might look like this (showing you samples without details on data points).
Purchase answer to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Hi,Uploading the solution for first probl...


Anonymous
Just what I needed. Studypool is a lifesaver!

Studypool
4.7
Indeed
4.5
Sitejabber
4.4

Related Tags