The datafile FINNISH2.docx contains measurements of fish in three species which were caught from the same lake in Finland. The variables are: X1 is the Observation number. X2 is the fish species: 1 = Bream, 4 = Parkki, 5 = Smelt. Y is the weight of the fish (in grams). X3 is the length from the nose to the beginning of the tail (in cm). X4 is the length from the nose to the notch of the tail (in cm). X5 is the length from the nose to the end of the tail (in cm). X6 is the Maximal height as % of X5. X7 is the Maximal width as % of X5. Construct a suitable linear model to predict Weight from the other variables and use it to answer the following questions: (a) Which variables are useful for predicting weight and what is their relationship to weight? (b) Are there differences between the different types of fish? Are any fish similar?
The answer to each question should include diagnostic checks (such as analysis of residuals, identication of influential observations, etc.). You should use R. All computer output should be included in the text of your answer and commented upon.
* For question (a) you should use stepwise regression
*For question (b) you should use categorial regression