All instruction is given in document and reference is given in attachment and content should be plagiarized

User Generated

enfpnyn

Programming

university of houstan

Description

Unformatted Attachment Preview

Chapter Review - Paper 1 Assignment Due Date: April 11, 2022 at 11:59PM CST Chapter 4: Training Models 1. What Linear Regression training algorithm can you use if you have a training set with millions of features? 2. Suppose the features in your training set have very different scales. What algorithms might suffer from this, and how? What can you do about it? 3. Can Gradient Descent get stuck in a local minimum when training a Logistic Regression model? 4. Do all Gradient Descent algorithms lead to the same model provided you let them run long enough? 5. Suppose you use Batch Gradient Descent and you plot the validation error at every epoch. If you notice that the validation error consistently goes up, what is likely going on? How can you fix this? 6. Is it a good idea to stop Mini-batch Gradient Descent immediately when the validation error goes up? 7. Which Gradient Descent algorithm (among those we discussed) will reach the vicinity of the optimal solution the fastest? Which will actually converge? How can you make the others converge as well? 8. Suppose you are using Polynomial Regression. You plot the learning curves and you notice that there is a large gap between the training error and the validation error. What is happening? What are three ways to solve this? 9. Suppose you are using Ridge Regression and you notice that the training error and the validation error are almost equal and fairly high. Would you say that the model suffers from high bias or high variance? Should you increase the regularization hyperparameter α or reduce it? 10. Why would you want to use: • Ridge Regression instead of plain Linear Regression (i.e., without any regularization)? • Lasso instead of Ridge Regression? • Elastic Net instead of Lasso? 11. Suppose you want to classify pictures as outdoor/indoor and daytime/nighttime. Should you implement two Logistic Regression classifiers or one Softmax Regression classifier? Grading Rubrics: Trait Exceptional The documentation is well written and clearly explains what is Documentation accomplishing and how. Also clearly addressed the assignment question. Response to Demonstrates full Questions knowledge of topic; explains and elaborates on all questions Content Demonstrates substance and depth; is comprehensive; shows mastery of material. Acceptable The documentation is written and clearly explains what is accomplishing and how. Also addressed the assignment question. Demonstrates ease in answering questions but does not elaborate. Covers topic sufficiently; uses appropriate type and number of sources. Amateur The documentation is simply explaining. Very lightly addressed the assignment questions. Demonstrates a barely sufficient level of both delivery in and knowledge of answers Covers major points of topic; needs additional coverage and sources. Unsatisfactory The documentation is kind of explaining. Hardly addressed the assignment questions. Demonstrates little grasp of information; has undeveloped or unclear answers to questions. Clearly provides inadequate coverage of topic; lacks sufficient sources.
Purchase answer to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

View attached explanation and answer. Let me know if you have any questions.Just wrote up my answers. Pretty confident in all of them. Sorry for not the best of grammars, but my answers should be sound. Let me know if any sit ill with you..

Chapter Review - Paper 1
Assignment Due Date: April 11, 2022 at 11:59PM CST

Chapter 4: Training Models
1.
What Linear Regression training algorithm can you use if you have a training set with
millions of features?

For a training set with millions of features, you could use multiple linear
regression with LASSO regularization so that you may represent only the
most important features while maintaining accuracy on the training set.
2.
Suppose the features in your training set have very different scales. What algorithms might
suffer from this, and how? What can you do about it?

Ridge and LASSO regularized regressions will suffer from significant scale
disparities, and should integrate normalization before modeling, with
appropriate means of denormalization for the interpretation of results.
3. Can Gradient Descent get stuck in a local minimum when training a Logistic Regression model?

In logistic regression Gradient Descent always finds the global optimum
because the cost function used to model the decision boundary is convex.
4.
Do all Gradient Descent algorithms lead to...


Anonymous
Really great stuff, couldn't ask for more.

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4

Related Tags