NEC Machine Learning Case Analysis

User Generated


Computer Science

New England College


Unformatted Attachment Preview

Case Analysis 3: MNIST Machine Learning The third case is related to a popular subfield of Machine Learning called computer vision. The central idea is to use ML algorithms to recognize images, both static and dynamic, without specifically providing any guidance to the computer. This task is achieved by training algorithms to recognize various images while also mapping the feature space of those images to a categorical label or name. This task was first successfully achieved in 1990s when computers were trained to recognize handwritten digits. The trained algorithm was used by postal services to sort mail and efficiently distribute to the right destination. We will use the same dataset and problem for this case. The dataset is called MNIST (Modified National Institute of Standards and Technology). More information on MNIST can be found here: Data Details: The above is example of the images of the handwritten digits. You are looking at 160 such images put together. The dataset consists of 42000 images, where each image is a 28 x 28 pixel resolution. Each image is broken down into 784 columns (28 x 280 = 784) where a column represents the grayscale within color within each pixel. The grayscale varies from 0 to 255. Let’s analyze one image from the above and understand the transformation to the data form. The above is an image of the digit 0. The horizontal and vertical scales depict the number of pixels. If you focus on the first pixel, (0,0) you would not notice any grayscale. On the other hand, if you focus in the (10,10) pixel, there is a strong grayscale within the image. The following are the way data are structured: a) The first 28 pixels, namely (0,0), (0,1)…(0,27) are the first 28 columns Pixel0, Pixel1,…Pixel27 etc. b) The next 28 pixels, namely (1,0), (1,1)…(1,27) are the next 28 columns Pixel28, Pixel29,…Pixel55 etc. And so on. The following is an excerpt from the dataset: The column label identifies the image as digit 0. The first column after label Pixel0 is the (0,0) pixel in the image with value 0. Notice that all of the first few columns have values 0. Later, Pixel176 has a value of 179, Pixel177 has a value of 254 etc. The main objective of this case is to reduce dimensions from 784 using various dimensions reduction algorithms. These reduced dimensions can further OPTIONALLY be used to identify (classify) digits 0 through 9 in the train dataset. Please make sure to only use the train.csv file for this task.
Purchase answer to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer


Great study resource, helped me a lot.


Related Tags