Access over 20 million homework & study documents
search

Data Quality and Preprocessing Lab

Content type

User Generated

Subject

Computer Science

Type

Worksheet

Rating

Showing Page:
1/3

Sign up to view the full document!

lock_open Sign Up
Showing Page:
2/3

Sign up to view the full document!

lock_open Sign Up
Showing Page:
3/3

Sign up to view the full document!

lock_open Sign Up

Unformatted Attachment Preview

Lab Expt 5 Data Quality and Preprocessing Objective: 1)To fill missing values in dataset 2)To remove Redundant Data in a data set 3) To highlight data set with inconsistent values for some of the data Description/Theory: The main problems affecting data quality are associated with missing values, and with inconsistency, redundancy, noise and outliers in a data set. 1) Missing values: Since many data analysis techniques were not designed to deal with a data set with missing values, the data set must be pre-processed. Several alternatives approaches are: • Ignore missing values: – Use for each object only the attributes with values, without paying attention to missing values. This does not require any change in the modeling algorithm used, but the distance function should ignore the values of attributes with at least one missing value; – Modify a learning algorithm to allow it to a ...
Purchase document to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Anonymous
Just the thing I needed, saved me a lot of time.

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4