Access Millions of academic & study documents

Data Quality and Preprocessing Lab

Content type
User Generated
Subject
Computer Science
Type
Worksheet
Showing Page:
1/3

Sign up to view the full document!

lock_open Sign Up
Showing Page:
2/3

Sign up to view the full document!

lock_open Sign Up
Showing Page:
3/3

Sign up to view the full document!

lock_open Sign Up
Unformatted Attachment Preview
Lab Expt 5 Data Quality and Preprocessing Objective: 1)To fill missing values in dataset 2)To remove Redundant Data in a data set 3) To highlight data set with inconsistent values for some of the data Description/Theory: The main problems affecting data quality are associated with missing values, and with inconsistency, redundancy, noise and outliers in a data set. 1) Missing values: Since many data analysis techniques were not designed to deal with a data set with missing values, the data set must be pre-processed. Several alternatives approaches are: • Ignore missing values: – Use for each object only the attributes with values, without paying attention to missing values. This does not require any change in the modeling algorithm used, but the distance function should ignore the values of attributes with at least one missing value; – Modify a learning algorithm to allow it to a ...
Purchase document to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.
Studypool
4.7
Indeed
4.5
Sitejabber
4.4

Similar Documents