Description
Unformatted Attachment Preview
Purchase answer to see full attachment
Explanation & Answer
Thank you for working with me
Running head: DATA QUALITY, DATA MINING, AND TEXT MINING
Data Quality, Data Mining, and Text Mining
Name
Institutional Affiliation
Date
1
DATA QUALITY, DATA MINING, AND TEXT MINING
2
Data quality
Data quality refers to the ability of data in serving its purpose in a given context (Hazen,
Boone, Ezell, & Jones-Farmer, 2014, p. 72). Data quality gets determined by various factors,
including reliability, accuracy, completeness, and relevance. Having data quality proof is very
important in that it promotes the trust of customers to an organization (Kwon, Lee, & Shin,
2014). Proof of data quality, on the other hand, brings about additional costs and risks like the
fact that the proof data quality may lead to substantial variation in which one enterprise benefits
while others do not. The additional costs are because the proof of data quality may indicate that
some areas require more attention hence resulting in the neglect of the other regions(Cai & Zhu,
2015).Cai and Zhu, (2015),also assert that quality data requires a highly skilled workforce in its
production, which might be very expensive to acquire. In addition to substantial variation and
cost, proof data quality may also result in the risk of incorrect predictions. From the available
data, managers may make decisions concerning the future of their organizations, decisions that
are at risk of falsification should the dependent variables change(Cai & Zhu, 2015). Also, proof
data quality may lead to the undermining of decisions made by experienced managers that differ
from the proof which is a challenge because the decisions of these managers may be the correct
ones in the longrun(Cai & Zhu, 2015).
Data mining
Data mining is the extraction of non-trivial implicit, previously known and potentially useful
patterns or knowledge from vast amounts of data (Aggarwal, 2015). Data mining, according to
Aggarwal, (2015), is the analysis step in Knowledge Discovery in Databases (KDD). According
to Zaki, Meira Jr, andMeira, (2014), the term data mining got introduced in the 1900s with the
old methods of identifying patterns in data having consisted of the Bayes' Theorem of the 1700s
DATA QUALITY, DATA MINING, AND TEXT MINING
3
and the Regression Analysis in the 1800s. This article section highlights the main research areas
of data mining, the kinds of data that data mining gets done on explains the main functionality
and the processes in data mining and states the major issues in data mining.
According to Aggarwal (2015), data mining is widely applicable and can get used for market
analysis and management, risk analysis, management, and fraud detection, and detection of
unusual patterns. From the broad applications of data asAggarwal, (2015), further asserts, the
main research areas of data mining are in medicine and manufacturing engineering. Data mining
usually gets done on data stored in relational databases, data warehouses, transactional databases
and other data in advanced databases and information repositories like time-series data, spatial
and temporal data and stream data (Roiger, 2017).
Data mining has many functionalities including classification and prediction through the
construction of models that describe and distinguish classes and concepts for future projections,
an association which involves correlation and causality, cluster analysis, outlier analysis, trend
analysis and...