MODULE O5
DEEP LEARNING, MINING,
ANALYTICS
Deep learning neural networks
Convolutional neural networks
Long term short term memory
Cognitive computing
Text mining – Web mining – Social mining
Streaming and location analytics
Arun Kumar Bhaskara-Baba / Blair Williams – For Class discussion purposes only
Give your audience the information they need, in the order they need it,
in words designed to be clear, concise, and winsome.*
What is the Lysol demand for the next 12 months?
The demand for the next 12 months is unknown due to the
pandemic being a major disruptor to the industry. Things to
consider is the distribution of the vaccine and its timeline, which
may increase consumer confidence and decrease demand for Lysol.
At the same time, this may not necessarily mean the demand
returns to pre-pandemic levels despite decreasing. Another
scenario may be that the vaccine is not distributed to majority of
the population yet and so demand will remain at pandemic levels.
Another possibility is that despite the widespread vaccine
distribution, the public continues to consume Lysol at pandemic
levels. The future “normal levels” are still unknown but are
important for the business to understand how to best adjust the
supply chain so that the needed resources are available and being
utilized.
How can I retain customers that have
historically been dependent on gas cars?
This is particularly relevant to marketing. I want to find
which “groupings” of customers will buy an electric vehicle.
But the dynamic part about it is that I can market the car in
any way I want. This could mean releasing commercials in
only certain parts of the world, sending information to
business and advertising expensive cars to only rich
customers. This is not exhaustive, of course.
https://magazine.wharton.upenn.edu/digital/6-tips-for-clear-and-concise-business-communication/
*USC Marshall Business school
https://www.marshall.usc.edu/sites/default/files/2020-01/Communicating-Clearly-Concisely-Persuasively.pdf
Give your audience the information they need, in the order they need it,
in words designed to be clear, concise, and winsome.*
One business question that can be improved using
predictive analysis is “How much discount can I give
on each product?”.
As previously determined, this is a Classification
type question so it can be answered using
classification techniques such as Decision tree
analysis, Rough Sets, and Case-based reasoning
predictive analysis is a magic solution to
help us solve one of the questions raised
before and that play a very effective role
The profit of the product depends upon the sales of
the product and the cost to manufacture and
market the product. A hyperplane concept can be
used to separate a profitable product
https://magazine.wharton.upenn.edu/digital/6-tips-for-clear-and-concise-business-communication/
*USC Marshall Business school
https://www.marshall.usc.edu/sites/default/files/2020-01/Communicating-Clearly-Concisely-Persuasively.pdf
Give your audience the information they need, in the order they need it,
in words designed to be clear, concise, and winsome.*
TEAM PROJECT
- STATUS
Deep Learning – Mining – Analytics – Learning Objectives
❑ Learn what deep learning is and how it is changing the world of computing
❑ Know the underlying concept and methods for deep neural networks
❑ Understand how convolutional neural networks (C N N), recurrent neural networks (R N N), and long
short-memory networks (L S T M) work
❑ Know the foundational details about cognitive Computing and I B M Watson
❑ Describe text mining and understand the need for text mining and differentiate among text analytics, text
mining and data mining
❑ Describe sentiment analysis, and Develop familiarity with popular applications of sentiment analysis
❑ Become familiar with speech analytics as it relates to sentiment analysis
❑ Learn three facets of Web analytics—content, structure, and usage mining
❑ Know social analytics including social media and social network analyses
❑ Understand the need for and appreciate the capabilities of stream analytics and learn about the
applications of stream analytics
❑ Describe how geospatial and location-based analytics are assisting organizations
ANALYTICS, DATA SCIENCE AND AI: SYSTEMS
FOR DECISION SUPPORT
Eleventh Edition
Chapter 6
Deep Learning and Cognitive
Computing
Chapter 7
Slide in this Presentation Contain Hyperlinks. JAWS
users should beSentiment
able to get a list of links
by using
Text Mining,
Analysis,
and
INSERT+F77
Social Analytics
Chapter 9 (2 sections)
Streaming Analytics and Location
Analyticss
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
ACCURACY OF MODELS
In classification problems, the primary source for accuracy
estimation is the confusion matrix
Accuracy =
TP + TN
TP + TN + FP + FN
𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑅𝑎𝑡𝑒(𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦) =
𝑇𝑃
𝑇𝑃 + 𝐹𝑁
𝑇𝑟𝑢𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑅𝑎𝑡𝑒 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 =
Precision =
TP
TP + FP
Recall =
TP
TP + FN
𝑇𝑁
𝑇𝑁 + 𝐹𝑃
ESTIMATION METHODOLOGIES :
SINGLE/SIMPLE SPLIT
Simple split (or holdout or test sample estimation)
Split the data into 2 mutually exclusive sets: training (~70%)
and testing (30%)
For Neural Networks, the data is split into three sub-sets
(training [~60%], validation [~20%], testing [~20%])
WHAT IS A MODELER?
Instances
Instances
Instances
Instance
Modeler
New
instance
Model
Classifier
Class
Class
Class
Class
A
mathematical/algorithmic
approach to generalize
from instances so it can
make predictions about
instances that it has not
seen before
Its output is called a model
10
Introduction to Deep Learning
• Imaginative things in the SciFi movies are turning into
realities-tanks to AI and Machine Learning
– Siri, Google assistant, Alexa, Google home, …
• Deep learning is the newest member of the AI/Machine
Learning family
– Learn better than ever before
• The reason for Deep Learning superiority
– Automatic feature extraction and representation
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Introduction to Deep Learning
• Differences between Classic Machine-Learning Methods
and Representation Learning/Deep Learning
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Technology Insight 6.1
Elements of an Artificial Neural Network
• Processing element (PE)
• Network structure
– Hidden layer(s)
• Input
• Output
• Connection weights
• Summation function
• Transfer function
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Basics of “Shallow” Learning
• Artificial Neural Networks – abstractions of human brain
and its complex biological network of neurons
• Neurons = Processing Elements (PEs)
• Single-input and single-output neuron/PE
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Basics of “Shallow” Learning
• Typical multiple-input neuron with R individual inputs
n = w1,1 p1 + w1,2 p2 + w1,3 p3 + ... + w1, R pR + b
n = Wp + b
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Technology Insight 6.1
Elements of an Artificial Neural Network
• Neural Network with One Hidden Layer
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Technology Insight 6.1
Elements of an Artificial Neural Network
Summation Functions
Transfer Function
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Process of Developing Neural-Network Based
Systems
• A process
with
constant
feedbacks
for
changes
and
improveme
nts!
1. Compute temporary
outputs.
2. Compare outputs with
desired targets.
3. Adjust the weights and
repeat the process.
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Backpropagation for ANN Training
1.
2.
3.
4.
5.
Initialize weights with random values
Read in the input vector and the desired output
Compute the actual output via the calculations
Compute the error.
Change the weights by working backward
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Backpropagation for ANN Training
• Illustration of the Overfitting in ANN
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Illuminating the Black Box of ANN
• ANN are typically known as black boxes
• Sensitivity analysis can shed light to the black-box
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Application Case 6.4
Sensitivity Analysis Reveals Injury Severity
Factors in Traffic Accidents
• Graphical representation of the sensitivity analysis results
for the eight binary ANN model configurations
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Deep Neural Networks
• Deep: more hidden layers
• In addition to CPU, it also uses GPU
– With programming languages like CUDA by NVIDIA
• Needs large datasets
• Deep learning uses tensors as inputs
– Tensor: N-dimensional arrays
– Image representation with 3-D tensors
• There are different types and capabilities of Deep Neural
Networks for different tasks/purposes
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Deep Neural Networks
Feedforward Multilayer Perceptron (MLP)-Type Deep
Networks
• Most common type of deep networks
• Vector Representation of the First Three Layers in a
Typical MLP Network.
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Deep Neural Networks
• Impact of Random
Weights in Deep MLP
• The Effect of Pretraining Network
Parameters on
Improving Results of a
Classification-Type
Deep Neural Network.
• More hidden layers
versus more neurons?
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Application Case 6.5
Georgia DOT Variable Speed Limit Analytics
Help Solve Traffic Congestions
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Convolutional “Deep” Neural
Networks
• Most popular MLP-base DL method
• Used for image/video processing, text recognition
• Has at least one convolution weight function
– Convolutional layer
• Convolutional layer → Polling (sub-sampling)
– Consolidating the large tensors into one with a smaller
size-and reducing the number of model parameters
while keeping only the important features
– There can be different types of polling layers
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Convolution Function
• Typical Convolutional Network Unit
• Convolution of a 2 x 2 Kernel by a 3 x 6 Input Matrix
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Image Processing Using CNN
• ImageNet (http://www.image-net.org)
• Architecture of AlexNet, a CNN for Image Classification
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Image Processing Using CNN
• Conceptual Representation of the Inception Feature in
GoogLeNet
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Image Processing Using CNN
• Examples of Using the Google Lens
Figure 6.28 Two Examples of Using the Google Lens, a Service Based
on Convolutional Deep Networks for Image Recognition.
Source: ©2018 Google LLC, used with permission. Google and the Google logo are
registered trademarks of Google LLC.
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Consider learning an image:
⚫Some patterns are much smaller than
the whole image
Can represent a small region with fewer parameters
“beak” detector
Same pattern appears in different places:
They can be compressed!
What about training a lot of such “small” detectors
and each detector must “move around”.
“upper-left
beak” detector
They can be compressed
to the same parameters.
“middle beak”
detector
Why Pooling
⚫ Subsampling pixels will not change the object
bird
bird
Subsampling
We can subsample the pixels to make image
smaller fewer parameters to characterize the image
https://people.cs.pitt.edu/~xianeizhang/notes/NN/NN.html
Copyright © 2020 by Pearson Education, Inc. All Rights Reserved
Text Processing Using CNN
• Google word2vec project
– Word embeddings
• Typical Vector Representation of Word Embeddings in a
Two-Dimensional Space
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Text Processing Using CNN
• CNN Architecture for Relation Extraction Task in Text
Mining
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Recurrent Neural Networks (RNN)
• RNN designed to process sequential inputs
• Typical recurrent unit
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
The referee blew his _________
(whistle)
I went to a carwash yesterday.
It was raining a lot.
It cost $5 to wash _______
(my car)
Copyright © 2020 by Pearson Education, Inc. All Rights Reserved
Long Short-Term Memory (LSTM)
• LSTM is a variant of RNN
– In a dynamic network, the weights are called the longterm memory while the feedbacks role is the shortterm memory
Typical Long
Short-Term
Memory (L S T M)
Network
Architecture
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Recurrent Neural Networks (RNN) &
Long Short-Term Memory (LST M)
• LSTM Network Applications
Example Indicating
the Close-toHuman
Performance of the
Google Neural
Machine Translator
(G N M T)
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Application Case 6.7
Deliver Innovation by Understanding Customer
Sentiments
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Computer Frameworks for
Implementation of Deep Learning
• Torch (http://www.torch.ch)
– ML with GPU
• Caffe (caffe.berkeleyvision.org)
– Facebook’s improved version (www.caffe2.ai)
– Pyorch.ai
• TensorFlow (www.tensorflow.org)
– Google - Tensor Processing Units (TPUs)
• Theano (deeplearning.net/software/theano)
– Deep Learning Group at the University of Montreal
• Keras (keras.io)
– Application Programming
Interface
Copyright
© 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Cognitive Computing
• Systems that use mathematical models to emulate (or
partially simulate) the human cognition process to find
solutions to complex problems and situations where the
potential answers can be imprecise
• IBM Watson on Jeopardy!
• How does cognitive computing work?
–
–
–
–
Adaptive
Interactive
Iterative and stateful
Contextual
•
•
•
•
Data mining,
Pattern recognition,
Deep learning, and
N LP
– Mimic the way the
human brain works
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Cognitive Computing
• How does cognitive computing differ from AI?
Table 6.3 Cognitive Computing versus Artificial Intelligence (AI).
Characteristic
Cognitive Computing
Artificial Intelligence (AI)
Technologies used
• Machine learning
• Natural language processing
• Neural networks
• Deep learning
• Text mining
• Sentiment analysis
• Machine learning
• Natural language processing
• Neural networks
• Deep learning
Capabilities offered
Simulate human thought processes
to assist humans in finding solutions
to complex problems
Find hidden patterns in a variety of
data sources to identify problems
and provide potential Solutions
Purpose
Augment human capability
Automate complex processes by
acting like a human in certain
Situations
Industries
Customer service, marketing,
healthcare, entertainment, service
Sector
Manufacturing, finance, healthcare,
banking, securities, retail,
government
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Cognitive Search
• Can handle a variety of data types
• Can contextualize the search space
• Employ advanced AI technologies.
• Enable developers to build enterprise-specific search
applications
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Text Analytics and Text Mining
Figure 7.2 Text Analytics, Related Application Areas, and
Enabling Disciplines.
• Text Analytics versus Text Mining
• Text Analytics =
– Information Retrieval +
– Information Extraction +
– Data Mining +
– Web Mining
or simply
– Text Analytics = Information
Retrieval + Text Mining
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Data Mining versus Text Mining
• Both seek for novel and useful patterns
• Both are semi-automated processes
• Difference is the nature of the data:
– Structured versus unstructured data
– Structured data: in databases
– Unstructured data: Word documents, PDF files, text
excerpts, XML files, and so on
• To perform text mining – first, impose structure to the data,
then mine the structured data.
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Text Mining Application Area
• Information extraction
• Topic tracking
• Summarization
• Categorization
• Clustering
• Concept linking
• Question answering
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Text Mining Terminology
• Unstructured or semistructured data
• Term dictionary
• Corpus (and corpora)
• Word frequency
• Terms
• Part-of-speech tagging
• Concepts
• Morphology
• Stemming
• Term-by-document matrix
– Occurrence matrix
• Stop words (and include words)
• Synonyms (and polysemes)
• Tokenizing
• Singular value decomposition
– Latent semantic indexing
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Copyright © 2020 by Pearson Education, Inc. All Rights Reserved
Natural Language Processing (NLP)
• Structuring a collection of text
– Old approach: bag-of-words
– New approach: natural language processing
• NLP is …
– a very important concept in text mining
– a subfield of artificial intelligence and computational
linguistics
– the studies of "understanding" the natural human
language
• Syntax- versus semantics-based text mining
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Natural Language Processing (NLP)
• Challenges in NLP
– Part-of-speech tagging
– Text segmentation
– Word sense disambiguation
– Syntax ambiguity
– Imperfect or irregular input
– Speech acts
• Dream of AI community
– to have algorithms that are capable of automatically
reading and obtaining knowledge from text
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
NLP Task Categories
• Question answering
• Automatic summarization
• Natural language generation & understanding
• Machine translation
• Foreign language reading & writing
• Speech recognition
• Text proofing, optical character recognition
• Optical character recognition
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Text Mining Process
• A Context Diagram for Text Mining Process
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Text Mining Process
Figure 7.6 The Three-Step/Task Text Mining Process.
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Text Mining Process
• Step 1: Establish the corpus
– Collect all relevant unstructured data
(e.g., textual documents, XML files, emails, Web
pages, short notes, voice recordings…)
– Digitize, standardize the collection
(e.g., all in ASCII text files)
– Place the collection in a common place
(e.g., in a flat file, or in a directory as separate files)
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Text Mining Process
• Step 2: Create the Term–by–Document Matrix
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Text Mining Process
• Step 2: Create the Term–by–Document Matrix (TDM)
(Cont.)
– Should all terms be included?
▪ Stop words, include words
▪ Synonyms, homonyms
▪ Stemming
– What is the best representation of the indices (values
in cells)?
▪ Row counts; binary frequencies; log frequencies;
▪ Inverse document frequency
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Text Mining Process
• Step 2: Create the Term–by–Document Matrix (TDM)
(Cont.)
– TDM is a sparse matrix. How can we reduce the
dimensionality of the TDM?
▪ Manual - a domain expert goes through it
▪ Eliminate terms with very few occurrences in very
few documents (?)
▪ Transform the matrix using singular value
decomposition (SVD)
▪ SVD is similar to principle component analysis
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Text Mining Process
• Step 3: Extract patterns/knowledge
– Classification (text categorization)
– Clustering (natural groupings of text)
▪ Improve search recall
▪ Improve search precision
▪ Scatter/gather
▪ Query-specific clustering
– Association
– Trend Analysis (…)
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Sentiment Analysis
• Sentiment → belief, view, opinion, and conviction
• Sentiment analysis is trying to answer the question “What
do people feel about a certain topic?”
• By analyzing data related to opinions of many using a
variety of automated tools
• Used in variety of domains, but it application in CRM are
especially noteworthy (which related to
customers/consumers’ opinions)
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Sentiment Analysis Applications
• Voice of the customer (VOC)
• Voice of the Market (VOM)
• Voice of the Employee (VOE)
• Brand Management
• Financial Markets
• Politics
• Government Intelligence
• … others
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Sentiment Analysis Process
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Sentiment Analysis Process
• Step 1 – Sentiment Detection
– Comes right after the retrieval and preparation of the
text documents
– It is also called detection of objectivity
▪ Fact [= objectivity] versus Opinion [= subjectivity]
• Step 2 – N-P Polarity Classification
– Given an opinionated piece of text, the goal is to
classify the opinion as falling under one of two
opposing sentiment polarities
▪ N [= negative] versus P [= positive]
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Sentiment Analysis Process
• Step 3 – Target Identification
– The goal of this step is to accurately identify the target
of the expressed sentiment (e.g., a person, a product,
and event, etc.)
▪ Level of difficulty → the application domain
• Step 4 – Collection and Aggregation
– Once the sentiments of all text data points in the
document are identified and calculated, they are to be
aggregated
▪ Word → Statement → Paragraph → Document
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Web Mining Overview
• Web is the largest repository of data
• Data is in HTML, XML, text format
• Challenges (of processing Web data)
– The Web is too big for effective data mining
– The Web is too complex
– The Web is too dynamic
– The Web is not specific to a domain
– The Web has everything
• Opportunities and challenges are great!
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Web Mining
Web mining (or Web data mining) is the process of
discovering intrinsic relationships from Web data (textual,
linkage, or usage)
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Web Usage Mining
• Extraction of information from data generated through
Web page visits and transactions…
– data stored in server access logs, referrer logs, agent
logs, and client-side cookies
– user characteristics and usage profiles
– metadata, such as page attributes, content attributes,
and usage data
• Clickstream data
• Clickstream analysis
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Web Usage Mining
• Web usage mining applications
– Determine the lifetime value of clients
– Design cross-marketing strategies across products.
– Evaluate promotional campaigns
– Target electronic ads and coupons at user groups
based on user access patterns
– Predict user behavior based on previously learned
rules and users’ profiles
– Present dynamic information to users based on their
interests and profiles
–…
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Search Engines
• Google, Bing, Yahoo, …
• For what reason do you use search engines?
• Search engine is a software program that searches for
documents (Internet sites or files) based on the keywords
(individual words, multi-word terms, or a complete
sentence) that users have provided that have to do with
the subject of their inquiry
• They are the workhorses of the Internet
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Structure of a Typical Internet Search
Engine
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Anatomy of a Search Engine
1.
Development Cycle
– Web Crawler
– Document Indexer
2.
Response Cycle
– Query Analyzer
– Document Matcher/Ranker
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Search Engine Optimization
• It is the intentional activity of affecting the visibility of an ecommerce site or a Web site in a search engine’s natural
(unpaid or organic) search results
• Part of an Internet marketing strategy
• Based on knowing how a Search Engine works
– Content, HTML, keywords, external links, …
• Indexing based on …
– Webmaster submission of URL
– Proactively and continuously crawling the Web
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Top 15 Most Popular Search Engines
(by eBizMBA, August 2016)
Rank
Name
Estimated Unique Monthly Visitors
1
Google
2
Bing
400,000,000
3
Yahoo! Search
300,000,000
4
Ask
245,000,000
5
AOL Search
125,000,000
6
Wow
100,000,000
7
WebCrawler
65,000,000
8
MyWebSearch
60,000,000
9
Infospace
24,000,000
10
Info
13,500,000
11
DuckDuckGo
11,000,000
12
Contenko
10,500,000
13
Dogpile
7,500,000
14
Alhea
4,000,000
15
ixQuick
1,000,000
1,600,000,000
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Web Analytics Metrics
• Web site usability
– How were the visitors using my Web site?
• Traffic sources
– Where did they come from?
• Visitor profiles
– What do my visitors look like?
• Conversion statistics
– What does it all mean for the business?
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Web Analytics Metrics
Web Site Usability
Traffic Source
• Page views
• Referral Web sites
• Time on site
• Search engines
• Downloads
• Direct
• Click map
• Offline campaigns
• Click paths
• Online campaigns
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Web Analytics Metrics
Visitor Profiles
Conversion Statistics
• Keywords
• New visitors
• Content groupings
• Returning visitors
• Geography
• Leads
• Time of day
• Sales/conversions
• Landing page profiles
• Abandonment/exit rate
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
A Sample Web Analytics Dashboard
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Social Analytics Social Network
Analysis
• Social Network - social structure composed of individuals
link to each other
• Analysis of social dynamics
• Interdisciplinary field
– Social psychology
– Sociology
– Statistics
– Graph theory
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Social Analytics Social Network
Analysis
• Social Networks help study relationships between
individuals, groups, organizations, societies
– Self organizing
– Emergent
– Complex
• Typical social network types
– Communication networks, community networks,
criminal networks, innovation networks, …
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Social Analytics Social Network
Analysis Metrics
• Connections
– Homophily
– Multiplexity
– Mutuality/reciprocity
– Network closure
– Propinquity
• Distribution
– Bridge
– Centrality
– Density
– Distance
– Structural holes
• Segmentation
– Cliques and social
circles
– Clustering coefficient
– Cohesion
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Social Media Analytics
• It is the systematic and scientific ways to consume the
vast amount of content created by Web-based social
media outlets, tools, and techniques for the betterment of
an organization’s competitiveness
• Tools to measure social media impact:
– Descriptive analytics
– Social network analysis
– Advanced analytics
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Best Practices in Social Media
Analytics
• Think of measurement as a guidance system, not a rating
system
• Track the elusive sentiment
• Continuously improve the accuracy of text analysis
• Look at the ripple effect
• Look beyond the brand
• Identify your most powerful influencers
• Look closely at the accuracy of your analytic tool
• Incorporate social media intelligence into planning
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Stream Analytics
• Data-in-motion analytics and real-time data analytics
– One of the Vs in Big Data = Velocity
• Analytic process of extracting actionable information from
continuously flowing data
• Why Stream Analytics?
– It may not be feasible to store the data, or lose its
value
• Stream Analytics Versus Perpetual Analytics
• Critical Event Processing?
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Stream Analytics
A Use Case in Energy Industry
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Stream Analytics Applications
• e-Commerce
• Telecommunication
• Law Enforcement and Cyber Security
• Power Industry
• Financial Services
• Health Services
• Government
https://www.linkedin.com/pulse/fascinating-examplesshow-why-streaming-data-real-time-bernard-marr/
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
https://docs.microsoft.com/en-us/azure/architecture/solution-ideas/articles/iot-predictive-maintenance
Copyright © 2020 by Pearson Education, Inc. All Rights Reserved
https://docs.microsoft.com/en-us/azure/architecture/solution-ideas/articles/defect-prevention-with-predictive-maintenance
Copyright © 2020 by Pearson Education, Inc. All Rights Reserved
https://www.emersonautomationexperts.com/2013/industry/life-sciencesmedical/establishing-a-process-analytical-technology-program/
https://gmpua.com/World/Ma/07/j.htm
Copyright © 2020 by Pearson Education, Inc. All Rights Reserved
https://ibsen.com/applications/spectroscopy/spectroscopy-for-process-analytical-technology-pat/
Copyright © 2020 by Pearson Education, Inc. All Rights Reserved
Location-Based Analytics
• Geospatial analytics / GIS
• Agricultural, crime, disease spread applications
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
https://blogs.oracle.com/analytics/have-you-everconsidered-location-analytics-to-optimize-your-businessspace
Copyright © 2020 by Pearson Education, Inc. All Rights Reserved
DEMONSTRATE A COMMITMENT TO SOCIAL DISTANCING
While it’s likely there is a widely variable tolerance for risk among the various members of your on-campus community, it makes sense to consider strategies that maximize the number of people willing to return for the fall.
Thinking creatively about ways to build on the foundational steps outlined above can help inspire peace of mind for the greatest number of students, their parents, and staff.
Strategic Partnership with Degree Analytics
Deployed into your existing campus WiFi network by Apogee, the location analytics module from Degree Analytics is a software platform that uses anonymized data and machine learning to log WiFi access by zones you
designate. The tool generates visualizations – heat maps – of zone utilization and dwell time and can be used to gauge social distancing compliance on a zone-by-zone basis.
https://www.apogee.us/location-analytics/
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Copyright © 2020 by Pearson Education, Inc. All Rights Reserved
Location-Based Analytics
• A Multimedia Exercise in Analytics Employing Geospatial
Analytics
– www.teradatauniversitynetwork.com/Library/Samples/
BSI-The-Case-of-the-Dropped-Mobile-Calls
• Real-Time Location Intelligence
• Analytics Applications for Consumers
– Waze
– Yelp
– ParkPGH
– …
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Leadership
Copyright © 2020, 2017, 2014 Pearson Education, Inc.
S7 - 98
https://www.entrepreneur.com/article/27483
1
CITATIONS & BACKUP SLIDES
Citations
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Art of Analytics: Safety Cloud - YouTube
Success.com
Mareana.com
Overview of Mareana COVID-19 app (Enterprise version) on Vimeo
https://www.entrepreneur.com/article/274831
https://yecommunity.com/en/blog/fail-fast-fail-often-fail-forward
www.teradatauniversitynetwork.com/Library/Samples/BSI-The-Case-of-the-Dropped-Mobile-Calls
https://www.apogee.us/location-analytics/
https://blogs.oracle.com/analytics/have-you-ever-considered-location-analytics-to-optimize-your-business-space
https://ibsen.com/applications/spectroscopy/spectroscopy-for-process-analytical-technology-pat/
https://docs.microsoft.com/en-us/azure/architecture/solution-ideas/articles/iot-predictive-maintenance
https://docs.microsoft.com/en-us/azure/architecture/solution-ideas/articles/defect-prevention-with-predictive-maintenance
https://www.emersonautomationexperts.com/2013/industry/life-sciences-medical/establishing-a-process-analytical-technology-program/
https://cs.uwaterloo.ca/~mli/Deep-Learning-2017-Lecture5CNN.ppt
https://gmpua.com/World/Ma/07/j.htm
What is Text Mining? – YouTube
Watson and the Jeopardy! Challenge - YouTube
http://www.torch.ch
Pyorch.ai
TensorFlow (www.tensorflow.org)
Theano (deeplearning.net/software/theano)
Keras (keras.io)
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Copyright
This work is protected by United States copyright laws and is
provided solely for the use of instructors in teaching their
courses and assessing student learning. Dissemination or sale of
any part of this work (including on the World Wide Web) will
destroy the integrity of the work and is not permitted. The work
and materials from it should never be made available to students
except by instructors using the accompanying text in their
classes. All recipients of this work are expected to abide by these
restrictions and to honor the intended pedagogical purposes and
the needs of other instructors who rely on these materials.
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
CNN/RNN/LSTM-MODELS
This century's stark reality is that the business environment has become more dynamic
and challenging as organizations are constantly exposed to an array of business problems. One of
the most common business problems experienced by organizations and businesses across the
globe is the poor management of finances. In particular, changes in financial management
remain a severe issue in the PVC Plumbing pipe Industry as various organizations in this
industry have continued to grapple with ensuring successful and effective management of their
finances, given that some of the available models are not suitable for analyzing certain types of
financial data. Over the years, more organizations have turned to CNN/RNN/LSTM models to
ensure proper management of finances. these models have the potential to effectively analyze
sophisticated and noisy financial data, thus allowing organizations to ensure accuracy. Hence, it
will help solve the business problem by allowing individuals to obtain data features from larger
numbers of raw data without struggling.
There are various decisions that can be made based on the outcome of the data. For
instance, the involved parties will make decisions concerning finances and how to ensure
increased profitability. The kind of data that will be needed includes both qualitative and
quantitative data. Particularly, the data that will be required include the industry's financial data,
which consists of sales, expenditures, and profitability. Structured questionnaires will be used to
collect the required quantitative and qualitative data. One of the steps that I have to take to make
sure the model is relevant is training the organizations within this industry. After that, testing and
validation of the model will be carried out to make sure that it is relevant.
Text Mining
The inability to predict customer perceptions about a particular company or industry and
their purchasing behavior is yet another common problem in the present business environment.
Particularly, predicting consumer perception and purchasing behavior has continued to challenge
the PVC Plumbing pipe industry despite the efforts put in place to develop an improved
understanding of their behaviors. Text mining will help solve this particular business problem to
enhance cross-selling and up-selling by examining call centers' data.
The data will provide the industry with detailed information on customers' perceptions of
its products and services. Some of the decisions that can be made based on the outcome of the
model include decisions about the kind of products to provide and those that the industry should
stop producing. Other decisions that can be made based on the outcome of the model are those
concerning how to increase the quality of the industry's products to ensure improved customer
experiences.
The data that will be needed include demographic data to help understand the industry's
products, how they are selling, and who likes them. The data will be collected through structured
questionnaires. Some of the steps that will be followed include training the involved parties,
testing, and validating the model to make sure that it remains relevant.
Web Mining/ Web Analytics
Apart from the above-mentioned problems, organizations worldwide, especially the PVC
Plumbing pipe industry, are still trying to come to terms with the rate at which technology is
accelerating the pace of change. In this respect, the rate at which the pace of change is
accelerating due to technology is another serious business problem. Industries are increasingly
becoming concerned with how technology is transforming their operations, and they find it to be
such a serious problem because they are not certain of what they should do to adapt to this highly
changing environment. Web mining or web analytics can be used to address the above business
problem by allowing those in this industry to extract relevant and appropriate data from the web.
There is no doubt that those in the industry will develop a better understanding of the changes
taking place in the present business environment through web mining. The kind of decisions that
can be made based on this model's outcome include decisions about how to compete favorably in
the present business environment. The kind of data that will be required includes data concerning
technological changes taking place and the extent to which they are impacting this industry. The
data will be collected through online surveys. The first step that I will have to take to make sure
the model is relevant is the training of all the participants. Thereafter, the model will be tested
before being validated.
Streaming Analytics
Another serious business problem in the present corporate world, particularly in the PVC
Plumbing pipe Industry, is the inability to accurately predict the demand of its products and
services as well as the production in a timely manner. Streaming analytics helps address this
business problem by allowing organizations in this industry to apply transaction-level logic to
real-time observations. The decisions that can be made based on the outcome of this model are
those concerning how the industry can make its offers creative and price offers.
The data that will be needed include data concerning the industry's price offers, its
products, the products that are constantly looked for by customers, and how often the customers
do shop for its products. The steps to be followed include training, testing, and validation of the
model to make sure that it is relevant and effective.
Location Analytics
Exploding data is also considered to be a severe business problem in the present
corporate world, specifically in the PVC Plumbing Pipe Industry. Research has shown that over
85% of the world's data was created in the past few years, and managing, ensuring safety, and
obtaining important insights from the ever-growing amounts of data is becoming a big problem
to organizations in this industry. Location analytics addresses this problem by adding an extra
layer of geolocation information to the already existing business data. Some of the decisions that
can be made based on the outcome of the model include those about the kind of data to be
extracted, transactions, and events. The data that will be needed will be data concerning their
customers, their needs and preferences, and data concerning the industry's sales. The data will be
collected through online surveys, and steps that will be followed include training, testing, and
validation.
Purchase answer to see full
attachment