University of Kentucky NLTK Natural Language Processing Python Code Report

User Generated

fcbbegul

Engineering

University of Kentucky

Description

Guidelines

  • Share screen shot on your response
  • Share the code and the plots
  • Put your name and id number
  • Clear mark question number
  • Upload Word document
  • Insert Cover page Questions Attempted

HW NLTK Natural Language Processing

Q1 Review the python script in Q1 Folder - NLTK_Text_Analysis.py

Use text below to apply the same process

Text= “””Backgammon is one of the oldest known board games. Its history can be traced back nearly 5,000 years to archeological discoveries in the Middle East. It is a two-player game where each player has fifteen checkers which move between twenty-four points according to the roll of two dice.”””

a.Text Analysis Operations using NLTK

b.Tokenization

c.Stopwords removal

d.Lexicon Normalization such as Stemming and Lemmatization

e.POS Tagging

Q2 NLTK Corpus on Movie Reviews

Using the Dataset write a paper on Movie Reviews

http://www.cs.cornell.edu/people/pabo/movie-review-data/

https://www.kaggle.com/nltkdata/movie-review

https://www.nltk.org/book/ch06.html

Use the following reference analyze sentiment analysis on Movie Review “Movie Reviews.py”

http://blog.chapagain.com.np/python-nltk-sentiment-analysis-on-movie-reviews-natural-language-processing-nlp/

User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Please ignore the firts file, use this file with file named: WHII ;)

HWII
University
Name
July 05, 2020

1

Question 1:
Part a:
We have use the above text statement with the format of :
text=””
sentences = nltk.sent_tokenize(text)
then print the sentence
and print ()
For the output, there will be given the 3 separately sentences components.
Part b:
words = nltk.word_tokenize(sentence)
print(words)
print()
Output:

Part c:

2

Output:

['Backgammon', 'one', 'oldest', 'known', 'board', 'games', '.']
Part d & e:

Output:
Stemmer: seen
Lemmatizer: see
Stemmer: drove
Lemmatizer: drive

Question 2:
The Movie reviews were done in the Kaggle dataset which mainly highlights the
sentiment analysis of the reviews. It has und...


Anonymous
Really helped me to better understand my coursework. Super recommended.

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4

Related Tags