12:40 CM @ 00 Ml. •
Q
17% ال. ج
Per %
Program requirements (220 points)
Unless otherwise specified, you should
always assume that every function you
are asked to implement has to work for
all sizes and variations of the data.
This assignment has 2 main parts:
• Part 1: Allow the user to repeatedly enter
pairs of words to compute the edit
distance for. The code that reads the
words and calls the edit distance is
provided in spell_checker.c. It will work
once you implement the edit distance
function.
The edit distance function will build and
print the table for the edit distance and
also the distance itself.
It should be CASE SENSITIVE.
EditDistance("dog","DOG") = 2
(I consider it important to be able to print the
data from your program in a formatted, easily
readable way that allows you to easily check and
verify that the program does what you want it to
do. Printing the table for the edit distance along
with the indices and corresponding letters from
the strings allows you to check that the program
generates the same table as we did in class or for
any other test case you develop on your own on
paper.)
• Part 2: Simulate a simple version of a
spell checker program.
• You will write all your code in the file
spell.c (provided). A client file,
spell_checker.c, is also provided. It
implements the high level behavior of
the program and calls specific functions
that you must implement in spell.c.
Details:
1. Implement the Edit Distance between 2
strings as shown in class. It must be the
BOTTOM-UP DYNAMIC PROGRAMMING
method (i.e. the one that has NO
III
O
?
12:40 CM @ 20 M3. •
17% الا. ه
Details:
1. Implement the Edit Distance between 2
strings as shown in class. It must be the
BOTTOM-UP DYNAMIC PROGRAMMING
method (i.e. the one that has NO
recursion). Simply write the loop(s) to
populate the 2D table. - 20 points
Dist(0,0) = 0
Dist(0.j) =j
Dist(1,0) = i
Dist(ij) = Dist(i-1.j-1) if Xi-1 = Yj-1
1 + min { Dist(i-1,j), Dist(i,j-1), Dist(i-1,j-1) }
if Xi-1 Yj-1
It should be CASE SENSITIVE.
EditDistance("dog","DOG") = 2
2. Print the distance matrix as a formatted
table. - 25 points
3. Allow the user to repeatedly compute the
edit distance between pairs of words
given as input. It stops when the user
enters -1 -1 . (Implementation already
provided)
4. Spell check.
1. If user selects verbose mode, print the
dictionary words before and after
sorting and also the words touched-on
during binary search. It should match
the sample output perfectly: index
number and word.
2. load a dictionary file. Sort the data in
the file in alphabetical order. (if
verbose mode, print the dictionary
before and after sorting.) You can
assume all the words in the dictionary
are in lowercase.
I STRONGLY encourage you to use the
qsort function from the C library. E.g.
read
http://www.cplusplus.com/reference/cstdlib/qsort/.
Note that the compar function that it
uses (and you need to write) takes
POINTERS to whatever type of data is
in your array. That means that if your
array already has pointors, that
function takes point ã pointers. It
may take a bit of cart. rial and error,
but it is worth the price to learn to use
11.
.וח
III.
O
( T
12:40 CM @ 00 Ml. •
17% الان. مه
may take a bit of careful trial and error,
but it is worth the price to learn to use
the qsort function. The compar
argument is a function pointer. If you
are not familiar with it you can read
here
https://www.geeksforgeeks.org/function-
pointer-in-c). Based on how you store
the dictionary words (array of pointers
or 2D array of chars), it may be a bit
tricky to set up the compar function, or
to give the correct size of the elements
for the qsort function.
If you write a good function to be passed for
the compar argument, qsort will work and you
do not need to implement a sorting function.
This is a great opportunity to learn how to use
a library function, as opposed to writing
everything ourselves. Function pointers are
also so cool...
3. open a file to write the corrected text
to. That is the output file. It will have
the same name as the text file, but with
the prefix "out_" added to it. E.g. if
processing text fie "text1.txt" a new file
with name "out_text1.txt" will be
created and have the spell-checked
version of the paragraph from file
"text1.txt".
4. open the text file (e.g. text1.txt) and
process it as follows:
1. any separator is just copied in the
output file. List of symbols to be
recognized as separators: space (one
white space), comma, dot,
exclamation mark, question mark (,
.!?). You do not need to worry about
other separators. You can assume that
the file has only English letters and
the supported separators.
Make sure all separators are copied,
even if there are several consecutive
separators.
You can assume that there is NO new
III.
?
17% ال. ج
-T
I
12:40 DM 00 M2 •
Per %
2. extract a word. You can assume that
the file starts with a word (and never
with a separator). The last symbol in
the file may be a separator or a letter.
Make sure you extract the last word
correct even when it does not have a
separator after it.
3. for each extracted word do:
print it to the screen. Put two
vertical bars around it to be able to
tell if you read any extra space or
not with the word. E.g. print |Can|,
not just Can.
- search for it in the sorted dictionary
using binary search. Keep the count
of how many words were touched-
on during binary search (or how
many times the loop for binary
search executed) and print it.
If in verbose mode, print the
dictionary words that were used
during binary search.
- If the word is found, it means the
spelling is correct. Write it to the
output file.
- If the word is not found, identify
the most similar words in the
dictionary and give these options to
the user as to what correction to be
used for this word in the output file
-1 - user will type the correct spelling
0 - leave the word as is (do not apply any correction)
list of most similar words from dictionary. The user will select a word from this list.
print the corrected word in the
output file.
5. In order to find the most similar words
in the dictionary file do:
1. compute the edit distance between
the misspelled word and all the
dictionary words. You can store all
these distances. Here you can do an
improvement and not compute the
distance to all wori 't you do NOT
have to. It is fine to
T ㅈ
distance to all words in the dictionary
even though some may clearly be too
pute the
III.
?
12:41 CM @ 00 Ml •
@ * Put will 17%
distance to all words in the dictionary
even though some may clearly be too
different (because they are too big or
too small).
2. find the smallest distance
3. print all the dictionary words that
have that edit distance. Print an
index as well to allow the user to
easily select the correct word.
6. Get the user's choice. If the choice is (-1)
the program will also allow the user to
type in a word. See sample runs.
7. Calculate the worst case time
complexity to find the most similar
words in the dictionary in case of
misspelled words. Assume there are T
misspelled words in the text file, D
words in the dictionary and that each
word can be at most MAX_LEN chars.
What is the time complexity to
compute the edit distance from each
test word to each dictionary word?
Since the word length can vary, you
should assume the worst case, that is,
assume that every test word and every
dictionary word is size MAX_LEN. What
is the o for this worst case scenario?
Give your answer as a function of T,D
and MAX_LEN. For example if T = 10
extracted words and D = 222 dictionary
words, and MAX_LEN =100 you would
assume that each of those (10+222)
words has 100 characters. Write the
time complexity at the top of your file
as a comment. (You do not need to
worry about the time to read the words
from files. Assume they are already in
memory for this calculation.)
8. Calculate the time complexity to search
for a word in the dictionary (to see if it
is correctly spelled) uniao binary
search. Assume the case: the
word is not found, an.... the words
have MAX LEN.
III.
<
Purchase answer to see full
attachment