CSCI 3110 Dalhousie University Algorithms And Data Structures Coding Task

User Generated

FM1998

Computer Science

CSCI 3110

Dalhousie University

CSCI

Description

Explanation & Answer length: 1000 words

Unformatted Attachment Preview

Midterm 2 posted: 9 am, 05.07.2021 due: midnight, 09.07.2021 You are not allowed to work in groups for the midterm. You can look in books and online, but you must not discuss the exam problems with anyone. If you don’t understand a question, contact Travis or one of the TAs for an explanation (but no hints). All problems are weighted the same and, for problems broken down into subproblems, all subproblems are weighted the same. As with assignments, answers the markers consider hard to read will get 0! 1. Suppose your class is on a field trip to an island with n towns on it, connected by m townto-town buses (which run in both directions), run by c companies. Each company’s buses are a different colour, and there can be buses from two or more companies running between two towns. You have a map showing which companies run buses between which towns. The drivers have a relaxed attitude to schedules and the buses run often, so there’s no telling which buses will be arriving and leaving next. Your classmate Ryan has wandered off and got lost and you’re (somewhat reluctantly) trying to find him. You’d told him which buses the class was supposed to take during the day, and given him tickets from the appropriate companies, the same colours as the buses and stapled together in the right order. Ryan didn’t remember which towns the class was going to visit, however, so he always took the first bus he saw of the colour of the next ticket, tearing off that ticket and giving it to the driver. Design a polynomial-time dynamic-programming algorithm that, given the map of the bus routes, Ryan’s starting point and the colours of the buses he took, calculates the probability Ryan is in each of the n towns. For example, if the map is as shown below and Ryan started in town A and took a red bus, a green bus, a blue bus and another green bus, then he could have gone A-B-C-D-B, A-B-D-A-C, A-B-D-C-A, A-B-D-C-B, A-C-A-B-C, A-C-A-B-D, A-C-A-D-B, A-C-B-A-C . The probability Ryan went first from A to B is 0.5, and the probability he went first from A to C is 0.5. Therefore, after one trip, the probability he was in B is 0.5 and the probability he was in C is 0.5. The probability Ryan’s second trip took him from B to C is 0.5 times the probability he was in B, or 0.25. The probability it took him from B to D is also 0.5 times the probability he was in B, or 0.25. The probability it took him from C to A is 0.5 times the probability he was in C, or 0.25. The probability it took him from C to B is 0.5 times the probability he 1 was in C, or 0.25. Therefore, after two trips, the probability is 0.25 he was in any particular town. The probability Ryan’s third trip took him from A to B is 0.5 times the probability he was in A, or 0.125. The probability it took him from A to D is 0.5 times the probability he was in A, or 0.125. The probability it took him from B to A is the probability he was in A, or 0.25. The probability it took him from C to D is the probability he was in C, or 0.25. The probability it took him from D to A is 0.5 times the probability he was in D, or 0.125. The probability it took him from D to C is 0.5 times the probability he was in D, or 0.125. Therefore, after three trips, the probabilities Ryan was in A, B, C, D are, respectively, 0.25 + 0.125 = 0.375 (the probability his third trip took him from B to A plus the probability it took him from D to A), 0.125, 0.125 and 0.125 + 0.25 = 0.375 (the probability his third trip took him from A to D plus the probability it took him from C to D). You can compute the probability of Ryan being in a particular town after four trips and five trips similarly. Isn’t it lucky you’re on an island, so you can’t accidentally lose Ryan forever? (Hint: first design an algorithm that computes the number of ways Ryan could have ended up in a town, and then modify it to compute the probability.) B A D C 2. Your professor Travis told your TA Sarah that he was going to ask you to modify the code in the lecture notes for computing edit distance, to compute an optimal alignment using only one pass through the matrix.* (The question on Assignment 5 allows filling in the matrix and then walking back from the bottom right corner to the top left corner to compute the alignment.) Travis claimed it was possible by keeping two more arrays, B[0..m, 0..n] and C[0..m, 0..n], where B[i, j] is a pointer to a string of length O(m + n) containing the top line in an optimal alignment of S[1..i] to T [1..j], and C[i, j] is a pointer to a string of length O(m+n) containing the bottom line in that alignment. To compute B[i, j], we sprintf into an empty string either B[i − 1, j] or B[i − 1, j − 1] or B[i, j − 1], followed by either S[i] or ‘-’. To compute C[i, j], we sprintf into an empty string either C[i − 1, j] or C[i − 1, j − 1] or C[i, j − 1], followed by either T [j] or ‘-’. Sarah correctly pointed out that sprintfing a string of length O(m + n) takes O(m + n) time, so Travis’s solution takes cubic time. Help Travis by figuring out how to modify his solution to use quadratic time again. * Yes, this question is based on a true story. 2 (Hint: pointers are your friends!) Bonus (worth 10% of the midterm): Can you reduce the space usage to O((m + n)k), assuming you’re given the edit distance k between S and T ? (Hint: banded dynamic programming and garbage collections are your friends!) 3. For the problem Longest Kind-Of Increasing Subsequence (LKOIS), we’re given a sequence S[1..n] of integers and asked to find the longest subsequence S 0 of S such that S 0 [i − 1] − 3 ≤ S 0 [i] for 1 < i ≤ |S 0 |. Give an O(n log n) algorithm for LKOIS. 4. For the problem Partition, we’re given a set S of positive integers that sum to 2t and asked if there is a subset of S that sums to exactly t. Prove Partition is NP-complete by ˆ showing Partition is in NP, ˆ reducing one of the NP-complete problems we’ve seen in class to Partition. 5. Write a program that, given a list of the edges in a connected graph G on the vertices 1, . . . , n, in polynomial time outputs a Boolean formula F that is satisfiable if and only if G has a Hamiltonian path. You can assume the list of edges looks something like (1, (1, (4, (6, (5, 2) 3) 2) 5) 3) with one pair per line, and your output should consist of a single line containing copies of space, (, ), AND, OR, NOT and variables that look something like x1, x2, etc. 3 CSCI 3110: Design and Analysis of Algorithms May 4th, 2020 Lecture 1a Lecturer: Travis Gagie Scribe: Travis Gagie We start the course with 5 key ideas: • computer scientists study computations, not necessarily computers; • you can do computations and computer science without a computer; • two problems can seem similar while one is easy and the other is hard; • two problems can seem unrelated while being somehow equivalent; • “basic” techniques — such as divide and conquer — can be applied to hard problems. Computers perform computations and computations are performed by computers, but when Turing wrote about computers in the 1930s, he meant people. These days people use computers to do lots of things without knowing or caring what computations are being carried out, so let us consider a computational problem that predates electronic computers. 1 Eulerian cycles In the 1700s, the citizens of Königsberg in Prussia wrote to Leonhard Euler, asking him if someone could walk across each of the bridges in the city exactly once and arrive back where they started. In the process of showing it was impossible, Euler invented graph theory. If we model the city as a graph, with the shores and islands as vertices and the bridges as edges as shown in Figure 1, then such a walk corresponds to a cycle in the graph that crosses each edge once. Obviously the graph must be connected, which it is in this example; also, for every arrival at a vertex there must be a matching departure, and vice versa, so the degree of every vertex must be even, which is not true here. It is easy to see that the two conditions mentioned above are necessary, and Euler showed that, at least for finite graphs, they are also sufficient. Definition 1 An Eulerian cycle of a graph is a cycle that contains each edge exactly once. A graph is Eulerian if it contains an Eulerian cycle. Theorem 2 A finite graph is Eulerian if and only if it is connected and all its vertices’ degrees are even. Proof Suppose a finite graph meets the conditions and we start at a vertex of the graph and start walking, never crossing the same edge twice. Either we eventually arrive back where we started, or we get stuck at some other vertex with no way to leave. To see that the second case is impossible, suppose that whenever we cross an edge, we remove it from the graph, reducing the degree of the vertex we just left by 1 and the degree of the vertex we just arrived at by 1. Since all the degrees are initially even, after we leave the vertex 1a-1 Figure 1: A map of Königsberg with the bridges shown in green (left) and the corresponding graph (right). (Both images from https://en.wikipedia.org/wiki/Seven_Bridges_ of_Konigsberg.) where we start and until we return to it, its degree is odd and the degree of the vertex we are currently visiting is odd. Since 0 is even, it follows that we always have an edge to leave across, at least until we return to where we started. When we return to where we started the graph may be disconnected but, as long as there are edges, we can again choose a starting point and walk until we return to it. Repeating this until there are no edges left in the graph, we decompose the graph into edge-disjoint cycles. We can take any of the cycles in our current collection, walk around it until we reach a vertex it shares with another cycle, walk around that other cycle completely, and then continue walking around the first cycle from the shared vertex. This gives us a single cycle containing all the edges of those two cycles, so we can replace them with it. Repeating this, eventually we obtain a single cycle, which crosses each edge in the graph exactly once. Figure 2 shows an example. By essentially the same arguments, Theorem 2 can be extended to directed cycles in directed graphs: Theorem 3 A finite directed graph is Eulerian if and only the underlying undirected graph is Eulerian and each vertex’s in-degree and out-degree are equal. Eulerian cycles are not just a mathematical curiousity: for example, bioinformaticians build Eulerian cycles of de Bruijn for de novo assembly of genomes. Early sequencing and assembly of the COVID-19 genome was essential in the development of tests for infection, without which the world would be in an even worse situation right now. To sequence a genome, a sample is put through a sequencing machine, which produces fragments of DNA called reads. Most sequencing machines produce short reads, which are about 150 or 200 characters long, while some now produce long reads, which can be thousands of characters long but are much less accurate. In comparison, the COVID-19 genome is about 30 thousand characters long and the human genome is about 3 billion characters long. 1a-2 To help you see how building Eulerian cycles can help bioinformaticians piece the reads together to get the whole genome, your first homework problem is “Tanya and Password” from Codeforces (https://codeforces.com/problemset/problem/508/D): While dad was at work, a little girl Tanya decided to play with dad’s password to his secret database. Dad’s password is a string consisting of n + 2 characters. She has written all the possible n three-letter continuous substrings of the password on pieces of paper, one for each piece of paper, and threw the password out. Each three-letter substring was written the number of times it occurred in the password. Thus, Tanya ended up with n pieces of paper. Then Tanya realized that dad will be upset to learn about her game and decided to restore the password or at least any string corresponding to the final set of three-letter strings. You have to help her in this difficult task. We know that dad’s password consisted of lowercase and uppercase letters of the Latin alphabet and digits. Uppercase and lowercase letters of the Latin alphabet are considered distinct. Actually, because Dad’s password need not end with the same three-letter substring it starts with, what we are looking for in this problem is an Eulerian path or Eulerian tour — a walk that crosses each edge once but may end at a different vertex that where it started — not necessarily and Eulerian cycle. In particular, we are looking for an Eulerian path in a de Bruijn graph, which is a graph in which the vertices are tuples of characters and there can be an edge from one vertex to another only if it is possible to delete the leftmost character from the tuple for the first vertex, add a character on the right, and obtain the tuple for the second vertex. If a directed graph has an Eulerian path then we need add at most one edge such that it contains an Eulerian cycle: either each vertex’s in-degree is equal to its out-degree, or there is one vertex with 1 more in-edge than out-edge and one vertex with 1 more out-edge than in-edge; we add an edge from the first vertex to the second one. Figure 3 shows the graph for the first example instance on the Codeforces page, abacaba, together with the edge we must add to make the graph Eulerian. Note that the vertices are pairs, because the three-letter substrings in the problem correspond to edges that move us from pair to pair! Once we find an Eulerian cycle in the graph with the extra edge inserted, we can delete that edge and obtain an Eulerian path. A reasonably efficient implementation of Euler’s construction will solve the Codeforces problem in linear time, but if we change the problem just a little — asking that the cycle visit each vertex exactly once instead of cross each edge exactly once — then no one knows a polynomial-time algorithm for it, and we strongly suspect no such algorithm exists. Generally we consider an algorithm to be efficient if it takes time polynomial in the size of its input (even if sometimes that definition is too loose to be useful in practice). 2 Hamiltonian cycles A Hamiltonian cycle of a graph is a cycle that visits each vertex exactly once: for example, see http://dm.compsciclub.ru/app/quiz-hamiltonian-cycle. The problem of finding 1a-3 b c a d Euler cycle: g abcdecgefbgfa f e b b c a c d g g e f cdec abgfa e f bcgefb abcgefbgfa abcdecgefbgfa Figure 2: A decomposition of a graph into edge-disjoint cycles, and how they can be combined into an Euler cycle. AB BA CA AC Figure 3: The graph for the first example from the Codeforces page, with the dashed line at the top showing the edge we add to make the graph Eulerian. The Eulerian path visits AB, BA, AC, CA, AB again, and BA again, and corresponds to the strings ABACABA. 1a-4 a Hamiltonian cycle is not a mathematical curiousity either, since it is a special case of the Travelling Salesperson (TSP) problem, which is of practical interest to logisticians. For TSP, we are given a map of some cities and roads, modelled as an edge-weighted graph, and asked to plan a route for a salesperson to visit each city and return home while travelling the minimum distance, by finding a cycle that visits each vertex exactly once and has the minimum possible total edge-weight. If all the edge-weights are 1 then finding a cycle with total edge-weight equal to the number of cities is equivalent to finding a Hamiltonian cycle. Therefore, if we cannot find a Hamiltonian cycle, then we cannot solve TSP optimally in general. Later in the course we will see that the reverse holds as well. Just because we cannot find Hamiltonian cycles in general, does not mean there is no reason to study the problem of doing so. Sometimes it is easy for us to tell in polynomial if a Hamiltonian cycle does not does not exists — when the graph is a tree, for example — and sometimes, even if we cannot find a polynomial-time algorithm, we can find one that is better than just trying all the possible cycles. For example, we believe there is no polynomial-time algorithm to determine if there is a Hamiltonian cycle in a graph even when the graph is restricted to be planar — meaning it can be drawn in the plane with no edges crossing each other — and the maximum degree is 3. m The obvious algorithm is to try each of the n possible subsets of n of the m edges, where n is the number of vertices, and see if it is a Hamiltonian cycle. Since each edge has 2 endpoints and there are at most 3n endpoints in the graph, 2m ≤ 3n and so the there are at most 23n/2 ≤ 2.83n subsets of the edges in total. However, since the graph is planar, there is a theorem that tells us that we can find a planar separator in quadratic time: that √ is, a subset of at most 2 n vertices such that removing them breaks the graph into pieces each of size at most 2n/3. If we take such a separator and consider all the subsets of edges incident to its vertices which might not be in the Hamiltonian cycle, and recurse on the pieces — not looking for Hamiltonian cycles of the pieces but for subsets of edges that visit all their vertices exactly once and join up correctly with the edges we have chosen√incident to the separator — then we get a divide-and-conquer algorithm that takes 2O( n) time, which is asymptotically faster than an algorithm taking something like 2.83n time. 3 Summary So far, we have discussed the first three of the key ideas we started with, and touched on the fifth: • computer scientists study computations, not necessarily computers; • you can do computations and computer science without a computer; • two problems can seem similar while one is easy and the other is hard; • two problems can seem unrelated while being somehow equivalent; • “basic” techniques — such as divide and conquer — can be applied to hard problems. In the next half-lecture, we will talk more about the divide-and-conquer approach to algorithm design, and see how to use it to solve the problem of finding the maximum-sum 1a-5 subarray in an array, which we can do efficiently (in polynomial time). We will talk about reductions between that problem and a variant of it, and them return to discussing hard problems, time time graph colouring: http://dm.compsciclub.ru/app/quiz-map-coloring. To drive home the idea that figuring out what makes problems hard is itself a hard and interesting problem, consider Figure 4. Actually, though, the difference between finding an Eulerian cycle and finding a Hamiltonian cycle is somehow more subtle than what is considered in that cartoon. Determining whether a photo is of a bird is a “messy” problem, in that there may not be enough information to judge, people may differ on what they accept as a bird, etc. Finding cycles in graphs is clean and well-defined, and the division between easy and hard here is something deep and interesting. Figure 4: We will consider more interesting separations between “easy” and “hard”. (Image from https://xkcd.com/1425/.) 1a-6 CSCI 3110: Design and Analysis of Algorithms May 6th, 2020 Lecture 1b: The Maximum Sub Subarray Problem Lecturer: Travis Gagie Scribes: A. Ahmed, B. Zhao, F. Fei, K. McDonald-Landry, M. Rajendran, O. Dechant 1 Introduction to the Problem Initially, the maximum subarray problem was surveyed by Ulf Grenander around 1980s in digital images for estimation of some patterns. To make it simple, there is a one-dimension array that may include either negative or positive numbers. This problem needs to find at least one contiguous subarray or a singleton subarray which contains the maximum sum. Hence, there are 3 conditions we may consider: 1. If the numbers of the array a...
Purchase answer to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

View attached explanation and answer. Let me know if you have any questions.Let me know if you need any edits :)

DATA STRUCTURE AND ALGORITHMS MIDTERM
1. Suppose your class is on a field trip to an island with n towns on it, connected by m town- totown buses (which run in both directions), run by c companies. Each company’s buses are a
different colour, and there can be buses from two or more companies running between two towns.
You have a map showing which companies run buses between which towns. The drivers have a
relaxed attitude to schedules and the buses run often, so there’s no telling which buses will be
arriving and leaving next.
Your classmate Ryan has wandered off and got lost and you’re (somewhat reluctantly) trying to find
him. You’d told him which buses the class was supposed to take during the day, and given him
tickets from the appropriate companies, the same colours as the buses and stapled together in the
right order. Ryan didn’t remember which towns the class was going to ...


Anonymous
Really great stuff, couldn't ask for more.

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4
Similar Questions
Related Tags