# **Data Structures and Algorithms (Fall 2024)** --- **Instructor:** [郑朝栋](https://chaodong.me) **Class meeting:** Monday 10:10-12:00 at 仙Ⅱ-404, and Thursday 10:10-12:00 at 仙Ⅱ-404 --- ## <b style="color:navy">--==News==--</b> --- * <b style="font-family:monospace;color:black">2023/09/02:</b> First class meeting, the semester begins! * <b style="font-family:monospace;color:black">2023/11/11:</b> The midterm exam will be on Nov 14, from 10:10 to 12:10. The exam will be closed-book. The venue will be 仙1-109. * <b style="font-family:monospace;color:purple">2023/12/23:</b> The final exam will be on Dec 30, from 10:30 to 12:30. The exam will be closed-book. The venue will be 仙1-109. --- ## **Course Overview** --- ### Description This course introduces to you the basic data structures and algorithms that lie at the core of computer science, along with the various techniques used in the design and analysis of these data structures and algorithms. Topics covered include basic data structures (list, stack, queue, heap, ...), sorting and order statistics, searching (search trees, hashing, ...), graph algorithms (graph traversal, minimum spanning tree, shortest path, network flow, ...), amortized analysis, algorithm design techniques (divide and conquer, greedy, dynamic programming, ...). Depending on time and pace, some advanced topics will also be covered. ### Textbook and References The required textbook for this course is [*Introduction to Algorithms (Third Edition)*](https://mitpress.mit.edu/books/introduction-algorithms-third-edition) by **C**ormen, **L**eiserson, **R**ivest and **S**tein. (Now you can see why people often call this book **CLRS**.) I strongly encourage you to read the [original English version](https://book.douban.com/subject/3904676/) instead of the [translated Chinese version](https://book.douban.com/subject/20432061/). Most of the topics and examples covered in this course will be adapted from the above textbook. However, there are certain topics and/or examples that are not included in CLRS, or cases where other materials provide better presentation. In such scenario, I will point out additional references in the slides. ### Grading Grades in this course will be based on problem sets, programming assignments, a midterm exam, and a final exam. Problem sets (henceforth called PS) will be given out regularly: most likely after the second class of each week. Each PS dues in one week, then we will grade it and return it to you another week later. We will also try to provide a sample solution for each problem set. Programming assignments (henceforth called PA) will be given to you every so often, each associated with a deadline. A programming assignment usually asks you to design and implement algorithms and/or data structures to solve some well-defined problems. Your code will be checked (specifically, compiled and executed) by an online judge (henceforth called OJ), and you get all credits so long as you pass all the tests the OJ system throws at your code. Notice, however, besides correctness, the OJ system will also pose requirements on efficiency (for example, runtime and memory usage). The two exams will most likely be closed-book, but this might change. The final grade will be a weighted sum: 20% from PS, 10% from PA, 25% from midterm exam, and 45% from final exam. Again, these percentages might change. --- ## **Course Logistics** --- ### Office Hour and Communication Regular office hours will be held from 15:30 to 17:00 on Thursday in my office (计算机楼623). You can also email me at [chaodong@nju.edu.cn](mailto:chaodong@nju.edu.cn). There is a QQ group (838684360) where you can discuss amongst yourselves. The teaching assistants and I will also visit it regularly. So, go join it! ### Teaching Assistant There will be several teaching assistants (TA) for our class, see below for their details (listed in alphabetical order). They will primarily be responsible for PS and PA grading. If there is enough demand, they will also do tutorial sessions to discuss PS and PA. * 白思睿, [srbai@smail.nju.edu.cn](mailto:srbai@smail.nju.edu.cn). * 陈铭, [221300059@smail.nju.edu.cn](mailto:221300059@smail.nju.edu.cn). * 傅心语, [xyfu@smail.nju.edu.cn](mailto:xyfu@smail.nju.edu.cn). * 谢云天, [1249761356@qq.com](mailto:1249761356@qq.com). ### Online Judge As mentioned previously, we will use an OJ system to grade PA. The URL of the OJ system is [oj.chaodong.me](https://oj.chaodong.me). Please visit the site and register an account. **During registration, choose "NJU AI DS&Alg Class - Fall 2024" as the organization. You should register only one account and use your NJU ID (学号) as username.** Extra accounts will be removed. Once registered, go join the sample contest and solve the simple "A Plus B" problem to get yourself familiar the OJ system. The OJ system currently supports C, C++, and Python. Nontheless, we encourage (and often enforce) you to only use C or C++. <figure style="padding-top: 5px; padding-bottom: 5px"> <img src="./pics/oj-reg.jpg" style="width:30%; display:block; margin-left:auto; margin-right:auto; border:1px solid black"> <figcaption style="text-align:center; font-style:italic; font-size:small">Registration page of the OJ system.</figcaption> </figure> ### Typeset Your PS Solutions Instead of handwriting the solutions for PS, we encourage you to use [$\rm\LaTeX$](https://www.latex-project.org/). (You are free to write in Chinese or in English though.) LaTeX is a high-quality typesetting system; it includes features designed for the production of technical and scientific documentation. LaTeX is the de facto standard for the communication and publication of scientific documents. You can follow instructions on [this page](https://www.latex-project.org/get/) to get LaTeX installed onto your system. For example, if you are using a Windows PC, then you can get the [MiKTeX](https://miktex.org/) distribution. I also recommand using [TeXmaker](https://www.xm1math.net/texmaker/) or [TeXstudio](https://www.texstudio.org/) as an integrated writing environment for creating LaTeX documents. There are also online LaTeX services like [Overleaf](https://www.overleaf.com/). It takes some time to learn to use LaTeX, but it is not that hard. For example, "The Not So Short Introduction to $\rm\LaTeX$ $2\varepsilon$", or "lshort" in short, is a popular tutorial to get you started. You can find it [here](https://www.ctan.org/tex-archive/info/lshort/), it is in [English](http://mirrors.ctan.org/info/lshort/english/lshort.pdf), [Chinese](http://mirrors.ctan.org/info/lshort/chinese/lshort-zh-cn.pdf), and many other languages. If you need to write pseudocode in LaTeX documents, consider using the [algorithmicx](https://ctan.org/pkg/algorithmicx) package or the [algorithm2e](https://ctan.org/pkg/algorithm2e) package. <figure style="padding-top: 5px; padding-bottom: 5px"> <img src="./pics/texstudio-screenshot.jpg" style="width:50%; display:block; margin-left:auto; margin-right:auto; border:1px solid black"> <figcaption style="text-align:center; font-style:italic; font-size:small">A screenshot of TeXstudio. I'm using LaTeX to typeset the sample solution of a PS.</figcaption> </figure> --- ## **Academic Integrity** --- I take academic integrity seriously. You should always try to solve PS and PA independently. You may discuss with me or the TAs. You may also discuss with your classmates if you really need to, but you **must** list their names when you hand in your solutions. You **may not** search and/or copy-paste existing solutions, regardless of the source of the solutions (including AI such as ChatGPT). Violating these rules will lead to a zero on the corresponding assignment and potential reporting to the university. When in doubt, ask me what is allowed. --- ## **Schedule** --- <b style="color:purple">This section will be frequently updated as the semester progresses. Slides, PS, and PA will all be posted/announced here. So, remember checking back regularly!</b> * <b style="font-variant:small-caps;font-size:1.2em">Sep 02:</b> course overview [\[slides ver.0903\]](./slides/00-intro-fall24.pptx), algorithm analysis basics (correctness of algorithms, complexity of algorithms). * <b style="font-variant:small-caps;font-size:1.2em">Sep 05:</b> algorithm analysis basics (asymptotic notations) [\[slides ver.0907\]](./slides/01-alg-analysis-basics-fall24.pptx), basic data structures (ADT, various queue ADT, list ADT, array based implementation, linked-list based implementation). - <u style="color:darkcyan">Problem Set 1</u>, due September 12 23:59:59 (UTC+8). Email your solution to [ai-dsalg-ps@chaodong.me](mailto:ai-dsalg-ps@chaodong.me). The subject of your email should be PS1-XXXXXXXXX, where XXXXXXXXX should be replaced with your actual NJU ID. [\[file\]](./assignments/ps1.pdf) * <b style="font-variant:small-caps;font-size:1.2em">Sep 09:</b> applications of stack, recursion versus iteration (tail recursion) [\[slides ver.0913\]](./slides/02-basic-ds-fall24.pptx), divide-and-conquer (basic idea, correctness, mergesort). * <b style="font-variant:small-caps;font-size:1.2em">Sep 12:</b> divide-and-conquer (integer multiplication, matrix multiplication, recurrence-tree method, substitution method, the master theorem) [\[slides ver.0913\]](./slides/03-divide-and-conquer-fall24.pptx), reduce-and-conquer (binary search). - <u style="color:darkcyan">Problem Set 2</u>, due September 19 23:59:59 (UTC+8). Email your solution to [ai-dsalg-ps@chaodong.me](mailto:ai-dsalg-ps@chaodong.me). The subject of your email should be PS2-XXXXXXXXX, where XXXXXXXXX should be replaced with your actual NJU ID. [\[file\]](./assignments/ps2.pdf) * <b style="font-variant:small-caps;font-size:1.2em">Sep 14:</b> reduce-and-conquer (1D peak finding, 2D peak finding) [\[slides ver.0916\]](./slides/03-divide-and-conquer-part2-fall24.pptx), heaps (binary heap, priority queue). * <b style="font-variant:small-caps;font-size:1.2em">Sep 19:</b> heaps (heapsort) [\[slides ver.0919\]](./slides/04-heaps-fall24.pptx), sorting (the problem, characteristics of sorting algorithms, selection sort, bubble sort, shell sort, quick sort). - <u style="color:darkcyan">Problem Set 3</u>, due September 26 23:59:59 (UTC+8). Email your solution to [ai-dsalg-ps@chaodong.me](mailto:ai-dsalg-ps@chaodong.me). The subject of your email should be PS3-XXXXXXXXX, where XXXXXXXXX should be replaced with your actual NJU ID. [\[file\]](./assignments/ps3.pdf) * <b style="font-variant:small-caps;font-size:1.2em">Sep 23:</b> sorting (quick sort, randomized quicksort) [\[slides ver.0923\]](./slides/05-sorting-fall24.pptx). * <b style="font-variant:small-caps;font-size:1.2em">Sep 26:</b> computational complexity basics (notion of upper bound and lower bound, adversary argument, decision tree model, lower bound for comparsion-based sorting), sorting {part2} (linear time sorting algorithms, bucket sort, radix sort) [\[slides ver.0927\]](./slides/05-sorting-part2-fall24.pptx). * <b style="font-variant:small-caps;font-size:1.2em">Sep 30:</b> selection (lower bounds for finding min and/or max, randomized selection in expected linear time, deterministic selection in worst-case linear time) [\[slides ver.0930\]](./slides/06-selection-fall24.pptx), trees (terminology, representing trees in computers). - <u style="color:darkcyan">Problem Set 4</u>, due October 10 23:59:59 (UTC+8). Email your solution to [ai-dsalg-ps@chaodong.me](mailto:ai-dsalg-ps@chaodong.me). The subject of your email should be PS4-XXXXXXXXX, where XXXXXXXXX should be replaced with your actual NJU ID. [\[file\]](./assignments/ps4.pdf) * <b style="font-variant:small-caps;font-size:1.2em">Oct 10:</b> Tutorial session. * <b style="font-variant:small-caps;font-size:1.2em">Oct 12:</b> trees (various tree traversals) [\[slides ver.1012\]](./slides/07-trees-fall24.pptx), search trees (Set/OrderedSet ADT, definition of binary-search-tree, various operations on binary-search-tree). * <b style="font-variant:small-caps;font-size:1.2em">Oct 14:</b> search trees (treaps) [\[slides ver.1020\]](./slides/08-search-trees-fall24.pptx), search trees {part2} (red-black trees, insertion and deletion in red-black trees). * <b style="font-variant:small-caps;font-size:1.2em">Oct 17:</b> search trees {part2} (skiplists) [\[slides ver.1020\]](./slides/08-search-trees-part2-fall24.pptx), hashing (direct-address tables, notion of hashing, collisions in hashing, hashing with chaining). - <u style="color:darkcyan">Problem Set 5</u>, due October 31 23:59:59 (UTC+8). Email your solution to [ai-dsalg-ps@chaodong.me](mailto:ai-dsalg-ps@chaodong.me). The subject of your email should be PS5-XXXXXXXXX, where XXXXXXXXX should be replaced with your actual NJU ID. [\[file\]](./assignments/ps5.pdf) - <u style="color:darkgreen">Programming Assignment 1 (4 Problems)</u>: [\[link\]](https://oj.chaodong.me/contest/pa1fall24dsalg) The submission deadline is November 10 23:59:59 (UTC+8). *(Notice: you need to join the "NJU AI DS&Alg Class - Fall 2024" organization to see the problems.)* * <b style="font-variant:small-caps;font-size:1.2em">Oct 21:</b> hashing (hash function design, universal hashing) [\[slides ver.1023\]](./slides/09-hashing-fall24.pptx), hashing {part2} (hashing with open addressing). * <b style="font-variant:small-caps;font-size:1.2em">Oct 24:</b> hashing {part2} (hashing with open addressing, perfect hashing) [\[slides ver.1024\]](./slides/09-hashing-part2-fall24.pptx), amortized analysis (notion of amortized analysis). * <b style="font-variant:small-caps;font-size:1.2em">Oct 27:</b> amortized analysis (accounting method, potential method) [\[slides ver.1027\]](./slides/10-amortized-analysis-fall24.pptx), disjoint sets (DisjointSet ADT, linked-list based implementation, rooted-tree based implementation). - <u style="color:darkcyan">Problem Set 6</u>, due November 07 23:59:59 (UTC+8). Email your solution to [ai-dsalg-ps@chaodong.me](mailto:ai-dsalg-ps@chaodong.me). The subject of your email should be PS6-XXXXXXXXX, where XXXXXXXXX should be replaced with your actual NJU ID. [\[file\]](./assignments/ps6.pdf) * <b style="font-variant:small-caps;font-size:1.2em">Nov 04:</b> disjoint sets (rooted-tree based implementation, union-by-size, union-by-height, union-by-rank, path-compression, performance analysis with union-by-rank and path-compression) [\[slides ver.1109\]](./slides/11-disjoint-sets-fall24.pptx). * <b style="font-variant:small-caps;font-size:1.2em">Nov 07:</b> graph representation and graph traversal (adjacency matrix, adjacency list, BFS, DFS). * <b style="font-variant:small-caps;font-size:1.2em">Nov 11:</b> graph representation and graph traversal (DFS) [\[slides ver.1111\]](./slides/12-graph-representation-and-traversal-fall24.pptx), applications of DFS (topological sort). - <u style="color:darkcyan">Problem Set 7</u>, due November 21 23:59:59 (UTC+8). Email your solution to [ai-dsalg-ps@chaodong.me](mailto:ai-dsalg-ps@chaodong.me). The subject of your email should be PS7-XXXXXXXXX, where XXXXXXXXX should be replaced with your actual NJU ID. [\[file\]](./assignments/ps7.pdf) * <b style="font-variant:small-caps;font-size:1.2em">Nov 14:</b> <b style="color:red">Midterm Exam</b>. The exam will be closed-book, from 10:10 to 12:10. The venue will be 仙1-109. * <b style="font-variant:small-caps;font-size:1.2em">Nov 18:</b> Midterm exam review. * <b style="font-variant:small-caps;font-size:1.2em">Nov 21:</b> applications of DFS (strongly connected components, Tarjan's SCC algorithm) [\[slides ver.1122\]](./slides/13-dfs-application-fall24.pptx). - <u style="color:darkcyan">Problem Set 8</u>, due November 28 23:59:59 (UTC+8). Email your solution to [ai-dsalg-ps@chaodong.me](mailto:ai-dsalg-ps@chaodong.me). The subject of your email should be PS8-XXXXXXXXX, where XXXXXXXXX should be replaced with your actual NJU ID. [\[file\]](./assignments/ps8.pdf) * <b style="font-variant:small-caps;font-size:1.2em">Nov 25:</b> minimum spanning trees (the cut property, Kruskal's algorithm, Prim's algorithm, Borůvka's algorithm) [\[slides ver.1125\]](./slides/14-mst-fall24.pptx), greedy algorithms (elements of the greedy strategy, activity selection problem). * <b style="font-variant:small-caps;font-size:1.2em">Nov 28:</b> greedy algorithms (activity selection problem, fractional knapsack and 0-1 knapsack, Huffman code). - <u style="color:darkcyan">Problem Set 9</u>, due December 05 23:59:59 (UTC+8). Email your solution to [ai-dsalg-ps@chaodong.me](mailto:ai-dsalg-ps@chaodong.me). The subject of your email should be PS9-XXXXXXXXX, where XXXXXXXXX should be replaced with your actual NJU ID. [\[file\]](./assignments/ps9.pdf) * <b style="font-variant:small-caps;font-size:1.2em">Dec 02:</b> greedy algorithms (Huffman code, set cover) [\[slides ver.1202\]](./slides/15-greedy-alg-fall24.pptx), single-source shortest path (the SSSP problem, BFS for unit weight graphs, Dijkstra's algorithm for positive weight graphs). * <b style="font-variant:small-caps;font-size:1.2em">Dec 05:</b> single-source shortest path (Bellman-Ford algorithm for arbitrary weight graphs, topological sort variant for DAG, A* algorithm for pathfinding) [\[slides ver.1207\]](./slides/16-sssp-fall24.pptx). - <u style="color:darkcyan">Problem Set 10</u>, due December 12 23:59:59 (UTC+8). Email your solution to [ai-dsalg-ps@chaodong.me](mailto:ai-dsalg-ps@chaodong.me). The subject of your email should be PS10-XXXXXXXXX, where XXXXXXXXX should be replaced with your actual NJU ID. [\[file\]](./assignments/ps10.pdf) * <b style="font-variant:small-caps;font-size:1.2em">Dec 09:</b> all-pairs shortest path (Johnson's algorithm, APSP through recursion/dynamic-programming, Floyd-Warshall algorithm, transitive closure) [\[slides ver.1210\]](./slides/17-apsp-fall24.pptx), dynamic programming (rod-cutting problem). * <b style="font-variant:small-caps;font-size:1.2em">Dec 12:</b> dynamic programming (rod-cutting problem, elements of dynamic programming, matrix-chain multiplication, edit distance, maximum independent set of trees). - <u style="color:darkcyan">Problem Set 11</u>, due December 22 23:59:59 (UTC+8). Email your solution to [ai-dsalg-ps@chaodong.me](mailto:ai-dsalg-ps@chaodong.me). The subject of your email should be PS11-XXXXXXXXX, where XXXXXXXXX should be replaced with your actual NJU ID. [\[file\]](./assignments/ps11.pdf) * <b style="font-variant:small-caps;font-size:1.2em">Dec 16:</b> dynamic programming (maximum independent set of trees, discussion, subset sum) [\[slides ver.1216\]](./slides/18-dp-fall24.pptx), computability and complexity (computability/decidability, the halting problem). * <b style="font-variant:small-caps;font-size:1.2em">Dec 19:</b> computability and complexity (non-deterministic Turing machine, P, NP) [\[slides ver.1220\]](./slides/19-complexity-fall24.pptx). - <u style="color:darkcyan">Problem Set 12</u>, due December 29 23:59:59 (UTC+8). Email your solution to [ai-dsalg-ps@chaodong.me](mailto:ai-dsalg-ps@chaodong.me). The subject of your email should be PS12-XXXXXXXXX, where XXXXXXXXX should be replaced with your actual NJU ID. [\[file\]](./assignments/ps12.pdf) - <u style="color:darkgreen">Programming Assignment 2 (5 Problems)</u>: [\[link\]](https://oj.chaodong.me/contest/pa2fall24dsalg) The submission deadline is January 12 23:59:59 (UTC+8). *(Notice: you need to join the "NJU AI DS&Alg Class - Fall 2024" organization to see the problems.)* * <b style="font-variant:small-caps;font-size:1.2em">Dec 23:</b> NP completeness (polynomial reduction, NP-hard, NPC, SAT, 3-SAT, Clique) [\[slides ver.1223\]](./slides/20-npc-fall24.pptx), network flow [\[slides ver.1223\]](./slides/21-network-flow-fall24.pptx). * <b style="font-variant:small-caps;font-size:1.2em">Dec 26:</b> Tutorial session. * <b style="font-variant:small-caps;font-size:1.2em">Dec 30:</b> <b style="color:red">Final Exam</b>. The exam will be closed-book, from 10:30 to 12:30. The venue will be 仙1-109. Good luck and happy new year!