Algorithmic Foundation of Data Science (AFDS) is an optimal course that I designed for the 3rd- and 4th-year undergraduates in computer science and mathematics. The overall set of materials is presented through 15 lectures, each of which lasts for about 1 hour. I taught AFDS at the University of Edinburgh in 2018 and 2019, and this webpage archives most materials that I used when teaching AFDS.
Data Science and Big Data Analysis is the research field that studies efficient processing and analysing massive datasets, and has received considerable attention from different research fields, including algorithms, machine learning, network sciences, probability theory and statistics. This line of research has also motivated a sequence of novel techniques for designing efficient algorithms for massive graphs, and has enormous industrial impact.
This course aims to introduce algorithmic techniques that form the foundations of processing and analysing massive data sets of various forms. In particular, the course discusses how to process massive datasets, effectively store massive datasets, design fast algorithms for massive datasets, and analyse the performance of the designed algorithm. Through various examples as well as the coursework, the students will see applications of the topics discussed in class in other research field of computer science, e.g., machine learning, and network optimisation.
Lecture 1: Introduction Slides
Lecture 2: High-Dimensional Spaces (1) Lecture Notes
Lecture 3: High-Dimensional Spaces (2) Lecture Notes
Lecture 4: High-Dimensional Spaces (3) Lecture Notes
Lecture 5: Best-Fit Subspaces and Singular Value Decomposition (1) Lecture Notes
Lecture 6: Best-Fit Subspaces and Singular Value Decomposition (2) Lecture Notes
Lecture 7: Best-Fit Subspaces and Singular Value Decomposition (3) Lecture Notes
Lecture 8: Hashing Functions Slides
Lecture 9: Data Streaming Algorithms (1) Slides
Lecture 10: Data Streaming Algorithms (2) Slides
Lecture 11: Graphs versus Matrices Lecture Notes
Lecture 12: The Cheeger Inequality Lecture Notes
Lecture 13: Spectral Clustering Slides
Lecture 14: Spectral Sparsification of Graphs (1) Lecture Notes
Lecture 15: Spectral Sparsification of Graphs (2) Lecture Notes