LSA Institute 2015: Intro to Computational Linguistics


Lab 3 is now posted. Please check that you can run the RD parser app before next lab session..

The assignment is now posted, with an updated due date of Mon 27 July

Course staff

Instructor: Sharon Goldwater, University of Edinburgh
You can email me (sgwater) here:
Office hours: Wed 2-4pm at Plein Air Cafe

Teaching assistant: Jackson Lee, University of Chicago
available as (jsllee) here:
Office hours: Mon and Wed 10:30-12:00 in the Karen Landahl Linguistics Research Center in the basement of Social Sciences Research Building (next to Harper).

Course Information

This course provides an overview of the main methods and algorithms used in computational linguistics, motivated by some examples of questions they can be used to investigate. We will cover the basics of: information theory (entropy and mutual information), n-gram models (for computing the probabilities of phone or word sequences), finite-state automata and hidden Markov models, parsing algorithms, and distributional semantic models. In addition to lectures, we will include some hands-on labs in Python to help students gain practical experience with some of these concepts.

Required textbook:

Speech and language processing, 2nd edition, by D. Jurafsky and J.H. Martin.

If purchasing your own copy, you may want to look for the paperback Internation Edition from Pearson.

There is also a copy of the textbook on 2-hour reserve at Regenstein Library (the main library of the university). You can request it under the course number LING 70000 or the call number P98.J87 2009.


Students should have some previous programming experience in Python and be familiar with the basics of probability theory, more or less what is covered by these notes.

Labs and required software:

Lab questions and support code will be available from the class schedule page.

Please see the software page to make sure you have installed appropriate versions of Python and modules needed for this course.

Assignments and grades:

Grades will be determined based on class attendance (20%), lab participation (30%), and one homework assignment due on Mon 27 July (50%).

You are free (even encouraged) to work with other students for the labs; you may also work with a partner on the assignment provided you state who did what.

Meeting times and locations:

Mondays 8:30-10:20am in Harper 140.
Thursdays 8:30-10:20am in Rosenwald 405.


Please contact me if you are interested in auditing the class.

Class Schedule and Materials

Readings, lecture slides, labs, and assignments will be posted on the class schedule page as they become available.