The upcoming NLP Masterclass will involve regular lab sessions in which you will learn the use of a range of NLP processing techniques and tools. This document tells you how to prepare your laptop or personal computer for these lab sessions by installing the necessary software and resources, and points you to some learning materials you should work through before arrival.
We recommend reading over the next section once before you start. The complete install process takes around 15 minutes assuming a reasonably good network connection (home broadband should be OK, office may be better). Please do this before the first day of the course, so we have time to help you if you have difficulties.
Note that even if you already have Python installed, we recommend that you follow the process below for the Masterclass, to avoid the possibility of hard-to-detect dependency and versioning problems between what you have pre-installed and what we expect.
Finally, unless you are already quite experienced with Python and Jupyter, as well as having some experience with natural language data handling and the mathematics that goes with it, please do go on to use you new installation to work through the first notebook and review the introductory materials linked from it, as summarised in the final section below.
These instructions should work on either a Windows PC (tested with Windows 10) or a Mac (tested with OS X). In a few places the process differs slightly depending on which—these are differentiated below by a [W] or an [M] respectively. Linux users will want to use their own distro's package manager to download/install, but should follow the Mac instructions for activating the 'class' environment.
Following these instructions involves entering commands in a terminal
window, what Windows calls a "Command Prompt" and Mac OS X a "Terminal", available
via the Launchpad. If you're not familiar with doing this, in particular with
using the cd
and mkdir
commands, please pause and
have a quick look at my quick introduction to directories and paths.
If you are refreshing an existing installation, the relevant steps are marked with an asterisk, but be sure to follow the specific instructions in the email asking you to do the refresh.
Skip this step if you already have a Python 3.6 Anaconda (or Miniconda) installed.
$ cd Downloads
$ bash Miniconda3-latest-MacOSX-X86_64.sh
Skim through the license as required, and accept it by typing yes
when
asked, then accept the default location for the install and the path update as
offered. Finally close the Terminal and open a new one.>
or $␣
is not meant to be
part of what you type: it's just meant as a representation of the terminal
prompt, which usually ends with an angle-bracket on Windows and a dollar-sign
and a space on a Mac. Start > Programs > Anaconda3 > Anaconda Prompt
([M] You should just stay in the
Terminal window you were in at the end of the previous step)[W] >conda update conda
>conda list
[M] $ conda update conda
$ conda list
You should see that conda is now at version 4.3.23.
[W] >conda create -n class python=3.6
[M] $ conda create -n class python=3.6
pip
and python 3.6.[something]
, along
with 3 or 4 others—confirm and wait for the environment to be
built, this takes a minute or so.You will need to do this any time you launch a new Anaconda Prompt/Terminal
[W] >activate class
[M] $ source activate class
You will see (class)
at the left margin, to let you know you're in
the right environment.
[W] >conda install numpy scipy matplotlib nltk jupyter pandas seaborn tqdm scikit-learn
>conda install -c conda-forge keras tensorflow
[M] $ conda install numpy scipy matplotlib nltk jupyter pandas seaborn tqdm scikit-learn
$ conda install -c conda-forge keras tensorflow
[W] >conda clean -t
[M] $ conda clean -t
hg --version
in a terminal window if you're not
sure), download and then install the appropriate version for your platform from
https://www.mercurial-scm.org/downloads.
For Windows you'll want one of the "Mercurial 4.2.2 Inno Setup
installer"s, either 'x64' for 64-bit Windows, 'x86' for 32-bit, where
for MacOS it's "Mercurial 4.3-rc for MacOS X 10.12".
class
environment.[W] >cd c:\Users\[you]\Documents
>mkdir HMRCourse
>cd HMRCourse
[M] $ cd Documents
$ mkdir HMRCourse
$ cd HMRCourse
class
directory:[W] >hg clone http://homepages.inf.ed.ac.uk/ht/nlp/hg -b default class
>cd class
[M] $ hg clone http://homepages.inf.ed.ac.uk/ht/nlp/hg -b default class
$ cd class
As the name suggests, you only need to do this once:
[W] >runMeOnce.bat
[M] $ source runMeOnce.sh
[W] >jupyter notebook
[M] $ jupyter notebook
- Copy/paste this URL into your browser when you connect for the first time, to login with a token:
- http://localhost:8888/?token=************
notebooks
folder, launch the 01_Introduction.ipynb
notebook
there.<ctrl>+<enter>
).When you have time, you should work through the first lab's notebook
(01_Introduction.ipynb
), and as necessary given your background,
follow the relevant links therein to the
various background resources we've provided or shared, which are summarised
below, in the recommended order for reading:
python_intro
notebookpercent
webpage linked
from the 'More on controlling format' hovertext in Exercise 2 in the 01_Introduction
notebooknumpy_intro
notebook
(not needed for the first lab, but you will need it by the third).