This experimental page presents an LDA topic analysis of around 9000 PDFs, published on homepages.inf.ed.ac.uk, into three topics. Each topic is a probability distribution on words. Each document is represented as a mixture of the three topics, giving barycentric coordinates that place it, as a dot, in the triangle below. The topics arise out of the data, as a best-effort to account for the variations in word frequencies between documents using this topic model.
Enter space-separated lists of UUNs (or more precisely, basenames of your directories in the /public/homepages/ - e.g. rbf bundy wadler) in the text boxes to highlight documents published by these people or groups; each box has its own colour. Changes appear onchanged, i.e. after you click outside the text area, so that it loses focus.
Not everything published on homepages is our work - but it probably represents our interests. Not everyone publishes on homepages, so for some UUNs you will find no entries. I plan to extend coverage - suggestions of other data sources we should mine are welcome (html pages and library repository are already on my list).
The pure topics - I've named them systems, interaction, and theory; but these are just names, suggestions for alternative names are welcome - are represented by the three corners of the triangle. The modal words for each topic are grouped nearby, with font-size proportional to relative weight within that topic (note that this exaggerates differences; font-size should be proportional to sqrt(weight) - maybe later ...).
The LDA analysis was done using MALLET.