Photo of me

Kristian Woodsend

Contact:

Informatics Forum 3.46
10 Crichton Street
Edinburgh, EH8 9AB
United Kingdom

Email:

k.woodsend at ed.ac.uk

I am currently a Research Associate at ILCC, in the School of Informatics, University of Edinburgh.

Research

I am currently part of the Readers project. The aim of this project is to develop new unsupervised computational models to automatically extract background knowledge after reading large amounts of unstructured text. This semantic knowledge will then be integrated with structured relational databases using graph algorithms. The result of these innovations will enable computers to make better-informed summaries and semantic searches.

Previously I have worked on natural language summarization using integer linear programming (ILP). I developed novel models for text generation, tasks such as producing summaries, highlights, captions or simplifications of existing articles.

My PhD was in large-scale numerical optimization, under the supervision of Prof Jacek Gondzio at ERGO. I researched methods for training support vector machines (SVM) using the interior point method of continuous optimization. The techniques I developed were particularly efficient on multicore parallel computing platforms, and the software did pretty well in the Pascal Large Scale Learning Challenge (2008). You can download the software — it is free for academic use.

Online demos

I have put some demonstrations of our NLP research on this website:
  1. Multiple aspect approach to summarization
  2. Sentence simplification, learning from Simple Wikipedia
Let me know how they work for you!

Data sets

Here are data sets and other material we have used in papers:
  1. CNN highlights dataset used in Woodsend and Lapata (2010, ACL); contains alignments of CNN highlights with document sentences.
  2. Simple English Wikipedia revisions dataset used in Woodsend and Lapata (2011, EMNLP); this contains diff-ed revisions of Simple English Wikipedia that were marked by the editors as simplifications.

Selected papers

  1. Michael Roth and Kristian Woodsend 2014 . Composition of Word Representations Improves Semantic Role Labelling . To appear at EMNLP 2014, Doha, Qatar.

  2. Kristian Woodsend and Mirella Lapata. 2014 . Text rewriting improves semantic role labeling . To appear in Journal of Artificial Intelligence Research.

  3. Kristian Woodsend and Mirella Lapata. 2012 . Multiple Aspect Summarization Using Integer Linear Programming . EMNLP 2012, Jeju, Korea.

  4. Kristian Woodsend and Mirella Lapata. 2011 . Learning to Simplify Sentences with Quasi-Synchronous Grammar and Integer Programming . EMNLP 2011, Edinburgh, UK.

  5. Kristian Woodsend and Mirella Lapata. 2011 . WikiSimple: Automatic Simplification of Wikipedia Articles . AAAI 2011, San Francisco, USA.

  6. Kristian Woodsend and Mirella Lapata. 2010 . Automatic Generation of Story Highlights. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 565–574. Uppsala, Sweden.

  7. Kristian Woodsend, Yansong Feng and Mirella Lapata. 2010 . Title Generation with Quasi-Synchronous Grammar. To appear in Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 513–523. Cambridge, MA.

  8. Kristian Woodsend. 2009 . Using Interior Point Methods for Large-scale Support Vector Machine training. PhD thesis, University of Edinburgh.

  9. Kristian Woodsend and Jacek Gondzio. 2009 . Hybrid MPI/OpenMP parallel linear support vector machine training. Journal of Machine Learning Research, 10:1937–1953.

  10. Kristian Woodsend and Jacek Gondzio. 2009 . Exploiting separability in large-scale linear support vector machine training. To appear in Computational Optimization and Applications.

  11. Marco Colombo, Andreas Grothey, Jonathan Hogg, Kristian Woodsend, and Jacek Gondzio 2009 . A structure-conveying modelling language for mathematical and stochastic programming. Mathematical Programming Computation, 1(4):223–247.

  12. Kristian Woodsend and Jacek Gondzio. 2009 . High-performance parallel support vector machine training. In R. Ciegis, D. Henty, B. Kagstrom, and J. Zilinskas, editors, Parallel Scientific Computing and Optimization: Advances and Applications, volume 27 of Springer Optimization and Its Applications, pages 83–92. Springer-Verlag, Berlin.

  13. Andreas Grothey, Jonathan Hogg, Kristian Woodsend, Marco Colombo, and Jacek Gondzio. 2009 . A structure-conveying parallelisable modelling language for mathematical programming. In R. Ciegis, D. Henty, B. Kagstrom, and J. Zilinskas, editors, Parallel Scientific Computing and Optimization: Advances and Applications, volume 27 of Springer Optimization and Its Applications, pages 147–158. Springer-Verlag, Berlin.