Working with terabytes of Twitter data and historical corpora
that can fit on a floppy disk, I investigate how language is used,
how it varies between people and how this all changes across time.
I'll be spending the summer at Bell Labs in Cambridge, working on NLP for dreams, integrative complexity, and health and well-being.
My paper Self-Representation on Twitter Using Emoji Skin Color Modifiers was accepted to ICWSM. It was covered, amongst others, by the BBC, the Telegraph and National Geographic.
My paper Evaluating historical text normalization systems: How well do they generalize? was accepted to NAACL.
I spent a summer at TAB, working on categorisation and recommendation systems.
Anything to do with historical spelling variation, the psychology of language processing, language use and variation.
A mostly up-to-date copy of my CV can be found here.