Barry Haddow

Barry Haddow on Stuc a'Chroin    

Contact information:

School of Informatics
The Informatics Forum
10 Crichton Street
Edinburgh EH8 9AB
email: bhaddow at inf dot ed dot ac dot uk

About Me

I am a research fellow in the ILCC at the University of Edinburgh.

Projects

Sept 2022 - Aug 2025 HPLT
HPLT is a Horizon Europe project which aims to build an open large-scale data extraction and model building pipeline, ingesting PetaBytes of web data from the Internet Archive, and creating high performance translation and language models.

Oct 2022 - Sep 2025 Utter
The aim of this Horizon Europe project is to apply large multilingual, multimodal models to applications in speech and dialogue translation, and meeting assistance.

Jan 2019 - Mar 2022 Gourmet
This is a H2020 project aimed at improving the translation of under-resourced languages, with applications to journalism.

Jan 2019 - Mar 2022 Elitr
In this H2020 project we are researching spoken language translation, multilingual machine translation and automatic minuting.

Apr 2018 - July 2021 Material
An IARPA funded project to do speech recognition, machine translation, information retrieval and summarisation on under-resourced languages.

Feb 2015 - Jan 2018 HimL
Developing semantically accurate machine translation for the medical domain, and applying to morphologically complex languages in central and eastern Europe. I am coordinating this €3M Horizon 2020 project.

Feb 2015 - Jan 2018 Cracker
A coordination action covering European MT research, organisation of shared tasks, workshops and industrial outreach.

Jan 2015 - Dec 2017 QT21
A broad research project aimed at creating improved statistical models for translating European languages, especially those for which MT currently performs poorly.

Jan 2015 - Dec 2017 MMT
Aiming to create the next generation of machine translation infrastructure - scalable, adaptable and open-source.

Feb 2012 - Jan 2015 MosesCore
Supporting open source machine translation through MT marathons, shared tasks and workshops, and continued Moses development, as well as industrial outreach.

Jan 2012 - Dec 2014 Accept
Improving the translation of user-generated content through pre- and post-editing, and advances in the underlying machine translation.

Jan 2011 - Dec 2013 MLTMLV
Automatic ranslation from standard German into dialects, particularly Viennese.

Feb 2009 - Feb 2012 EUROMATRIXPLUS
Continuing the work of Euromatrix, but focussing on bringing translation to the user, with interactive tools, translation of news stories, and improvements to the core system.

Jan 2008 - Feb 2009 EUROMATRIX
This project is in statistical machine translation, working on tools and evaluations for the translation of all EU language pairs.

Sep 2005 - Dec 2007 TXM
A biomedical information extraction project aiming to produce natural language processing tools to assist curators in their work.

Publications

[1] Vilém Zouhar, Pinzhen Chen, Tsz Kin Lam, Nikita Moghe, and Barry Haddow. Pitfalls and outlooks in using comet, 2024. [ bib | arXiv | http ]
[2] Tom Kocmi, Eleftherios Avramidis, Rachel Bawden, Ondrej Bojar, Anton Dvorkovich, Christian Federmann, Mark Fishel, Markus Freitag, Thamme Gowda, Roman Grundkiewicz, Barry Haddow, Marzena Karpinska, Philipp Koehn, Benjamin Marie, Kenton Murray, Masaaki Nagata, Martin Popel, Maja Popovic, Mariya Shmatova, Steinþór Steingrímsson, and Vilém Zouhar. Preliminary wmt24 ranking of general mt systems and llms, 2024. [ bib | arXiv | http ]
[3] Vivek Iyer, Bhavitvya Malik, Pavel Stepachev, Pinzhen Chen, Barry Haddow, and Alexandra Birch. Quality or quantity? on data scale and diversity in adapting large language models for low-resource translation, 2024. [ bib | arXiv | http ]
[4] Weixuan Wang, Barry Haddow, Wei Peng, and Alexandra Birch. Sharing matters: Analysing neurons across languages and tasks in llms, 2024. [ bib | arXiv | http ]
[5] Pinzhen Chen, Simon Yu, Zhicheng Guo, and Barry Haddow. Is it good data for multilingual instruction tuning or just bad multilingual evaluation for large language models?, 2024. [ bib | arXiv | http ]
[6] Dawei Zhu, Pinzhen Chen, Miaoran Zhang, Barry Haddow, Xiaoyu Shen, and Dietrich Klakow. Fine-tuning large language models to translate: Will a touch of noisy data in misaligned languages suffice?, 2024. [ bib | arXiv | http ]
[7] Weixuan Wang, Barry Haddow, and Alexandra Birch. Retrieval-augmented multilingual knowledge editing. In Lun-Wei Ku, Andre Martins, and Vivek Srikumar, editors, Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 335--354, Bangkok, Thailand, August 2024. Association for Computational Linguistics. [ bib | http ]
[8] Tsz Kin Lam, Alexandra Birch, and Barry Haddow. Compact speech translation models via discrete speech units pretraining. In Proceedings of IWSLT, 2024. [ bib | http ]
[9] Pinzhen Chen, Zhicheng Guo, Barry Haddow, and Kenneth Heafield. Iterative translation refinement with large language models. In Proceedings of EAMT, 2024. [ bib | http ]
[10] Weixuan Wang, Barry Haddow, Alexandra Birch, and Wei Peng. Assessing factual reliability of large language model knowledge. In Kevin Duh, Helena Gomez, and Steven Bethard, editors, Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 805--819, Mexico City, Mexico, June 2024. Association for Computational Linguistics. [ bib | http ]
[11] Christos Baziotis, Biao Zhang, Alexandra Birch, and Barry Haddow. When does monolingual data help multilingual translation: The role of domain and model scale. In Kevin Duh, Helena Gomez, and Steven Bethard, editors, Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 6297--6324, Mexico City, Mexico, June 2024. Association for Computational Linguistics. [ bib | http ]
[12] Vivek Iyer, Bhavitvya Malik, Wenhao Zhu, Pavel Stepachev, Pinzhen Chen, Barry Haddow, and Alexandra Birch. Exploring very low-resource translation with LLMs: The University of Edinburgh's submission to AmericasNLP 2024 translation task. In Manuel Mager, Abteen Ebrahimi, Shruti Rijhwani, Arturo Oncevay, Luis Chiruzzo, Robert Pugh, and Katharina von der Wense, editors, Proceedings of the 4th Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP 2024), pages 209--220, Mexico City, Mexico, June 2024. Association for Computational Linguistics. [ bib | DOI | http ]
[13] Nikolay Bogoychev, Pinzhen Chen, Barry Haddow, and Alexandra Birch. The ups and downs of large language model inference with vocabulary trimming by language heuristics. In Proceedings of the Workshop in Insight from Negative results, 2024. [ bib | http ]
[14] Matthias Sperber, Ondřej Bojar, Barry Haddow, Dávid Javorský, Xutai Ma, Matteo Negri, Jan Niehues, Peter Polák, Elizabeth Salesky, Katsuhito Sudoh, and Marco Turchi. Evaluating the IWSLT2023 speech translation tasks: Human annotations, automatic metrics, and segmentation. In Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, and Nianwen Xue, editors, Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 6484--6495, Torino, Italia, May 2024. ELRA and ICCL. [ bib | http ]
[15] Jonas Waldendorf, Barry Haddow, and Alexandra Birch. Contrastive decoding reduces hallucinations in large multilingual machine translation models. In Yvette Graham and Matthew Purver, editors, Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2526--2539, St. Julian's, Malta, March 2024. Association for Computational Linguistics. [ bib | http ]
[16] Giulio Zhou, Tsz Kin Lam, Alexandra Birch, and Barry Haddow. Prosody in cascade and direct speech-to-text translation: a case study on Korean wh-phrases. In Yvette Graham and Matthew Purver, editors, Findings of the Association for Computational Linguistics: EACL 2024, pages 674--683, St. Julian's, Malta, March 2024. Association for Computational Linguistics. [ bib | http ]
[17] Pinzhen Chen, Shaoxiong Ji, Nikolay Bogoychev, Andrey Kutuzov, Barry Haddow, and Kenneth Heafield. Monolingual or multilingual instruction tuning: Which makes a better alpaca. In Yvette Graham and Matthew Purver, editors, Findings of the Association for Computational Linguistics: EACL 2024, pages 1347--1356, St. Julian's, Malta, March 2024. Association for Computational Linguistics. [ bib | http ]
[18] Nikolay Bogoychev, Jelmer van der Linde, Graeme Nail, Barry Haddow, Jaume Zaragoza-Bernabeu, Gema Ramírez-Sánchez, Lukas Weymann, Tudor Nicolae Mateiu, Jindřich Helcl, and Mikko Aulamo. Opuscleaner and opustrainer, open source toolkits for training machine translation and large language models, 2023. [ bib | arXiv | http ]
[19] Tom Kocmi, Eleftherios Avramidis, Rachel Bawden, Ondřej Bojar, Anton Dvorkovich, Christian Federmann, Mark Fishel, Markus Freitag, Thamme Gowda, Roman Grundkiewicz, Barry Haddow, Philipp Koehn, Benjamin Marie, Christof Monz, Makoto Morishita, Kenton Murray, Makoto Nagata, Toshiaki Nakazawa, Martin Popel, Maja Popović, and Mariya Shmatova. Findings of the 2023 conference on machine translation (WMT23): LLMs are here but not quite there yet. In Philipp Koehn, Barry Haddow, Tom Kocmi, and Christof Monz, editors, Proceedings of the Eighth Conference on Machine Translation, pages 1--42, Singapore, December 2023. Association for Computational Linguistics. [ bib | DOI | http ]
[20] Ashok Urlana, Pinzhen Chen, Zheng Zhao, Shay Cohen, Manish Shrivastava, and Barry Haddow. PMIndiaSum: Multilingual and cross-lingual headline summarization for languages in India. In Houda Bouamor, Juan Pino, and Kalika Bali, editors, Findings of the Association for Computational Linguistics: EMNLP 2023, pages 11606--11628, Singapore, December 2023. Association for Computational Linguistics. [ bib | DOI | http ]
[21] Milind Agarwal, Sweta Agrawal, Antonios Anastasopoulos, Luisa Bentivogli, Ondřej Bojar, Claudia Borg, Marine Carpuat, Roldano Cattoni, Mauro Cettolo, Mingda Chen, William Chen, Khalid Choukri, Alexandra Chronopoulou, Anna Currey, Thierry Declerck, Qianqian Dong, Kevin Duh, Yannick Estève, Marcello Federico, Souhir Gahbiche, Barry Haddow, Benjamin Hsu, Phu Mon Htut, Hirofumi Inaguma, Dávid Javorský, John Judge, Yasumasa Kano, Tom Ko, Rishu Kumar, Pengwei Li, Xutai Ma, Prashant Mathur, Evgeny Matusov, Paul McNamee, John P. McCrae, Kenton Murray, Maria Nadejde, Satoshi Nakamura, Matteo Negri, Ha Nguyen, Jan Niehues, Xing Niu, Atul Kr. Ojha, John E. Ortega, Proyag Pal, Juan Pino, Lonneke van der Plas, Peter Polák, Elijah Rippeth, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Yun Tang, Brian Thompson, Kevin Tran, Marco Turchi, Alex Waibel, Mingxuan Wang, Shinji Watanabe, and Rodolfo Zevallos. FINDINGS OF THE IWSLT 2023 EVALUATION CAMPAIGN. In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), pages 1--61, Toronto, Canada (in-person and online), July 2023. Association for Computational Linguistics. [ bib | DOI | http ]
[22] Biao Zhang, Barry Haddow, and Rico Sennrich. Efficient CTC regularization via coarse labels for end-to-end speech translation. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 2256--2268, Dubrovnik, Croatia, May 2023. Association for Computational Linguistics. [ bib | http ]
[23] Sukanta Sen, Rico Sennrich, Biao Zhang, and Barry Haddow. Self-training reduces flicker in retranslation-based simultaneous translation. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 3716--3726, Dubrovnik, Croatia, May 2023. Association for Computational Linguistics. [ bib | http ]
[24] Nuno Miguel Guerreiro, Duarte M. Alves, Jonas Waldendorf, Barry Haddow, Alexandra Birch, Pierre Colombo, and André F. T. Martins. Hallucinations in Large Multilingual Translation Models. Transactions of the Association for Computational Linguistics, 11, December 2023. [ bib | http ]
[25] Biao Zhang, Barry Haddow, and Alexandra Birch. Prompting large language model for machine translation: A case study. In Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 41092--41110. PMLR, 23--29 Jul 2023. [ bib | .html | .pdf ]
[26] Sukanta Sen, Ondřej Bojar, and Barry Haddow. Simultaneous translation for unsegmented input: A sliding window approach, 2022. [ bib | DOI | http ]
[27] Tom Kocmi, Rachel Bawden, Ondřej Bojar, Anton Dvorkovich, Christian Federmann, Mark Fishel, Thamme Gowda, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Rebecca Knowles, Philipp Koehn, Christof Monz, Makoto Morishita, Masaaki Nagata, Toshiaki Nakazawa, Michal Novák, Martin Popel, Maja Popović, and Mariya Shmatova. Findings of the 2022 conference on machine translation (wmt22). In Proceedings of the Seventh Conference on Machine Translation, pages 1--45, Abu Dhabi, December 2022. Association for Computational Linguistics. [ bib | .pdf ]
[28] Chantal Amrhein and Barry Haddow. Don't discard fixed-window audio segmentation in speech-to-text translation. In Proceedings of the Seventh Conference on Machine Translation, pages 203--219, Abu Dhabi, December 2022. Association for Computational Linguistics. [ bib | .pdf ]
[29] Jonas Waldendorf, Alexandra Birch, Barry Haddow, and Antonio Valerio Micele Barone. Improving translation of out of vocabulary words using bilingual lexicon induction in low-resource machine translation. In Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track), pages 144--156, Orlando, USA, September 2022. Association for Machine Translation in the Americas. [ bib | http ]
[30] Biao Zhang, Barry Haddow, and Rico Sennrich. Revisiting end-to-end speech-to-text translation from scratch. In Proceedings of ICML, 2022. [ bib | http ]
[31] Jindřich Helcl, Barry Haddow, and Alexandra Birch. Non-autoregressive machine translation: It's not as fast as it seems. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1780--1790, Seattle, United States, July 2022. Association for Computational Linguistics. [ bib | DOI | http ]
[32] Arturo Oncevay, Duygu Ataman, Niels Van Berkel, Barry Haddow, Alexandra Birch, and Johannes Bjerva. Quantifying synthesis and fusion and their impact on machine translation. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1308--1321, Seattle, United States, July 2022. Association for Computational Linguistics. [ bib | DOI | http ]
[33] Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nadejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, and Shinji Watanabe. Findings of the IWSLT 2022 evaluation campaign. In Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022), pages 98--157, Dublin, Ireland (in-person and online), May 2022. Association for Computational Linguistics. [ bib | DOI | http ]
[34] Barry Haddow, Rachel Bawden, Antonio Valerio Miceli Barone, Jindřich Helcl, and Alexandra Birch. Survey of low-resource machine translation. Computational Linguistics, 48(3):673--732, September 2022. [ bib | DOI | http ]
[35] Philip Williams and Barry Haddow. The ELITR ECA corpus. CoRR, abs/2109.07351, 2021. [ bib | arXiv | http ]
[36] Farhad Akhbardeh, Arkady Arkhangorodsky, Magdalena Biesialska, Ondřej Bojar, Rajen Chatterjee, Vishrav Chaudhary, Marta R. Costa-jussa, Cristina España-Bonet, Angela Fan, Christian Federmann, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Leonie Harter, Kenneth Heafield, Christopher Homan, Matthias Huck, Kwabena Amponsah-Kaakyire, Jungo Kasai, Daniel Khashabi, Kevin Knight, Tom Kocmi, Philipp Koehn, Nicholas Lourie, Christof Monz, Makoto Morishita, Masaaki Nagata, Ajay Nagesh, Toshiaki Nakazawa, Matteo Negri, Santanu Pal, Allahsera Auguste Tapo, Marco Turchi, Valentin Vydrin, and Marcos Zampieri. Findings of the 2021 conference on machine translation (WMT21). In Proceedings of the Sixth Conference on Machine Translation, pages 1--88, Online, November 2021. Association for Computational Linguistics. [ bib | http ]
[37] Alexandra Birch, Barry Haddow, Antonio Valerio Miceli Barone, Jindrich Helcl, Jonas Waldendorf, Felipe Sánchez Martínez, Mikel Forcada, Víctor Sánchez Cartagena, Juan Antonio Pérez-Ortiz, Miquel Esplà-Gomis, Wilker Aziz, Lina Murady, Sevi Sariisik, Peggy van der Kreeft, and Kay Macquarrie. Surprise language challenge: Developing a neural machine translation system between Pashto and English in two months. In Proceedings of the 18th Biennial Machine Translation Summit (Volume 1: Research Track), pages 92--102, Virtual, August 2021. Association for Machine Translation in the Americas. [ bib | http ]
[38] Ondřej Bojar, Vojtěch Srdečný, Rishu Kumar, Otakar Smrž, Felix Schneider, Barry Haddow, Phil Williams, and Chiara Canton. Operating a complex SLT system with speakers and human interpreters. In Proceedings of the 1st Workshop on Automatic Spoken Language Translation in Real-World Settings (ASLTRW), pages 23--34, Virtual, August 2021. Association for Machine Translation in the Americas. [ bib | http ]
[39] Philip Williams and Barry Haddow. The University of Edinburgh's Submission to the First Shared Task on Automatic Minuting. In Proceedings of The First Shared Task on Automatic Minuting (AutoMin), 2021. [ bib ]
[40] Sukanta Sen, Ulrich Germann, and Barry Haddow. The University of Edinburgh's submission to the IWSLT21 simultaneous translation task. In Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021), pages 46--51, Bangkok, Thailand (online), August 2021. Association for Computational Linguistics. [ bib | DOI | http ]
[41] Biao Zhang, Ivan Titov, Barry Haddow, and Rico Sennrich. Beyond sentence-level end-to-end speech translation: Context helps. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 2566--2578, Online, August 2021. Association for Computational Linguistics. [ bib | DOI | http ]
[42] Christos Baziotis, Ivan Titov, Alexandra Birch, and Barry Haddow. Exploring unsupervised pretraining objectives for machine translation. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 2956--2971, Online, August 2021. Association for Computational Linguistics. [ bib | DOI | http ]
[43] Ebrahim Ansari, Ondřej Bojar, Barry Haddow, and Mohammad Mahmoudi. SLTEV: Comprehensive evaluation of spoken language translation. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pages 71--79, Online, April 2021. Association for Computational Linguistics. [ bib | http ]
[44] Ondřej Bojar, Dominik Macháček, Sangeet Sagar, Otakar Smrž, Jonáš Kratochvíl, Peter Polák, Ebrahim Ansari, Mohammad Mahmoudi, Rishu Kumar, Dario Franceschini, Chiara Canton, Ivan Simonini, Thai-Son Nguyen, Felix Schneider, Sebastian Stüker, Alex Waibel, Barry Haddow, Rico Sennrich, and Philip Williams. ELITR multilingual live subtitling: Demo and strategy. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pages 271--277, Online, April 2021. Association for Computational Linguistics. [ bib | http ]
[45] Loïc Barrault, Magdalena Biesialska, Ondřej Bojar, Marta R. Costa-jussà, Christian Federmann, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Matthias Huck, Eric Joanis, Tom Kocmi, Philipp Koehn, Chi-kiu Lo, Nikola Ljubešić, Christof Monz, Makoto Morishita, Masaaki Nagata, Toshiaki Nakazawa, Santanu Pal, Matt Post, and Marcos Zampieri. Findings of the 2020 conference on machine translation (WMT20). In Proceedings of the Fifth Conference on Machine Translation, pages 1--55, Online, November 2020. Association for Computational Linguistics. [ bib | http ]
[46] Yvette Graham, Barry Haddow, and Philipp Koehn. Statistical power and translationese in machine translation evaluation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 72--81, Online, November 2020. Association for Computational Linguistics. [ bib | DOI | http ]
[47] Arturo Oncevay, Barry Haddow, and Alexandra Birch. Bridging linguistic typology and multilingual machine translation with multi-view language representations. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2391--2406, Online, November 2020. Association for Computational Linguistics. [ bib | DOI | http ]
[48] Christos Baziotis, Barry Haddow, and Alexandra Birch. Language model prior for low-resource neural machine translation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7622--7634, Online, November 2020. Association for Computational Linguistics. [ bib | DOI | http ]
[49] Biao Zhang, Ivan Titov, Barry Haddow, and Rico Sennrich. Adaptive feature selection for end-to-end speech translation. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2533--2544, Online, November 2020. Association for Computational Linguistics. [ bib | DOI | http ]
[50] Yvette Graham, Christian Federmann, Maria Eskevich, and Barry Haddow. Assessing human-parity in machine translation on the segment level. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 4199--4207, Online, November 2020. Association for Computational Linguistics. [ bib | DOI | http ]
[51] Yuekun Yao and Barry Haddow. Dynamic masking for improved stability in online spoken language translation. In Proceedings of the 14th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track), pages 123--136, Virtual, October 2020. Association for Machine Translation in the Americas. [ bib | http ]
[52] Marta Bañón, Pinzhen Chen, Barry Haddow, Kenneth Heafield, Hieu Hoang, Miquel Esplà-Gomis, Mikel L. Forcada, Amir Kamran, Faheem Kirefu, Philipp Koehn, Sergio Ortiz Rojas, Leopoldo Pla Sempere, Gema Ramírez-Sánchez, Elsa Sarrías, Marek Strelec, Brian Thompson, William Waites, Dion Wiggins, and Jaume Zaragoza. ParaCrawl: Web-scale acquisition of parallel corpora. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4555--4567, Online, July 2020. Association for Computational Linguistics. [ bib | DOI | http ]
[53] Susie Coleman, Andrew Secker, Rachel Bawden, Barry Haddow, and Alexandra Birch. Architecture of a scalable, secure and resilient translation platform for multilingual news media. In Proceedings of the 1st International Workshop on Language Technology Platforms, pages 16--21, Marseille, France, May 2020. European Language Resources Association. [ bib | http ]
[54] Dario Franceschini, Chiara Canton, Ivan Simonini, Armin Schweinfurth, Adelheid Glott, Sebastian Stüker, Thai-Son Nguyen, Felix Schneider, Thanh-Le Ha, Alex Waibel, Barry Haddow, Philip Williams, Rico Sennrich, Ondřej Bojar, Sangeet Sagar, Dominik Macháček, and Otakar Smrž. Removing European language barriers with innovative machine translation technology. In Proceedings of the 1st International Workshop on Language Technology Platforms, pages 44--49, Marseille, France, May 2020. European Language Resources Association. [ bib | http ]
[55] Barry Haddow and Faheem Kirefu. PMIndia -- A Collection of Parallel Corpora of Languages of India. arXiv e-prints, page arXiv:2001.09907, January 2020. [ bib | arXiv | http ]
[56] Joanna Wetesko, Marcin Chochowski, Pawel Przybysz, Philip Williams, Roman Grundkiewicz, Rico Sennrich, Barry Haddow, Antonio Valerio Miceli Barone, and Alexandra Birch. Samsung and university of edinburgh’s system for the iwslt 2019. In Proceedings of IWSLT, 2019. [ bib | .pdf ]
[57] Arturo Oncevay, Barry Haddow, and Alexandra Birch. Towards a multi-view language representation: A shared space of discrete and continuous language features. In Extended abstract at First Workshop on Typology for Polyglot NLP, 2019. [ bib ]
[58] Loïc Barrault, Ondřej Bojar, Marta R. Costa-jussà, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Philipp Koehn, Shervin Malmasi, Christof Monz, Mathias Müller, Santanu Pal, Matt Post, and Marcos Zampieri. Findings of the 2019 conference on machine translation (WMT19). In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), pages 1--61, Florence, Italy, August 2019. Association for Computational Linguistics. [ bib | DOI | http ]
[59] Philip Williams, Marcin Chochowski, Pawel Przybysz, Rico Sennrich, Barry Haddow, and Alexandra Birch. Samsung and university of edinburgh's system for the iwslt 2018 low resource mt task. In Proceedings of IWSLT, 2018. [ bib | .pdf ]
[60] Ondřej Bojar, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Philipp Koehn, and Christof Monz. Findings of the 2018 conference on machine translation (wmt18). In Proceedings of the Third Conference on Machine Translation: Shared Task Papers, pages 272--303. Association for Computational Linguistics, 2018. [ bib | http ]
[61] Barry Haddow, Nikolay Bogoychev, Denis Emelin, Ulrich Germann, Roman Grundkiewicz, Kenneth Heafield, Antonio Valerio Miceli Barone, and Rico Sennrich. The university of edinburgh's submissions to the wmt18 news translation task. In Proceedings of the Third Conference on Machine Translation: Shared Task Papers, pages 399--409. Association for Computational Linguistics, 2018. [ bib | http ]
[62] Mikel L. Forcada, Carolina Scarton, Lucia Specia, Barry Haddow, and Alexandra Birch. Exploring gap filling as a cheaper alternative to reading comprehension questionnaires when evaluating machine translation for gisting. In Proceedings of the Third Conference on Machine Translation: Research Papers, pages 192--203. Association for Computational Linguistics, 2018. [ bib | http ]
[63] Rachel Bawden, Rico Sennrich, Alexandra Birch, and Barry Haddow. Evaluating discourse phenomena in neural machine translation. In Proceedings of NAACL, 2018. [ bib | http ]
[64] Pawel Przybysz, Marcin Chochowski., Rico Sennrich, Barry Haddow, and Alexandra Birch. The samsung and university of edinburgh’s submission to iwslt17. In Proceedings if IWSLT, 2017. [ bib | .pdf ]
[65] Antonio Valerio Miceli Barone, Barry Haddow, Ulrich Germann, and Rico Sennrich. Regularization techniques for fine-tuning in neural machine translation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1490--1495, Copenhagen, Denmark, September 2017. Association for Computational Linguistics. [ bib | http ]
[66] Rico Sennrich, Alexandra Birch, Anna Currey, Ulrich Germann, Barry Haddow, Kenneth Heafield, Antonio Valerio Miceli Barone, and Philip Williams. The university of edinburgh's neural mt systems for wmt17. In Proceedings of the Second Conference on Machine Translation, pages 389--399, Copenhagen, Denmark, September 2017. Association for Computational Linguistics. [ bib | http ]
[67] Antonio Jimeno Yepes, Aurelie Neveol, Mariana Neves, Karin Verspoor, Ondrej Bojar, Arthur Boyer, Cristian Grozea, Barry Haddow, Madeleine Kittner, Yvonne Lichtblau, Pavel Pecina, Roland Roller, Rudolf Rosa, Amy Siu, Philippe Thomas, and Saskia Trescher. Findings of the wmt 2017 biomedical translation shared task. In Proceedings of the Second Conference on Machine Translation, pages 234--247, Copenhagen, Denmark, September 2017. Association for Computational Linguistics. [ bib | http ]
[68] Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Shujian Huang, Matthias Huck, Philipp Koehn, Qun Liu, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Raphael Rubino, Lucia Specia, and Marco Turchi. Findings of the 2017 conference on machine translation (wmt17). In Proceedings of the Second Conference on Machine Translation, pages 169--214, Copenhagen, Denmark, September 2017. Association for Computational Linguistics. [ bib | http ]
[69] Antonio Valerio Miceli Barone, Jindřich Helcl, Rico Sennrich, Barry Haddow, and Alexandra Birch. Deep architectures for neural machine translation. In Proceedings of the Second Conference on Machine Translation, pages 99--107, Copenhagen, Denmark, September 2017. Association for Computational Linguistics. [ bib | http ]
[70] Rico Sennrich, Orhan Firat, Kyunghyun Cho, Alexandra Birch, Barry Haddow, Julian Hitschler, Marcin Junczys-Dowmunt, Samuel Läubli, Antonio Valerio Miceli Barone, Jozef Mokry, and Maria Nadejde. Nematus: a toolkit for neural machine translation. In Proceedings of the Software Demonstrations of the 15th Conference of the European Chapter of the Association for Computational Linguistics, pages 65--68, Valencia, Spain, April 2017. Association for Computational Linguistics. [ bib | http ]
[71] Alexandra Birch, Omri Abend, Ondřej Bojar, and Barry Haddow. Hume: Human ucca-based evaluation of machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 1264--1274, Austin, Texas, November 2016. Association for Computational Linguistics. [ bib | http ]
[72] Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Aurelie Neveol, Mariana Neves, Martin Popel, Matt Post, Raphael Rubino, Carolina Scarton, Lucia Specia, Marco Turchi, Karin Verspoor, and Marcos Zampieri. Findings of the 2016 conference on machine translation. In Proceedings of the First Conference on Machine Translation, pages 131--198, Berlin, Germany, August 2016. Association for Computational Linguistics. [ bib | .pdf ]
[73] Rico Sennrich and Barry Haddow. Linguistic Input Features Improve Neural Machine Translation. In Proceedings of First Conference on Machine Translation (WMT2016), 2016. [ bib ]
[74] Rico Sennrich, Barry Haddow, and Alexandra Birch. Edinburgh Neural Machine Translation Systems for WMT 16. In Proceedings of First Conference on Machine Translation (WMT2016), 2016. [ bib ]
[75] Matthias Huck, Alexander Fraser, and Barry Haddow. The Edinburgh/LMU Hierarchical Machine Translation System for WMT 2016. In Proceedings of First Conference on Machine Translation (WMT2016), 2016. [ bib ]
[76] Jan-Thorsten Peter, Tamer Alkhouli, Hermann Ney, Matthias Huck, Fabienne Braune, Alexander Fraser, Aleš Tamchyna, Ondřej Bojar, Barry Haddow, Rico Sennrich, Frédéric Blain, Lucia Specia, Jan Niehues, Alex Waibel, Alexandre Allauzen, Lauriane Aufrant, Franck Burlot, Elena Knyazeva, Thomas Lavergne, François Yvon, Joachim Daiber, and Mãrcis Pinnis. The QT21/HimL Combined Machine Translation System. In Proceedings of First Conference on Machine Translation (WMT2016), 2016. [ bib ]
[77] Philip Williams, Rico Sennrich, Maria Nadejde, Matthias Huck, Barry Haddow, and Ondřej Bojar. Edinburgh's Statistical Machine Translation Systems for WMT16. In Proceedings of First Conference on Machine Translation (WMT2016), 2016. [ bib ]
[78] Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural Machine Translation of Rare Words with Subword Units. In Proceedings of ACL, 2016. [ bib | http ]
[79] Rico Sennrich, Barry Haddow, and Alexandra Birch. Improving neural machine translation models with monolingual data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 86--96, Berlin, Germany, August 2016. Association for Computational Linguistics. [ bib | DOI | http ]
[80] Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1715--1725, Berlin, Germany, August 2016. Association for Computational Linguistics. [ bib | DOI | http ]
[81] Rico Sennrich, Barry Haddow, and Alexandra Birch. Controlling Politeness in Neural Machine Translation via Side Constraints. In Proceedings of NAACL, 2016. [ bib ]
[82] Ondřej Bojar, Christian Federmann, Barry Haddow, Philipp Koehn, Matt Post, and Lucia Specia. Ten Years of WMT Evaluation Campaigns: Lessons Learnt. In Proceedings of the LREC 2016 Workshop “Translation Evaluation - From Fragmented Tools and Data Sets to an Integrated Ecosystem”, 2016. [ bib | .pdf ]
[83] Matthias Huck, Alexandra Birch, and Barry Haddow. Mixed-domain vs. multi-domain statistical machine translation. In Proceedings of MT Summit, 2015. [ bib | .pdf ]
[84] Rico Sennrich and Barry Haddow. A joint dependency model of morphological and syntactic structure for statistical machine translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 2081--2087, Lisbon, Portugal, September 2015. Association for Computational Linguistics. [ bib | .pdf ]
[85] Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Barry Haddow, Matthias Huck, Chris Hokamp, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Carolina Scarton, Lucia Specia, and Marco Turchi. Findings of the 2015 workshop on statistical machine translation. In Proceedings of the Tenth Workshop on Statistical Machine Translation, pages 1--46, Lisbon, Portugal, September 2015. Association for Computational Linguistics. [ bib | .pdf ]
[86] Barry Haddow, Matthias Huck, Alexandra Birch, Nikolay Bogoychev, and Philipp Koehn. The edinburgh/jhu phrase-based machine translation systems for wmt 2015. In Proceedings of the Tenth Workshop on Statistical Machine Translation, pages 126--133, Lisbon, Portugal, September 2015. Association for Computational Linguistics. [ bib | .pdf ]
[87] Eva Hasler, Barry Haddow, and Philipp Koehn. Combining Domain and Topic Adaptation for SMT. In Proceedings of AMTA, 2014. [ bib | .pdf ]
[88] Eva Hasler, Barry Haddow, and Philipp Koehn. Dynamic Topic Adaptation for SMT using Distributional Profiles. In Proceedings of the Ninth Workshop on Statistical Machine Translation, Baltimore, USA, June 2014. Association for Computational Linguistics. [ bib | .pdf ]
[89] Nadir Durrani, Barry Haddow, Kenneth Heafield, and Philipp Koehn. Edinburgh's Phrase-based Machine Translation Systems for WMT-14. In Proceedings of the Ninth Workshop on Statistical Machine Translation, Baltimore, USA, June 2014. Association for Computational Linguistics. [ bib | .pdf ]
[90] Ondřej Bojar, Christian Buck, Christian Federmann, Barry Haddow, Philipp Koehn, Johannes Leveling, Christof Monz, Pavel Pecina, Matt Post, Herve Saint-Amand, Radu Soricut, Lucia Specia, and Ales Tamchyna. Findings of the 2014 Workshop on Statistical Machine Translation. In Proceedings of the Ninth Workshop on Statistical Machine Translation, pages 12--58, Baltimore, USA, June 2014. Association for Computational Linguistics. [ bib | .pdf ]
[91] Eva Hasler, Phil Blunsom, Philipp Koehn, and Barry Haddow. Dynamic Topic Adaptation for Phrase-based MT. In Proceedings of EACL, 2014. [ bib | .pdf ]
[92] Friedrich Neubarth, Barry Haddow, Adolfo Hernandez, and Harald Trost. A hybrid approach to statistical machine translation between standard and dialectal varieties. In Proceedings of the Sixth Language and Technology Conference, 2013. [ bib | .pdf ]
[93] Barry Haddow, Adolfo Hernandez, Friedrich Neubarth, and Harald Trost. Corpus development for machine translation between standard and dialectal varieties. In Proceedings of Workshop on Adaptation of language resources and tools for closely related languages and language variants, 2013. [ bib | .pdf ]
[94] Alexandra Birch, Barry Haddow, Ulrich Germann, Maria Nadejde, Christian Buck, and Philipp Koehn. The feasibility of HMEANT as a human MT evaluation metric. In Proceedings of the Eighth Workshop on Statistical Machine Translation, pages 52--61, Sofia, Bulgaria, August 2013. Association for Computational Linguistics. [ bib | .pdf ]
[95] Ondřej Bojar, Christian Buck, Chris Callison-Burch, Christian Federmann, Barry Haddow, Philipp Koehn, Christof Monz, Matt Post, Radu Soricut, and Lucia Specia. Findings of the 2013 Workshop on Statistical Machine Translation. In Proceedings of the Eighth Workshop on Statistical Machine Translation, pages 1--44, Sofia, Bulgaria, August 2013. Association for Computational Linguistics. [ bib | .pdf ]
[96] Nadir Durrani, Barry Haddow, Kenneth Heafield, and Philipp Koehn. Edinburgh's machine translation systems for European language pairs. In Proceedings of the Eighth Workshop on Statistical Machine Translation, pages 114--121, Sofia, Bulgaria, August 2013. Association for Computational Linguistics. [ bib | .pdf ]
[97] Pierrette Bouillon, Johanna Gerlach, Ulrich German, Barry Haddow, and Manny Rayner. Two Approaches to Correcting Homophone Confusions in a Hybrid Machine Translation System. In Proceedings of Second Workshop on Hybrid Approaches to Translation (HyTra), 2013. [ bib | .pdf ]
[98] Barry Haddow. Applying pairwise ranked optimisation to improve the interpolation of translation models. In Proceedings of NAACL, 2013. [ bib | .pdf ]
[99] Eva Hasler, Barry Haddow, and Philipp Koehn. Sparse lexicalised features and topic adaptation for SMT. In Proceedings of IWSLT, 2012. [ bib | .pdf ]
[100] Eva Hasler, Peter Bell, Arnab Ghoshal, Barry Haddow, Philipp Koehn, Fergus McInnes, Steve Renals, and Pawel Swietojanski. The UEDIN systems for the IWSLT 2012 evaluation. In Proceedings of IWSLT, 2012. [ bib | .pdf ]
[101] Manny Rayner, Pierrette Bouillon, and Barry Haddow. Using source-language transformations to address register mismatches in SMT. In Proceedings of AMTA, 2012. [ bib | .pdf ]
[102] Philipp Koehn and Barry Haddow. Interpolated backoff for factored translation models. In Proceedings of AMTA, 2012. [ bib | .pdf ]
[103] Barry Haddow and Philipp Koehn. Analysing the effect of out-of-domain data on SMT systems. In Proceedings of the Seventh Workshop on Statistical Machine Translation, Montreal, Canada, June 2012. Association for Computational Linguistics. [ bib | .pdf ]
[104] Philipp Koehn and Barry Haddow. Towards effective use of training data in statistical machine translation. In Proceedings of the Seventh Workshop on Statistical Machine Translation, Montreal, Canada, June 2012. Association for Computational Linguistics. [ bib | .pdf ]
[105] Eva Hasler, Barry Haddow, and Philipp Koehn. Margin infused relaxed algorithm for moses. Prague Bulletin of Mathematical Linguistics, No. 96:69--78, 2011. [ bib | .pdf ]
[106] Barry Haddow, Abhishek Arun, and Philipp Koehn. Samplerank training for phrase-based machine translation. In Proceedings of the Sixth Workshop on Statistical Machine Translation, Edinburgh, Scotland, July 2011. Association for Computational Linguistics. (Contains minor corrections to published version). [ bib | .pdf ]
[107] Philipp Koehn, Barry Haddow, Philip Williams, and Hieu Hoang. More linguistic annotation for statistical machine translation. In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, pages 115--120, Uppsala, Sweden, July 2010. Association for Computational Linguistics. [ bib | .pdf ]
[108] Abhishek Arun, Barry Haddow, and Philipp Koehn. A unified approach to minimum risk training and decoding. In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, pages 365--374, Uppsala, Sweden, July 2010. Association for Computational Linguistics. [ bib | slides | .pdf ]
[109] Abhishek Arun, Barry Haddow, Philipp Koehn, Adam Lopez, Chris Dyer, and Phil Blunsom. Monte carlo techniques for phrase-based translation. Machine Translation, 2010. [ bib | DOI | .pdf ]
[110] Barry Haddow. Adding multi-threaded decoding to moses. Prague Bulletin of Mathematical Linguistics, No. 93:57--66, 2010. [ bib | slides | .pdf ]
[111] Philipp Koehn and Barry Haddow. Interactive assistance to human translators using statistical machine translation methods. In Proceedings of MT Summit XII, Ottawa, Canada, 2009. [ bib ]
[112] Abhishek Arun, Chris Dyer, Barry Haddow, Phil Blunsom, Adam Lopez, and Philipp Koehn. Monte carlo inference and maximization for phrase-based translation. In Proceedings of the Conference on Natural Language Learning, Boulder, USA, June 2009. Association for Computational Linguistics. [ bib | .pdf ]
[113] Philipp Koehn and Barry Haddow. Edinburgh's submission to all tracks of the WMT 2009 shared task with reordering and speed improvements to Moses. In Proceedings of the Fourth Workshop on Statistical Machine Translation, pages 160--164, Athens, Greece, March 2009. Association for Computational Linguistics. [ bib | .pdf ]
[114] Nicola Bertoldi, Barry Haddow, and Jean-Baptiste Fouet. Improved minimum error rate training in moses. Prague Bulletin of Mathematical Linguistics, No. 91:7--16, 2009. [ bib | slides | .pdf ]
[115] Barry Haddow. Using automated feature optimisation to create an adaptable relation extraction system. In Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing, pages 19--27, Columbus, Ohio, June 2008. Association for Computational Linguistics. [ bib | .pdf ]
[116] Bea Alex, Claire Grover, Barry Haddow, Mijail Kabadjov, Ewan Klein, Michael Matthews, Stuart Roebuck, Richard Tobin, and Xinglong Wang. The ITI TXM Corpora: Tissue Expressions and Protein-Protein Interactions. In Proceedings of Workshop on Building and evaluating resources for biomedical text mining, LREC 2008, Marrakesh, Morocco, 2008. [ bib | .pdf ]
[117] Barry Haddow and Beatrice Alex. Exploiting multiply annotated corpora in biomedical information extraction tasks. In Proceedings of LREC, Marrakesh, Morocco, 2008. [ bib | .pdf ]
[118] Beatrice Alex, Claire Grover, Barry Haddow, Mijail Kabadjov, Ewan Klein, Michael Matthews, Richard Tobin, and Xinglong Wang. Automating curation using a natural language processing pipeline. Genome Biology, 9(Suppl 2):S10, 2008. [ bib | DOI | http ]
[119] Larry Smith et al. Overview of biocreative ii gene mention recognition. Genome Biology, 9(Suppl 2):S2, 2008. [ bib | http ]
[120] Florian Leitner et al. Introducing meta-services for biomedical information extraction. Genome Biology, 9(Suppl 2):S6, 2008. [ bib | http ]
[121] Beatrice Alex, Claire Grover, Barry Haddow, Mijail Kabadjov, Ewan Klein, Michael Matthews, Stuart Roebuck, Richard Tobin, and Xinglong Wang. Assisted curation: Does text mining really help? In Proceedings of Pacific Symposium on Biocomputing, Hawaii, USA, January 2008. [ bib | .pdf ]
[122] Barry Haddow and Michael Matthews. The extraction of enriched protein-protein interactions from biomedical text. In Biological, translational, and clinical language processing, pages 145--152, Prague, Czech Republic, June 2007. Association for Computational Linguistics. [ bib | .pdf ]
[123] Beatrice Alex, Barry Haddow, and Claire Grover. Recognising nested named entities in biomedical text. In Biological, translational, and clinical language processing, pages 65--72, Prague, Czech Republic, June 2007. Association for Computational Linguistics. [ bib | .pdf ]
[124] Claire Grover, Barry Haddow, Ewan Klein, Michael Matthews, Leif Nielsen, Richard Tobin, and Xinglong Wang. Adapting a Relation Extraction Pipeline for the BioCreAtIvE II Tasks. In Proceedings of the Second BioCreative Challenge Workshop, pages 273--286, Madrid, Spain, 2007. CNIO. [ bib | .pdf ]

What I Did Before

In 2005 I completed an MSc in Artificial Intelligence in Edinburgh University, specialising in Natural Language. Prior to that I worked for Orbism, a Dublin-based IT consultancy for several years. And going back even further, I gained a PhD in Mathematics from Aberdeen University, supervised by Graham Hall, in 1994.