Errata for Local String Transduction as Sequence Labeling, Joana Ribeiro, Shashi Narayan, Shay B. Cohen and Xavier Carreras, COLING 2018
========================================================================================================================================

In our paper, we report the results for the Finnish OCR task, and compare our "accuracy" results with the ones from Silfverberg
et al. (2016). However, as was pointed to us at COLING by Simon Clematide, the comparison is not valid, as Silfverberg et al. did not
use strictly an accuracy metric (in which the percentage of correct words is calculated). Instead, they used a metric called
"correction rate" which is calculated as

	CORRECTION_RATE = (tp - fp) / (tp + fn)

where

	tp is the number of input words that had a mistake in them and were output by the system in the correct form
	fp is the number of input words that did not have a mistake in them, but were changed by the system
	fn is the number of input words that had a mistake in them, and were not changed to the correct form

We include below the results with the correction rate values. As can be seen, the results are significantly lower than the ones
of Silfverberg et al., and as such we expect their accuracy to be significantly higher than in the 80s to match that.

(Please note that for EM, we only averaged the results over fold 0, 2, 4 and 8, as we lost the original files for the other folds.
The results across the folds are stable.)


bilstm:
Average accuracy across folds: 0.83183
Average correction rate across folds: 0.11413

em:
Average accuracy across folds: 0.79922
Average correction rate across folds: -0.06471

spectral:
Average accuracy across folds: 0.79443
Average correction rate across folds: -0.08268

crf:
Average accuracy across folds: 0.83337
Average correction rate across folds: 0.12225

(This correction is noted in Joana Ribeiro's MPhil thesis, which is based on this paper.)