Corpus-based measures discriminate inflection and derivation cross-linguistically. Coleman Haley, Edoardo Ponti and Sharon Goldwater.
Journal of Language Modeling.
In press.
[
pdf |
bib
| abstract
]
Estimating the Level of Dialectness Predicts Interannotator Agreement in Multi-dialect Arabic Datasets. Amr Keleg, Sharon Goldwater and Walid Magdy.
In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics. 2024.
[
pdf |
bib
| abstract
]
A predictive learning model can simulate temporal dynamics and context effects found in neural representations of continuous speech. Oli Danyi Liu, Hao Tang, Naomi H. Feldman and Sharon Goldwater.
In Proceedings of the 46th Annual Conference of the Cognitive Science Society. 2024.
(Winner of the Computational Modeling Prize for Perception & Action)
[
pdf |
bib
| abstract
]
Orthogonality and isotropy of speaker and phonetic information in self-supervised speech representations. Mukhtar Mohamed, Oli Danyi Liu, Hao Tang and Sharon Goldwater.
In Proceedings of Interspeech. 2024.
[
pdf |
bib
| abstract
]
Cross-linguistically Consistent Semantic and Syntactic Annotation of Child-directed Speech. Ida Szubert, Omri Abend, Nathan Schneider, Samuel Gibbon, Louis Mahon, Sharon Goldwater and Mark Steedman.
Language Resources and Evaluation.
2024.
[
pdf |
bib
| abstract
]
ALDi: Quantifying the Arabic Level of Dialectness of Text. Amr Keleg, Sharon Goldwater and Walid Magdy.
In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023.
[
pdf |
bib
| abstract
]
Self-supervised Predictive Coding Models Encode Speaker and Phonetic Information in Orthogonal Subspaces. Oli Danyi Liu, Hao Tang and Sharon Goldwater.
In Proceedings of Interspeech. 2023.
[
pdf |
bib
| abstract
]
Infant Phonetic Learning as Perceptual Space Learning: A Crosslinguistic Evaluation of Computational Models. Yevgen Matusevych, Thomas Schatz, Herman Kamper, Naomi Feldman and Sharon Goldwater.
Cognitive Science 47 (7), pp. e13314.
2023.
[
pdf |
bib
| abstract
| online journal
]
Prosodic features improve sentence segmentation and parsing in English. Elizabeth Nielsen, Mark Steedman and Sharon Goldwater.
In Proceedings of Interspeech. 2023.
[
pdf |
bib
| abstract
]
Acoustic Word Embeddings for Untranscribed Target Languages with Continued Pretraining and Learned Pooling. Ramon Sanabria, Hao Tang and Sharon Goldwater.
In Proceedings of Interspeech. 2023.
[
pdf |
bib
| abstract
]
Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised Speech Models. Ramon Sanabria, Hao Tang and Sharon Goldwater.
In Proceedings of the 48th IEEE International Conference on Acoustics, Speech and Signal Processing. 2023.
[
pdf |
bib
| abstract
]
Regularization or lexical probability-matching? How German speakers generalize plural morphology. Kate McCurdy, Sharon Goldwater and Adam Lopez.
In Proceedings of the 44th Annual Conference of the Cognitive Science Society. 2022.
[
pdf |
bib
| abstract
]
Do infants really learn phonetic categories?. Naomi H. Feldman, Sharon Goldwater, Emmanuel Dupoux and Thomas Schatz.
Open Mind: Discoveries in Cognitive Science.
2021.
[
pdf |
bib
| abstract
| online journal
]
Multilingual and unsupervised subword modeling for zero-resource languages. Enno Hermann, Herman Kamper and Sharon Goldwater.
Computer Speech and Language 65.
2021.
[
preprint pdf |
bib
| abstract
| online journal
]
Improved acoustic word embeddings for zero-resource languages using multilingual transfer. Herman Kamper, Yevgen Matusevych and Sharon Goldwater.
IEEE Transactions on Audio, Speech and Language Processing 29.
2021.
[
preprint pdf |
bib
| abstract
| online journal
]
A phonetic model of non-native spoken word processing. Yevgen Matusevych, Herman Kamper, Thomas Schatz, Naomi Feldman and Sharon Goldwater.
In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics. 2021.
(Received an Honourable Mention for the Best Long Paper award.)
[
pdf |
bib
| abstract
]
Black or White but never neutral: How readers perceive identity from yellow or skin-toned emoji. Alexander Robertson, Walid Magdy and Sharon Goldwater.
Proceedings of the ACM on Human-Computer Interaction 5 (CSCW2), pp. 1–23.
2021.
[
preprint pdf |
bib
| abstract
| online journal
]
Identity Signals in Emoji Do not Influence Perception of Factual Truth on Twitter. Alexander Robertson, Walid Magdy and Sharon Goldwater.
In Proceedings of the Fourth International Workshop on Emoji Understanding and Applications in Social Media. 2021.
[
pdf |
bib
| abstract
]
On the Difficulty of Segmenting Words with Attention. Ramon Sanabria, Hao Tang and Sharon Goldwater.
In Proceedings of the Workshop on Insights from Negative Results in NLP. 2021.
[
pdf |
bib
| abstract
]
Early phonetic learning without phonetic categories: Insights from large-scale simulations on realistic input. Thomas Schatz, Naomi H Feldman, Sharon Goldwater, Xuan-Nga Cao and Emmanuel Dupoux.
Proceedings of the National Academy of Sciences 118 (7).
2021.
[
pdf |
bib
| abstract
| online journal
]
Cross-lingual topic prediction for speech using translations. Sameer Bansal, Herman Kamper, Adam Lopez and Sharon Goldwater.
In Proceedings of the 45th IEEE International Conference on Acoustics, Speech and Signal Processing. 2020.
[
pdf |
bib
| abstract
]
Multilingual acoustic word embedding models for processing zero-resource languages. Herman Kamper, Yevgen Matusevych and Sharon Goldwater.
In Proceedings of the 45th IEEE International Conference on Acoustics, Speech and Signal Processing. 2020.
[
pdf |
bib
| abstract
]
Input matters in the modeling of early phonetic learning. Ruolan Li, Thomas Schatz, Yevgen Matusevych, Sharon Goldwater and Naomi Feldman.
In Proceedings of the 42nd Annual Conference of the Cognitive Science Society. 2020.
[
pdf |
bib
| abstract
]
Analyzing autoencoder-based acoustic word embeddings. Yevgen Matusevych, Herman Kamper and Sharon Goldwater.
In Workshop on Bridging AI and Cognitive Science at ICLR. 2020.
[
pdf |
bib
| abstract
]
Evaluating computational models of infant phonetic learning across languages. Yevgen Matusevych, Thomas Schatz, Herman Kamper, Naomi Feldman and Sharon Goldwater.
In Proceedings of the 42nd Annual Conference of the Cognitive Science Society. 2020.
[
pdf |
bib
| abstract
]
Inflecting when there’s no majority: Limitations of encoder-decoder neural networks as cognitive models for German plurals. Kate McCurdy, Sharon Goldwater and Adam Lopez.
In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020.
[
pdf |
bib
| abstract
]
Conditioning, but on Which Distribution? Grammatical Gender in German Plural Inflection. Kate McCurdy, Adam Lopez and Sharon Goldwater.
In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics, pp. 59–65. 2020.
[
pdf |
bib
| abstract
]
The role of context in neural pitch accent detection in English. Elizabeth Nielsen, Mark Steedman and Sharon Goldwater.
In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pp. 7994–8000. 2020.
[
pdf |
bib
| abstract
]
Emoji Skin Tone Modifiers: Analyzing Variation in Usage on Social Media. Alexander Robertson, Walid Magdy and Sharon Goldwater.
ACM Transactions on Social Computing 3 (2), pp. 1–25.
2020.
[
pdf |
bib
| abstract
| online journal
]
Analyzing ASR pretraining for low-resource speech-to-text translation. Mihaela C. Stoain, Sameer Bansal and Sharon Goldwater.
In Proceedings of the 45th IEEE International Conference on Acoustics, Speech and Signal Processing. 2020.
[
pdf |
bib
| abstract
]
Pre-training on high-resource speech recognition improves low-resource speech-to-text translation. Sameer Bansal, Herman Kamper, Karen Livescu, Adam Lopez and Sharon Goldwater.
In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019.
[
pdf |
bib
| abstract
]
Data Augmentation for Context-Sensitive Neural Lemmatization Using Inflection Tables and Raw Text. Toms Bergmanis and Sharon Goldwater.
In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019.
(This is an updated version that fixes a cross-reference and a consequential typo: we used macro-averaged type accuracy, not micro-averaged as stated in the official published version.)
[
pdf |
bib
| abstract
]
Are we there yet? Encoder-decoder neural networks as cognitive models of English past tense inflection. Maria Corkery, Yevgen Matusevych and Sharon Goldwater.
In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019.
[
pdf |
bib
| abstract
]
Low-Resource Speech-to-Text Translation. Sameer Bansal, Herman Kamper, Karen Livescu, Adam Lopez and Sharon Goldwater.
In Proceedings of Interspeech. 2018.
[
pdf |
bib
| abstract
]
Context Sensitive Neural Lemmatization with Lematus. Toms Bergmanis and Sharon Goldwater.
In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018.
[
pdf |
bib
| abstract
]
Multilingual bottleneck features for subword modeling in zero-resource languages. Enno Hermann and Sharon Goldwater.
In Proceedings of Interspeech. 2018.
[
pdf |
bib
| abstract
]
Evaluating historical text normalization systems: How well do they generalize?. Alexander Robertson and Sharon Goldwater.
In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018.
[
pdf |
bib
| abstract
]
Self-Representation on Twitter Using Emoji Skin Color Modifiers. Alexander Robertson, Walid Magdy and Sharon Goldwater.
In Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM). 2018.
[
pdf |
bib
| abstract
]
Inducing a lexicon of sociolinguistic variables from code-mixed text. Philippa Shoemark, James Kirby and Sharon Goldwater.
In Workshop on Noisy User-Generated Text at EMNLP. 2018.
(Received Best Paper award. Includes Supplementary Information.)
[
pdf |
bib
| abstract
]
Bootstrapping Language Acquisition. Omri Abend, Tom Kwiatkowski, Nathaniel J. Smith, Sharon Goldwater and Mark Steedman.
Cognition 164, pp. 116–143.
2017.
[
preprint pdf |
bib
| abstract
| online journal
]
Spoken Term Discovery for Language Documentation using Translations. Antonios Anastasopoulos, Sameer Bansal, Sharon Goldwater, Adam Lopez and David Chiang.
In Workshop on Speech-Centric Natural Language Processing at EMNLP. 2017.
[
pdf |
bib
| abstract
]
Towards speech-to-text translation without speech recognition. Sameer Bansal, Herman Kamper, Adam Lopez and Sharon Goldwater.
In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. 2017.
[
pdf |
bib
| abstract
]
Weakly supervised spoken term discovery using cross-lingual side information. Sameer Bansal, Herman Kamper, Sharon Goldwater and Adam Lopez.
In Proceedings of the 42nd IEEE International Conference on Acoustics, Speech and Signal Processing. 2017.
(Copyright 2017 IEEE.)
[
pdf |
bib
| abstract
]
From segmentation to analyses: a probabilistic model for unsupervised morphology induction. Toms Bergmanis and Sharon Goldwater.
In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. 2017.
[
pdf |
bib
| abstract
]
Training Data Augmentation for Low-Resource Morphological Inflection. Toms Bergmanis, Katharina Kann, Hinrich Schütze and Sharon Goldwater.
In Proceedings of the CoNLL-SIGMORPHON 2017 Shared Task: Universal Morphological Reinflection. 2017.
[
pdf |
bib
| abstract
]
A segmental framework for fully-unsupervised large-vocabulary speech recognition. Herman Kamper, Aren Jansen and Sharon Goldwater.
Computer Speech and Language 46, pp. 154–174.
2017.
(Received 2021 Best Research Paper award (for a paper published in the journal during the previous five years).)
[
preprint pdf |
bib
| abstract
| online journal
]
An embedded segmental K-means model for unsupervised segmentation and clustering of speech. Herman Kamper, Karen Livescu and Sharon Goldwater.
In Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). 2017.
(Nominated for Best Paper award. Copyright 2017 IEEE.)
[
pdf |
bib
| abstract
]
Topic and audience effects on distinctively Scottish vocabulary usage in Twitter data. Philippa Shoemark, James Kirby and Sharon Goldwater.
In Workshop on Stylistic Variation at EMNLP 2017. 2017.
[
pdf |
bib
| abstract
]
Aye or naw, whit dae ye hink? Scottish independence and linguistic identity on social media. Philippa Shoemark, Debnil Sur, Luke Shrimpton, Iain Murray and Sharon Goldwater.
In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. 2017.
[
pdf |
bib
| abstract
]
Unsupervised Word Segmentation and Lexicon Discovery using Acoustic Word Embeddings. Herman Kamper, Aren Jansen and Sharon Goldwater.
IEEE Transactions on Audio, Speech and Language Processing 24 (4), pp. 669–679.
2016.
(Copyright 2016 IEEE.)
[
preprint pdf |
bib
| abstract
| online journal
]
Statistical Learning, Inductive Bias, and Bayesian Inference in Language Acquisition. Lisa Pearl and Sharon Goldwater.
In Lidz, Jeffrey and Snyder, William and Pater, Joe, editors. Oxford Handbook of Developmental Linguistics, chapter 28, pp. 664–695. Oxford University Press. 2016.
[
preprint pdf |
bib
]
Towards robust cross-linguistic comparisons of phonological networks. Philippa Shoemark, Sharon Goldwater, James Kirby and Rik Sarkar.
In Proceedings of the 14th ACL SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, pp. 110–120. 2016.
[
pdf |
bib
| abstract
]
Unsupervised neural network based feature extraction using weak top-down constraints. Herman Kamper, Micha Elsner, Aren Jansen and Sharon Goldwater.
In Proceedings of the 40th IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 5818–5822. 2015.
(Copyright 2015 IEEE.)
[
pdf |
bib
| abstract
]
Fully Unsupervised Small-Vocabulary Speech Recognition Using a Segmental Bayesian Model. Herman Kamper, Aren Jansen and Sharon Goldwater.
In Proceedings of Interspeech. 2015.
[
pdf |
bib
| abstract
]
Talkers account for listener and channel characteristics to communicate efficiently. John K. Pate and Sharon Goldwater.
Journal of Memory and Language 78, pp. 1–17.
2015.
[
preprint pdf |
bib
| abstract
| online journal
]
A Comparison of Neural Network Methods for Unsupervised Representation Learning on the Zero Resource Speech Challenge. Daniel Renshaw, Herman Kamper, Aren Jansen and Sharon Goldwater.
In Proceedings of Interspeech. 2015.
[
pdf |
bib
| abstract
]
Weak semantic context helps phonetic learning in a model of infant language acquisition. Stella Frank, Naomi Feldman and Sharon Goldwater.
In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pp. 1073–1083. 2014.
[
pdf |
bib
| abstract
]
Unsupervised lexical clustering of speech segments using fixed dimensional acoustic embeddings. Herman Kamper, Aren Jansen, Simon King and Sharon Goldwater.
In Proceedings of the IEEE Spoken Language Technology Workshop. 2014.
(Copyright 2014 IEEE.)
[
pdf |
bib
| abstract
]
POS induction with distributional and morphological information using a distance-dependent Chinese restaurant process. Kairit Sirts, Jacob Eisenstein, Micha Elsner and Sharon Goldwater.
In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Volume 2: Short Papers, pp. 265–271. 2014.
[
pdf |
bib
| abstract
]
A Joint Learning Model of Word Segmentation, Lexical Acquisition, and Phonetic Variability. Micha Elsner, Sharon Goldwater, Naomi Feldman and Frank Wood.
In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 42–54. 2013.
[
pdf |
bib
| abstract
]
A role for the developing lexicon in phonetic category acquisition. Naomi H. Feldman, Thomas L. Griffiths, Sharon Goldwater and James L. Morgan.
Psychological Review 120 (4), pp. 751–778.
2013.
[
preprint pdf |
bib
| abstract
| online journal
]
Adding sentence types to a model of syntactic category acquisition. Stella Frank, Sharon Goldwater and Frank Keller.
TopiCS in Cognitive Science 5 (3), pp. 495–521.
2013.
[
preprint pdf |
bib
| abstract
| online journal
]
Exploring the utility of joint morphological and syntactic learning from child-directed speech. Stella Frank, Frank Keller and Sharon Goldwater.
In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 30–41. 2013.
[
pdf |
bib
| abstract
]
A summary of the 2012 JHU CLSP Workshop on Zero Resource Speech Technologies and Models of Early Language Acquisition. Aren Jansen, Emmanuel Dupoux, Sharon Goldwater, Mark Johnson, Sanjeev Khudanpur, Kenneth Church, Naomi Feldman, Hynek Hermansky, Florian Metze, Richard Rose, Mike Seltzer, Pascal Clark, Ian McGraw, Balakrishnan Varadarajan, Erin Bennett, Benjamin Borschinger, Justin Chiu, Ewan Dunbar, Abdallah Fourtassi, David Harwath, Chia-ying Lee, Keith Levin, Atta Norouzian, Vijay Peddinti, Rachel Richardson, Thomas Schatz and Samuel Thomas.
In Proceedings of the 38th IEEE International Conference on Acoustics, Speech and Signal Processing. 2013.
(Copyright 2013 IEEE.)
[
pdf |
bib
| abstract
]
Modeling Graph Languages with Grammars Extracted via Tree Decompositions. Bevan K. Jones, Mark Johnson and Sharon Goldwater.
In Proceedings of the 11th Conference on Finite-State Methods and Natural Language Processing. 2013.
[
pdf |
bib
| abstract
]
Unsupervised dependency parsing with acoustic cues. John K. Pate and Sharon Goldwater.
Transactions of the Association for Computational Linguistics 1(Mar), pp. 63–74.
2013.
[
pdf |
bib
| abstract
]
Minimally-Supervised Morphological Segmentation using Adaptor Grammars. Kairit Sirts and Sharon Goldwater.
Transactions of the Association for Computational Linguistics 1(May), pp. 231–242.
2013.
[
pdf |
bib
| abstract
]
Turning the pipeline into a loop: Iterated unsupervised dependency parsing and PoS induction. Christos Christodoulopoulos, Sharon Goldwater and Mark Steedman.
In Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure. 2012.
[
pdf |
bib
]
Bootstrapping a Unified Model of Lexical and Phonetic Acquisition. Micha Elsner, Sharon Goldwater and Jacob Eisenstein.
In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp. 184–193. 2012.
[
pdf |
bib
| abstract
]
Semantic Parsing with Bayesian Tree Transducers. Bevan K. Jones, Mark Johnson and Sharon Goldwater.
In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp. 488–496. 2012.
[
pdf |
bib
| abstract
]
A Probabilistic Model of Syntactic and Semantic Acquisition from Child-Directed Utterances and their Meanings. Tom Kwiatkowski, Sharon Goldwater, Luke Zettelmoyer and Mark Steedman.
In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 234–244. 2012.
[
pdf |
bib
| abstract
]
A Bayesian mixture model for part-of-speech induction using multiple features. Christos Christodoulopoulos, Sharon Goldwater and Mark Steedman.
In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 638–647. 2011.
[
pdf |
bib
| abstract
]
Producing power-law distributions and damping word frequencies with two-stage language models. Sharon Goldwater, Thomas L. Griffiths and Mark Johnson.
Journal of Machine Learning Research 12(Jul), pp. 2335–2382.
2011.
[
pdf |
bib
| abstract
]
Formalizing Semantic Parsing with Tree Transducers. Bevan K. Jones, Mark Johnson and Sharon Goldwater.
In Proceedings of the Australasian Language Technology Workshop. 2011.
[
pdf |
bib
| abstract
]
Lexical generalization in CCG grammar induction for semantic parsing. Tom Kwiatkowski, Luke Zettelmoyer, Sharon Goldwater and Mark Steedman.
In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1512–1523. 2011.
[
pdf |
bib
| abstract
]
Unsupervised extraction of recurring words from infant-directed speech. Fergus R. McInnes and Sharon Goldwater.
In Proceedings of the 33rd Annual Conference of the Cognitive Science Society. 2011.
[
pdf |
bib
| abstract
]
Predictability effects in adult-directed and infant-directed speech: Does the listener matter?. John K. Pate and Sharon Goldwater.
In Proceedings of the 33rd Annual Conference of the Cognitive Science Society. 2011.
[
pdf |
bib
| abstract
]
Unsupervised syntactic chunking with acoustic cues: Computational models for prosodic bootstrapping. John K. Pate and Sharon Goldwater.
In Proceedings of the 2nd Workshop on Cognitive Modeling and Computational Linguistics. 2011.
(Received Best Student Paper award.)
[
pdf |
bib
| abstract
]
Two decades of unsupervised POS induction: How far have we come?. Christos Christodoulopoulos, Sharon Goldwater and Mark Steedman.
In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 575–584. 2010.
(This is an updated version that corrects a minor bug in the computation of the vmb measure in Figure 1 and Table 1. Thanks to Andreas Zollmann for pointing this out.)
[
pdf |
bib
| abstract
]
Inducing tree substitution grammars. Trevor Cohn, Phil Blunsom and Sharon Goldwater.
Journal of Machine Learning Research 11(Nov), pp. 3053–3096.
2010.
[
preprint pdf |
bib
| abstract
| online journal
]
Beyond transitional probabilities: Human learners impose a parsimony bias in statistical word segmentation. Michael C. Frank, Inbal Arnon, Harry Tily and Sharon Goldwater.
In Proceedings of the 32nd Annual Conference of the Cognitive Science Society. 2010.
[
pdf |
bib
| abstract
]
Modeling human performance in statistical word segmentation. Michael C. Frank, Sharon Goldwater, Thomas L. Griffiths and Joshua B. Tenenbaum.
Cognition 117 (2), pp. 107–125.
2010.
[
preprint pdf |
bib
| abstract
| online journal
]
Using sentence type information for syntactic category acquisition. Stella Frank, Sharon Goldwater and Frank Keller.
In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics at ACL. 2010.
[
pdf |
bib
| abstract
]
Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase speech recognition error rates. Sharon Goldwater, Daniel Jurafsky and Christopher D. Manning.
Speech Communication 52 (3), pp. 181–200.
2010.
[
preprint pdf |
bib
| abstract
| online journal
]
Inducing probabilistic CCG grammars from logical form with higher-order unification. Tom Kwiatkowski, Luke Zettelmoyer, Sharon Goldwater and Mark Steedman.
In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1223–1233. 2010.
[
pdf |
bib
| abstract
]
How ideal are we? Incorporating human limitations into Bayesian models of word segmentation. Lisa Pearl, Sharon Goldwater and Mark Steyvers.
In Proceedings of the 34th annual Boston University Conference on Child Language Development, pp. 315–326. 2010.
[
pdf |
bib
]
Online Learning Mechanisms for Bayesian Models of Word Segmentation. Lisa Pearl, Sharon Goldwater and Mark Steyvers.
Research on Language and Computation 8 (2), pp. 107-132.
2010.
[
preprint pdf |
bib
| abstract
| online journal
]
A note on the implementation of Hierarchical Dirichlet Processes. Phil Blunsom, Trevor Cohn, Sharon Goldwater and Mark Johnson.
In Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics, pp. 337–340. 2009.
(This is an updated version with corrected pseudocode for Algorithm 1: line 16 was previously missing. Thanks to Weng Wei for pointing this out.)
[
pdf |
bib
| abstract
]
Inducing compact but accurate tree-substitution grammars. Trevor Cohn, Sharon Goldwater and Phil Blunsom.
In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 548–556. 2009.
[
pdf |
bib
| abstract
]
Evaluating models of syntactic category acquisition without using a gold standard. Stella Frank, Sharon Goldwater and Frank Keller.
In Proceedings of the 31st Annual Conference of the Cognitive Science Society. 2009.
[
pdf |
bib
| abstract
]
A Bayesian framework for word segmentation: Exploring the effects of context. Sharon Goldwater, Thomas L. Griffiths and Mark Johnson.
Cognition 112 (1), pp. 21–54.
2009.
(Results in this paper are based on a newer version of the code used in the ACL06 and BUCLD07 word segmentation papers and chapter 5 of my thesis. The new version corrects a small bug in the implementation of the bigram (HDP) model. Please cite results from this paper in future publications.)
[
preprint pdf |
bib
| abstract
| online journal
]
Improving nonparametric Bayesian inference: Experiments on unsupervised word segmentation with adaptor grammars. Mark Johnson and Sharon Goldwater.
In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 317–325. 2009.
[
pdf |
bib
| abstract
]
Improving morphology induction by learning spelling rules. Jason Naradowsky and Sharon Goldwater.
In Proceedings of IJCAI. 2009.
[
pdf |
bib
| abstract
]
Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase ASR error rates. Sharon Goldwater, Daniel Jurafsky and Christopher D. Manning.
In Proceedings of ACL-08: HLT, pp. 380–388. 2008.
[
pdf |
bib
| abstract
]
Modeling Human Performance in Statistical Word Segmentation. Michael C. Frank, Sharon Goldwater, Vikash Mansinghka, Thomas L. Griffiths and Joshua B. Tenenbaum.
In Proceedings of the 29th Annual Conference of the Cognitive Science Society. 2007.
[
pdf |
bib
| abstract
]
Distributional cues to word segmentation: Context is important. Sharon Goldwater, Thomas L. Griffiths and Mark Johnson.
In Proceedings of the 31st Boston University Conference on Language Development. 2007.
(If you plan to cite results from this paper, see this note.)
[
pdf |
bib
]
A Fully Bayesian Approach to Unsupervised Part-of-Speech Tagging. Sharon Goldwater and Thomas L. Griffiths.
In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, pp. 744–751. 2007.
[
pdf |
bib
| abstract
]
Adaptor Grammars: a Framework for Specifying Compositional Nonparametric Bayesian Models. Mark Johnson, Thomas L. Griffiths and Sharon Goldwater.
In Advances in Neural Information Processing Systems 19, pp. 641–648. 2007.
(This is an updated version that fixes a typo in equation 4. Thanks to Julia Hockenmaier for pointing this out.)
[
pdf |
bib
| abstract
]
Bayesian Inference for PCFGs via Markov chain Monte Carlo. Mark Johnson, Thomas L. Griffiths and Sharon Goldwater.
In Proceedings of Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, pp. 139–146. 2007.
[
pdf |
bib
| abstract
]
Contextual Dependencies in Unsupervised Word Segmentation. Sharon Goldwater, Thomas L. Griffiths and Mark Johnson.
In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pp. 673–680. 2006.
[
pdf |
bib
| abstract
]
Interpolating Between Types and Tokens by Estimating Power-Law Generators. Sharon Goldwater, Thomas L. Griffiths and Mark Johnson.
In Advances in Neural Information Processing Systems 18, pp. 459–466. 2006.
(This is a corrected version.)
[
pdf |
bib
| abstract
]
Nonparametric Bayesian Models of Lexical Acquisition. Sharon Goldwater.
Ph.D. Dissertation, Brown University, 2006.
(This is the tree-saving version of my thesis, single-spaced with minimal front matter. The official version is double-spaced and contains more front matter. If you plan to cite results on word segmentation, see this note.)
[
pdf |
bib
]
A Non-Parametric Bayesian Approach to Spike Sorting. Frank Wood, Sharon Goldwater and Michael Black.
In Proceedings of the 28th IEEE Conference on Engineering in Medicine and Biologicial Systems. 2006.
[
pdf |
bib
| abstract
]
Representational Bias in Unsupervised Learning of Syllable Structure. Sharon Goldwater and Mark Johnson.
In Proceedings of the Ninth Conference on Computational Natural Language Learning (CONLL ’05), pp. 112–119. 2005.
[
pdf |
bib
| abstract
]
Improving Statistical MT Through Morphological Analysis. Sharon Goldwater and David McClosky.
In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 676–683. 2005.
[
pdf |
bib
| abstract
]
Priors in Bayesian Learning of Phonological Rules. Sharon Goldwater and Mark Johnson.
In Proceedings of the Seventh Meeting of the ACL Special Interest Group in Computational Phonology (SIGPHON ’04). 2004.
[
pdf |
bib
| abstract
]
Statically finding errors in spreadsheets. Yanif Ahmad, Tudor Antoniu, Sharon Goldwater and Shriram Krishnamurthi.
In Proceedings of the IEEE International Conference on Software Engineering. 2003.
[
pdf |
bib
| abstract
]
Learning OT Constraint Rankings Using a Maximum Entropy Model. Sharon Goldwater and Mark Johnson.
In Proceedings of the Workshop on Variation within Optimality Theory, pp. 113–122. 2003.
[
pdf |
bib
| abstract
]
Building a Robust Dialogue System with Limited Data. Sharon Goldwater, Elizabeth Owen Bratt, Jean-Mark Gawron and John Dowding.
In Proceedings of the NAACL Workshop on Conversational Systems. 2000.
[
pdf |
bib
]
Compiling language models from a linguistically motivated unification grammar. Manny Rayner, Beth Ann Hockey, Frankie James, Elizabeth Owen Bratt, Sharon Goldwater and Jean Mark Gawron.
In Proceedings of the 18th Conference on Computational linguistics, Volume 2, pp. 670–676. 2000.
[
pdf |
bib
| abstract
]
Interpreting language in context in CommandTalk. John Dowding, Elizabeth Owen Bratt and Sharon Goldwater.
In Proceedings of Communicative Agents: The Use of Natural Language in Embodied Systems. 1999.
[
pdf |
bib
]
Edge-based best-first chart parsing. Eugene Charniak, Sharon Goldwater and Mark Johnson.
In Proceedings of the Sixth Workshop on Very Large Corpora at COLING-ACL. 1998.
[
pdf |
bib
]
Thanks to Charles Sutton for the scripts used to generate this page automatically from my BibTeX file. You too can download the scripts here.