Publications by Year

Alternately see my publications by topic.

2023

  1. Can Large Language Models Reason about Program Invariants?. Kexin Pei, David Bieber, Kensen Shi, Charles Sutton and Pengcheng Yin. In International Conference on Machine Learning. 2023.

    [ .pdf | bib | abstract ]

  2. Any-scale Balanced Samplers for Discrete Space. Haoran Sun, Bo Dai, Charles Sutton, Dale Schuurmans and Hanjun Dai. In International Conference on Learning Representations. 2023.

    [ .pdf | bib | abstract ]

  3. Natural Language to Code Generation in Interactive Data Science Notebooks. Pengcheng Yin, Wen-Ding Li, Kefan Xiao, Abhishek Rao, Yeming Wen, Kensen Shi, Joshua Howland, Paige Bailey, Michele Catasta, Henryk Michalewski, Alex Polozov and Charles Sutton. In Proceedings of the Association of Computational Linguistics (ACL). 2023.

    [ arXiv | bib | abstract | source code ]

2022

  1. PaLM: Scaling Language Modeling with Pathways. Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin, Michael Isard, Guy Gur-Ari, Pengcheng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, Liam Fedus, Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, Barret Zoph, Alexander Spiridonov, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean, Slav Petrov and Noah Fiedel. arXiv:2204.02311. 2022.

    [ arXiv | bib ]

  2. CrossBeam: Learning to Search in Bottom-Up Program Synthesis. Kensen Shi, Hanjun Dai, Kevin Ellis and Charles Sutton. In International Conference on Learning Representations (ICLR). 2022.

    [ arXiv | bib ]

  3. Compositional generalization and decomposition in neural program synthesis. Kensen Shi, Joey Hong, Manzil Zaheer, Pengcheng Yin and Charles Sutton. In ICLR Workshop on Deep Learning for Code (DL4C). 2022.

    [ arXiv | bib | abstract ]

2021

  1. SpreadsheetCoder: Formula Prediction from Semi-structured Context. Xinyun Chen, Petros Maniatis, Rishabh Singh, Charles Sutton, Hanjun Dai, Max Lin and Denny Zhou. In International Conference in Machine Learning (ICML). 2021.

    [ to appear | bib ]

  2. Latent Programmer: Discrete Latent Codes for Program Synthesis. Joey Hong, David Dohan, Rishabh Singh, Charles Sutton and Manzil Zaheer. In International Conference in Machine Learning (ICML). 2021.

    [ to appear | bib ]

  3. BUSTLE: Bottom-Up Program Synthesis Through Learning-Guided Exploration. Augustus Odena, Kensen Shi, David Bieber, Rishabh Singh, Charles Sutton and Hanjun Dai. In International Conference on Learning Representations. 2021.

    [ .pdf | bib ]

  4. Program Synthesis with Large Language Models. Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le and Charles Sutton. arXiv:2108.07732. 2021.

    [ arXiv | bib ]

  5. The IDEAL household energy dataset, electricity, gas, contextual sensor data and survey data for 255 UK homes. Martin Pullinger, Jonathan Kilgour, Nigel Goddard, Niklas Berliner, Lynda Webb, Myroslava Dzikovska, Heather Lovell, Janek Mann, Charles Sutton, Janette Webb and Mingjun Zhong. Scientific Data 8 (1). 2021.

    [ .pdf | bib | abstract | data ]

  6. Show your work: Scratchpads for intermediate computation with language models. Maxwell Nye, Anders Johan Andreassen, Guy Gur-Ari, Henryk Michalewski, Jacob Austin, David Bieber, David Dohan, Aitor Lewkowycz, Maarten Bosma, David Luan, Charles Sutton and Augustus Odena. . 2021.

    [ to appear | bib | abstract ]

  7. Learning Semantic Representations to Verify Hardware Designs. Shobha Vasudevan, Wenjie Jiang, David Bieber, Rishabh Singh, Hamid Shojaei, Richard Ho and Charles Sutton. In Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS). 2021.

    [ to appear | bib ]

  8. Couplings for Multinomial Hamiltonian Monte Carlo. Kai Xu, Tor Erlend Fjelde, Charles Sutton and Hong Ge. In International Conference on Artificial Intelligence and Statistics (AISTATS). 2021.

    [ arXiv | bib | source code ]

2020

  1. Learning to Execute Programs with Instruction Pointer Attention Graph Neural Networks. David Bieber, Charles Sutton, Hugo Larochelle and Daniel Tarlow. In Advances in Neural Information Processing Systems (NeurIPS). 2020.

    [ .pdf | bib | abstract ]

  2. Learning Discrete Energy-based Models via Auxiliary-variable Local Exploration. Hanjun Dai, Rishabh Singh, Bo Dai, Charles Sutton and Dale Schuurmans. In Advances in Neural Information Processing Systems (NeurIPS). 2020.

    [ .pdf | bib | abstract ]

  3. Robust Variational Autoencoders for Outlier Detection and Repair of Mixed-Type Data. Simao Eduardo, Alfredo Nazabal, Christopher K. I. Williams and Charles Sutton. In Conference on Artificial Intelligence and Statistics (AISTATS). 2020.

    [ arXiv | bib ]

  4. Global Relational Models of Source Code. Vincent J. Hellendoorn, Charles Sutton, Rishabh Singh, Petros Maniatis and David Bieber. In International Conference on Learning Representations. 2020.

    [ .pdf | bib ]

  5. How Often Do Single-Statement Bugs Occur? The ManySStuBs4J Dataset. Rafael-Michael Karampatsis and Charles Sutton. In Working Conference on Mining Software Repositories (MSR; Data Showcase). 2020.

    [ arXiv | bib ]

  6. Big Code != Big Vocabulary: Open-Vocabulary Models for Source Code. Rafael-Michael Karampatsis, Hlib Babii, Romain Robbes, Charles Sutton and Andrea Janes. In International Conference on Software Engineering (ICSE). 2020.

    [ .pdf | bib ]

  7. Where should I comment my code? A dataset and model for predicting locations that need comments. Annie Louis, Santanu Kumar Dash, Earl T Barr, Michael D Ernst and Charles Sutton. In International Conference on Software Engineering (ICSE; NIER track). 2020.

    [ .pdf | bib | source code ]

  8. Learning to Represent Programs with Property Signatures. Augustus Odena and Charles Sutton. In International Conference on Learning Representations. 2020.

    [ .pdf | bib ]

  9. Incremental Sampling Without Replacement for Sequence Models. Kensen Shi, David Bieber and Charles Sutton. In International Conference in Machine Learning (ICML). 2020.

    [ arXiv | bib ]

  10. Generative Ratio Matching Networks. Akash Srivastava, Kai Xu, Michael U. Gutmann and Charles Sutton. In International Conference on Learning Representations. 2020.

    [ .pdf | bib ]

  11. Learning to Fix Build Errors with Graph2Diff Neural Networks. Daniel Tarlow, Subhodeep Moitra, Andrew Rice, Zimin Chen, Pierre-Antoine Manzagol, Charles Sutton and Edward Aftandilian. In ICSE Workshop on Automated Program Repair. 2020.

    [ arXiv | bib ]

2019

  1. ColNet: Embedding the Semantics of Web Tables for Column Type Prediction. Jiaoyan Chen, Ernesto Jimenez-Ruiz, Ian Horrocks and Charles Sutton. In National Conference on Artificial Intelligence (AAAI). 2019.

    [ arXiv | bib ]

  2. Probabilistic Programming with Densities in SlicStan: Efficient, Flexible and Deterministic. Maria I. Gorinova, Andrew D. Gordon and Charles Sutton. In ACM SIGPLAN Symposium on Principles of Programming Languages (POPL). 2019.

    [ arXiv | bib ]

  3. Learning to Fix Build Errors with Graph2Diff Neural Networks. Daniel Tarlow, Subhodeep Moitra, Andrew Rice, Zimin Chen, Pierre-Antoine Manzagol, Charles Sutton and Edward Aftandilian. 2019.

    [ arXiv | bib ]

  4. Wrangling messy CSV files by detecting row and type patterns. Gertjan J J van den Burg, Alfredo Nazábal and Charles Sutton. Data Min. Knowl. Discov. 33 (6). 2019.

    [ arXiv | bib | abstract ]

2018

  1. A Survey of Machine Learning for Big Code and Naturalness. Miltiadis Allamanis, Earl T. Barr, Premkumar Devanbu and Charles Sutton. ACM Computing Surveys 51 (4). 2018.

    [ arXiv | bib ]

  2. Interpreting Deep Classifier by Visual Distillation of Dark Knowledge. Kai Xu, Dae Hoon Park, Yi Chang and Charles Sutton. ArXiv e-prints. 2018.

    [ .pdf | bib ]

  3. Data Diff: Interpretable, Executable Summaries of Changes in Distributions for Data Wrangling. Charles Sutton, Timothy Hobson, James Geddes and Rich Caruana. In Conference on Knowledge Discovery and Data Mining (KDD). 2018.

    [ .pdf | bib ]

  4. SlicStan: Improving Probabilistic Programming using Information Flow Analysis. Maria I. Gorinova, Andrew D. Gordon and Charles Sutton. In Probabilistic Programming Languages, Semantics, and Systems Workshop at the Symposium on Principles of Programming Languages (PPS 2018). 2018.

    [ .pdf | bib ]

  5. SlicStan: A Blockless Stan-like Language. Maria I. Gorinova, Andrew D. Gordon and Charles Sutton. In StanCon. 2018.

    [ .pdf | bib ]

  6. Synthesis of Differentiable Functional Programs for Lifelong Learning. Lazar Valkov, Dipak Chaudhari, Akash Srivastava, Charles Sutton and Swarat Chaudhuri. In Neural Information Processing Systems. 2018.

    [ .pdf | bib ]

  7. Summarizing Software API Usage Examples using Clustering Techniques. Nikolaos Katirtzis, Themistoklis Diamantopoulos and Charles Sutton. In International Conference on Fundamental Approaches to Software Engineering (FASE). 2018.

    [ .pdf | bib | source code ]

  8. Deep Dungeons and Dragons: Learning Character-Action Interactions from Role-Playing Game Transcripts. Annie Louis and Charles Sutton. In North American Chapter of the Association for Computational Linguistics (NAACL). 2018.

    [ .pdf | bib | data ]

  9. Deep Learning to Detect Redundant Method Comments. Annie Louis, Santanu K. Dash, Earl T. Barr and Charles Sutton. ArXiv e-prints. 2018.

    [ .pdf | bib | source code ]

  10. Wrattler: Reproducible, live and polyglot notebooks. Tomas Petricek, James Geddes and Charles Sutton. In USENIX Workshop on the Theory and Practiece of Provenance (TaPP). 2018.

    [ .pdf | bib ]

  11. GEMSEC: Graph Embedding with Self Clustering. Benedek Rozemberczki, Ryan Davies, Rik Sarkar and Charles Sutton. ArXiv e-prints. 2018.

    [ arXiv | bib | source code | data ]

  12. Ratio Matching MMD Nets: Low dimensional projections for effective deep generative models. Akash Srivastava, Kai Xu, Michael U. Gutmann and Charles Sutton. ArXiv e-prints. 2018.

    [ .pdf | bib ]

  13. Mining Semantic Loop Idioms. Miltiadis Allamanis, Earl T. Barr, Christian Bird, Premkumar Devanbu, Mark Marron and Charles Sutton. IEEE Transactions on Software Engineering 44 (7). 2018.

    [ .pdf | bib ]

  14. Amortized Inference for Latent Feature Models Using Variational Russian Roulette. Kai Xu, Akash Srivastava and Charles Sutton. In NeurIPS Workshop on All of Bayesian Nonparametrics. 2018.

    [ to appear | bib ]

  15. Sequence-to-Point Learning with Neural Networks for Non-intrusive Load Monitoring. Chaoyun Zhang, Mingjun Zhong, Zongzuo Wang, Nigel Goddard and Charles Sutton. In National Conference on Artificial Intelligence (AAAI). 2018.

    [ .pdf | bib ]

2017

  1. Learning Continuous Semantic Representations of Symbolic Expressions. Miltiadis Allamanis, Pankajan Chanthirasegaran, Pushmeet Kohli and Charles Sutton. In International Conference on Machine Learning (ICML). 2017.

    [ .pdf | bib | code and data ]

  2. Autofolding for Source Code Summarization. Jaroslav Fowkes, Razvan Ranca, Miltiadis Allamanis, Mirella Lapata and Charles Sutton. IEEE Transactions on Software Engineering 43 (12). 2017.

    [ .pdf | bib ]

  3. Autoencoding Variational Inference for Topic Models. Akash Srivastava and Charles Sutton. In International Conference on Learning Representations (ICLR). 2017.

    [ .pdf | arXiv | bib | discussion | source code ]

  4. VEEGAN: Reducing Mode Collapse in GANs using Implicit Variational Learning. Akash Srivastava, Lazar Valkov, Chris Russell, Michael Gutmann and Charles Sutton. In Advances in Neural Information Processing Systems (NIPS). 2017.

    [ .pdf | bib | abstract | code and data ]

2016

  1. A Convolutional Attention Network for Extreme Summarization of Source Code. Miltiadis Allamanis, Hao Peng and Charles Sutton. In International Conference in Machine Learning (ICML). 2016.

    [ .pdf | bib ]

  2. Clustering with a Reject Option: Interactive Clustering as Bayesian Prior Elicitation. Akash Srivastava, James Zou, Ryan P. Adams and Charles Sutton. In Workshop on Human Interpretability in Machine Learning Workshop on Human Interpretability in Machine Learning (co-located with ICML). 2016.

    [ .pdf | bib ]

  3. Learning and Verifying Unwanted Behaviours. Wei Chen, David Aspinall, Andrew Gordon, Charles Sutton and Igor Muttik. In Workshop on Hot Issues in Security Principles and Trust (HotSpot 2016). 2016.

    [ .pdf | bib ]

  4. On Robust Malware Classifiers by Verifying Unwanted Behaviours. Wei Chen, David Aspinall, Andrew Gordon, Charles Sutton and Igor Muttik. In International Conference on Integrated Formal Methods. 2016.

    [ to appear | bib ]

  5. More Semantics More Robust: Improving Android Malware Classifiers. Wei Chen, David Aspinall, Andrew D Gordon, Charles Sutton and Igor Muttik. In ACM Conference on Security and Privacy in Wireless and Mobile Networks (WiSec). 2016.

    [ to appear | bib ]

  6. Tailored Mutants Fit Bugs Better. Miltiadis Allamanis, Earl T. Barr, René Just and Charles Sutton. CoRR abs/1611.02516. 2016.

    [ .pdf | bib ]

  7. Parameter-Free Probabilistic API Mining across GitHub. Jaroslav Fowkes and Charles Sutton. In Foundations of Software Engineering (FSE). 2016.

    [ .pdf | bib | code and data ]

  8. A Subsequence Interleaving Model for Sequential Pattern Mining. Jaroslav Fowkes and Charles Sutton. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016.

    [ .pdf | bib | code and data ]

  9. TASSAL: Autofolding for Source Code Summarization. Jaroslav Fowkes, Pankajan Chanthirasegaran, Razvan Ranca, Miltiadis Allamanis, Mirella Lapata and Charles Sutton. In International Conference on Software Engineering (ICSE). 2016.

    (Demo track)

    [ .pdf | bib | source code ]

  10. A Bayesian Network Model for Interesting Itemsets. Jaroslav Fowkes and Charles Sutton. In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery (ECML-PKDD). 2016.

    [ .pdf | bib | source code ]

  11. Blending LSTMs into CNNs. Krzysztof J. Geras, Abdel-rahman Mohamed, Rich Caruana, Gregor Urban, Shengjie Wang, Ozlem Aslan, Matthai Philipose, Matthew Richardson and Charles Sutton. In International Conference on Learning Representations (ICLR Workshop). 2016.

    [ .pdf | bib ]

  12. Composite denoising autoencoders. Krzysztof Geras and Charles Sutton. In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery (ECML-PKDD). 2016.

    [ .pdf | bib ]

  13. Mining Semantic Loop Idioms from Big Code. Miltiadis Allamanis, Earl T. Barr, Christian Bird, Premkumar Devanbu, Mark Marron and Charles Sutton. Microsoft Research Technical Report, MSR-TR-2016-1116, 2016.

    [ .pdf | bib | abstract ]

  14. A Bayesian Approach to Parameter Inference in Queueing Networks. Weikun Wang, Giuliano Casale and Charles Sutton. ACM Transactions on Modeling and Computer Simulation 27 (1). 2016.

    [ .pdf | bib ]

2015

  1. Compact Explanations of Why Malware is Bad. Wei Chen, Charles Sutton, Andrew Gordon, David Aspinall, Igor Muttik and Qi Shen. In International Workshop on the Use of AI in Formal Methods (AI4FM). 2015.

    [ to appear | bib ]

  2. Verifying Anti-Security Policies Learnt from Android Malware Families. Wei Chen, Charles Sutton, David Aspinall, Andrew Gordon, Qi Shen and Igor Muttik. In International Seminar on Program Verification, Automated Debugging and Symbolic Computation. 2015.

    [ to appear | bib ]

  3. Suggesting Accurate Method and Class Names. Miltiadis Allamanis, Earl T. Barr, Christian Bird and Charles Sutton. In Foundations of Software Engineering (FSE). 2015. (Neural network model that can suggest a name for a method or class, given the method’s body and signature.)

    [ .pdf | bib | abstract | source code ]

  4. Scheduled Denoising Autoencoders. Krzysztof Geras and Charles Sutton. In International Conference on Representation Learning (ICLR). 2015.

    [ .pdf | bib ]

  5. Latent Bayesian melding for integrating individual and population models. Mingjun Zhong, Nigel Goddard and Charles Sutton. In Advances in Neural Information Processing Systems (NIPS). 2015.

    [ .pdf | bib ]

2014

  1. Word Storms: Multiples of Word Clouds for Visual Comparison of Documents. Quim Castella and Charles Sutton. In International World Wide Web Conference (WWW). 2014.

    [ .pdf | bib ]

  2. Mining idioms from source code. Miltos Allamanis and Charles Sutton. In Symposium on the Foundations of Software Engineering (FSE). 2014.

    [ .pdf | bib ]

  3. Learning Natural Coding Conventions. Miltiadis Allamanis, Earl T Barr, Christian Bird and Charles Sutton. In Symposium on the Foundations of Software Engineering (FSE). 2014.

    (Winner, ACM SIGSOFT Distinguished Paper Award.)

    [ .pdf | bib | source code ]

  4. Semi-Separable Hamiltonian Monte Carlo for Inference in Bayesian Hierarchical Models. Yichuan Zhang and Charles Sutton. In Advances in Neural Information Processing Systems (NIPS). 2014.

    [ .pdf | bib ]

  5. Signal Aggregate Constraints in Additive Factorial HMMs, with Application to Energy Disaggregation. Mingjun Zhong, Nigel Goddard and Charles Sutton. In Advances in Neural Information Processing Systems (NIPS). 2014.

    [ .pdf | bib ]

2013

  1. Mining Source Code Repositories at Massive Scale using Language Modeling. Miltos Allamanis and Charles Sutton. In Working Conference on Mining Software Repositories (MSR). 2013.

    [ .pdf | bib ]

  2. Why, When, and What: Analyzing Stack Overflow Questions by Topic, Type, and Code. Miltos Allamanis and Charles Sutton. In Working Conference on Mining Software Repositories (MSR). 2013.

    [ .pdf | bib ]

  3. Multiple-source Cross Validation. Krzysztof Geras and Charles Sutton. In International Conference on Machine Learning (ICML). 2013.

    [ .pdf | bib ]

  4. Supporting User-Defined Functions on Uncertain Data. Thanh T. L. Tran, Yanlei Diao, Charles Sutton and Anna Liu. Proceedings of the VLDB Endowment (PVLDB). 2013.

    [ .pdf | bib ]

2012

  1. An Introduction to Conditional Random Fields. Charles Sutton and Andrew McCallum. Foundations and Trends in Machine Learning 4 (4). 2012.

    [ .pdf | bib | abstract ]

  2. Continuous Relaxations for Discrete Hamiltonian Monte Carlo. Yichuan Zhang, Charles Sutton, Amos Storkey and Zoubin Ghahramani. In Advances in Neural Information Processing Systems (NIPS). 2012.

    [ .pdf | bib ]

2011

  1. Distributed Inference and Query Processing for RFID Tracking and Monitoring. Zhao Cao, Charles Sutton, Yanlei Diao and Prashant Shenoy. Proceedings of the VLDB Endowment (PVLDB) 4 (5). 2011.

    [ .pdf | bib ]

  2. Bayesian Inference in Queueing Networks. Charles Sutton and Michael I. Jordan. Annals of Applied Statistics 5 (1). 2011.

    [ .pdf | bib | source code ]

  3. Quasi-Newton Markov chain Monte Carlo. Yichuan Zhang and Charles Sutton. In Advances in Neural Information Processing Systems (NIPS). 2011.

    [ .pdf | bib ]

2010

  1. Learning and Inference in Queueing Networks. Charles Sutton and Michael I. Jordan. In Conference on Artificial Intelligence and Statistics (AISTATS). 2010. (Conference version of the longer paper "Bayesian Inference in Queueing Networks".)

    [ .pdf | bib ]

2009

  1. Automatic Exploration of Datacenter Performance Regimes. Peter Bodik, Rean Griffith, Charles Sutton, Armando Fox, Michael I. Jordan and David A. Patterson. In First Workshop on Automated Control for Datacenters and Clouds (ACDC ’09). 2009.

    [ .pdf | bib ]

  2. Statistical Machine Learning Makes Automatic Control Practical for Internet Datacenters. Peter Bodik, Rean Griffith, Charles Sutton, Armando Fox, Michael I. Jordan and David A. Patterson. In Workshop on Hot Topics in Cloud Computing (HotCloud ’09). 2009.

    [ .pdf | bib ]

  3. Capturing Data Uncertainty in High-Volume Stream Processing. Yanlei Diao, Boduo Li, Anna Liu, Liping Peng, Charles Sutton, Thanh Tran and Michael Zink. In Conference on Innovative Data Systems Research (CIDR). 2009.

    [ .pdf | bib ]

  4. Misleading learners: Co-opting your spam filter. Blaine Nelson, Marco Barreno, Fuching Jack Chi, Anthony D. Joseph, Benjamin I. P. Rubinstein, Udam Saini, Charles Sutton, J. D. Tygar and Kai Xia. In Tsai, Jeffrey J. P. and Yu, Philip S., editors. Machine Learning in Cyber Trust: Security, Privacy, Reliability. Springer. 2009.

    [ .pdf | bib ]

  5. Piecewise Training for Structured Prediction. Charles Sutton and Andrew McCallum. Machine Learning 77 (2–3). 2009. (Train undirected graphical model by splitting into overlapping parts that are trained independently. Connections to pseudolikelihood and Bethe free energy. Journal version of UAI and ICML papers below.)

    [ .pdf | bib | abstract ]

  6. Probabilistic Inference over RFID Streams in Mobile Environments. Thanh Tran, Charles Sutton, Richard Cocci, Yanming Nie, Yanlei Diao and Prashant Shenoy. In International Conference on Data Engineering (ICDE). 2009.

    [ .pdf | bib ]

2008

  1. Unsupervised Deduplication using Cross-field Dependencies. Robert Hall, Charles Sutton and Andrew McCallum. In Conference on Knowledge Discovery and Data Mining (KDD). 2008. (Hierarchical DP model that jointly clusters citation venue strings based on both string-edit distance and title information.)

    [ .pdf | bib | abstract ]

  2. Exploiting Machine Learning to Subvert your Spam Filter. Blaine Nelson, Marco Barreno, Fuching Jack Chi, Anthony D. Joseph, Benjamin I. P. Rubinstein, Udam Saini, Charles Sutton, J. D. Tygar and Kai Xia. In Proceedings of the First USENIX Workshop on Large-Scale Exploits and Emergent Threats (LEET). 2008. (Send crafted email to a spam filter to cause it to misclassify your normal email as spam. Initial experiments on defenses to this attack. )

    [ .pdf | bib ]

  3. Probabilistic inference in queueing networks. Charles Sutton and Michael I. Jordan. In Workshop on Tackling Computer Systems Problems with Machine Learning Techniques (SYSML). 2008.

    [ .pdf | bib ]

  4. Efficient Training Methods for Conditional Random Fields. Charles Sutton. Ph.D. Dissertation, University of Massachusetts, 2008.

    [ .pdf | bib ]

  5. Bayesian Modeling of Dependency Trees Using Hierarchical Pitman-Yor Priors. Hanna Wallach, Charles Sutton and Andrew McCallum. In ICML Workshop on Prior Knowledge for Text and Language Processing. 2008. (Two Bayesian dependency parsing models: 1. Model with Pitman-Yor prior that significantly improves Eisner’s classic model; 2. Latent-variable model that learns "syntactic" topics.)

    [ .pdf | bib ]

2007

  1. Response-Time Modeling for Resource Allocation and Energy-Informed SLAs. Peter Bodik, Charles Sutton, Armando Fox, David Patterson and Michael I. Jordan. In NIPS Workshop on Statistical Learning Techniques for Solving Systems Problems (MLSys 07). 2007. (Quantile regression (both parametric and non-) for predicting the performance of a web service as a function of workload and power consumption. Much better for voltage control than built-in frequency scaling.)

    [ .pdf | bib ]

  2. Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data. Charles Sutton, Andrew McCallum and Khashayar Rohanimanesh. Journal of Machine Learning Research 8. 2007. (Combination of dynamic Bayesian networks and conditional random fields. Also considers latent-variable model and cascaded training. Journal version of ICML and EMNLP papers below.)

    [ .pdf | bib | abstract ]

  3. An Introduction to Conditional Random Fields for Relational Learning. Charles Sutton and Andrew McCallum. In Getoor, Lise and Taskar, Ben, editors. Introduction to Statistical Relational Learning. MIT Press. 2007. (Detailed tutorial on conditional random fields. Includes motivation, background, mathematical foundations, linear-chain form, general-structure form, inference, parameter estimation, and tips and tricks. NOTE: In Equation (1.22), there is a small error. There should not be a summation over k in the final term, just lambda_k / sigma_2. )

    [ .pdf | bib ]

  4. Piecewise Pseudolikelihood for Efficient CRF Training. Charles Sutton and Andrew McCallum. In International Conference on Machine Learning (ICML). 2007. (Train a large CRF in five times faster by dividing it into separate pieces and reducing numbers of predicted variable combinations with pseudolikelihood. Analysis in terms of belief propagation and Bethe energy.)

    [ .pdf | bib | abstract ]

  5. Improved Dynamic Schedules for Belief Propagation. Charles Sutton and Andrew McCallum. In Conference on Uncertainty in Artificial Intelligence (UAI). 2007. (Significantly faster version of loopy BP by selecting which messages to send based on an approximation to their residual.)

    [ .pdf | bib | abstract ]

2006

  1. Sparse Forward-Backward using Minimum Divergence Beams for Fast Training of Conditional Random Fields. Chris Pal, Charles Sutton and Andrew McCallum. In International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2006. (New criterion for adaptive beam size within forward-backward, suggested by a variational perspective. Works well within CRF training.)

    [ .pdf | bib | abstract ]

  2. Local Training and Belief Propagation. Charles Sutton and Tom Minka. Microsoft Research Technical Report, TR-2006-121, 2006.

    [ .pdf | bib ]

  3. Reducing Weight Undertraining in Structured Discriminative Learning. Charles Sutton, Michael Sindelar and Andrew McCallum. In Conference on Human Language Technology and North American Association for Computational Linguistics (HLT-NAACL). 2006. (Trains multiple linear-chain CRFs with different subsets of features, in order to force dependent sets of features to be able to separately model the class label.)

    (This is the published version. An early version had an error in Section 4, under Per-Sequence Mixtures.)

    [ .pdf | bib | abstract ]

2005

  1. Fast, Piecewise Training for Discriminative Finite-state and Parsing Models. Charles Sutton and Andrew McCallum. Center for Intelligent Information Retrieval Technical Report, IR-403, 2005.

    [ .pdf | bib ]

  2. Joint Parsing and Semantic Role Labeling. Charles Sutton and Andrew McCallum. In Conference on Natural Language Learning (CoNLL). 2005.

    [ .pdf | bib ]

  3. Piecewise Training of Undirected Models. Charles Sutton and Andrew McCallum. In Conference on Uncertainty in Artificial Intelligence (UAI). 2005. (Train large CRF by dividing into pieces and training independently. The explanation in this paper for why it works is somewhat unsatisfying. Consult journal version (2008) for a better story.)

    [ .pdf | bib | abstract ]

  4. Composition of Conditional Random Fields for Transfer Learning. Charles Sutton and Andrew McCallum. In Conference on Human Language Technology and Empirical Methods in Natural Language Processing (HLT-EMNLP). 2005.

    [ .pdf | bib ]

  5. Learning in Markov Random Fields with Contrastive Free Energies. Max Welling and Charles Sutton. In Conference on Artificial Intelligence and Statistics (AISTATS). 2005.

    [ .pdf | bib | abstract ]

2004

  1. Piecewise Training with Parameter Independence Diagrams: Comparing Globally- and Locally-trained Linear-chain CRFs. Andrew McCallum and Charles Sutton. In NIPS Workshop on Learning with Structured Outputs. 2004.

    [ .pdf | bib | abstract ]

  2. Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data. Charles Sutton, Khashayar Rohanimanesh and Andrew McCallum. In International Conference on Machine Learning (ICML). 2004. (Combination of dynamic Bayesian networks and conditional random fields, with experiments in noun-phrase chunking.)

    [ .pdf | bib | abstract ]

  3. Conditional probabilistic context-free grammars. Charles Sutton. Synthesis project (Required for Ph.D. candidacy), University of Massachusetts, 2004.

    [ .pdf | bib ]

  4. Collective Segmentation and Labeling of Distant Entities in Information Extraction. Charles Sutton and Andrew McCallum. In ICML Workshop on Statistical Relational Learning and Its Connections to Other Fields. 2004.

    [ .pdf | bib ]

2003

  1. Information Theory and Representation in Associative Word Learning. Brendan Burns, Charles Sutton, Clayton Morrison and Paul R. Cohen. In Third International Workshop on Epigenetic Robotics. 2003.

    [ .pdf | bib ]

  2. Very Predictive N-grams for Space-Limited Probabilistic Models. Paul R. Cohen and Charles Sutton. In International Symposium on Intelligent Data Analysis. 2003.

    [ .pdf | bib | abstract ]

  3. Dynamic Conditional Random Fields for Jointly Labeling Multiple Sequences. Andrew McCallum, Khashayar Rohanimanesh and Charles Sutton. In NIPS Workshop on Syntax, Semantics, and Statistics. 2003.

    [ .pdf | bib | abstract ]

  4. Guided Incremental Construction of Belief Networks. Charles Sutton, Brendan Burns, Clayton Morrison and Paul R. Cohen. In International Symposium on Intelligent Data Analysis. 2003.

    [ .pdf | bib | abstract ]

2002

  1. Learning Effects of Robot Actions Using Temporal Associations. Paul R. Cohen, Charles Sutton and Brendan Burns. In International Conference on Development and Learning (ICDL). 2002.

    [ .pdf | bib | abstract ]

  2. Computers and Octi: Report from the 2001 Tournament. Charles Sutton. ICGA Journal 25 (2). 2002.

    [ .pdf | bib | abstract ]