Publications by Year

Alternately see my publications by topic.

2017

  1. Learning Continuous Semantic Representations of Symbolic Expressions. Miltiadis Allamanis, Pankajan Chanthirasegaran, Pushmeet Kohli and Charles Sutton. In International Conference on Machine Learning (ICML). 2017.

    [ .pdf | bib | code and data ]

  2. Autofolding for Source Code Summarization. Jaroslav Fowkes, Razvan Ranca, Miltiadis Allamanis, Mirella Lapata and Charles Sutton. Transactions on Software Engineering. In press. 2017.

    [ .pdf | bib ]

  3. Autoencoding Variational Inference for Topic Models. Akash Srivastava and Charles Sutton. In International Conference on Learning Representations (ICLR). 2017.

    [ .pdf | bib | arXiv | discussion | source code ]

  4. VEEGAN: Reducing Mode Collapse in GANs using Implicit Variational Learning. Akash Srivastava, Lazar Valkov, Chris Russell, Michael Gutmann and Charles Sutton. In Advances in Neural Information Processing Systems (NIPS). 2017.

    [ .pdf | bib | abstract | code and data ]

2016

  1. A Convolutional Attention Network for Extreme Summarization of Source Code. Miltiadis Allamanis, Hao Peng and Charles Sutton. In International Conference in Machine Learning (ICML). 2016.

    [ .pdf | bib ]

  2. Clustering with a Reject Option: Interactive Clustering as Bayesian Prior Elicitation. Akash Srivastava, James Zou, Ryan P. Adams and Charles Sutton. In Workshop on Human Interpretability in Machine Learning Workshop on Human Interpretability in Machine Learning (co-located with ICML). 2016.

    [ .pdf | bib ]

  3. Learning and Verifying Unwanted Behaviours. Wei Chen, David Aspinall, Andrew Gordon, Charles Sutton and Igor Muttik. In Workshop on Hot Issues in Security Principles and Trust (HotSpot 2016). 2016.

    [ .pdf | bib ]

  4. On Robust Malware Classifiers by Verifying Unwanted Behaviours. Wei Chen, David Aspinall, Andrew Gordon, Charles Sutton and Igor Muttik. In International Conference on Integrated Formal Methods. 2016.

    [ to appear | bib ]

  5. More Semantics More Robust: Improving Android Malware Classifiers. Wei Chen, David Aspinall, Andrew D Gordon, Charles Sutton and Igor Muttik. In ACM Conference on Security and Privacy in Wireless and Mobile Networks (WiSec). 2016.

    [ to appear | bib ]

  6. Tailored Mutants Fit Bugs Better. Miltiadis Allamanis, Earl T. Barr, René Just and Charles Sutton. CoRR abs/1611.02516. 2016.

    [ .pdf | bib ]

  7. Parameter-Free Probabilistic API Mining across GitHub. Jaroslav Fowkes and Charles Sutton. In Foundations of Software Engineering (FSE). 2016.

    [ .pdf | bib | code and data ]

  8. A Subsequence Interleaving Model for Sequential Pattern Mining. Jaroslav Fowkes and Charles Sutton. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016.

    [ .pdf | bib | code and data ]

  9. TASSAL: Autofolding for Source Code Summarization. Jaroslav Fowkes, Pankajan Chanthirasegaran, Razvan Ranca, Miltiadis Allamanis, Mirella Lapata and Charles Sutton. In International Conference on Software Engineering (ICSE). 2016.

    (Demo track)

    [ .pdf | bib | source code ]

  10. A Bayesian Network Model for Interesting Itemsets. Jaroslav Fowkes and Charles Sutton. In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery (ECML-PKDD). 2016.

    [ .pdf | bib | source code ]

  11. Blending LSTMs into CNNs. Krzysztof J. Geras, Abdel-rahman Mohamed, Rich Caruana, Gregor Urban, Shengjie Wang, Ozlem Aslan, Matthai Philipose, Matthew Richardson and Charles Sutton. In International Conference on Learning Representations (ICLR Workshop). 2016.

    [ .pdf | bib ]

  12. Composite denoising autoencoders. Krzysztof Geras and Charles Sutton. In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery (ECML-PKDD). 2016.

    [ .pdf | bib ]

  13. Mining Semantic Loop Idioms from Big Code. Miltiadis Allamanis, Earl T. Barr, Christian Bird, Premkumar Devanbu, Mark Marron and Charles Sutton. Microsoft Research Technical Report, MSR-TR-2016-1116, 2016.

    [ .pdf | bib | abstract ]

  14. A Bayesian Approach to Parameter Inference in Queueing Networks. Weikun Wang, Giuliano Casale and Charles Sutton. ACM Transactions on Modeling and Computer Simulation 27 (1). 2016.

    [ .pdf | bib ]

2015

  1. Compact Explanations of Why Malware is Bad. Wei Chen, Charles Sutton, Andrew Gordon, David Aspinall, Igor Muttik and Qi Shen. In International Workshop on the Use of AI in Formal Methods (AI4FM). 2015.

    [ to appear | bib ]

  2. Verifying Anti-Security Policies Learnt from Android Malware Families. Wei Chen, Charles Sutton, David Aspinall, Andrew Gordon, Qi Shen and Igor Muttik. In International Seminar on Program Verification, Automated Debugging and Symbolic Computation. 2015.

    [ to appear | bib ]

  3. Suggesting Accurate Method and Class Names. Miltiadis Allamanis, Earl T. Barr, Christian Bird and Charles Sutton. In Foundations of Software Engineering (FSE). 2015. (Neural network model that can suggest a name for a method or class, given the method’s body and signature.)

    [ .pdf | bib | abstract | source code ]

  4. Scheduled Denoising Autoencoders. Krzysztof Geras and Charles Sutton. In International Conference on Representation Learning (ICLR). 2015.

    [ .pdf | bib ]

  5. Latent Bayesian melding for integrating individual and population models. Mingjun Zhong, Nigel Goddard and Charles Sutton. In Advances in Neural Information Processing Systems (NIPS). 2015.

    [ .pdf | bib ]

2014

  1. Word Storms: Multiples of Word Clouds for Visual Comparison of Documents. Quim Castella and Charles Sutton. In International World Wide Web Conference (WWW). 2014.

    [ .pdf | bib ]

  2. Mining idioms from source code. Miltos Allamanis and Charles Sutton. In Symposium on the Foundations of Software Engineering (FSE). 2014.

    [ .pdf | bib ]

  3. Learning Natural Coding Conventions. Miltiadis Allamanis, Earl T Barr, Christian Bird and Charles Sutton. In Symposium on the Foundations of Software Engineering (FSE). 2014.

    (Winner, ACM SIGSOFT Distinguished Paper Award.)

    [ .pdf | bib | source code ]

  4. Semi-Separable Hamiltonian Monte Carlo for Inference in Bayesian Hierarchical Models. Yichuan Zhang and Charles Sutton. In Advances in Neural Information Processing Systems (NIPS). 2014.

    [ .pdf | bib ]

  5. Signal Aggregate Constraints in Additive Factorial HMMs, with Application to Energy Disaggregation. Mingjun Zhong, Nigel Goddard and Charles Sutton. In Advances in Neural Information Processing Systems (NIPS). 2014.

    [ .pdf | bib ]

2013

  1. Mining Source Code Repositories at Massive Scale using Language Modeling. Miltos Allamanis and Charles Sutton. In Working Conference on Mining Software Repositories (MSR). 2013.

    [ .pdf | bib ]

  2. Why, When, and What: Analyzing Stack Overflow Questions by Topic, Type, and Code. Miltos Allamanis and Charles Sutton. In Working Conference on Mining Software Repositories (MSR). 2013.

    [ .pdf | bib ]

  3. Multiple-source Cross Validation. Krzysztof Geras and Charles Sutton. In International Conference on Machine Learning (ICML). 2013.

    [ .pdf | bib ]

  4. Supporting User-Defined Functions on Uncertain Data. Thanh T. L. Tran, Yanlei Diao, Charles Sutton and Anna Liu. Proceedings of the VLDB Endowment (PVLDB). 2013.

    [ .pdf | bib ]

2012

  1. An Introduction to Conditional Random Fields. Charles Sutton and Andrew McCallum. Foundations and Trends in Machine Learning 4 (4). 2012.

    [ .pdf | bib | abstract ]

  2. Continuous Relaxations for Discrete Hamiltonian Monte Carlo. Yichuan Zhang, Charles Sutton, Amos Storkey and Zoubin Ghahramani. In Advances in Neural Information Processing Systems (NIPS). 2012.

    [ .pdf | bib ]

2011

  1. Distributed Inference and Query Processing for RFID Tracking and Monitoring. Zhao Cao, Charles Sutton, Yanlei Diao and Prashant Shenoy. Proceedings of the VLDB Endowment (PVLDB) 4 (5). 2011.

    [ .pdf | bib ]

  2. Bayesian Inference in Queueing Networks. Charles Sutton and Michael I. Jordan. Annals of Applied Statistics 5 (1). 2011.

    [ .pdf | bib | source code ]

  3. Quasi-Newton Markov chain Monte Carlo. Yichuan Zhang and Charles Sutton. In Advances in Neural Information Processing Systems (NIPS). 2011.

    [ .pdf | bib ]

2010

  1. Learning and Inference in Queueing Networks. Charles Sutton and Michael I. Jordan. In Conference on Artificial Intelligence and Statistics (AISTATS). 2010. (Conference version of the longer paper "Bayesian Inference in Queueing Networks".)

    [ .pdf | bib ]

2009

  1. Automatic Exploration of Datacenter Performance Regimes. Peter Bodik, Rean Griffith, Charles Sutton, Armando Fox, Michael I. Jordan and David A. Patterson. In First Workshop on Automated Control for Datacenters and Clouds (ACDC ’09). 2009.

    [ .pdf | bib ]

  2. Statistical Machine Learning Makes Automatic Control Practical for Internet Datacenters. Peter Bodik, Rean Griffith, Charles Sutton, Armando Fox, Michael I. Jordan and David A. Patterson. In Workshop on Hot Topics in Cloud Computing (HotCloud ’09). 2009.

    [ .pdf | bib ]

  3. Capturing Data Uncertainty in High-Volume Stream Processing. Yanlei Diao, Boduo Li, Anna Liu, Liping Peng, Charles Sutton, Thanh Tran and Michael Zink. In Conference on Innovative Data Systems Research (CIDR). 2009.

    [ .pdf | bib ]

  4. Misleading learners: Co-opting your spam filter. Blaine Nelson, Marco Barreno, Fuching Jack Chi, Anthony D. Joseph, Benjamin I. P. Rubinstein, Udam Saini, Charles Sutton, J. D. Tygar and Kai Xia. In Tsai, Jeffrey J. P. and Yu, Philip S., editors. Machine Learning in Cyber Trust: Security, Privacy, Reliability. Springer. 2009.

    [ .pdf | bib ]

  5. Piecewise Training for Structured Prediction. Charles Sutton and Andrew McCallum. Machine Learning 77 (2–3). 2009. (Train undirected graphical model by splitting into overlapping parts that are trained independently. Connections to pseudolikelihood and Bethe free energy. Journal version of UAI and ICML papers below.)

    [ .pdf | bib | abstract ]

  6. Probabilistic Inference over RFID Streams in Mobile Environments. Thanh Tran, Charles Sutton, Richard Cocci, Yanming Nie, Yanlei Diao and Prashant Shenoy. In International Conference on Data Engineering (ICDE). 2009.

    [ .pdf | bib ]

2008

  1. Unsupervised Deduplication using Cross-field Dependencies. Robert Hall, Charles Sutton and Andrew McCallum. In Conference on Knowledge Discovery and Data Mining (KDD). 2008. (Hierarchical DP model that jointly clusters citation venue strings based on both string-edit distance and title information.)

    [ .pdf | bib | abstract ]

  2. Exploiting Machine Learning to Subvert your Spam Filter. Blaine Nelson, Marco Barreno, Fuching Jack Chi, Anthony D. Joseph, Benjamin I. P. Rubinstein, Udam Saini, Charles Sutton, J. D. Tygar and Kai Xia. In Proceedings of the First USENIX Workshop on Large-Scale Exploits and Emergent Threats (LEET). 2008. (Send crafted email to a spam filter to cause it to misclassify your normal email as spam. Initial experiments on defenses to this attack. )

    [ .pdf | bib ]

  3. Probabilistic inference in queueing networks. Charles Sutton and Michael I. Jordan. In Workshop on Tackling Computer Systems Problems with Machine Learning Techniques (SYSML). 2008.

    [ .pdf | bib ]

  4. Efficient Training Methods for Conditional Random Fields. Charles Sutton. Ph.D. Dissertation, University of Massachusetts, 2008.

    [ .pdf | bib ]

  5. Bayesian Modeling of Dependency Trees Using Hierarchical Pitman-Yor Priors. Hanna Wallach, Charles Sutton and Andrew McCallum. In ICML Workshop on Prior Knowledge for Text and Language Processing. 2008. (Two Bayesian dependency parsing models: 1. Model with Pitman-Yor prior that significantly improves Eisner’s classic model; 2. Latent-variable model that learns "syntactic" topics.)

    [ .pdf | bib ]

2007

  1. Response-Time Modeling for Resource Allocation and Energy-Informed SLAs. Peter Bodik, Charles Sutton, Armando Fox, David Patterson and Michael I. Jordan. In NIPS Workshop on Statistical Learning Techniques for Solving Systems Problems (MLSys 07). 2007. (Quantile regression (both parametric and non-) for predicting the performance of a web service as a function of workload and power consumption. Much better for voltage control than built-in frequency scaling.)

    [ .pdf | bib ]

  2. Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data. Charles Sutton, Andrew McCallum and Khashayar Rohanimanesh. Journal of Machine Learning Research 8. 2007. (Combination of dynamic Bayesian networks and conditional random fields. Also considers latent-variable model and cascaded training. Journal version of ICML and EMNLP papers below.)

    [ .pdf | bib | abstract ]

  3. An Introduction to Conditional Random Fields for Relational Learning. Charles Sutton and Andrew McCallum. In Getoor, Lise and Taskar, Ben, editors. Introduction to Statistical Relational Learning. MIT Press. 2007. (Detailed tutorial on conditional random fields. Includes motivation, background, mathematical foundations, linear-chain form, general-structure form, inference, parameter estimation, and tips and tricks. NOTE: In Equation (1.22), there is a small error. There should not be a summation over k in the final term, just lambda_k / sigma_2. )

    [ .pdf | bib ]

  4. Piecewise Pseudolikelihood for Efficient CRF Training. Charles Sutton and Andrew McCallum. In International Conference on Machine Learning (ICML). 2007. (Train a large CRF in five times faster by dividing it into separate pieces and reducing numbers of predicted variable combinations with pseudolikelihood. Analysis in terms of belief propagation and Bethe energy.)

    [ .pdf | bib | abstract ]

  5. Improved Dynamic Schedules for Belief Propagation. Charles Sutton and Andrew McCallum. In Conference on Uncertainty in Artificial Intelligence (UAI). 2007. (Significantly faster version of loopy BP by selecting which messages to send based on an approximation to their residual.)

    [ .pdf | bib | abstract ]

2006

  1. Sparse Forward-Backward using Minimum Divergence Beams for Fast Training of Conditional Random Fields. Chris Pal, Charles Sutton and Andrew McCallum. In International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2006. (New criterion for adaptive beam size within forward-backward, suggested by a variational perspective. Works well within CRF training.)

    [ .pdf | bib | abstract ]

  2. Local Training and Belief Propagation. Charles Sutton and Tom Minka. Microsoft Research Technical Report, TR-2006-121, 2006.

    [ .pdf | bib ]

  3. Reducing Weight Undertraining in Structured Discriminative Learning. Charles Sutton, Michael Sindelar and Andrew McCallum. In Conference on Human Language Technology and North American Association for Computational Linguistics (HLT-NAACL). 2006. (Trains multiple linear-chain CRFs with different subsets of features, in order to force dependent sets of features to be able to separately model the class label.)

    (This is the published version. An early version had an error in Section 4, under Per-Sequence Mixtures.)

    [ .pdf | bib | abstract ]

2005

  1. Fast, Piecewise Training for Discriminative Finite-state and Parsing Models. Charles Sutton and Andrew McCallum. Center for Intelligent Information Retrieval Technical Report, IR-403, 2005.

    [ .pdf | bib ]

  2. Joint Parsing and Semantic Role Labeling. Charles Sutton and Andrew McCallum. In Conference on Natural Language Learning (CoNLL). 2005.

    [ .pdf | bib ]

  3. Piecewise Training of Undirected Models. Charles Sutton and Andrew McCallum. In Conference on Uncertainty in Artificial Intelligence (UAI). 2005. (Train large CRF by dividing into pieces and training independently. The explanation in this paper for why it works is somewhat unsatisfying. Consult journal version (2008) for a better story.)

    [ .pdf | bib | abstract ]

  4. Composition of Conditional Random Fields for Transfer Learning. Charles Sutton and Andrew McCallum. In Conference on Human Language Technology and Empirical Methods in Natural Language Processing (HLT-EMNLP). 2005.

    [ .pdf | bib ]

  5. Learning in Markov Random Fields with Contrastive Free Energies. Max Welling and Charles Sutton. In Conference on Artificial Intelligence and Statistics (AISTATS). 2005.

    [ .pdf | bib | abstract ]

2004

  1. Piecewise Training with Parameter Independence Diagrams: Comparing Globally- and Locally-trained Linear-chain CRFs. Andrew McCallum and Charles Sutton. In NIPS Workshop on Learning with Structured Outputs. 2004.

    [ .pdf | bib | abstract ]

  2. Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data. Charles Sutton, Khashayar Rohanimanesh and Andrew McCallum. In International Conference on Machine Learning (ICML). 2004. (Combination of dynamic Bayesian networks and conditional random fields, with experiments in noun-phrase chunking.)

    [ .pdf | bib | abstract ]

  3. Conditional probabilistic context-free grammars. Charles Sutton. Synthesis project (Required for Ph.D. candidacy), University of Massachusetts, 2004.

    [ .pdf | bib ]

  4. Collective Segmentation and Labeling of Distant Entities in Information Extraction. Charles Sutton and Andrew McCallum. In ICML Workshop on Statistical Relational Learning and Its Connections to Other Fields. 2004.

    [ .pdf | bib ]

2003

  1. Information Theory and Representation in Associative Word Learning. Brendan Burns, Charles Sutton, Clayton Morrison and Paul R. Cohen. In Third International Workshop on Epigenetic Robotics. 2003.

    [ .pdf | bib ]

  2. Very Predictive N-grams for Space-Limited Probabilistic Models. Paul R. Cohen and Charles Sutton. In International Symposium on Intelligent Data Analysis. 2003.

    [ .pdf | bib | abstract ]

  3. Dynamic Conditional Random Fields for Jointly Labeling Multiple Sequences. Andrew McCallum, Khashayar Rohanimanesh and Charles Sutton. In NIPS Workshop on Syntax, Semantics, and Statistics. 2003.

    [ .pdf | bib | abstract ]

  4. Guided Incremental Construction of Belief Networks. Charles Sutton, Brendan Burns, Clayton Morrison and Paul R. Cohen. In International Symposium on Intelligent Data Analysis. 2003.

    [ .pdf | bib | abstract ]

2002

  1. Learning Effects of Robot Actions Using Temporal Associations. Paul R. Cohen, Charles Sutton and Brendan Burns. In International Conference on Development and Learning (ICDL). 2002.

    [ .pdf | bib | abstract ]

  2. Computers and Octi: Report from the 2001 Tournament. Charles Sutton. ICGA Journal 25 (2). 2002.

    [ .pdf | bib | abstract ]