Publications by Topic

Alternately see my publications by year.

Deep Learning

  1. Sequence-to-Point Learning with Neural Networks for Non-intrusive Load Monitoring. Chaoyun Zhang, Mingjun Zhong, Zongzuo Wang, Nigel Goddard and Charles Sutton. In National Conference on Artificial Intelligence (AAAI). 2018.

    [ .pdf | bib ]

  2. Autoencoding Variational Inference for Topic Models. Akash Srivastava and Charles Sutton. In International Conference on Learning Representations (ICLR). 2017.

    [ .pdf | bib | arXiv | discussion | source code ]

  3. Blending LSTMs into CNNs. Krzysztof J. Geras, Abdel-rahman Mohamed, Rich Caruana, Gregor Urban, Shengjie Wang, Ozlem Aslan, Matthai Philipose, Matthew Richardson and Charles Sutton. In International Conference on Learning Representations (ICLR Workshop). 2016.

    [ .pdf | bib ]

  4. Composite denoising autoencoders. Krzysztof Geras and Charles Sutton. In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery (ECML-PKDD). 2016.

    [ .pdf | bib ]

  5. Suggesting Accurate Method and Class Names. Miltiadis Allamanis, Earl T. Barr, Christian Bird and Charles Sutton. In Foundations of Software Engineering (FSE). 2015. (Neural network model that can suggest a name for a method or class, given the method’s body and signature.)

    [ .pdf | bib | abstract | source code ]

  6. Scheduled Denoising Autoencoders. Krzysztof Geras and Charles Sutton. In International Conference on Representation Learning (ICLR). 2015.

    [ .pdf | bib ]

Sustainable Energy

  1. Sequence-to-Point Learning with Neural Networks for Non-intrusive Load Monitoring. Chaoyun Zhang, Mingjun Zhong, Zongzuo Wang, Nigel Goddard and Charles Sutton. In National Conference on Artificial Intelligence (AAAI). 2018.

    [ .pdf | bib ]

  2. Latent Bayesian melding for integrating individual and population models. Mingjun Zhong, Nigel Goddard and Charles Sutton. In Advances in Neural Information Processing Systems (NIPS). 2015.

    [ .pdf | bib ]

  3. Signal Aggregate Constraints in Additive Factorial HMMs, with Application to Energy Disaggregation. Mingjun Zhong, Nigel Goddard and Charles Sutton. In Advances in Neural Information Processing Systems (NIPS). 2014.

    [ .pdf | bib ]

Sustainable Energy; Deep Learning

  1. Sequence-to-Point Learning with Neural Networks for Non-intrusive Load Monitoring. Chaoyun Zhang, Mingjun Zhong, Zongzuo Wang, Nigel Goddard and Charles Sutton. In National Conference on Artificial Intelligence (AAAI). 2018.

    [ .pdf | bib ]

Software Engineering

  1. Autofolding for Source Code Summarization. Jaroslav Fowkes, Razvan Ranca, Miltiadis Allamanis, Mirella Lapata and Charles Sutton. Transactions on Software Engineering. In press. 2017.

    [ .pdf | bib ]

  2. Parameter-Free Probabilistic API Mining across GitHub. Jaroslav Fowkes and Charles Sutton. In Foundations of Software Engineering (FSE). 2016.

    [ .pdf | bib | code and data ]

  3. A Convolutional Attention Network for Extreme Summarization of Source Code. Miltiadis Allamanis, Hao Peng and Charles Sutton. In International Conference in Machine Learning (ICML). 2016.

    [ .pdf | bib ]

  4. Suggesting Accurate Method and Class Names. Miltiadis Allamanis, Earl T. Barr, Christian Bird and Charles Sutton. In Foundations of Software Engineering (FSE). 2015. (Neural network model that can suggest a name for a method or class, given the method’s body and signature.)

    [ .pdf | bib | abstract | source code ]

  5. Mining idioms from source code. Miltos Allamanis and Charles Sutton. In Symposium on the Foundations of Software Engineering (FSE). 2014.

    [ .pdf | bib ]

  6. Learning Natural Coding Conventions. Miltiadis Allamanis, Earl T Barr, Christian Bird and Charles Sutton. In Symposium on the Foundations of Software Engineering (FSE). 2014.

    (Winner, ACM SIGSOFT Distinguished Paper Award.)

    [ .pdf | bib | source code ]

  7. Why, When, and What: Analyzing Stack Overflow Questions by Topic, Type, and Code. Miltos Allamanis and Charles Sutton. In Working Conference on Mining Software Repositories (MSR). 2013.

    [ .pdf | bib ]

  8. Mining Source Code Repositories at Massive Scale using Language Modeling. Miltos Allamanis and Charles Sutton. In Working Conference on Mining Software Repositories (MSR). 2013.

    [ .pdf | bib ]

Topic Models

  1. Autoencoding Variational Inference for Topic Models. Akash Srivastava and Charles Sutton. In International Conference on Learning Representations (ICLR). 2017.

    [ .pdf | bib | arXiv | discussion | source code ]

  2. Why, When, and What: Analyzing Stack Overflow Questions by Topic, Type, and Code. Miltos Allamanis and Charles Sutton. In Working Conference on Mining Software Repositories (MSR). 2013.

    [ .pdf | bib ]

  3. Unsupervised Deduplication using Cross-field Dependencies. Robert Hall, Charles Sutton and Andrew McCallum. In Conference on Knowledge Discovery and Data Mining (KDD). 2008. (Hierarchical DP model that jointly clusters citation venue strings based on both string-edit distance and title information.)

    [ .pdf | bib | abstract ]

  4. Bayesian Modeling of Dependency Trees Using Hierarchical Pitman-Yor Priors. Hanna Wallach, Charles Sutton and Andrew McCallum. In ICML Workshop on Prior Knowledge for Text and Language Processing. 2008. (Two Bayesian dependency parsing models: 1. Model with Pitman-Yor prior that significantly improves Eisner’s classic model; 2. Latent-variable model that learns "syntactic" topics.)

    [ .pdf | bib ]

Queueing Networks

  1. A Bayesian Approach to Parameter Inference in Queueing Networks. Weikun Wang, Giuliano Casale and Charles Sutton. ACM Transactions on Modeling and Computer Simulation 27 (1). 2016.

    [ .pdf | bib ]

  2. Bayesian Inference in Queueing Networks. Charles Sutton and Michael I. Jordan. Annals of Applied Statistics 5 (1). 2011.

    [ .pdf | bib | source code ]

  3. Learning and Inference in Queueing Networks. Charles Sutton and Michael I. Jordan. In Conference on Artificial Intelligence and Statistics (AISTATS). 2010. (Conference version of the longer paper "Bayesian Inference in Queueing Networks".)

    [ .pdf | bib ]

  4. Probabilistic inference in queueing networks. Charles Sutton and Michael I. Jordan. In Workshop on Tackling Computer Systems Problems with Machine Learning Techniques (SYSML). 2008.

    [ .pdf | bib ]

Computer Security

  1. Learning and Verifying Unwanted Behaviours. Wei Chen, David Aspinall, Andrew Gordon, Charles Sutton and Igor Muttik. In Workshop on Hot Issues in Security Principles and Trust (HotSpot 2016). 2016.

    [ .pdf | bib ]

  2. More Semantics More Robust: Improving Android Malware Classifiers. Wei Chen, David Aspinall, Andrew D Gordon, Charles Sutton and Igor Muttik. In ACM Conference on Security and Privacy in Wireless and Mobile Networks (WiSec). 2016.

    [ to appear | bib ]

  3. Misleading learners: Co-opting your spam filter. Blaine Nelson, Marco Barreno, Fuching Jack Chi, Anthony D. Joseph, Benjamin I. P. Rubinstein, Udam Saini, Charles Sutton, J. D. Tygar and Kai Xia. In Tsai, Jeffrey J. P. and Yu, Philip S., editors. Machine Learning in Cyber Trust: Security, Privacy, Reliability. Springer. 2009.

    [ .pdf | bib ]

  4. Exploiting Machine Learning to Subvert your Spam Filter. Blaine Nelson, Marco Barreno, Fuching Jack Chi, Anthony D. Joseph, Benjamin I. P. Rubinstein, Udam Saini, Charles Sutton, J. D. Tygar and Kai Xia. In Proceedings of the First USENIX Workshop on Large-Scale Exploits and Emergent Threats (LEET). 2008. (Send crafted email to a spam filter to cause it to misclassify your normal email as spam. Initial experiments on defenses to this attack. )

    [ .pdf | bib ]

Data Mining

  1. A Subsequence Interleaving Model for Sequential Pattern Mining. Jaroslav Fowkes and Charles Sutton. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016.

    [ .pdf | bib | code and data ]

  2. A Bayesian Network Model for Interesting Itemsets. Jaroslav Fowkes and Charles Sutton. In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery (ECML-PKDD). 2016.

    [ .pdf | bib | source code ]

Deep Learning

  1. A Convolutional Attention Network for Extreme Summarization of Source Code. Miltiadis Allamanis, Hao Peng and Charles Sutton. In International Conference in Machine Learning (ICML). 2016.

    [ .pdf | bib ]

Interactive Machine Learning

  1. Clustering with a Reject Option: Interactive Clustering as Bayesian Prior Elicitation. Akash Srivastava, James Zou, Ryan P. Adams and Charles Sutton. In Workshop on Human Interpretability in Machine Learning Workshop on Human Interpretability in Machine Learning (co-located with ICML). 2016.

    [ .pdf | bib ]

Programming Languages

  1. Mining Semantic Loop Idioms from Big Code. Miltiadis Allamanis, Earl T. Barr, Christian Bird, Premkumar Devanbu, Mark Marron and Charles Sutton. Microsoft Research Technical Report, MSR-TR-2016-1116, 2016.

    [ .pdf | bib | abstract ]

Big Code

  1. Mining Semantic Loop Idioms from Big Code. Miltiadis Allamanis, Earl T. Barr, Christian Bird, Premkumar Devanbu, Mark Marron and Charles Sutton. Microsoft Research Technical Report, MSR-TR-2016-1116, 2016.

    [ .pdf | bib | abstract ]

Weak Supervision

  1. Latent Bayesian melding for integrating individual and population models. Mingjun Zhong, Nigel Goddard and Charles Sutton. In Advances in Neural Information Processing Systems (NIPS). 2015.

    [ .pdf | bib ]

  2. Signal Aggregate Constraints in Additive Factorial HMMs, with Application to Energy Disaggregation. Mingjun Zhong, Nigel Goddard and Charles Sutton. In Advances in Neural Information Processing Systems (NIPS). 2014.

    [ .pdf | bib ]

Approximate Inference

  1. Semi-Separable Hamiltonian Monte Carlo for Inference in Bayesian Hierarchical Models. Yichuan Zhang and Charles Sutton. In Advances in Neural Information Processing Systems (NIPS). 2014.

    [ .pdf | bib ]

  2. Continuous Relaxations for Discrete Hamiltonian Monte Carlo. Yichuan Zhang, Charles Sutton, Amos Storkey and Zoubin Ghahramani. In Advances in Neural Information Processing Systems (NIPS). 2012.

    [ .pdf | bib ]

  3. Quasi-Newton Markov chain Monte Carlo. Yichuan Zhang and Charles Sutton. In Advances in Neural Information Processing Systems (NIPS). 2011.

    [ .pdf | bib ]

  4. Improved Dynamic Schedules for Belief Propagation. Charles Sutton and Andrew McCallum. In Conference on Uncertainty in Artificial Intelligence (UAI). 2007. (Significantly faster version of loopy BP by selecting which messages to send based on an approximation to their residual.)

    [ .pdf | bib | abstract ]

  5. Sparse Forward-Backward using Minimum Divergence Beams for Fast Training of Conditional Random Fields. Chris Pal, Charles Sutton and Andrew McCallum. In International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2006. (New criterion for adaptive beam size within forward-backward, suggested by a variational perspective. Works well within CRF training.)

    [ .pdf | bib | abstract ]

Markov Chain Monte Carlo

  1. Semi-Separable Hamiltonian Monte Carlo for Inference in Bayesian Hierarchical Models. Yichuan Zhang and Charles Sutton. In Advances in Neural Information Processing Systems (NIPS). 2014.

    [ .pdf | bib ]

  2. Continuous Relaxations for Discrete Hamiltonian Monte Carlo. Yichuan Zhang, Charles Sutton, Amos Storkey and Zoubin Ghahramani. In Advances in Neural Information Processing Systems (NIPS). 2012.

    [ .pdf | bib ]

  3. Quasi-Newton Markov chain Monte Carlo. Yichuan Zhang and Charles Sutton. In Advances in Neural Information Processing Systems (NIPS). 2011.

    [ .pdf | bib ]

Visualization

  1. Word Storms: Multiples of Word Clouds for Visual Comparison of Documents. Quim Castella and Charles Sutton. In International World Wide Web Conference (WWW). 2014.

    [ .pdf | bib ]

Databases

  1. Supporting User-Defined Functions on Uncertain Data. Thanh T. L. Tran, Yanlei Diao, Charles Sutton and Anna Liu. Proceedings of the VLDB Endowment (PVLDB). 2013.

    [ .pdf | bib ]

  2. Distributed Inference and Query Processing for RFID Tracking and Monitoring. Zhao Cao, Charles Sutton, Yanlei Diao and Prashant Shenoy. Proceedings of the VLDB Endowment (PVLDB) 4 (5). 2011.

    [ .pdf | bib ]

  3. Capturing Data Uncertainty in High-Volume Stream Processing. Yanlei Diao, Boduo Li, Anna Liu, Liping Peng, Charles Sutton, Thanh Tran and Michael Zink. In Conference on Innovative Data Systems Research (CIDR). 2009.

    [ .pdf | bib ]

  4. Probabilistic Inference over RFID Streams in Mobile Environments. Thanh Tran, Charles Sutton, Richard Cocci, Yanming Nie, Yanlei Diao and Prashant Shenoy. In International Conference on Data Engineering (ICDE). 2009.

    [ .pdf | bib ]

Evaluation Of Machine Learning

  1. Multiple-source Cross Validation. Krzysztof Geras and Charles Sutton. In International Conference on Machine Learning (ICML). 2013.

    [ .pdf | bib ]

Conditional Random Fields

  1. An Introduction to Conditional Random Fields. Charles Sutton and Andrew McCallum. Foundations and Trends in Machine Learning 4 (4). 2012.

    [ .pdf | bib | abstract ]

  2. Piecewise Training for Structured Prediction. Charles Sutton and Andrew McCallum. Machine Learning 77 (2–3). 2009. (Train undirected graphical model by splitting into overlapping parts that are trained independently. Connections to pseudolikelihood and Bethe free energy. Journal version of UAI and ICML papers below.)

    [ .pdf | bib | abstract ]

  3. Efficient Training Methods for Conditional Random Fields. Charles Sutton. Ph.D. Dissertation, University of Massachusetts, 2008.

    [ .pdf | bib ]

  4. Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data. Charles Sutton, Andrew McCallum and Khashayar Rohanimanesh. Journal of Machine Learning Research 8. 2007. (Combination of dynamic Bayesian networks and conditional random fields. Also considers latent-variable model and cascaded training. Journal version of ICML and EMNLP papers below.)

    [ .pdf | bib | abstract ]

  5. An Introduction to Conditional Random Fields for Relational Learning. Charles Sutton and Andrew McCallum. In Getoor, Lise and Taskar, Ben, editors. Introduction to Statistical Relational Learning. MIT Press. 2007. (Detailed tutorial on conditional random fields. Includes motivation, background, mathematical foundations, linear-chain form, general-structure form, inference, parameter estimation, and tips and tricks. NOTE: In Equation (1.22), there is a small error. There should not be a summation over k in the final term, just lambda_k / sigma_2. )

    [ .pdf | bib ]

  6. Piecewise Pseudolikelihood for Efficient CRF Training. Charles Sutton and Andrew McCallum. In International Conference on Machine Learning (ICML). 2007. (Train a large CRF in five times faster by dividing it into separate pieces and reducing numbers of predicted variable combinations with pseudolikelihood. Analysis in terms of belief propagation and Bethe energy.)

    [ .pdf | bib | abstract ]

  7. Sparse Forward-Backward using Minimum Divergence Beams for Fast Training of Conditional Random Fields. Chris Pal, Charles Sutton and Andrew McCallum. In International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2006. (New criterion for adaptive beam size within forward-backward, suggested by a variational perspective. Works well within CRF training.)

    [ .pdf | bib | abstract ]

  8. Reducing Weight Undertraining in Structured Discriminative Learning. Charles Sutton, Michael Sindelar and Andrew McCallum. In Conference on Human Language Technology and North American Association for Computational Linguistics (HLT-NAACL). 2006. (Trains multiple linear-chain CRFs with different subsets of features, in order to force dependent sets of features to be able to separately model the class label.)

    (This is the published version. An early version had an error in Section 4, under Per-Sequence Mixtures.)

    [ .pdf | bib | abstract ]

  9. Local Training and Belief Propagation. Charles Sutton and Tom Minka. Microsoft Research Technical Report, TR-2006-121, 2006.

    [ .pdf | bib ]

  10. Learning in Markov Random Fields with Contrastive Free Energies. Max Welling and Charles Sutton. In Conference on Artificial Intelligence and Statistics (AISTATS). 2005.

    [ .pdf | bib | abstract ]

  11. Fast, Piecewise Training for Discriminative Finite-state and Parsing Models. Charles Sutton and Andrew McCallum. Center for Intelligent Information Retrieval Technical Report, IR-403, 2005.

    [ .pdf | bib ]

  12. Piecewise Training of Undirected Models. Charles Sutton and Andrew McCallum. In Conference on Uncertainty in Artificial Intelligence (UAI). 2005. (Train large CRF by dividing into pieces and training independently. The explanation in this paper for why it works is somewhat unsatisfying. Consult journal version (2008) for a better story.)

    [ .pdf | bib | abstract ]

  13. Composition of Conditional Random Fields for Transfer Learning. Charles Sutton and Andrew McCallum. In Conference on Human Language Technology and Empirical Methods in Natural Language Processing (HLT-EMNLP). 2005.

    [ .pdf | bib ]

  14. Collective Segmentation and Labeling of Distant Entities in Information Extraction. Charles Sutton and Andrew McCallum. In ICML Workshop on Statistical Relational Learning and Its Connections to Other Fields. 2004.

    [ .pdf | bib ]

  15. Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data. Charles Sutton, Khashayar Rohanimanesh and Andrew McCallum. In International Conference on Machine Learning (ICML). 2004. (Combination of dynamic Bayesian networks and conditional random fields, with experiments in noun-phrase chunking.)

    [ .pdf | bib | abstract ]

  16. Piecewise Training with Parameter Independence Diagrams: Comparing Globally- and Locally-trained Linear-chain CRFs. Andrew McCallum and Charles Sutton. In NIPS Workshop on Learning with Structured Outputs. 2004.

    [ .pdf | bib | abstract ]

  17. Dynamic Conditional Random Fields for Jointly Labeling Multiple Sequences. Andrew McCallum, Khashayar Rohanimanesh and Charles Sutton. In NIPS Workshop on Syntax, Semantics, and Statistics. 2003.

    [ .pdf | bib | abstract ]

Systems / Machine Learning

  1. Automatic Exploration of Datacenter Performance Regimes. Peter Bodik, Rean Griffith, Charles Sutton, Armando Fox, Michael I. Jordan and David A. Patterson. In First Workshop on Automated Control for Datacenters and Clouds (ACDC ’09). 2009.

    [ .pdf | bib ]

  2. Statistical Machine Learning Makes Automatic Control Practical for Internet Datacenters. Peter Bodik, Rean Griffith, Charles Sutton, Armando Fox, Michael I. Jordan and David A. Patterson. In Workshop on Hot Topics in Cloud Computing (HotCloud ’09). 2009.

    [ .pdf | bib ]

  3. Response-Time Modeling for Resource Allocation and Energy-Informed SLAs. Peter Bodik, Charles Sutton, Armando Fox, David Patterson and Michael I. Jordan. In NIPS Workshop on Statistical Learning Techniques for Solving Systems Problems (MLSys 07). 2007. (Quantile regression (both parametric and non-) for predicting the performance of a web service as a function of workload and power consumption. Much better for voltage control than built-in frequency scaling.)

    [ .pdf | bib ]

Natural Language Processing

  1. Unsupervised Deduplication using Cross-field Dependencies. Robert Hall, Charles Sutton and Andrew McCallum. In Conference on Knowledge Discovery and Data Mining (KDD). 2008. (Hierarchical DP model that jointly clusters citation venue strings based on both string-edit distance and title information.)

    [ .pdf | bib | abstract ]

  2. Bayesian Modeling of Dependency Trees Using Hierarchical Pitman-Yor Priors. Hanna Wallach, Charles Sutton and Andrew McCallum. In ICML Workshop on Prior Knowledge for Text and Language Processing. 2008. (Two Bayesian dependency parsing models: 1. Model with Pitman-Yor prior that significantly improves Eisner’s classic model; 2. Latent-variable model that learns "syntactic" topics.)

    [ .pdf | bib ]

  3. Joint Parsing and Semantic Role Labeling. Charles Sutton and Andrew McCallum. In Conference on Natural Language Learning (CoNLL). 2005.

    [ .pdf | bib ]

  4. Composition of Conditional Random Fields for Transfer Learning. Charles Sutton and Andrew McCallum. In Conference on Human Language Technology and Empirical Methods in Natural Language Processing (HLT-EMNLP). 2005.

    [ .pdf | bib ]

  5. Collective Segmentation and Labeling of Distant Entities in Information Extraction. Charles Sutton and Andrew McCallum. In ICML Workshop on Statistical Relational Learning and Its Connections to Other Fields. 2004.

    [ .pdf | bib ]

Nonparametric Bayesian Models

  1. Unsupervised Deduplication using Cross-field Dependencies. Robert Hall, Charles Sutton and Andrew McCallum. In Conference on Knowledge Discovery and Data Mining (KDD). 2008. (Hierarchical DP model that jointly clusters citation venue strings based on both string-edit distance and title information.)

    [ .pdf | bib | abstract ]

  2. Bayesian Modeling of Dependency Trees Using Hierarchical Pitman-Yor Priors. Hanna Wallach, Charles Sutton and Andrew McCallum. In ICML Workshop on Prior Knowledge for Text and Language Processing. 2008. (Two Bayesian dependency parsing models: 1. Model with Pitman-Yor prior that significantly improves Eisner’s classic model; 2. Latent-variable model that learns "syntactic" topics.)

    [ .pdf | bib ]

Information Extraction

  1. Unsupervised Deduplication using Cross-field Dependencies. Robert Hall, Charles Sutton and Andrew McCallum. In Conference on Knowledge Discovery and Data Mining (KDD). 2008. (Hierarchical DP model that jointly clusters citation venue strings based on both string-edit distance and title information.)

    [ .pdf | bib | abstract ]

  2. Collective Segmentation and Labeling of Distant Entities in Information Extraction. Charles Sutton and Andrew McCallum. In ICML Workshop on Statistical Relational Learning and Its Connections to Other Fields. 2004.

    [ .pdf | bib ]