Maximum Entropy Modeling

This page dedicates to a general-purpose machine learning technique called Maximum Entropy Modeling (MaxEnt for short). On this page you will find:

What is Maximum Entropy Modeling

In his famous 1957 paper, Ed. T. Jaynes wrote:
Information theory provides a constructive criterion for setting up probability distributions on the basis of partial knowledge, and leads to a type of statistical inference which is called the maximum entropy estimate. It is least biased estimate possible on the given information; i.e., it is maximally noncommittal with regard to missing information.
That is to say, when characterizing some unknown events with a statistical model, we should always choose the one that has Maximum Entropy.

Maximum Entropy Modeling has been successfully applied to Computer Vision, Spatial Physics, Natural Language Processing and many other fields. This page will focus on applying Maxent to Natural Language Processing (NLP).

The concept of Maximum Entropy can be traced back along multiple threads to Biblical times. However, not until the late of 21st century has computer become powerful enough to handle complex problems with statistical modeling technique like Maxent.

Maximum Entropy was first introduced to NLP area by Berger, et al (1996) and Della Pietra, et al. 1997. Since then, Maximum Entropy technique (and the more general framework Random Fields) has enjoyed intensive research in NLP community.

Tutorials for Maximum Entropy Modeling

Here is an (incomplete) list of tutorials & introduction for Maximum Entropy Modeling.

Maxent related software

Here is an incomplete list of software found on the net that are related to Maximum Entropy Modeling.

Annotated papers on Maximum Entropy Modeling in NLP

Here is a list of recommended papers on Maximum Entropy Modeling with brief annotation. Other recommended papers:

Other MaxEnt related resources on the web