Large sets of Linear Equations

Excerpt from: Numerical Methods that (usually) Work, by Forman S. Acton, Spectrum Paperback 1990 (original edition 1970), p253.

Whenever a person eagerly inquires if my computer can solve a set of 300 equations in 300 unknowns, I must quickly suppress the temptation to retort, “Yes, buy why bother?” There are, indeed, legitimate sets of equations that large. They arise from replacing a partial differential equation on a set of grid points, and the person who knows enough to tackle this type of problem also usually knows what kind of computer he needs. The odds are all too high that our inquiring friend is suffering from a quite different problem: he probably has collected a set of experimental data and is now attempting to fit a 300-parameter model to it—by Least Squares! The sooner this guy can be eased out of your office, the sooner you will be able to get back to useful work—but these chaps are persistent. They have fitted three-parameter models on desk machines with no serious difficulties and now the electronic computer permits them more grandiose visions. They leap from the known to the unknown with a terrifying innocence and the perennial self-confidence that every parameter is totally justified. It does no good to point out that several parameters are nearly certain to be competing to “explain” the same variations in the data and hence the equation system will be nearly indeterminate. It does no good to point out that all large least-squares matrices are striving mightily to be proper subsets of the Hilbert matrix—which is virtually indeterminate and uninvertible—and so even if all 300 parameters were beautifully independent, the fitting equations would still be violently unstable. All of this, I repeat, does no good—and you end up by getting angry and throwing the guy out of your office.

Most of this instability is unnecessary, for there is usually a reasonable procedure. Unfortunately, it is undramatic, laborious, and requires thought—which most of these charlatans avoid like the plague. They should merely fit a five-parameter model, then a six-parameter one. If all goes well and there is a statistically valid reduction of the residual variability, then a somewhat more elaborate model may be tried. Somewhere along the line—and it will be much closer to 15 parameters than to 300—the significant improvement will cease and the fitting operation is over. There is no system of 300 equations, no 300 parameters, and no glamor. But a person has to know some statistics, he has to have a clear idea about the mechanisms by which the variability has entered his data, and he has to know the intended use for his fitted formula. It is infinitely easier to let a computer try to solve 300 equations and hope to put some sort of interpretation on the numbers, assuming one gets any, and be safe from criticism because the computer did it all. It is difficult to choose mathematical models to represent material phenomena and, while it is not easy to evaluate the parameters, the real difficulties are statistical, not computational. The computer center’s director must prevent the looting of valuable computer time by these would-be fitters of many parameters. The task is not a pleasant one, but the legitimate computer users have rights, too. The alternative commits everybody to a miserable two weeks of sloshing around in great quantities of “Results” that are manifestly impossible, with no visible way of finding the trouble. The trouble, of course, arises form looking for a good answer to a poorly posed problem, but a computer director seldom knows enough about the subject matter to win any of those arguments with the problem’s proposer, and the impasse finally has to be broken by violence—which therefore might as well be used in the very beginning.