Natural Resources Research. 10(4):241–286, DECEMBER 2001

Issn Print: 1520-7439

Publication Date: December 2001

# A History of Regression and Related Model-Fitting in the Earth Sciences (1636?-2000)

Richard Howarth;

+ Author Information

1Department of Geological Sciences, University College London, Gower Street, London, WC1E 6BT, UK2The substance of this paper formed the keynote lecture of the 2000 Krumbein Medallist delivered as part of the IAMG program at the 31st. International Geological Congress, Rio de Janeiro, Brazil, 15 August 2000.

### Abstract

The (statistical) modeling of the behavior of a dependent variate as a function of one or more predictors provides examples of model-fitting which span the development of the earth sciences from the 17th Century to the present. The historical development of these methods and their subsequent application is reviewed. Bond's predictions (c. 1636 and 1668) of change in the magnetic declination at London may be the earliest attempt to fit such models to geophysical data. Following publication of Newton's theory of gravitation in 1726, analysis of data on the length of a 1° meridian arc, and the length of a pendulum beating seconds, as a function of sin2(latitude), was used to determine the ellipticity of the oblate spheroid defining the Figure of the Earth. The pioneering computational methods of Mayer in 1750, Boscovich in 1755, and Lambert in 1765, and the subsequent independent discoveries of the principle of ‘least squares’ by Gauss in 1799, Legendre in 1805, and Adrain in 1808, and its later substantiation on the basis of probability theory by Gauss in 1809 were all applied to the analysis of such geodetic and geophysical data. Notable later applications include: the geomagnetic survey of Ireland by Lloyd, Sabine, and Ross in 1836, Gauss's model of the terrestrial magnetic field in 1838, and Airy's 1845 analysis of the residuals from a fit to pendulum lengths, from which he recognized the anomalous character of measurements of gravitational force which had been made on islands. In the early 20th Century applications to geological topics proliferated, but the computational burden effectively held back applications of multivariate analysis. Following World War II, the arrival of digital computers in universities in the 1950s facilitated computation, and fitting linear or polynomial models as a function of geographic coordinates, ‘trend surface analysis,’ became popular during the 1950–60s. The inception of ‘geostatistics’ in France at this time by Matheron had its roots in meeting the evident need for improved estimators in spatial interpolation. Technical advances in regression analysis during the 1970s embraced the development of regression diagnostics and consequent attention to outliers; the recognition of problems caused by correlated predictors, and the subsequent introduction of ‘ridge regression’ to overcome them; and techniques for fitting errors-in-variables and mixture models. Improvements in computational power have enabled ever more computer-intensive methods to be applied. These include algorithms which are ‘robust’ in the presence of outliers, for example Rousseeuw's 1984 Least Median Squares; nonparametric smoothing methods, such as kernel-functions, splines and Cleveland's 1979 LOcally WEighted Scatterplot Smoother (LOWESS); and the Classification and Regression Tree (CART) technique of Breiman and others in 1984. Despite a continuing improvement in the rate of technology-transfer from the statistical to the earth-science community, despite an abrupt drop to a time-lag of about 10 years following the introduction of digital computers, these more recent developments are only just beginning to penetrate beyond the research community of earth scientists. Examples of applications to problem-solving in the earth sciences are given.