# A brief history of Model II regression analysis

- Karl Pearson [1] was the “first” to address the problem of fitting a line when both the X and Y variables had measurement error. He called his solution to the problem the “major axis” of the data ellipse. It describes how X and Y co-vary.
- Kermack and Haldane [2] later showed that when the units of the X and Y variables were changed the major axis was not uniquely determined: the slope and intercept would vary (even after correction for the new axes) when the scales were changed. They proposed the use of a “reduced major axis” where both X and Y were converted to standardized variables. For standardized variables, the mean = 0 and the standard deviation = 1.
- York [3] developed a method of weighing the data points in both X and Y for those cases where one wants to find the major axis but the uncertainties of the two measurements are different. He called his method the least squares cubic because it requires the solution of a cubic equation to find the slope of the regression, not because it gives a cubic equation.
- Ricker [4] showed the the geometric mean regression is identical to the reduced major axis but far easier to compute.
- Jolicoeur [5] took great exception to some of Ricker’s comments; most notably:
- he pointed out that “
*l’axe majeur des variables reduites*” is more accurately translated*standard major axis*rather than*reduced major axis*which Kermack and Haldane [2] had translated literally from the French; - that formulae for the asymmetrical confidence limits for the slope of the geometrical mean regression already existed;
- and, that because of the lack of sensitivity of the slope of the standard major axis to the strength of the relationship, the use of the bivariate structural relationship was preferred.

- he pointed out that “
- In his reply, Ricker [6] took great pains to address each of Jolicoeur’s complaints:
- he presented the formulae for the asymmetrical confidence limits for the slope of the geometrical mean regression and agreed that they provided better limits than his approximate symmetrical limits. However, because of their computational complexity, I have not included them here.
- and, he robustly defended the geometric mean regression as always suitable for naturally variable data.

- Subsequently, Sprent and Dolby [7] took exception to the
*ad hoc*use of the geometric mean regression in model II cases. They argue that an equally strong case can be made for the line that bisects the minor angle between the two model I regressions: Y-on-X and X-on-Y. (Let’s call this line the least squares bisector.) While the differences in slope between the geometric mean regression and the least squares bisector are small and probably not statistically significant, this new “regression” line is included here for the sake of completeness. - Even so, Ricker [4] has been extensively cited, including:
- Laws’ and Archie’s [8] presention of a very illustrative (biological) example of the pitfalls of using the model-I regression when a model-II regression is required.
- Sokal’s and Rohlf’s [9] textbook “Biometry” where the issues of model-I
*vs*model-II regression are discussed in great detail.

- Recently, Laws [10] has written a book which contains a collection of mathematical and statistical methods commonly used by oceanographers. It includes an extensive chapter where various aspects of model-II regression techniques are presented.
- Bevington and Robinson [11] have an excellent chapter in their book that describes the derivation of several of the model I regression lines and the statistics used to calculate the slope and y-intercept plus their standard derivations. In a later chapter, they also describe the derivation of the correlation coefficient, r, and how it is calculated.
- And, finally, York et al. [12] have derived unified equations for the slope, intercept and standard errors of the best straight line for model-II cases showing that the least-squares estimation (LSE) and maximum likelihood estimation (MLE) methods yield identical results. Furthermore, they show that all known correct regression solutions in the literature can be derived from the original York equations [3].

### References

- Pearson (1901). On lines and planes of closest fit to systems of points in space.
*Phil. Mag.*v2(6): 559-572. - Kermack & Haldane (1950). Organic correlation and allometry.
*Biometrika*v37: 30-41. - York (1966). Least-squares fitting of a straight line.
*Canad. J. Phys.*44: 1079-1086. - Ricker (1973). Linear regressions in Fishery Research.
*J. Fish. Res. Board Can.*30: 409-434. - Jolicoeur (1975). Linear Regressions in Fishery Research: Some Comments.
*J. Fish. Res. Board Can.*32: 1491-1494. - Ricker (1975). A Note Concerning Professor Jolicoeur’s Comments.
*J. Fish. Res. Board Can.*32: 1494-1498. - Sprent and Dolby (1980). The Geometric Mean Functional Relationship.
*Biometrics*36: 547-550. - Laws and Archie (1981). Appropriate use of regression analysis in marine biology.
*Mar. Biol.*65: 13-16. - Sokal and Rohlf (1995).
*Biometry, 3rd edition*. W. H. Freeman and Company, San Francisco, CA. - Laws (1997).
*Mathematical Methods for Oceanographers*. John Wiley and Sons, Inc., New York, NY. - Bevington and Robinson (2003). Data reduction and error analysis for the physical sciences. 3rd edition. McGraw-Hill, New York, NY.
- York et al. (2004). Unified equations for the slope, intercept, and standard errors of the best straight line.
*Am.J. Phys.*72(3): 367-375.