A brief history of Model II regression analysis

  • Karl Pearson [1] was the “first” to address the problem of fitting a line when both the X and Y variables had measurement error. He called his solution to the problem the “major axis” of the data ellipse. It describes how X and Y co-vary.
  • Kermack and Haldane [2] later showed that when the units of the X and Y variables were changed the major axis was not uniquely determined: the slope and intercept would vary (even after correction for the new axes) when the scales were changed. They proposed the use of a “reduced major axis” where both X and Y were converted to standardized variables. For standardized variables, the mean = 0 and the standard deviation = 1.
  • York [3] developed a method of weighing the data points in both X and Y for those cases where one wants to find the major axis but the uncertainties of the two measurements are different. He called his method the least squares cubic because it requires the solution of a cubic equation to find the slope of the regression, not because it gives a cubic equation.
  • Ricker [4] showed the the geometric mean regression is identical to the reduced major axis but far easier to compute.
  • Jolicoeur [5] took great exception to some of Ricker’s comments; most notably:
    • he pointed out that “ l’axe majeur des variables reduites” is more accurately translated standard major axisrather than reduced major axis which Kermack and Haldane [2] had translated literally from the French;
    • that formulae for the asymmetrical confidence limits for the slope of the geometrical mean regression already existed;
    • and, that because of the lack of sensitivity of the slope of the standard major axis to the strength of the relationship, the use of the bivariate structural relationship was preferred.
  • In his reply, Ricker [6] took great pains to address each of Jolicoeur’s complaints:
    • he presented the formulae for the asymmetrical confidence limits for the slope of the geometrical mean regression and agreed that they provided better limits than his approximate symmetrical limits. However, because of their computational complexity, I have not included them here.
    • and, he robustly defended the geometric mean regression as always suitable for naturally variable data.
  • Subsequently, Sprent and Dolby [7] took exception to the ad hoc use of the geometric mean regression in model II cases. They argue that an equally strong case can be made for the line that bisects the minor angle between the two model I regressions: Y-on-X and X-on-Y. (Let’s call this line the least squares bisector.) While the differences in slope between the geometric mean regression and the least squares bisector are small and probably not statistically significant, this new “regression” line is included here for the sake of completeness.
  • Even so, Ricker [4] has been extensively cited, including:
    • Laws’ and Archie’s [8] presention of a very illustrative (biological) example of the pitfalls of using the model-I regression when a model-II regression is required.
    • Sokal’s and Rohlf’s [9] textbook “Biometry” where the issues of model-I vs model-II regression are discussed in great detail.
  • Recently, Laws [10] has written a book which contains a collection of mathematical and statistical methods commonly used by oceanographers. It includes an extensive chapter where various aspects of model-II regression techniques are presented.
  • Bevington and Robinson [11] have an excellent chapter in their book that describes the derivation of several of the model I regression lines and the statistics used to calculate the slope and y-intercept plus their standard derivations. In a later chapter, they also describe the derivation of the correlation coefficient, r, and how it is calculated.
  • And, finally, York et al. [12] have derived unified equations for the slope, intercept and standard errors of the best straight line for model-II cases showing that the least-squares estimation (LSE) and maximum likelihood estimation (MLE) methods yield identical results. Furthermore, they show that all known correct regression solutions in the literature can be derived from the original York equations [3].

References

  1. Pearson (1901). On lines and planes of closest fit to systems of points in space. Phil. Mag. v2(6): 559-572.
  2. Kermack & Haldane (1950). Organic correlation and allometry. Biometrika v37: 30-41.
  3. York (1966). Least-squares fitting of a straight line. Canad. J. Phys. 44: 1079-1086.
  4. Ricker (1973). Linear regressions in Fishery Research. J. Fish. Res. Board Can. 30: 409-434.
  5. Jolicoeur (1975). Linear Regressions in Fishery Research: Some Comments. J. Fish. Res. Board Can. 32: 1491-1494.
  6. Ricker (1975). A Note Concerning Professor Jolicoeur’s Comments. J. Fish. Res. Board Can. 32: 1494-1498.
  7. Sprent and Dolby (1980). The Geometric Mean Functional Relationship. Biometrics 36: 547-550.
  8. Laws and Archie (1981). Appropriate use of regression analysis in marine biology. Mar. Biol. 65: 13-16.
  9. Sokal and Rohlf (1995). Biometry, 3rd edition. W. H. Freeman and Company, San Francisco, CA.
  10. Laws (1997). Mathematical Methods for Oceanographers. John Wiley and Sons, Inc., New York, NY.
  11. Bevington and Robinson (2003). Data reduction and error analysis for the physical sciences. 3rd edition. McGraw-Hill, New York, NY.
  12. York et al. (2004). Unified equations for the slope, intercept, and standard errors of the best straight line. Am.J. Phys. 72(3): 367-375.

Products

Data repository
Data policy
What is happening in Monterey Bay today?
Central and Northern California Ocean Observing System
Chemical data
Ocean float data
Slough data
Mooring ISUS measurements
M1 ISUS CTD Data Display
Southern Ocean Data
Mooring data
M1 Mooring Summary Data
M1 ADCP (CeNCOOS)
M1 Asimet
M1 Download Info
M1 EMeter
M1 Fluorometer (CeNCOOS)
M1 GPS Location
Molecular and genomics data
ESP Web Portal
Seafloor mapping
Upper ocean data
Spatial Temporal Oceanographic Query System (STOQS) Data
Tide prediction
Image gallery
Video library
Seminars
Previous seminars
David Packard Distinguished Lecturers
Research software
Video Annotation and Reference System
System Overview
Knowledgebase
Installation
Annotation Interface
Video Tape User Guide
Video File User Guide
Installation
Annotation Glossary
Query Interface
Basic User Guide
Advanced User Guide
Results
Installation
Query Glossary
FAQ
VARS Publications
Oceanographic Decision Support System
MB-System seafloor mapping software
MB-System Documentation
MB-System Announcements
MB-System Announcements (Archive)
How to Download and Install MB-System
MB-System Discussion Lists
MB-System FAQ
Matlab scripts: Linear regressions
Introduction to Model I and Model II linear regressions
A brief history of Model II regression analysis
Index of downloadable files
Summary of modifications
Regression rules of thumb
Results for Model I and Model II regressions
Graphs of the Model I and Model II regressions
Which regression: Model I or Model II?
Matlab scripts: Oceanographic calculations
Matlab scripts: Sound velocity
Visual Basic for Excel: Oceanographic calculations
Educational resources
MBARI Summer Internship Program
2017 MBARI Summer Internship Program
2017 Intern Projects and Mentors
Education and Research: Testing Hypotheses (EARTH)
EARTH workshops
2016—New Brunswick, NJ
2015—Newport, Oregon
2016 Satellite workshop—Pensacola, FL
2016 Satellite workshop—Beaufort, NC
EARTH resources
EARTH lesson plans
Lesson plans—published
Lesson plans—development
Lesson drafts—2015
Lesson drafts—2016 Pensacola
Center for Microbial Oceanography: Research and Education (C-MORE) Science Kits
Publications
Sample archive