Null Distribution of the Test Statistic for Model Selection via Marginal Screening: Implications for Multivariate Regression Analysis
Keywords:
linear models, multiple comparisons, freedman's paradox, extreme value theory, order statistics, genetic risk score
Abstract
Marginal screening MS is the computationally simple and commonly used for the dimension reduction procedures In it a linear model is constructed for several top predictors chosen according to the absolute value of marginal correlations with the dependent variable Importantly when k predictors out of m primary covariates are selected the standard regression analysis may yield false-positive results if m k Freedman s paradox In this work we provide analytical expressions describing null distribution of the test statistics for model selection via MS Using the theory of order statistics we show that under MS the common F-statistic is distributed as a mean of k top variables out of m independent random variables having a 2 1 distribution Based on this finding we estimated critical p-values for multiple regression models after MS comparisons with which of those obtained in real studies will help researchers to avoid false-positive result Analytical solutions obtained in the work are implemented in a free Excel spreadsheet program
Downloads
- Article PDF
- TEI XML Kaleidoscope (download in zip)* (Beta by AI)
- Lens* NISO JATS XML (Beta by AI)
- HTML Kaleidoscope* (Beta by AI)
- DBK XML Kaleidoscope (download in zip)* (Beta by AI)
- LaTeX pdf Kaleidoscope* (Beta by AI)
- EPUB Kaleidoscope* (Beta by AI)
- MD Kaleidoscope* (Beta by AI)
- FO Kaleidoscope* (Beta by AI)
- BIB Kaleidoscope* (Beta by AI)
- LaTeX Kaleidoscope* (Beta by AI)
How to Cite
Published
2021-01-15
Issue
Section
License
Copyright (c) 2021 Authors and Global Journals Private Limited
This work is licensed under a Creative Commons Attribution 4.0 International License.