Currently I’m using a technique in which I find which variables have a relative association in a regression prediction model by running a univariate regression on each variable, selecting variables with p<=0.2 and running those variables against each other in a multiple regression to fit the final model.
However I’m open to other methods to verify my results. I’ve read about k-1 folding in which I would, for example in the 2-fold-cross-validation, split my data into 2 and then run a regression with all relevant (would this mean p<=0.2 like above?) and then averaging the p-values for the 2 runs.
So if my 1st partition of data produces a p-value of 0.1 for a variable and my second partition of data produces a p-value of 0.05, I would arrive at a result of p=0.3.
Is this correctly understood? My field is medicine, would this method be a good way to verify my results in a predictive models with about 30-40 independent variables against a single y?