[Q] Coding ordinal predictors in regression

Hi all,

I’m running a logistic regression to predict an outcome of interest, and have a question re: the best way to handle a particular variable relating to the number of instances of occurrence of a relevant event. The number of instances is obviously continuous but I’m dissatisfied with way I’ve modeled it as a continuous predictor so I’m thinking of collapsing it into an ordinal one (I have some evidence that 0-2 instances doesn’t much matter, 3 matters a bit, 4 matters a lot, and 5+ is basically a guarantee that the outcome of interest will occur).

What’s the optimal way to code ordinal predictors? Is it standard dummy-coding, so that, e.g., I have flags events_3, events_4, events_5plus, which would take values 1, 0, 0 respectively when the number of events is 3, or values 0, 1, 0 respectively when the number of events is 4?

Or should I do in a way that reflects the ordinality of it, so that, e.g., I have flags events_3ormore, events_4ormore, events_5ormore which take values 1, 0, 0 when the number of events is 3, or values 1, 1, 0 when the number of events is 4, or values 1, 1, 1 when the number of events is 5 or more?

Thanks in advance. For the record, am doing all this in Python.

submitted by /u/sw85
[link] [comments]

Published by

Nevin Manimala

Nevin Manimala is interested in blogging and finding new blogs https://nevinmanimala.com

Leave a Reply

Your email address will not be published. Required fields are marked *