[Q] At what point is your dataset “too imbalanced” for classification?

So 1 exemplar in the positive class and 100 in the negative class is clearly shit, but what if the positive class examples is double the size of the negative class? What about a 20/80 ratio, or 30/70? Are there any rules of thumb for how badly imbalanced a dataset can be for ML classification?

Edit: undersampled and all my problems went away. Lesson learned

submitted by /u/pretysmitty
[link] [comments]

Published by

Nevin Manimala

Nevin Manimala is interested in blogging and finding new blogs https://nevinmanimala.com

Leave a Reply

Your email address will not be published. Required fields are marked *