So 1 exemplar in the positive class and 100 in the negative class is clearly shit, but what if the positive class examples is double the size of the negative class? What about a 20/80 ratio, or 30/70? Are there any rules of thumb for how badly imbalanced a dataset can be for ML classification?
Edit: undersampled and all my problems went away. Lesson learned