I’m relatively familiar with principal component itself but have a targeted question regarding its application.
Suppose I have 25 rows of data representing 25 successful product lots, characterized by six columns, representing various metrics.
Next I have 5 rows representing failed products lots.
In order to “cluster” or separate the successful from the failed product lots, would I:
(1) apply PCA to the entire data set at once and investigate a score plot
(2) first apply PCA to the 25 successful lots to build a model. then calculate scores using said model for the failed lots and observe where they fall with respect to the old model