Binary classification metrics are calculating incorrectly with API

Probs-to-labels correction causes the incorrect results from ROC AUC metric in binary case.

Example:

All lables are zero.

It can be reproded in test_classification_quality_improvement (even in baseline_metrics step)

The best solution is to run pipeline.predict is label-based mode instead of direct conversion of probabilites.