llm auroc

how can you calculate AUROC from LLMs? i will be presenting my research in a few days and i have a big bug in my code. the solution in the end was to use token log probs. my professor suggested this package: EleutherAI/lm-evaluation-harness: A framework for few-shot evaluation of language models

here's the formula

call endpoint with logprobs=True, top_logprobs >= # classes, max_tokens=1, temperature=0
1. ensure choices[0].logprobs.content[0].top_logprobs is preesent, if None, -> NaN
2. constrain output to single token per class: prompt to return labeled option: (e.g. A: apples, B: bananas) and instruct "answer with a single capital letter"
extract token log probls -> class probabilities
1. from top_logprobs of first generated token, build a map for option letters {A, B, C, D}
2. clean tokens (strip leading space/byte-pair artifacts)
3. if any optoin letter missing, mark sample prob as invalid
4. convert logprobs to probabilities with softmax, (normalize to sum=1)
compute AUROC
1. Binary tasks: roc_auc_score(y_true, P[:, pos_idx]) (use probability of the positive class)
2. Multiclass: roc_auc_score(y_true, P, multi_class='ovr', average='macro')
3. Compute on only those rows with valid probabilities

references