Currently, VW cannot report AUC. What is worse, it cannot optimize directly for AUC. Optimizing for AUC is not compatible with online learning, but there are some approximations of AUC suitable for optimizing.
Concerning your question, you don't need to store the intermediate file with raw predictions on disk. You can pipe it directly to the external evaluation tool (perf in this case):
vw -d test.data -t -i model.vw -r /dev/stdout | perf -roc -files gold /dev/stdin
Edit:
John Langford confirmed that AUC can generally be optimized by changing the ratio of false positive and false negative loss. In VW, this means setting a different importance weight for positive and negative examples. You need to tune the optimal weight using a hold out set (or cross validation, or progressive validation loss for one-pass learning).
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…