Statistics for Policy Extraction via Online Q-Value Distillation