Statistics for Self-supervised Video Representation Learning by Exploiting Video Speed Changes