Statistics for Multi-Resolution and Asymmetric Implementation of Attention in Transformers