Which activation function is most commonly used in hidden layers of deep neural networks to mitigate the vanishing gradient problem?