A test developer wants to assess the reliability of a new anxiety scale.
They administer 'Form A' of the scale to a group of participants.
One week later, they administer 'Form B,' a different set of questions designed to measure the same construct, to the same group.
They then correlate the scores from Form A and Form B.
This procedure is designed to evaluate which type of reliability?