Measurement provides structure, removes chaos, reduces waste, ensures open and fair markets, supports precision where required and saves lives, money and time
Some of the highly accurate balances can give false results if they are not placed upon a completely level surface, so this calibration process is the best way to avoid this.
In the non-physical sciences, the definition of an instrument is much broader, encompassing everything from a set of survey questions to an intelligence test. A survey to measure reading ability in children must produce reliable and consistent results if it is to be taken seriously.
Political opinion polls, on the other hand, are notorious for producing inaccurate results and delivering a near unworkable margin of error.
In the physical sciences, it is possible to isolate a measuring instrument from external factors, such as environmental conditions and temporal factors. In the social sciences, this is much more difficult, so any instrument must be tested with a reasonable range of reliability.
TEST OF STABILITY
Any test of instrument reliability must test how stable the test is over time, ensuring that the same test performed upon the same individual gives exactly the same results.
The test-retest method is one way of ensuring that any instrument is stable over time.
Of course, there is no such thing as perfection and there will be always be some disparity and potential for regression, so statistical methods are used to determine whether the stability of the instrument is within acceptable limits.
TEST OF EQUIVALENCE
Testing equivalence involves ensuring that a test administered to two people, or similar tests administered at the same time give similar results.
Split-testing is one way of ensuring this, especially in tests or observations where the results are expected to change over time. In a school exam, for example, the same test upon the same subjects will generally result in better results the second time around, so testing stability is not practical.
Checking that two researchers observe similar results also falls within the remit of the test of equivalence.