Our linkage quality
The CHeReL uses a method called probabilistic linkage. This linkage method is designed so that only a small number of links are incorrect. The linkage models are designed to ensure only 5 out of every 1,000 records may be linked by mistake. For example, in a dataset of 100,000 people, around 500 records may contain linkage errors. The CHeReL also aims to miss as few links as possible. However, missing or incomplete information, such as names or dates of birth, can increase the number of missed links.
Our linkage and quality assurance process
Probabilistic linkage software works by giving a linkage weight to pairs of records. This weight shows how likely it is that two records belong to the same person.
Records that match closely on details such as first name, last name, date of birth, and address receive a high linkage weight. Records that match on fewer details receive a lower linkage weight. A high weight means the records are likely to be a true match, while a low weight means they are unlikely to match.
Some record pairs have linkage weights that fall in the middle. These are not clearly true matches or false matches.
To manage this, the CHeReL uses two cut-off points:
Records with weights between the two cut-offs are checked manually by trained staff. This process is called clerical review.