fuzzylink: Probabilistic Record Linkage Using Pretrained Text Embeddings
Links datasets through fuzzy string matching using pretrained text embeddings. Produces more accurate record linkage when lexical string distance metrics are a poor guide to match quality (e.g., "Patricia" is more lexically similar to "Patrick" than it is to "Trish"). Capable of performing multilingual record linkage. Methods are described in Ornstein (2025) <doi:10.1017/pan.2025.10016>.
| Version: | 0.2.5 | 
| Depends: | R (≥ 4.1.0) | 
| Imports: | stats, utils, dplyr, Rfast, reshape2, stringdist, stringr, httr, jsonlite, httr2, ranger | 
| Published: | 2025-08-29 | 
| DOI: | 10.32614/CRAN.package.fuzzylink | 
| Author: | Joe Ornstein  [aut, cre, cph] | 
| Maintainer: | Joe Ornstein  <jornstein at uga.edu> | 
| BugReports: | https://github.com/joeornstein/fuzzylink/issues | 
| License: | MIT + file LICENSE | 
| URL: | https://github.com/joeornstein/fuzzylink | 
| NeedsCompilation: | no | 
| Materials: | README, NEWS | 
| CRAN checks: | fuzzylink results | 
Documentation:
Downloads:
Linking:
Please use the canonical form
https://CRAN.R-project.org/package=fuzzylink
to link to this page.