Fuzzy strings matching

http://www.r-bloggers.com/fuzzy-string-matching-a-survival-skill-to-tackle-unstructured-information/

Fuzzy String Matching is basically rephrasing the YES/NO “Are string A and string B the same?” as “How similar are string A and string B?”… And to compute the degree of similarity (called “distance”), the research community has been consistently suggesting new methods over the last decades. Maybe the first and most popular one was Levenshtein, which is by the way the one that R natively implements in the utils package (adist)

Mark Van der Loo released a package called stringdist with additional popular fuzzy string matching methods, which we are going to use in our example below...

La donnée intelligente

Search This Blog

Fuzzy strings matching

Comments

Post a Comment