Python - Using SequenceMatcher.ratio() to find similarity between two strings


Python tip:

You can use difflib.SequenceMatcher.ratio() to get the distance between two strings:

  • T - total number of elements in both strings (len(first_string) + len(second_string))
  • M - number of matches

Distance = 2.0 * M / T -> between 0.0 and 1.0 (1.0 if the sequences are identical, and 0.0 if they don't have anyhing in common)

https://docs.python.org/3/library/difflib.html#sequencematcher-objects

For example:

from difflib import SequenceMatcher

first = "Jane"
second = "John"

print(SequenceMatcher(a=first, b=second).ratio())
# => 0.5