EDIT DISTANCE AND LEVENSHTEIN EDIT DISTANCE

EDIT DISTANCE AND LEVENSHTEIN EDIT DISTANCE







What is Edit Distance ?

The minimum edit distance between two string is the minimum number of editing operations needed to transform one string into another string. In edit distance each operation has cost of 1. There are two strings given. The first string is the source string and the second string is the target string.

Editing Operations are :

  • Insert a character
  • Delete a character
  • Replace a character
The time complexity for all three of these operations is O(1).

Different types of edit distance

  • Levenshtein Distance
    • Operations Allowed : Deletion, Insertion, Substitution
  • Longest Common Subsequence
    • Operations Allowed : Deletion, Insertion
  • Demerau Levenshtein Distance 
    • Operations Allowed : Insertion, Deletion, Substitution, Transportation
  • Hamming Distance
    • Operation Allowed : Substitution
  • Jaro Distance 
    • Operation Allowed : Transportation

SOURCE CODE

Edit Distance

def editdistance(str1, str2, m, n):
if m==0:
return n
if n==0:
return m
if str1[m-1] == str2[n-1]:
return editdistance(str1, str2, m-1, n-1)
return 1+ min(editdistance(str1, str2, m, n-1),
editdistance(str1, str2, m-1, n),
editdistance(str1, str2, m-1, n-1)
)

str1 = input("Enter string1: ")
str2 = input("Enter string2: ")
print("Edit Distance is: ",editdistance(str1, str2, len(str1), len(str2)))

OUTPUT




Levenshtein Edit Distance

import numpy as np
def levenshteinDistance(seq1, seq2):
size_x = len(seq1) + 1
size_y = len(seq2) + 1
matrix = np.zeros((size_x, size_y))

for x in range(size_x):
matrix[x, 0] = x
for y in range(size_y):
matrix[0, y] = y
for x in range(1, size_x):
for y in range(1, size_y):
if seq1[x-1]==seq2[y-1]:
matrix[x, y]=min(
matrix[x-1, y] + 1,
matrix[x-1, y-1],
matrix[x, y-1] + 1
)
else:
matrix[x, y]=min(
matrix[x-1, y] + 1,
matrix[x-1, y-1] + 1,
matrix[x, y-1] + 1
)
print(matrix)
return (matrix[size_x - 1, size_y - 1])
levenshteinDistance("cat","dog")

OUTPUT





Comments