jump to navigation

Levenshtein distance – An optimized version 15 Gennaio 2008

Posted by fripp in Algoritmi, C, Informatica, Matematica, Programmazione.
Tags:
1 comment so far

We can adapt the algorithm to use less space, O(m) instead of O(mn), since it only requires that the previous row and current row be stored at any one time.
This is the second version of the algorithm


int levenshtein_distance(char *x, char *y){
	int m=strlen(x);
	int n=strlen(y);

	register int i,j;
        int distance;
	int *prev=malloc((n+1)*sizeof(int));
	int *curr=malloc((n+1)*sizeof(int));
	int *tmp=0;

	for(i=0;i<=n;i++)
		prev[i]=i;

	for(i=1;i<=m;i++){
		curr[0]=i;
		for(j=1;j<=n;j++){
			if(x[i-1]!=y[j-1]){
				int k=min(curr[j-1],prev[j-1],prev[j]);
				curr[j]=k+1;
			}else{
				curr[j]=prev[j-1];
			}
		}

		tmp=prev;

		prev=curr;

		curr=tmp;

		memset((void*)curr,0,sizeof(int)*(n+1));

	}

	distance=prev[n];

	free(curr);
	free(prev);

	return distance;
}