String matching-1

String matching

Introduction 

String matching is used to fined a pattern inside a text. The pattern and text should always be like this, |text| > |pattern| .

There are some algorithms that we can use to do string matching.

  • Naive (brute-force) algorithm
  • Knuth Morris Pratt (KMP)
  • Boyer Moor algorithm 
  • Rabin-Karp algorithm

1.Naive algorithm

 This algorithm simply test all the possible placements of Pattern relative to Text
and Consists of two nested loops.

n  = size of the Text
m = size of the pattern

one loop in this algorithm runs (n-m) times and other one m times.


T[] = a b b  c d e
P[] = b b c

first match T[0] and P[0] 
in above example first items do not match. because of that shift the pattern by 1 and match again.

T[] = a b b c d e
P[] =    b b c

after shifting T[1] = P[0] 
because of that next try to match second item in pattern 
T[2] = P[1]
because that two elements mach too next compare last item in the pattern.
T[3] = P[2]
because that matches return the matching point

Like this until the end of the Text we can fined the occurrences of the pattern.

Time complexity 

In the worst case scenario 

                  T[] = a a a a a a a a a
                  P[] = a a a

in this example we get a matching pair every time.because of that we have to search each and every item with the pattern.

for this              O((n-m)(m))
                         O(nm-m.m)
                         O(nm)           (n >> m)

we can also take following example as worst case scenario 

                 T[] = a a a a a a a a a a
                 P[] = a a b

In best case scenario 

                 T[] = a a a a a a a b
                 P[] = b a a

in this case there is no matches what so ever

for this              O(n-m)
                         O(n)               (n >> m)

pseudo code

for i in range (0 , n-m-1) {
   matchcount = 0;
   for j in range (0 , m-1) {
       if T[i+j] == P[j]{
           matchcount++ ;
       else 
           break ;
       }
   }
}


implementation in C

#include<stdio.h>
#include<string.h>

int main()
{
int count = 0, m,n,i,j;
char pattern[256], text[256];

printf("\nEnter the pattern: ");
scanf("%s", pattern);
printf("\nEnter the text: ");
scanf("%s", text);

m = strlen(pattern);
n = strlen(text);

for(i=0; i< n-m+1; i++)
{
   for(j=0; j<m; j++)
   {
      if(text[i+j] == pattern[j])
      count++ ;
   }
   if(count == m)
   {
      printf("\nMatching point: %d",i);
   }
   count = 0;
}
return 0;
}       

No comments:

Post a Comment

String matching-2

2.Rabin Karp algorithm The Rabin-Karp algorithm is a string searching algorithm that uses hashing to find a substring in a text.in this a...