Python Program for Rabin-Karp Algorithm

Rabin-Karp is a string searching algorithm that uses hashing to find patterns in a text efficiently.

It is especially useful when searching for multiple patterns or repeated queries in large texts.

1. Understanding the Problem

Given a text and a pattern, find all occurrences of the pattern in the text.

Input: text = "abcabcabc", pattern = "abc"
Output: 0, 3, 6

2. Method 1: Basic Rabin-Karp Implementation

Python
Basic hashing approach
def rabin_karp(text, pattern):
    n = len(text)
    m = len(pattern)
    pattern_hash = hash(pattern)
    
    for i in range(n - m + 1):
        substring = text[i:i+m]
        if hash(substring) == pattern_hash:
            if substring == pattern:
                print("Pattern found at index", i)

text = input("Enter text: ")
pattern = input("Enter pattern: ")
rabin_karp(text, pattern)

This method uses Python's built-in hash function to compare substrings.

3. Method 2: Rolling Hash (Optimized)

Python
Rolling hash method
def rabin_karp(text, pattern):
    d = 256
    q = 101
    n = len(text)
    m = len(pattern)
    h = pow(d, m-1) % q
    p = 0
    t = 0

    for i in range(m):
        p = (d*p + ord(pattern[i])) % q
        t = (d*t + ord(text[i])) % q

    for i in range(n - m + 1):
        if p == t:
            if text[i:i+m] == pattern:
                print("Pattern found at index", i)

        if i < n - m:
            t = (d*(t - ord(text[i])*h) + ord(text[i+m])) % q
            if t < 0:
                t += q

text = input()
pattern = input()
rabin_karp(text, pattern)

Rolling hash avoids recomputing hash from scratch, improving efficiency.

4. Method 3: Returning Indices

Python
Return positions
def find_pattern(text, pattern):
    result = []
    n = len(text)
    m = len(pattern)

    for i in range(n - m + 1):
        if text[i:i+m] == pattern:
            result.append(i)

    return result

print(find_pattern("abcabcabc", "abc"))

This version returns all matching indices.

5. Method 4: Multiple Pattern Matching

Python
Multiple patterns
def multi_pattern_search(text, patterns):
    for pattern in patterns:
        print(f"Searching for {pattern}:")
        for i in range(len(text) - len(pattern) + 1):
            if text[i:i+len(pattern)] == pattern:
                print("Found at", i)

multi_pattern_search("abcabcabc", ["abc", "bca"])

Rabin-Karp can be extended to handle multiple patterns efficiently.

6. Method 5: Handling Edge Cases

Python
Edge cases
text = input().strip()
pattern = input().strip()

if not pattern:
    print("Empty pattern")
elif len(pattern) > len(text):
    print("No match")
else:
    for i in range(len(text) - len(pattern) + 1):
        if text[i:i+len(pattern)] == pattern:
            print(i)

Handles empty patterns and invalid cases safely.

7. Algorithm

1. Compute hash of pattern.

2. Compute hash of first window of text.

3. Slide window over text.

4. Compare hash values.

5. If match, verify substring.

6. Repeat until end.

8. Common Mistakes

1. Ignoring hash collisions.

2. Incorrect rolling hash update.

3. Not verifying substring after hash match.

4. Choosing poor hash base or modulus.

9. Applications

1. Plagiarism detection.

2. DNA sequence matching.

3. Search engines.

4. Pattern recognition systems.

Conclusion

Rabin-Karp is a powerful string matching algorithm that uses hashing for efficient searching.

The rolling hash technique makes it much faster than naive approaches for large inputs.