Python Program for Rabin-Karp Algorithm
Rabin-Karp is a string searching algorithm that uses hashing to find patterns in a text efficiently.
It is especially useful when searching for multiple patterns or repeated queries in large texts.
1. Understanding the Problem
Given a text and a pattern, find all occurrences of the pattern in the text.
Input: text = "abcabcabc", pattern = "abc" Output: 0, 3, 6
2. Method 1: Basic Rabin-Karp Implementation
def rabin_karp(text, pattern):
n = len(text)
m = len(pattern)
pattern_hash = hash(pattern)
for i in range(n - m + 1):
substring = text[i:i+m]
if hash(substring) == pattern_hash:
if substring == pattern:
print("Pattern found at index", i)
text = input("Enter text: ")
pattern = input("Enter pattern: ")
rabin_karp(text, pattern)
This method uses Python's built-in hash function to compare substrings.
3. Method 2: Rolling Hash (Optimized)
def rabin_karp(text, pattern):
d = 256
q = 101
n = len(text)
m = len(pattern)
h = pow(d, m-1) % q
p = 0
t = 0
for i in range(m):
p = (d*p + ord(pattern[i])) % q
t = (d*t + ord(text[i])) % q
for i in range(n - m + 1):
if p == t:
if text[i:i+m] == pattern:
print("Pattern found at index", i)
if i < n - m:
t = (d*(t - ord(text[i])*h) + ord(text[i+m])) % q
if t < 0:
t += q
text = input()
pattern = input()
rabin_karp(text, pattern)
Rolling hash avoids recomputing hash from scratch, improving efficiency.
4. Method 3: Returning Indices
def find_pattern(text, pattern):
result = []
n = len(text)
m = len(pattern)
for i in range(n - m + 1):
if text[i:i+m] == pattern:
result.append(i)
return result
print(find_pattern("abcabcabc", "abc"))
This version returns all matching indices.
5. Method 4: Multiple Pattern Matching
def multi_pattern_search(text, patterns):
for pattern in patterns:
print(f"Searching for {pattern}:")
for i in range(len(text) - len(pattern) + 1):
if text[i:i+len(pattern)] == pattern:
print("Found at", i)
multi_pattern_search("abcabcabc", ["abc", "bca"])
Rabin-Karp can be extended to handle multiple patterns efficiently.
6. Method 5: Handling Edge Cases
text = input().strip()
pattern = input().strip()
if not pattern:
print("Empty pattern")
elif len(pattern) > len(text):
print("No match")
else:
for i in range(len(text) - len(pattern) + 1):
if text[i:i+len(pattern)] == pattern:
print(i)
Handles empty patterns and invalid cases safely.
7. Algorithm
1. Compute hash of pattern.
2. Compute hash of first window of text.
3. Slide window over text.
4. Compare hash values.
5. If match, verify substring.
6. Repeat until end.
8. Common Mistakes
1. Ignoring hash collisions.
2. Incorrect rolling hash update.
3. Not verifying substring after hash match.
4. Choosing poor hash base or modulus.
9. Applications
1. Plagiarism detection.
2. DNA sequence matching.
3. Search engines.
4. Pattern recognition systems.
Conclusion
Rabin-Karp is a powerful string matching algorithm that uses hashing for efficient searching.
The rolling hash technique makes it much faster than naive approaches for large inputs.
Codecrown