Python Implementation of Burrows-Wheeler Transform (BWT)
The Burrows-Wheeler Transform (BWT) is a string transformation technique that rearranges characters to improve compression efficiency.
It is commonly used in compression systems, genome sequencing, and search indexing.
1. Problem Overview
Transform a string into its BWT form and reconstruct it back to the original string.
Input: banana$ Output: annb$aa
2. Rotation-Based Approach
def bwt_transform(text):
rotations = [text[i:] + text[:i] for i in range(len(text))]
rotations.sort()
return ''.join(row[-1] for row in rotations)
print(bwt_transform("banana$"))
This method generates all rotations, sorts them, and extracts the last column.
3. Suffix Array Approach
def bwt_with_suffix_array(text):
suffixes = sorted(range(len(text)), key=lambda i: text[i:])
return ''.join(text[i - 1] for i in suffixes)
print(bwt_with_suffix_array("banana$"))
This approach avoids full rotation storage and is more memory-efficient.
4. Inverse Burrows-Wheeler Transform
def reverse_bwt(bwt):
table = [""] * len(bwt)
for _ in range(len(bwt)):
table = sorted([bwt[i] + table[i] for i in range(len(bwt))])
for row in table:
if row.endswith('$'):
return row
print(reverse_bwt("annb$aa"))
This method rebuilds the original string by iteratively sorting partial results.
5. Combined Transformation and Recovery
def bwt_pipeline(text):
transformed = bwt_transform(text)
restored = reverse_bwt(transformed)
return transformed, restored
print(bwt_pipeline("banana$"))
6. Handling Edge Cases
text = input().strip()
if not text:
print("")
else:
if '$' not in text:
text += '$'
print(bwt_transform(text))
Ensures correct input format and safe execution.
7. Algorithm Steps
Forward BWT:
1. Generate cyclic rotations
2. Sort them lexicographically
3. Extract last column
Inverse BWT:
1. Initialize empty table
2. Prepend characters iteratively
3. Sort at each step
4. Identify row ending with '$'
8. Common Errors
1. Missing end marker ($)
2. Incorrect rotation logic
3. Sorting mistakes
4. Errors in inverse reconstruction
9. Applications
1. File compression systems
2. Genome sequencing
3. Text indexing
4. Pattern searching
Conclusion
The Burrows-Wheeler Transform is a foundational technique that improves compression efficiency when combined with other algorithms.
Mastering BWT strengthens understanding of advanced string processing concepts.
Codecrown