Learn how to compare two text files (or strings) in Python using the built-in difflib module to generate human-readable differences, similar to the Unix diff command.
Sample Text Files
file1.txt (original):
Line one
Line two
Line three
Line four
Line five
file2.txt (modified):
Line one
Line SECOND
Line three
Line FOURTH
Line five
Line six
result:
| Original | Modified | ||||
|---|---|---|---|---|---|
| f | 1 | Line one | f | 1 | Line one |
| n | 2 | Line two | n | 2 | Line SECOND |
| 3 | Line three | 3 | Line three | ||
| n | 4 | Line four | n | 4 | Line FOURTH |
| 5 | Line five | 5 | Line five | ||
| t | t | 6 | Line six | ||
| Legends | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
| |||||||||
1. Using difflib.HtmlDiff for Visual HTML Output
Generate a side-by-side HTML diff — great for reports or web display.
import difflib
# Read files
with open('file1.txt') as f1:
text1 = f1.readlines()
with open('file2.txt') as f2:
text2 = f2.readlines()
# Create HTML diff
differ = difflib.HtmlDiff()
html_diff = differ.make_file(text1, text2, fromdesc='Original', todesc='Modified')
# Save to HTML file
with open('diff.html', 'w') as f:
f.write(html_diff)
print("HTML diff saved to diff.html")
Result: Opens in browser as a colored, side-by-side comparison table highlighting insertions, deletions, and changes.
2. Using difflib.unified_diff for Textual Diff
Produces a standard unified diff format (like git diff).
import difflib
with open('file1.txt') as f1:
text1 = f1.readlines()
with open('file2.txt') as f2:
text2 = f2.readlines()
diff = difflib.unified_diff(
text1, text2,
fromfile='file1.txt',
tofile='file2.txt',
lineterm=''
)
print('\n'.join(diff))
Output:
--- file1.txt
+++ file2.txt
@@ -1,5 +1,6 @@
Line one
-Line two
+Line SECOND
Line three
-Line four
+Line FOURTH
Line five
+Line six
3. Using difflib.context_diff
Shows context around changes (older format).
import difflib
with open('file1.txt') as f1:
text1 = f1.readlines()
with open('file2.txt') as f2:
text2 = f2.readlines()
diff = difflib.context_diff(text1, text2, fromfile='file1.txt', tofile='file2.txt')
print('\n'.join(diff))
Output:
*** file1.txt
--- file2.txt
***************
*** 1,5 ****
Line one
! Line two
Line three
! Line four
Line five
--- 1,6 ----
Line one
! Line SECOND
Line three
! Line FOURTH
Line five
+ Line six
4. Quick String Comparison with Differ
For in-memory strings or quick checks:
import difflib
text1 = "Hello world\nThis is a test\n"
text2 = "Hello world\nThis is a TEST\nExtra line\n"
d = difflib.Differ()
diff = list(d.compare(text1.splitlines(keepends=True), text2.splitlines(keepends=True)))
print(''.join(diff))
Output:
Hello world
This is a test
- This is a TEST
? ^^^^
+ This is a TEST
? ^^^^
+ Extra line
Performance Notes
difflibis built-in — no external dependencies.- Works line-by-line; for character-level diffs within lines, consider
SequenceMatcher. - For large files,
unified_difforcontext_diffare efficient and memory-friendly. HtmlDiffis excellent for visual reporting but generates larger output.
Use HtmlDiff for presentations, unified_diff for scripts/logs, and Differ for custom processing. These tools make file comparison simple and readable!