How to Match Either the End of a String or a Specific Character in Python Regex

To match either the end of a string ($) or a specific character with regex in Python you can use pattern like (?:char|$). This is useful when parsing structured text, extracting substrings, or validating input formats.

Regex Pattern: `(?:char|$)`

To match either a specific character or the end of the string, use the alternation (|) operator inside a non-capturing group ((?:...)):

(?:X|$)

Here, X is the specific character you want to match, and $ ensures that we also match the end of the string.

Explanation

/.*?(?:&|$)/ gm

. matches any character (except for line terminators)
*? matches the previous token between zero and unlimited times, as few times as possible, expanding as needed (lazy)
Non-capturing group (?:&|$)
- 1st Alternative &
  & matches the character & with index 3810 (2616 or 468) literally (case sensitive)
- 2nd Alternative $
  $ asserts position at the end of a line
Global pattern flags
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)

1: Split a String at `/` or the End

If we want to split a string at backslash (/) or the end, this regex works:

import re

text1 = "https://example.com/search?q=test&b=1&c=2&d=3"
text2 = "https://example.com/search?q=test"
text3 = "https://example.com/search?q=test&b=1&c=2&d=3"

reg = r".*?(?:&|$)"
matches1 = re.findall(reg, text1)
matches2 = re.findall(reg, text2)
matches3 = re.match(reg, text3)

print(matches1)
print(matches2)
print(matches3)

result:

['https://example.com/search?q=test&', 'b=1&', 'c=2&', 'd=3', '']
['https://example.com/search?q=test', '']
<re.Match object; span=(0, 34), match='https://example.com/search?q=test&'>

In the example above we are extracting different parts from URL based on the matching condition - backslash or end of string.

You can test it in:

2: Extract Words Before `.` or End

If we want to extract words before a period (.) or the end, we use:

text1 = "hello.world.example"
text2 = "hello world"
matches1 = re.findall(r"[^.]+(?:\.|$)", text1)
matches2 = re.findall(r"[^.]+(?:\.|$)", text2)

print(matches1)
print(matches2)

result:

['hello.', 'world.', 'example']
['hello world']

3: Match Until `:` or End

To match a substring until a colon (:) or the end of the string:

text1 = "username:password"
text2 = "username-password"
match1 = re.search(r"^.*?(?=:|$)", text1)
match2 = re.search(r"^.*?(?=:|$)", text2)

print(match1.group())
print(match2.group())

output:

username
username-password

4. Extract everything between HTML tags

For example we can extract everything between two tags: <a> and </a>. The regex below will match the start and the end of the links:

<a\b[^>]*>(.*?)</a>

Conclusion

Using (?:char|$), we can efficiently match either a specific character or the end of a string, making regex patterns more flexible for text processing.

> Python Basics

> Advanced Tutorials

> Python Errors

> Pandas Advanced

> Pandas Count

> Pandas Column

> Pandas Basics

> Pandas DataFrame

> Pandas Row

> User Interface

> Advanced Linux

> Troubleshoot

> Video & Sound

> Linux Commands

> MySQL

> SQL Basics

> Python

> DB apps

> JupyterLab

> Jupyter Tips

> Jupyter Display

> Regex in Text Editor

> Regex Basics

> Regex Match

> Regex Date

> PyCharm Advanced

> Git and PyCharm

> PyCharm Error

> PyCharm Tips

> Linux Mint Applications

> VIrtual Machine

> Miscellaneous

> Java

> Automation

> Windows

> Office

> Cheat Sheet

Regex Pattern: (?:char|$)

Explanation

1: Split a String at / or the End

2: Extract Words Before . or End

3: Match Until : or End