Python 3 offers several different ways for removing spaces from strings like:
- words.strip()
- words.replace(" ", "")
- re.sub(r"\s+", "", words, flags=re.UNICODE)
The could be also difference depending on your needs like:
- remove leading and ending spaces
- remove all spaces
- remove duplicated spaces
- dealing with encoding
Python 3 Remove spaces at the start and end of a string
You have two options if you want to get rid of the leading and ending spaces. The first one is by using the built in function strip:
words = ' Python is powerful... and fast; '
print(words.strip())
result:
Python is powerful... and fast;
You can achieve exactly the same with regular expression like:
import re
new_words = re.sub("^\s+|\s+$", "", words, flags=re.UNICODE)
print(new_words)
result:
Python is powerful... and fast;
If you want to remove only the spaces at the beginning you can do:
import re
words = ' Python is powerful... and fast; '
new_words = re.sub(r"^\s+", "", words, flags=re.UNICODE)
print(new_words)
result:
Python is powerful... and fast;
Or only at the end of a string by:
import re
words = ' Python is powerful... and fast; '
new_words = re.sub(r"\s+$", "", words, flags=re.UNICODE)
print(new_words)
result:
Python is powerful... and fast;
Python 3 Remove all spaces from a string
You have two options if you want to get rid of the leading and ending spaces. The first one is by using the built in function strip:
words = ' Python is powerful... and fast; '
print(words.replace(' ', ''))
result:
Pythonispowerful...andfast;
The same can be achieve also by regex:
import re
words = ' Python is powerful... and fast; '
new_words = re.sub(r"\s+", "", words, flags=re.UNICODE)
print(new_words)
result:
Pythonispowerful...andfast;
Python 3 Remove consecutive spaces
The last example will show you how to remove only the consecutive spaces in a string. Again you have 2 options in doing that. The first one split all the words by a space and join them with spaces:
words = ' Python is powerful... and fast; '
print(" ".join(words.split()))
result:
Python is powerful... and fast;
The second one by the regular expression is working in a different way. It's search for duplicated spaces and remove them. The difference between the two is that the leading and ending spaces will be preserved in the second case:
import re
words = ' Python is powerful... and fast; '
new_words = " ".join(re.split("\s+", words, flags=re.UNICODE))
print(new_words)
result:
Python is powerful... and fast;
If you care about performance in the regex vs non regex case than you can check next result:
import cProfile
def before():
for i in range(1, 1000000):
words = ' Python is powerful... and fast; '
" ".join(words.split())
def after():
for i in range(1, 1000000):
words = ' Python is powerful... and fast; '
new_words = " ".join(re.split("\s+", words, flags=re.UNICODE))
import re
cProfile.run('before()')
cProfile.run('after()')
result:
- non regex - 0.555 seconds
- regex - 2.890 seconds
The whole performance result for removing spaces from string. So the regex is much more customizable but at the cost of performance and memory:
2000002 function calls in 0.555 seconds
Ordered by: standard name
2000002 function calls in 0.555 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.555 0.555 <string>:1(<module>)
1 0.247 0.247 0.555 0.555 ProfilingSimple.py:3(before)
1 0.000 0.000 0.555 0.555 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
999999 0.103 0.000 0.103 0.000 {method 'join' of 'str' objects}
999999 0.205 0.000 0.205 0.000 {method 'split' of 'str' objects}
4000205 function calls (4000202 primitive calls) in 2.890 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 2.890 2.890 <string>:1(<module>)
1 0.398 0.398 2.890 2.890 ProfilingSimple.py:9(after)
34 0.000 0.000 0.000 0.000 enum.py:267(__call__)
34 0.000 0.000 0.000 0.000 enum.py:517(__new__)
3 0.000 0.000 0.000 0.000 enum.py:797(__or__)
14 0.000 0.000 0.000 0.000 enum.py:803(__and__)
999999 0.292 0.000 2.366 0.000 re.py:204(split)
999999 0.261 0.000 0.261 0.000 re.py:286(_compile)
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.555 0.555 <string>:1(<module>)
1 0.247 0.247 0.555 0.555 ProfilingSimple.py:3(before)
1 0.000 0.000 0.555 0.555 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
999999 0.103 0.000 0.103 0.000 {method 'join' of 'str' objects}
999999 0.205 0.000 0.205 0.000 {method 'split' of 'str' objects}
4000205 function calls (4000202 primitive calls) in 2.890 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 2.890 2.890 <string>:1(<module>)
1 0.398 0.398 2.890 2.890 ProfilingSimple.py:9(after)
34 0.000 0.000 0.000 0.000 enum.py:267(__call__)
34 0.000 0.000 0.000 0.000 enum.py:517(__new__)
3 0.000 0.000 0.000 0.000 enum.py:797(__or__)
14 0.000 0.000 0.000 0.000 enum.py:803(__and__)
999999 0.292 0.000 2.366 0.000 re.py:204(split)
999999 0.261 0.000 0.261 0.000 re.py:286(_compile)