In this tutorial, it's shown how to find and validate phone numbers in Python using simple examples. We will review different phone number formats which are the most popular.
The best option for search and validation of data like phones numbers, zip codes, identifiers is Regular expression or Regex.
Next, we'll see the examples to find, extract or validate phone numbers from a given text or string. The article starts with easy examples and finishes with advanced ones.
Step 1: Find Simple Phone Number
Let's suppose that we need to validate simple phone number formats which don't change. For example:
- 000-000-000
re.findall(r"[\d]{3}-[\d]{3}-[\d]{3}", text)
- 000 000 0000
re.findall(r"[\d]{3} [\d]{3} [\d]{3}", text)
The goal is to find all matches for the pattern.
The mentioned solution is very basic and it's not working for numbers like:
9000-000-0009
- it will find only000-000-000
000 000 000
(000)000-000
In order to deal with those cases the Regex should be improved.
Step 2: Regex for Phone Numbers with a Plus
Often phones numbers are displayed with plus sign like:
- +000-000-000
This format is matched by next regular expression:
re.findall(r"\+?[\d]{3}-[\d]{3}-[\d]{3}", text)
Note that this will catch:
- 000-000-000
- +000-000-000
but also:
5000-000-0004
will be extracted as 000-000-000. In the next step we will solve this problem.
Step 3: Validate Phone Numbers for Exact Match
If the the format is important and only exact matches are needed like:
000-000-000
, +000-000-000
but not - 5000-000-0004
, +000-000-0004
then we need to add word boundaries to our Regex by adding at the start and the end \b
:
re.findall(r"\+?\b[\d]{3}-[\d]{3}-[\d]{3}\b", text)
Next let's see a more generic example which covers international and country phone numbers.
Step 4: Validate International Phone Number
It's difficult to find and test international numbers with 100% accuracy. Analysis of the data might be needed in order to check what formats are present in the text.
One possible solution for validation of international numbers is:
re.match(r"^[\+\(]?\d+(?:[- \)\(]+\d+)+$", phone)
Another regular expression is:
re.match(r"^[\+\d]?(?:[\d-.\s()]*)$", phone)
Step 4: Validate US, UK, French phone numbers
For example let's start with US phone numbers:
- (000)000-0000
- 000-000-0000
- (000) 000-0000
can be done with next Regex:
re.match(r"^(\([0-9]{3}\) ?|[0-9]{3}-)[0-9]{3}-[0-9]{4}$", phone)
UK or GB numbers like:
- +447222000000
- +44 7222 000 000
can be searched and validated by:
^(?:0|\+?44)\s?(?:\d\s?){9,11}$
other possible solution for UK is: ^(\+44\s?7\d{3}|\(?07\d{3}\)?)\s?\d{3}\s?\d{3}(\s?\#(\d{4}|\d{3}))?$
The next simple Regex will work for French numbers:
^(?:(?:\+|00)33|0)\s*[\d](?:[\s.-]*\d{2}){4}$
like:
- 00 00 00 00 00
- +33 0 00 00 00 00
Step 5: Find phone numbers in different formats
If you like to build a Regex which find various formats you can try with the next one:
[\+\d]?(\d{2,3}[-\.\s]??\d{2,3}[-\.\s]??\d{4}|\(\d{3}\)\s*\d{3}[-\.\s]??\d{4}|\d{3}[-\.\s]??\d{4})
The one above will cover most phone numbers but will not work for all.
If the validation is important or additional features are needed like:
- updates for new formats/countries/regions
- geographical information related to a phone number
- timezone information
then we will recommend mature libraries to be used. Good example in this case is the Google's Java and JavaScript library for parsing, formatting, and validating international phone numbers.
Conclusion
Have in mind that Regex are powerful but you may face performance issues for the complex ones. Try to use simple and understandable Regex. Sometimes you may need to play with flags in order to make it work properly:
/^[\(]?0([\d{9})\$/mg
Another important note is about using:
- start and end -
^
and$
- word boundaries -
\b
We covered most cases of phone validation by using python and Regex.