Java regex match abbreviations 5 examples
In this tutorial, you will see several regex examples like: ([A-Z]{2,})+
for extracting abreviations in Java.
These examples can be used in other languages like Python.
you can check also:
Java regex match: HP, IBM
Matching abbreviations from 2 and more capital letters:
[A-Z]
- catch big letter{2,}
- two and more
String str = "Extract only abreaviations like HP and IBM";
Pattern reString = Pattern.compile("([A-Z]{2,})+" ); // extract two and more Big letters
Matcher matchString = reString.matcher(str);
while (matchString.find()) {
System.out.println(matchString.group());
}
result:
HP
IBM
Java regular expression matching: HP, IBM with boundery word
You can catch only capital letters with boundery word by:
[A-Z]+
- catch sequence of capital letters\\b
- set a boundary for the search
String strA = "Extract only abreaviations like HP and IBM";
Pattern reStringA = Pattern.compile("[A-Z]+\\b" ); // extract capital letters with boundery word
Matcher matchStringA = reStringA.matcher(strA);
while (matchStringA.find()) {
System.out.println(matchStringA.group());
}
result:
HP
IBM
Java regex match: Hp, Ibm
Matching abbreviations starting with capital letter and having several lowercase letters:
\\b
- boundery[A-Z]
- Catch one capital letter[a-z]{1,3}
- catch from 1 to 3 lowercase letters
String strB = "Extract only abreaviations like Hp and Ibm";
Pattern reStringB = Pattern.compile("\\b[A-Z][a-z]{1,3}\\b" ); // extract capital and lower case letters abbr
Matcher matchStringB = reStringB.matcher(strB);
while (matchStringB.find()) {
System.out.println(matchStringB.group());
}
result:
Hp
Ibm
Java regular expression finding abbreviations with dots Hp. Ibm.
If you want to find all abbreviations in text which ends by dot you can use:
([A-Z][a-z]+\\.){1,}
String strC = "Extract only abreaviations like Hp. and Ibm.";
Pattern reStringC = Pattern.compile("([A-Z][a-z]+\\.){1,}" ); // extract abbr. enging by dot
Matcher matchStringC = reStringC.matcher(strC);
while (matchStringC.find()) {
System.out.println(matchStringC.group());
}
result:
Hp.
Ibm.
regex to get words containing dots H.P. I.B.M.
Catching words with letters separated by dots can be done by:
\\b(?:[a-zA-Z]\\.){2,}
?:
- non-capturing parentheses
String strD = "Extract only abreaviations like H.P. and I.B.M.";
Pattern reStringD = Pattern.compile("\\b(?:[a-zA-Z]\\.){2,}" ); // match dotted words H.P.
Matcher matchStringD = reStringD.matcher(strD);
while (matchStringD.find()) {
System.out.println(matchStringD.group());
}
result:
H.P.
I.B.M.