Caret (^) usage in regex with Java examples

The caret  anchor(^) is used to match a character at the beginning of a string .Therefore  ^abc  matches abc at the beginning of a string.

Example 1- ^ to match start of string

The below program matches ‘th’ at the beginning of a string.It also matches ‘Th’ as we set our matcher to be case insensitive .

package devsought;

 

import java.util.regex.Matcher;

import java.util.regex.Pattern;

 

public class Regex5 {

 

    public static void main(String... args) {

        String str = "The chef is in the kitchen preparing a meal.";

        Pattern pattern = Pattern.compile("^th", Pattern.CASE_INSENSITIVE);

        Matcher matcher = pattern.matcher(str);

        if (matcher.find()) {

            System.out.println("Matched::" + matcher.group());

        } else {

            System.out.println("NO MATCH");

        }

 

    }

}

The above program outputs

Matched::Th

Explanation: Th in ‘The’ is matches since Th is at the beginning of our string. The second ‘the’ does not give a match since its not at the start of our string.We can observe this is our string was ‘chef is in the kitchen preparing a meal. Whereby the output would be NO MATCH .

Example 2-when used outside at start of brackets ^[….]+

When used in this scenario it means we try to match, at the start of the string, one of the characters inside the brackets e.g ^ [abc]+ will get a  match  in abefgj, aacblkja,cbegh,b12c but not get a match in dabgh,gccb,1abcc.

See below table  for inputs and matches of the above regex(^[abc]+)-match at the start of the string ,one or more times(we have used quantifier +) the occurrence of a ,b or c

Input string

Matched result

Explanation

abefgj

ab

The first a and b match start of string.Starting at e no match as not defined in our regex

aacblkja

aacb

First two a’s match,followed by c and b.Starting l no match as not defined in our regex

cbegh

cb

c and b match as per our regex.Rest of string does not match

b12c

b

Only b matches.

dabgh

NO MATCH

d is not one of the characters we expect at start of our string

gccb

NO MATCH

g is not one of the characters we expect at start of our string

1abcc

NO MATCH

1 is not one of the characters we expect at start of our string

 

Example 3- when used inside brackets as first character e.g [ad23][^bc][e-z]+

In this scenario ,the ^ negates, that is excludes characters after it inside the brackets from being matched. Matched outputs from table below  explains this.

regex([ad23][^bc][e-z]+)- Match a string that starts with either a,d,2 or 3 followed by a character that is not b or c then followed by one or more characters in the range e to z.

Input string

Matched result

Explanation

aafe2bh

aafe

First a matches,second a matches as its not b or c then f matches followed by e since they are in the range e-z.Matching breaks at 2 since 2 is not within range e-z .

cad22

NO MATCH

First character c is not in expected characters at start of string

dadiseatingbreadatthelounge

adise , ating and datthelounge

adise-The first d is not matched since its not expected as a starting character,second character a matches the regex expectations,the following character d matches the regex expectation since its neither b nor c.The susbsequent characters i,s and e match as they are in the range e-z.

ating- a(the second letter ‘a’) was not included in the previous match since it does not fit in the range e-z,but it can start a new match since we expect a as one of the first characters.The subsequent  character t matches as a second character since its neither b nor c,i-g(i,n,g) match as they are in the range e-z.

datthelounge-  the previous match breaks after the first g,because the subsequent character b is not within range e-z.Also b is not one of the start characters so we skip it.The subsequent r and e are not expected beginning characters so they are also skipped.The subsequent a is also skipped since the d that follows matches the regex but the next character a is not in the range e-z.So the subsequent d(the third d) is now one of the expected starting characters so it makes start of building a match,the following character a passes as its neither b nor c,The following t matches as its within range e-z,so do all characters after it giving us the third match

23dibjilm

3di

You would have expected 23 to match right?It does NOT fit.2 matches as its one of the first characters in our expectations.3 matches as its neither b nor c.However our third character d fails as its not in the range e-z and the match fails at this stage.We then move to the second character 3 which matches the expectations of the first character.The subsequent character d matches as its neither b nor c.The subsequent i matches as it’s in the range e-z.The subsequent b fails as its not within range e-z.

2dcfg2ak

2ak

Exercise:Using acquired logic so far,you can try to walk through how our matcher arrived at the output 2ak.

 

Sample code below with output

package devsought;

 

import java.util.regex.Matcher;

import java.util.regex.Pattern;

 

public class Regex6 {

 

    public static void main(String... args) {

        String str = "aafe2bh";

        Pattern pattern = Pattern.compile("[ad23][^bc][e-z]+");

        Matcher matcher = pattern.matcher(str);

        boolean matchFound = false;

        while (matcher.find()) {

            System.out.println(matcher.group());

            matchFound = true;

        }

        if (!matchFound) {

            System.out.println("NO MATCH");

        }

 

    }

}

 

The output of the above is

Matched::aafe

Example 4 -when it's not the first character inside brackets e.g [abc][@^12][\w]+

In this scenario the ^ loses its special meaning and is just a character to be matched like any other. The above regex means match any string that starts with either a,b or c, followed by either @,^,1 or 2 then one or more word characters

 

 

Input string

Matched result

Explanation

ab^ccd

b^ccd

The first character a matches but second character b fails it as its not in our expected list of second position characters.We now start at position b which passes the regex’s expectation of first position.The second character ^ passes our expectations for this position .Starting at position c to end of string ,all characters match coz we expect at least 1 or more word characters.

qd^2abc@

NO MATCH

First character q does not meet our expectations for this position and it fails.Further positions do not satisfy our regex and we don’t get a match.

c^2abc@

c^2abc

First character c matches our regex so as the second character ^.Subsequently word characters 2, a-c satisfy .Matching breaks at @ since its not a word character.

abcab^12

b^12

All matches performed from abca fail because we expect second character to be @,^,1 or 2 if we had started matching from them.The second b matches as our first character followed by ^ as our second as per the regex.1 and 2 follow as they are word characters.

cb@k^bb

b@k

Exercise:Using acquired logic so far,you can try to walk through how our matcher arrived at the output b@k

 

 

Example 5-when its used as \^ inside brackets-e.g [abc][@\^12][\w]+-this regex will give same  outcome  as in Example 4 above.We have just added \ before the ^ .

Sample code is shown below with the output generated

package devsought;

 

import java.util.regex.Matcher;

import java.util.regex.Pattern;

 

public class Regex6 {

 

    public static void main(String... args) {

        String str = "ab^ccd";

        Pattern pattern = Pattern.compile("[abc][@\\^12][\\w]+");

        Matcher matcher = pattern.matcher(str);

        boolean matchFound = false;

        while (matcher.find()) {

            System.out.println(matcher.group());

            matchFound = true;

        }

        if (!matchFound) {

            System.out.println("NO MATCH");

        }

 

    }

}

The output generated is

Matched::b^ccd

About the Author - John Kyalo Mbindyo(Bsc Computer Science) is a Senior Application Developer currently working at NCBA Bank Group,Nairobi- Kenya.He is passionate about making programming tutorials and sharing his knowledge with other software engineers across the globe. You can learn more about him and follow him on  Github.