Word (\w) boundary in regular expressions
In most regex contexts, a word boundary is a position between a word (\w) and a non-word (\W) character. Word characters are in the range [a-Za-Z0-9_]i.e. word characters are alpha-numeric and the (_)underscore character.
Word boundaries can match at below positions.
(a) Before the first character in the string, if the first character is a word character
Example 1 - in the string ‘cup’-word boundary matching will start before ‘c’ since ‘c’ is the first word character. The matched string is ‘cup’.
Example 2 - in the string ‘@myhandle’ word boundary matching will start before the ‘m’ character since ‘@’ is a not a word character. The matched string is ‘myhandle’ since ‘m’ is the first word character.
(b)After the last character in the string, if the last character is a word character
Example 3-in the string ‘cup’-word boundary matching will end after ‘p’ since ‘p’ is the last word character. The matched string is ‘cup’.
Example 4-in the string ‘cup_’-word boundary matching will end after ‘_’ since ‘_’ is the last word character. The matched string is ‘cup_’.
Example 5-in the string ‘cup24’-word boundary matching will end after ‘4 since ‘4’ is the last word character. The matched string is ‘cup24’.
Example 6-in the string ‘cup24%’-word boundary matching will end after ‘4 since ‘4’ is the last word character.’%’ is not a word character. The matched string is ‘cup24’.
(c)Between two characters in the string, where one is a word character and the other it’s a non-word character
Example 7-in the string ‘John@home’-word boundary matching will happen starting before ‘J’ through to after ‘n’ giving us the first word ‘John’ then continue before ‘h’ through to after ‘e’ giving us the second word ‘home’.Matching will break at the ‘@’ character since it’s a non-word (\W) character.
Example 8-in the string ‘Jane in_the_kitchen’-word boundary matching will happen starting before ‘J’ through to after ‘e’ giving us the first word ‘Jane’ then continue before ‘i’ through to after ‘n’ giving us the second word ‘in_the_kitchen’.Matching will break at the ‘ ’(whitespace) character since whitespace is a non-word (\W) character. The second word has ‘_’ since ‘_’ is a word character.