On this Page

Word (\w) boundary in regular expressions

In most regex contexts, a word boundary is a position between a word (\w) and a non-word (\W) character. Word characters are in the  range [a-Za-Z0-9_]i.e. word characters are alpha-numeric and the (_)underscore character.

Word boundaries can match at below positions.

(a) Before the first character in the string, if the first character is a word character

Example 1 - in the string   ‘cup’-word boundary matching will start before ‘c’ since ‘c’ is the first word   character. The matched string is ‘cup’.


Example 2 - in the string ‘@myhandle’ word boundary matching will start before the ‘m’ character since ‘@’ is a not a word character. The matched string is ‘myhandle’ since ‘m’ is the first word character.

 

(b)After the last character in the string, if the last character is a word character

Example 3-in the string  ‘cup’-word boundary matching will end after ‘p’ since ‘p’ is the last word character. The matched string is ‘cup’.


Example 4-in the string  ‘cup_’-word boundary matching will end after ‘_’ since ‘_’ is the last word character. The matched string is ‘cup_’.


Example 5-in the string  ‘cup24’-word boundary matching will end after ‘4 since ‘4’ is the last word character. The matched string is ‘cup24’.


Example 6-in the string  ‘cup24%’-word boundary matching will end after ‘4 since ‘4’ is the last word character.’%’ is not a word character. The matched string is ‘cup24’.
 

(c)Between two characters in the string, where one is a word character and the other it’s a non-word character

Example 7-in the string ‘John@home’-word boundary matching will happen starting before ‘J’ through to after ‘n’ giving us the first word ‘John’ then continue before ‘h’ through to after ‘e’ giving us the second word ‘home’.Matching will break at the ‘@’ character since  it’s a non-word (\W) character.

Example 8-in the string ‘Jane in_the_kitchen’-word boundary matching will happen starting before ‘J’ through to after ‘e’ giving us the first word ‘Jane’ then continue before ‘i’ through to after  ‘n’ giving us the second word ‘in_the_kitchen’.Matching will break at the ‘ ’(whitespace) character since  whitespace is a non-word (\W) character. The second word has ‘_’ since ‘_’ is a word character.
 

About the Author - John Kyalo Mbindyo(Bsc Computer Science) is a Senior Application Developer currently working at NCBA Bank Group,Nairobi- Kenya.He is passionate about making programming tutorials and sharing his knowledge with other software engineers across the globe. You can learn more about him and follow him on  Github.