Regular Expressions

7.15.2015

A regular expression is a sequence of characters that forms a search pattern. It is often used with the string functions search() and replace(). Regular expressions tend to differ depending on the programming language, so be wary of idiosyncracies. For the purpose of this article I will be focusing on their use in JavaScript unless otherwise specified you can declare a new regular expression two ways: var re1= new RegExp("abc") // creates a reg exp /abc/ using a constructor var re2= /abc/i //creates a new reg exp using a more common method As RegExp's are delineated with forwardslashes, the escape character is a back slash \ However, if you do not follow the back slash with a special character for RegExps that you could not have accessed without escaping, you will not escape and the backslash will be interpreted as part of the RegExp Special behavior: []-- anything in between brackets [] is treated as a search for any one of the characters at all indicating a range within brackets means anything between those two unicode numbers ^ -- a ^ after the initial bracket inverts the set, that is, says anything but the following characters {} -- Putting braces {} after an element indicates that it should occur exactly {x} number of times. You can also specify {x,}where it will match only if there are at least x occurences, and {x,y} where it is a match if there at least x and up to y occurences. /a{2,}/ will not find a match in candy but finds aaaaa in caaaaandy /a{1,3}/ will find a in candy and only aaa in caaaaaandy ?-- a following an element means that it may occur one or zero times var neighbor = /neighbou?r/; console.log(neighbor.test("neighbour")); // → true console.log(neighbor.test("neighbor")); // → true + -- indicates that the pattern may occur more than once * -- match may occur zero or one time ()-- if you need to use something like the + or * operator on a whole pattern, you can use parenthesis to surround the pattern: var cartoonCrying = /boo+(hoo+)+/i; console.log(cartoonCrying.test("Boohoooohoohooo")); // → true | -- allows a match of the expression on the right or left of the pipe x(?=y) -- only a match if x is followed by y, where x and y are any pattern x(!?=y) -- only a match if x is not followed by y. for example: /\d+(?!\.)/ will match only a number that is not followed by a decimal point \d -- any digit \w -- any "word" character a-z and 0-9 \s -- any whitespace character. any "space" \D -- anything that is not a digit \W -- anything that is non-alphanumeric \S -- a non-whitespace character . -- any character except for newline \0 -- matches a null character ^ -- matches beginning of line. If multiline flag is set, finds a match for anything right after a new line $ -- matches endof line. If multiline flag is set, finds a match for anything right before a line ends FOR RUBY ONLY [[:upper:]] - Uppercase alphabetical [[:graph:]] - Non-blank character (excludes spaces, control characters, and similar) [[:lower:]] - Lowercase alphabetical character [[:print:]] - Like [:graph:], but includes the space character [[:punct:]] - Punctuation character [[:space:]] - Whitespace character ([:blank:], newline, carriage return, etc.)