RegEx Crosswords

RegEx Crosswords

Regular Expressions, usually shortened to RegEx, are a way of searching a database in a more general way than usual that computer programmers employ. For example, maybe you want to find all of the people in your database with the surname Stephen or its derivatives. Some have a ph, while others have a v and either may or may not have an s on the end. In RegEx we could search for Ste((ph)|v)ens? Where (a|b) means a or b and s? means that s happens 0 or 1 times. This string will return all four variations of Stephen as true, but no other word can fit the patten.

There are all sorts of different symbols that RegEx uses, but some of the more common ones are . which is the wildcard meaning any symbol, [ABC] which produces any of the symbols in the range, A* which means any number of A in a row, such as AAAAAA, and A+ which is the same, but means that A happens at least once (so not zero times).

All of this is very practical if you work with large data sets, although this XKCD hits the nail on the head:

regular_expressions.png

However, I came across RegEx not from programming, but from puzzling. There is a genre of puzzle called the RegEx Crossword. Let's work one through to get a taste: 

Screenshot_20170907-171619.jpg

The top row is CAT, FOR or CAT, which means the last letter is definitely either T or R. Looking at the third column the first letter is one of R, A or M. The only letter in common is R, so we can fill in FOR:

Screenshot_20170907-172200.jpg

The middle column has some number wildcards followed by at least one letter from the range WAY. This means that the last digit it definitely one of those letters. But the third row says we have letters from TOWEL some number of times so only the letter W is repeated. While we are looking at that column we can fill in the middle square because the second row is either RY- or TY-.

Screenshot_20170907-173118.jpg

The first column is any letter, then any letter and then the \1 means repeat the first group (a group is something with a bracket around it), so we know that the 2nd and 3rd digits are the same as each other. Since the second row starts with either an R or a T, yet the bottom left box has to be a letter from TOWEL, they must both be Ts. Finally the bottom left box is O. 

Screenshot_20170907-173705.jpg

The meaning of life, the universe and everything.

If you want to have a go, there is a brilliant website which introduces the symbols one at a time and builds up to much more difficult ones. The help button in the top right brings up a dictionary of Regular Expressions if you need to look one up, There is usually quite a lot of humour hidden in the clues. Have fun!

(ps, you may want to get good at these now. I have something big planned for this site for next week.)

Globe Patches

Globe Patches

Flipping a Coin is Not 50:50

Flipping a Coin is Not 50:50