Another Online bookmark

Just so I’ll know where to find it next time…

Steve Ramsay’s Guide to Regular Expressions


If you’ve ever typed “cp *.html ../” at the UNIX command prompt, or entered “garden?” into a web-based search engine, you’ve already used a simple regular expression. Regular expressions (“regex’s” for short) are sets of symbols and syntactic elements used to match patterns of text.

Even these simple examples testify to the power of regular expressions. In the first instance, you’ve copied all the files which end in “.html” (as opposed to copying them one by one); in the second, you’ve conducted a search not only for “garden,” but for “garden, gardening, gardens, and gardeners” all at once.

4 thoughts on “Another Online bookmark”

  1. Sorry Jim, but I can’t let this slide…

    In your first example, “cp *.html ../”, that’s not actually a regular expression but a glob pattern. As described at foldoc (http://wombat.doc.ic.ac.uk/foldoc/foldoc.cgi?regular+expression), regular expressions are similar to but more elaborate than globs, and they are interpreted in different ways.

    For example, as a glob, “*.html” means all files starting with anything and ending with “.html”. But as a regular expression, “*.html” is not valid, since “*” means “0 or more of the preceding regexp” and there is no preceding regexp in that example. The regular expression for all files ending in “.html” is “.*\.html”.

    In your second example, “garden?” also has different interpretations depending on whether it’s viewed as a glob pattern or a regexp, but unfortunately neither interpretation matches how you’ve described it. As a regular expression, “garden?” matches both “garde” and “garden”, since the ‘?’ means “0 or 1 of the preceding character”. As a glob pattern, “garden?” matches “gardena”, “garden3”, etc, since the ‘?’ means “any single character”. For the interpretation you gave, “garden” followed by anything, you’d write “garden.*” as a regexp, or “garden*” as a glob.

    Sorry to pick nits, but regexps are my life blood! :-)

  2. (I wish bookmarklets added BLOCKQUOTE by default – the text below the link was from the page.)

    Maybe I need to find a better tutorial on Regexps…

    I’ve been pretty lazy about learning them – for example if I wanted to match both Foobar and foobar using grep, I would generally type :
    “grep oobar”
    (Which is useful for 90% of what I need…)

  3. Sorry, I didn’t realise you were quoting directly from the web page you linked.

    The best reference on regular expressions I have found is Jeffrey Friedl’s book “Mastering Regular Expressions” (http://regex.info/) published by O’Reilly & Associates.

    BTW, in your grep example you can just say “grep -i foobar” for case insensitive matching.

  4. Jim, if you do decide to buy the O’Reilly book, Bill gets a discount from O’Reilly (because we re-sell them as part of the Perl training class). So don’t pay list price, I can mail you one. We may have an extra one around the house, actually.

Comments are closed.