Regular expressions can be very useful for validating and manipulating data. Unfortunately there’s no built in support for regular expressions in Excel. You can use VBA but I think that’s a bit too messy.

In my Excel plugin SeoTools, I’ve added four methods:

=RegexpIsMatch(string input; string expr)

Matches an input string with a regular expression and returns true if there’s a match.

=RegexpFind(string input; string expr; int group)

Takes a regular expression and retrieves the matched group from an input string.

=RegexpFindOnUrl(string url; string regexp; int group)

Same as RegexpFind but the input string is instead the contents of a webpage.

=RegexpReplace(string input; string expr; string replacement)

Within a specified input string, replaces all strings that match a specified regular expression with a specified replacement string.

All examples below assume that you have downloaded and added SeoTools to the document you’re working on.

Example1 Validating email addresses with RegexpIsMatch

Let’s say you have a column with a bunch of email addresses and you want to find all invalid addresses. The following formula will return FALSE for all invalid email addresses:

=RegexpIsMatch(“niels@bosmainteractive.se”;”^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}$”


Example2 Find a particular part in a text

Now let’s say you have a column with text where you’d like to extract a particular part from, i.e. “Sports (id:4)”, where we like to extract the “4″ into a new column.

=RegexpFind(A1;”\(id:(\d+)\)”;1)


Example3 Get the number of pages indexed in Google for a set of keywords

For the last example we try some simple web scraping using the regular expression support in SeoTools. In our example we have a set of keywords and want to have the formula return the number of pages a search returns.

=RegexpFindOnUrl(“http://www.google.com/search?q=”&UrlEncode(A1);”About ([0-9,]*) results”;1)

With functions in SeoTools that are marked with “(cached)” (such RegexpFindOnUrl) the URL is only fetched once per “session”. So if you have several pieces you want to extract from a webpage the source of the page is only downloaded once even if you reference the URL several times in different formulas.

Resources for working with regular expressions

Tagged with:
 

9 Responses to Working with regular expressions in Excel

  1. ylxi123 says:

    hi,i’m jethro,i tried the seotools u shared,i wanna say ,great job !
    it works perfect with windows xp and office2007 except the GoogleIndexcount function,the message is “Input string format is not correct “?
    Could you please reply me ?
    Thanks in advance !

  2. MOGmartin says:

    Hi Niels,

    as per the commenter above, fantastic job on the plugins – they are fantastic looking, and Im sure they will save me a tonne of time.

    My issue: Can’t get them to appear in the userdefined functions (ie. I cant use them at all!) Running windows XP, Office 2007.

    the URLencode() functions appear, and work – but nothing else.

    Do you have any suggestions?

    thanks and best regards

    Martin.

  3. nielsbosma says:

    @MOGmartin: That’s very weird. Have you tried to reboot Excel and add the plugin again? You need to add the plugin to the document for every “session”. Is it only UrlEncode that pops up?

  4. nielsbosma says:

    @ylxi123: Where are you from? What’s your local Google url?

  5. MOGmartin says:

    Hi Niels,

    thanks for the reply, Ive tried the plugin on my home machine now, Office 2007 + windows 7 – and it works perfectly.

    Must be something wrong with my office machine, but its the first time an XLL or an add-in has failed.

    Must say, I really love the feature set, if you don’t mind I will feature it on a post on my site (seoforums.org/seo-blog).

    Also, while I greatly appreciate it being a free add download, I’d happily pay a couple of hundred dollars for it, perhaps you should rethink your pricing ;)

    Best regards

    MOGmartin
    (Martin Macdonald)

  6. MOGmartin says:

    UPDATE: worked out why the xll didnt work on my office pc.

    My default downloads folder is stored on a network drive – excel will open the file from there but it doesnt work fully.

    copying it onto my C drive, and it now works fine.

    thanks Niels!

  7. Vince says:

    Hi Neil,

    I cannot get any of them to work as well, is there anything I need to change to I add to excel with respect to the formula!

    Thanks,

    Vince.

  8. nielsbosma says:

    @Vince: .Net 2.0 Installed?

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>