I'm working on a commercial (not open source) C++ project that runs on a linux-based system. I need to do some regex within the C++ code. (I know: I now have 2 problems.)

QUESTION: What libraries do people who regularly do regex from C/C++ recommend I look into? A quick search has brought the following to my attention:

1) Boost.Regex (I need to go read the Boost Software License, but this question is not about software licenses)

2) C (not C++) POSIX regex (#include <regex.h>, regcomp, regexec, etc.)

3) http://freshmeat.net/projects/cpp_regex/ (I know nothing about this one; seems to be GPL, therefore not usable on this project)

Solution 1

Boost.Regex is very good and is slated to become part of the C++0x standard (it's already in TR1).

Personally, I find Boost.Xpressive much nicer to work with. It is a header-only library and it has some nice features such as static regexes (regexes compiled at compile time).

Update: If you're using a C++11 compliant compiler (gcc 4.8 is NOT!), use std::regex unless you have good reason to use something else.

Solution 2

Thanks for all the suggestions.

I tried out a few things today, and with the stuff we're trying to do, I opted for the simplest solution where I don't have to download any other 3rd-party library. In the end, I #include <regex.h> and used the standard C POSIX calls regcomp() and regexec(). Not C++, but in a pinch this proved to be the easiest.

Solution 3

In C++ projects past, I have used PCRE with good success. It's very complete and well-tested since it's used in many high profile projects. And I see that Google has contributed a set of C++ wrappers for PCRE recently, too.

Solution 4

C++ has a builtin regex library since TR1. AFAIK Boost's regex library is very compatible with it and can be used as a replacement, if your standard library doesn't provide TR1.

Solution 5

Boost has regex in it.

That should fill the bill

Solution 6

Two more options:

If you can write it in c++11 - Do the tutorial: http://www.codeguru.com/cpp/cpp/cpp_mfc/stl/article.php/c15339

Note: At the time of writing the only c++11 regex library that I know works is the clang/llvm one, and only works on Mac. The GNU still doesn't implement regex yet. I don't know about Visual Studio. Most people still use the boost regex implementation.

Or you can use ragel to generate a finite state machine to do the parsing for you, and generate the C/C++ code implementation: http://www.complang.org/ragel/

I used it a little to generate code to parse json. This ragel file: https://github.com/matiu2/yajp/blob/master/parser/number.rl is used to generate this code https://github.com/matiu2/yajp/blob/master/parser/json.hpp#L254 and this finite state machine diagram:

Update 1:

lvm's libc++ regex works on ubuntu 14.04: libc++-dev - LLVM C++ Standard library (development files). When compiling: clang++ -std=c++11 -lc++ -I/usr/include/c++/v1 ...

Update 2:

I'm currently enjoying boost spirit 3 - I like it more than regex, because it has BNF style rules and is well thought out. (Older (more documented) Spirit Qi libs found here)

Solution 7

You can also look at fast regex library that was developed at Yandex search engine for doing fast matches of thousands of patterns against huge amounts of data.

Solution 8

I've personally always used boost.regex (although I don't have much need for regex in C++). Microsoft Labs has a regex library too, called GRETA: http://research.microsoft.com/projects/greta/. Apparently it's very fast and features a whole Perl 5 syntax. I haven't used it, but you may want to test it out.

Solution 9

I faced a similar situation and ended up using Henry Spencers Regexp Engine http://www.codeproject.com/KB/string/spencerregexp.aspx

Solution 10

Noone here said anything about the one that comes with C++0x. If you are using a compiler and the STL that supports C++0x you could just use that instead of having another lib in your project.