wxRegEx woes

Volker Bartheld dr_versaeg at freenet.de
Thu Sep 14 04:29:27 PDT 2006


Hi!

[In the following examples, I use C-notation/escaping for strings, i. e.
"\\" for a string containing a _single_ backslash.]

I want to extract single-character commands with optional argument from
a string using regular expressions. Command arguments are encapsulated
in single quotes. So i. e. "\\1", "\\h'Foobar'" are valid.

I use this code

 typedef const wxChar* (LPFNFILTER)(const wxChar*, void*);
 const wxChar* FilterFn(const wxChar* Source, void*) { _tprintf(Source.c_str()); return wxT(""); }
 bool RegExFilter(wxString& Str, const wxString& RegExString, LPFNFILTER FnFilter, void* pv, size_t MatchCount=0)
 {
   wxString NewString;
   wxRegEx RegEx(RegExString);
   size_t start, len;
   bool bHasReplaced=false;
   if(!RegEx.IsValid()) return false;
   if(MatchCount>=RegEx.GetMatchCount()) return false;
   while(RegEx.Matches(Str.c_str()))
   {
     bHasReplaced=true;
     if(!RegEx.GetMatch(&start, &len, MatchCount)) return false;
     NewString.Append(Str.Left(start));
     NewString.Append(FnFilter(wxString(Str, start, len), pv));
     Str.Remove(0, start+len);
   } // while(RegEx.Matches(Str.c_str()))
   NewString.Append(Str);
   Str=NewString;
   return bHasReplaced;
 }

to send the regex-expression to a callback function LPFNFILTER. So,
 RegExFilter(wxT("Hello \\h'HelpMe'", wxT("(\\\\[h])('[^']*')?", FilterFn, 0);
works perfect for catching and cutting the \h(elp) command and argument
of the second example.

However, the backslash can be "escaped" itself - by doubling it.
"\\\\HelpMe" shouldn't be recognized as a command. I came up with
a RegExString of wxT("([^\\\\]|^)(\\\\[?])('[^']*')?" );
This introduced a subexpression for the first (nonbackslash) character
that I don't want to be part of the match. Since I have another two
subexpressions, I can't just increase MatchCount to 1 or 2.

So I ended with grouping the last two subexpressions like
wxT("([^\\\\]|^)((\\\\[?])('[^']*')?)") with MatchCount==2 since
non-capturing parentheses (see
http://www.regular-expressions.info/brackets.html) as in
wxT("(?:[^\\\\]|^)((\\\\[?])('[^']*')?)") didn't work.

Do you see an easier way to do this in wxW?

Thanks for reading that rather longish writeup and you opinion.


Volker
__
Mail replies to/an V B A R T H E L D at G M X dot D E






More information about the wx-users mailing list