[wx-dev] wxRegEx url matching

Robin Dunn robin at alldunn.com
Sun Feb 3 18:08:31 PST 2008


Vadim Zeitlin wrote:
> On Fri, 01 Feb 2008 11:45:55 -0800 Robin Dunn <robin at alldunn.com> wrote:
> 
> RD> Does anybody have a wxRegEx expression handy that can be used for 
> RD> matching URLs?
> 
>  This all depends on how accurate you can be but my copy of "Mastering
> Regular Expressions" (wholeheartedly recommended) gives this simple
> example (in egrep syntax) for an URI to an HTML file:
> 
> 	\<http://[-a-z0-9_.:]+/[-a-z0-9_:@&?=+,.!/~*%$]*\.html?\>


Thanks.  For the record I ended up with this:

	"(file|http|ftp|https)://([-0-9a-zA-Z\\._]*)(:[0-9]+)?([-/\\.a-zA-Z0-9_#~:.?+=&%!@]*)"


which handles URLs like all of these, and others:

	http://example.com
	http://example.com/dirname
	http://example.com/dirname/file.ext
	http://example.com:8080/dirname

	etc.

I got a little confused along the way because apparently wxRegEx doesn't 
support the common \w character class and I had used when experimenting 
with regex's in Python.  I kept getting non-matches when I thought I had 
done everything right, so I thought maybe my regex skills were a lot 
more rusty than I thought...

-- 
Robin Dunn
Software Craftsman
http://wxPython.org  Java give you jitters?  Relax with wxPython!





More information about the wx-dev mailing list