Tuesday, March 1, 2011

Regex to find labels

I'm trying to find all labels with an id and text, but my regex doesn't seem to work:

With the following regex:

<asp:[a-z]+.*? ID="(?<id>.*?)".*? Text="(?<text>.*?)".*?/>

and the following sample text:

<asp:Label ID="SomeID" Text="SomeText" />
<asp:Label Text="SomeText" />
<asp:Label ID="SomeID" />
<asp:Label ID="SomeOtherID" Text="Some Other Text" />

I get the following matches:

   1. "<asp:Label ID="SomeID" Text="SomeText" />" has 2 groups:
         1. "SomeID"
         2. "SomeText"
   2. "<asp:Label Text="SomeText" /> <asp:Label ID="SomeID" /> <asp:Label ID="SomeOtherID" Text="Some Other Text" />" has 2 groups:
         1. "SomeID"
         2. "Some Other Text"

The first one is obviously correct, but I'm not sure why #2 shows up.

And the following regex only finds the first label ("SomeID") but not the fourth one ("SomeOtherID"):

<asp:[a-z]+ (?!.*<[a-z]).*? ID="(?<id>.*?)".*? Text="(?<text>.*?)".*?/>
From stackoverflow
  • Try replacing the .*s in your expression with [^>]*, to avoid crossing HTML tag boundaries. The problem is that the .*? in the middle of your expression matches /> <asp:Label ID="SomeOtherID" .

    Perhaps something like this:

    <asp:[a-z]+\s*ID="(?<id>[^"]*)"\s*Text="(?<text>[^"]*)"[^/]*/>
    
    Theo Lenndorff : Just as an additional help, look for the ">>>> <<<<": >>>.*?<<<< Text="(?.*?)".*?/>
    Sadhana : Thanks - your regex provided a good starting point. :-)

0 comments:

Post a Comment