Thursday, March 31, 2011

Python Regular Expression Substitution

I have a block of text, and for every regex match, I want to substitute that match with the return value from another function. The argument to this function is of course the matched text.

I have been having trouble trying to come up with a one pass solution to this problem. It feels like it should be pretty simple.

From stackoverflow
  • Right from the documentation:

    >>> def dashrepl(matchobj):
    ...     if matchobj.group(0) == '-': return ' '
    ...     else: return '-'
    >>> re.sub('-{1,2}', dashrepl, 'pro----gram-files')
    'pro--gram files'
    
    esiegel : Damn, I just assumed that it could be a string in that function
  • Python-agnostic: Match everything before and everything after your text to replace.

    /^(.*?)(your regexp to match)(.*)$/
    

    Then you have the next before and after the text you're going to replace. The rest is easy -- just insert the result of your function between the two strings.

    ΤΖΩΤΖΙΟΥ : The OP said "for every match", so you described only half of the algorithm. In any case, the appropriate answer has been given and chosen.

0 comments:

Post a Comment