Monday, April 11, 2011

How can I efficiently trim selected characters from middle of a string?

Can anyone think of an efficient way (time wise) to trim a few selected characters from the middle of a string?

Best I came up with was:

public static string Trim(this string word, IEnumerable<char> selectedChars)
{
    string result = word;
    foreach (char c in selectedChars)
        result = result.Replace(c.ToString(), "");
    return result;
}

But it is still too slow.

From stackoverflow
  • Two options spring to mind:

    • Use a StringBuilder
    • Use a regular expression

    Here's the StringBuilder version:

    public static string Trim(this string word, IEnumerable<char> selectedChars)
    {
        // The best form for this will depend largely on the size of selectedChars
        // If you can change how you call the method, there are optimisations you
        // could do here
        HashSet<char> charSet = new HashSet<char>(selectedChars);
    
        // Give enough capacity for the whole word. Could be too much,
        // but definitely won't be too little
        StringBuilder builder = new StringBuilder(word.Length);
    
        foreach (char c in word)
        {
            if (!charSet.Contains(c))
            {
                builder.Append(c);
            }
        }
        return builder.ToString();
    }
    

    The regular expression option could be very efficient if you have a fixed set of chars you want to trim, and can build the regex once.

    Something like:

    // Put this statically somewhere
    Regex unwantedChars = new Regex("[def]", RegexOptions.Compiled);
    
    // Then do this every time you need to use it:
    word = unwantedChars.Replace(word, "");
    
  • start by using StringBuilder not string for your replacements...
    see http://blogs.msdn.com/charlie/archive/2006/10/11/Optimizing-C_2300_-String-Performance.aspx

0 comments:

Post a Comment