Hi,
I have text file which has lot of character entries one line after another.
I want to find all lines which start with ::
and delete all those lines.
What is the regular expression to do this?
-AD
-
Simple as:
^::
-
^::.*[\r\n]*
If you're reading the file line-by-line you won't need the
[\r\n]*
part. -
Regular expressions don't "do" anything. They only match text.
What you want is some tools that uses regular expressions to identify a line and then apply some command to those tools.
One such tools is
sed
(there's alsoawk
and many others). You'd use it like this:sed -e "/^::/d" < input.txt > output.txt
The part "
/^::/
" tellssed
to apply the following command to all lines that start with "::" and "d
" simply means "delete that line".Or the simplest solution (which my brain didn't produce for some strange reason):
grep -v "^::" input.txt > output.txt
Dscoduc : I think you have forgotten the Regex.Replace function... That actually "does" something, doesn't it?Joachim Sauer : @Dcoduc: as you said: The function does something (its one of the tools I mentioned). The regular expression itself still only matches some text. It's the semantics of the function that defines what is to be done with the matched text.Dscoduc : Thanks for the clarification... I stand corrected... -
sed -i -e '/^::/d' yourfile.txt
oylenshpeegul : I think this is perhaps the best answer, but it might be worth mentioning that not all versions of sed have a -i option. -
If you don't have sed or grep, find this and replace with empty string:
^::.*[\r\n]
-
Thanks for the pointers:
Following thing worked for me. After "::" any character was possiblly present in the text file so i gave:
^::[a-zA-Z0-9 I put all punctuation symbols here]*$
-AD
Manu : you don't need to match enything after the initial ^:: In your example you are forced to "account for" all the characters because you put a $ at the end.Alan Moore : If he's using a line-oriented tool like grep you're right. But he still hasn't said.Alan Moore : @goldenmean, what's preventing you from using .* instead of that monster character class?Dscoduc : I agree, it would be probably better to use a singleline option and add the .* to the expression.Alan Moore : Single-line? Why would you want the dot to match newline characters? If you read one line at a time, there won't be any newlines to match, and if you read the whole file into memory before processing, the dot-star will consume the rest of the file the first time it's applied. -
Here's my contribution in C#:
Text stream:
string stream = :: This is a comment line
Syntax:
Regex commentsExp = new Regex("^::.*", RegexOptions.Singleline);
Usage:
Console.WriteLine(commentsExp.Replace(stream, string.Empty));
Alternatively, if I wanted to simply take a text file that included comments and produce an exact duplicate without the comment lines I could use a simple but effective combination of the type and findstr commandline tools:
type commented.txt | findstr /v /R "^::" > uncommented.txt
0 comments:
Post a Comment