This is a quick tip on how to remove empty lines using regular expression from a large text file in Visual Studio Code.
When I am processing data and need to do it quickly, I found the best way is to use a simple text editor before loading it to Excel or to import it to a database. My favorite one is for cleaning and normalizing data is Visual Studio Code and the regular expression support of its Search Tab.
What I normally do is to start Visual Studio Code and begin removing unneeded lines or words from it. Recently, I had a few hundred lines of text that I needed to convert to CSV, so I can load it to Excel for further processing.
After removing the not needed data I ended up with a rather large text file like with hundreds of empty lines like this one in the screenshot. Removing this many empty lines is hard and unproductive work which I do not like.
How to remove empty lines using regular expression
To find the empty lines, do the following:
- Click on the Search icon on the left or use shortcut Ctrl-Shift-F
- Put in the regular expression below
- Make sure that the Use Regular Expression option is selected next to the Search Term entry.
- Make sure that the Replace entry remains empty.
- Click Replace All to remove the empty lines. This button is next to the Replace field or use the shortcut Ctrl-Alt-Enter.
- If you have multiple files open then make sure that you remove the empty lines only from the file that you want to.
Use this regular expression to find empty lines in your text.
This is it. Enjoy! :).
More tips with regular expressions
How do I match lines with a regular expression that ends with a word?
If you have text where lines are containing a certain word like someline but you want only to remove only the lines where this word ends the line, but keep other lines where the word is in a URL for example, then use someline$ regular expression to match the lines
How do I match lines with a regular expression that begins with a word?
If you have text where lines are containing a certain word like someline but you want only to remove only the lines where the line begins with this word, but keep other lines where the word is in a URL for example, then use ^someline regular expression to match the lines