Celeb Glow
news | March 18, 2026

How do I remove all lines in a file that are less than 6 characters?

I have a file containing approximately 10 million lines.

I want to remove all lines in the file that are less than six characters.

How do I do this?

2

5 Answers

There are many ways to do this.

Using grep:

grep -E '^.{6,}$' file.txt >out.txt

Now out.txt will contain lines having six or more characters.

Reverse way:

grep -vE '^.{,5}$' file.txt >out.txt

Using sed, removing lines of length 5 or less:

sed -r '/^.{,5}$/d' file.txt

Reverse way, printing lines of length six or more:

sed -nr '/^.{6,}$/p' file.txt 

You can save the output in a different file using > operator like grep or edit the file in-place using -i option of sed:

sed -ri.bak '/^.{6,}$/' file.txt 

The original file will be backed up as file.txt.bak and the modified file will be file.txt.

If you do not want to keep a backup:

sed -ri '/^.{6,}$/' file.txt

Using shell, Slower, Don't do this, this is just for the sake of showing another method:

while IFS= read -r line; do [ "${#line}" -ge 6 ] && echo "$line"; done <file.txt

Using python,even slower than grep, sed:

#!/usr/bin/env python2
with open('file.txt') as f: for line in f: if len(line.rstrip('\n')) >= 6: print line.rstrip('\n')

Better use list comprehension to be more Pythonic:

#!/usr/bin/env python2
with open('file.txt') as f: strip = str.rstrip print '\n'.join([line for line in f if len(strip(line, '\n')) >= 6]).rstrip('\n')
8

It's very simple:

grep ...... inputfile > resultfile #There are 6 dots

This is extremely efficient, as grep will not try to parse more than it needs, nor to interpret the chars in any way: it simply send a (whole) line to stdout (which the shell then redirects to resultfile) as soon as it saw 6 chars on that line (. in a regexp context matches any 1 character).

So grep will only output lines having 6 (or more) chars, and the other ones are not outputted by grep so they don't make it to resultfile.

Solution #1: using C

Fastest way: compile and run this C program:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_BUFFER_SIZE 1000000
int main(int argc, char *argv[]) { int length; if(argc == 3) length = atoi(argv[2]); else return 1; FILE *file = fopen(argv[1], "r"); if(file != NULL) { char line[MAX_BUFFER_SIZE]; while(fgets(line, sizeof line, file) != NULL) { char *pos; if((pos = strchr(line, '\n')) != NULL) *pos = '\0'; if(strlen(line) >= length) printf("%s\n", line); } fclose(file); } else { perror(argv[1]); return 1; } return 0;
}

Compile with gcc program.c -o program, run with ./program file line_length (where file = path to the file and line_length = minimum line length, in your case 6; the maximum line length is limited to 1000000 characters per line; you can change this by changing the value of MAX_BUFFER_SIZE).

(Trick to substitute \n with \0 found here.)

Comparison with all the other solutions proposed to this question except the shell solution (test run on a ~91MB file with 10M lines with an average lenght of 8 characters):

time ./foo file 6
real 0m1.592s
user 0m0.712s
sys 0m0.160s
time grep ...... file
real 0m1.945s
user 0m0.912s
sys 0m0.176s
time grep -E '^.{6,}$'
real 0m2.178s
user 0m1.124s
sys 0m0.152s
time awk 'length>=6' file
real 0m2.261s
user 0m1.228s
sys 0m0.160s
time perl -lne 'length>=6&&print' file
real 0m4.252s
user 0m3.220s
sys 0m0.164s
sed -r '/^.{,5}$/d' file >out
real 0m7.947s
user 0m7.064s
sys 0m0.120s
./script.py >out
real 0m8.154s
user 0m7.184s
sys 0m0.164s

Solution #2: using AWK:

awk 'length>=6' file
  • length>=6: if length>=6 returns TRUE, prints the current record.

Solution #3: using Perl:

perl -lne 'length>=6&&print' file
  • If lenght>=6 returns TRUE, prints the current record.

% cat file
a
bb
ccc
dddd
eeeee
ffffff
ggggggg
% ./foo file 6
ffffff
ggggggg
% awk 'length>=6' file
ffffff
ggggggg
% perl -lne 'length>=6&&print' file
ffffff
ggggggg
8

You can use Vim in Ex mode:

ex -sc 'v/\v.{6}/d' -cx file
  1. \v turn on magic

  2. .{6} find lines with 6 or more characters

  3. v invert selection

  4. d delete

  5. x save and close

Ruby solution:

$ cat input.txt
abcdef
abc
abcdefghijk
$ ruby -ne 'puts $_ if $_.chomp.length() >= 6 ' < input.txt
abcdef
abcdefghijk

Simple idea: redirect file into ruby's stdin, and print line from stdin only if it's length greater or equal to 6

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy