Is there a better way to do this command to find strings in a file excluding special characters?

Posted on

Problem :

Is there a better way to do this command to find strings in a file excluding special characters?

Currently I’m doing:

strings file.abc | grep -v = | grep -v ] | grep -v ) | more

I’d like to add more special characters so I’m only getting a-z and A-Z in the results.

Solution :

If you just want to exclude these special characters you can use regular expressions (e.g., PCRE like this:

strings file.abc | grep -Pv "[=])]"

If you want to display only strings containing some specific characters, you could use grep instead of strings.

The command

grep -Poa "[A-Za-z]{4,}" file.abc

shows all words with of at least four letters.

Here:

  • The -o switch makes grep show only the match (rather than the entire line).
  • The -a switch forces treating a binary file as a text file.
  • The PCRE [A-Za-z]{4,} matches four or more consecutive letters.

    Four is the default number that strings uses. Adjust as needed.

How about

strings file.abc | grep '^[A-Za-z]*$'

?

That will give you only lines consisting only of letters.

Actually, you probably want lines containing only one or more sequences of letters;
i.e., lines containing letters and spaces.  If that’s what you want, do

strings file.abc | grep '^[A-Za-z ]*$'

with a space after the z
If you decide that you want to include any other characters, put them inside the brackets. 
(Warning: some characters will be tricky,
such as the quote character itself, ', and the right bracket, ].)

Leave a Reply

Your email address will not be published. Required fields are marked *