View Single Post

  #22 (permalink)  
Old 08-17-2008
B. R. 'BeAr' Ederson
Guest
 
Posts: n/a
Default Re: Windows freeware unique sort technique for large text files (hosts)

On Sun, 17 Aug 2008 12:07:23 -0700 (PDT), Anand Hariharan wrote:

>> The following command line should contain all commands in a one liner:
>>
>> sed "/127\.0\.0\.1/d" hosts | tr '[A-Z]' '[a-z]' | sort -u | sed "1i127.0.0.1 localhost" > hosts


> Bad idea. My guess of how the OP is using the hosts file is to set the
> IP address of known malicious sites as 127.0.0.1. You'd at least want
> to append your sed's search expression with '[:space:]*localhost' before
> deleting *ALL* lines that contain 127.0.0.1.


You are absolutely right. :-( Actually, it should have been:

sed "/^127\.0\.0\.1/d"...

I thought about adding a filter for possible leading whitespace, since
some hosts files are formatted this way. But the already long command
line got a bit too unreadable. While deleting the whitespace class
operator I must have killed the leading caret on accident... :-(

> sort has a -f option, so the tr is not required.


I used tr, because it already had been suggested in this thread and
will produce nicer looking output. The mixed case result of the sort
process using the -f option is probably harder to look through, if
need arises.

> Had the OP used your above command line, he'd have lost all entries that
> corresponded to malicious web-sites in his hosts file.


At least she would have a more manageable hosts file size. ;-)

BeAr
--
================================================== =========================
= What do you mean with: "Perfection is always an illusion"? =
================================================== =============--(Oops!)===
Reply With Quote