Fantastic Unix Forums  

Go Back   Fantastic Unix Forums > Fantastic Unix Forums > Shell Programming > Unix Shell Programming

Unix Shell Programming Post here for discussing in comp.unix.shell newsgroup.

Comparing two text files with non-adjacent lines for unique entries

Reply

 

LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 06-27-2008
tntelle@yahoo.com
Guest
 
Posts: n/a
Default Comparing two text files with non-adjacent lines for unique entries

I am trying to find an easy and fast way to compare two files, each
with several thousand lines - only one column and spit out what is
unique only to one of the files.
So, compare file A and file B, and only lines that re unique to file A
are spit out to a new file.. comm and diff / sort and Uniq do not
work because in this case the two files will have non-adjacent lines.

Any help is GREATLY appreciated. Thank you in advance!
-TT

Reply With Quote
  #2 (permalink)  
Old 06-27-2008
John L
Guest
 
Posts: n/a
Default Re: Comparing two text files with non-adjacent lines for unique entries


<tntelle@yahoo.com> wrote in message news:1187142111.587829.251580@q3g2000prf.googlegro ups.com...
> I am trying to find an easy and fast way to compare two files, each
> with several thousand lines - only one column and spit out what is
> unique only to one of the files.
> So, compare file A and file B, and only lines that re unique to file A
> are spit out to a new file.. comm and diff / sort and Uniq do not
> work because in this case the two files will have non-adjacent lines.
>


cat A A B | sort |uniq -u
awk 'FNR==NR{Seen[$0]++} FNR!=NR && !Seen[$0]' A B

I am not sure what you mean by "non-adjacent lines".
And note that the two solutions above give different results for
lines that appear more than once in B: it is not clear what you want.
Surely solutions based on diff or comm will work if you first sort
A and B?

--
John.


Reply With Quote
  #3 (permalink)  
Old 06-27-2008
tntelle@yahoo.com
Guest
 
Posts: n/a
Default Re: Comparing two text files with non-adjacent lines for unique entries

On Aug 15, 2:37 am, William James <w_a_x_...@yahoo.com> wrote:
> John L wrote:
> > <tnte...@yahoo.com> wrote in messagenews:1187142111.587829.251580@q3g2000prf.go oglegroups.com...
> > > I am trying to find an easy and fast way to compare two files, each
> > > with several thousand lines - only one column and spit out what is
> > > unique only to one of the files.
> > > So, compare file A and file B, and only lines that re unique to file A
> > > are spit out to a new file.. comm and diff / sort and Uniq do not
> > > work because in this case the two files will have non-adjacent lines.

>
> > cat A A B | sort |uniq -u
> > awk 'FNR==NR{Seen[$0]++} FNR!=NR && !Seen[$0]' A B

>
> > I am not sure what you mean by "non-adjacent lines".
> > And note that the two solutions above give different results for
> > lines that appear more than once in B: it is not clear what you want.
> > Surely solutions based on diff or comm will work if you first sort
> > A and B?

>
> > --
> > John.

>
> Since he wants the lines that are in A but not in B,
> I think the order of the files should be reversed.
>
> awk 'NR==FNR{seen[$0]++; next} !seen[$0]' B A
>
> Another way:
>
> awk 'NR==FNR{seen[$0]; next} !($0 in seen)' B A
>
> One Ruby solution:
>
> ruby -e 'def lines;gets(nil).split("\n") end; puts lines - lines' A B- Hide quoted text -
>
> - Show quoted text -


Thank you all ---
When i attempt awk 'NR==FNR{seen[$0]++; next} !seen[$0]' B A - i get
awk: syntax error near line 1
awk: bailing out near line 1

=/

Reply With Quote
Reply

Tags
comparing, entries, files, lines, nonadjacent, text, unique


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)

 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump

Similar Threads

Thread Thread Starter Forum Replies Last Post
Comparing two text files with non-adjacent lines for unique entries tntelle@yahoo.com Unix Shell Programming 6 06-27-2008 11:13 PM
Comparing two text files with non-adjacent lines for unique entries tntelle@yahoo.com Unix Shell Programming 0 06-27-2008 11:13 PM
Countif Query - Finding Unique Entries jameswilkinsonfjs@googlemail.com MS-Access 4 05-21-2008 03:10 PM
Re: Importing text files whose lines are longer than 80-chars? MLH MS-Access 4 03-16-2008 05:47 PM
Comparing two text files with non-adjacent lines for unique Unix Shell Programming 10 08-17-2007 01:17 PM


All times are GMT +1. The time now is 02:01 PM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.2.0