![]() |
| |||||||
| Unix Shell Programming Post here for discussing in comp.unix.shell newsgroup. |
![]() |
| LinkBack | Thread Tools | Display Modes |
| |||
| I am trying to find an easy and fast way to compare two files, each with several thousand lines - only one column and spit out what is unique only to one of the files. So, compare file A and file B, and only lines that re unique to file A are spit out to a new file.. comm and diff / sort and Uniq do not work because in this case the two files will have non-adjacent lines. Any help is GREATLY appreciated. Thank you in advance! -TT |
| |||
| <tntelle@yahoo.com> wrote in message news:1187142111.587829.251580@q3g2000prf.googlegro ups.com... > I am trying to find an easy and fast way to compare two files, each > with several thousand lines - only one column and spit out what is > unique only to one of the files. > So, compare file A and file B, and only lines that re unique to file A > are spit out to a new file.. comm and diff / sort and Uniq do not > work because in this case the two files will have non-adjacent lines. > cat A A B | sort |uniq -u awk 'FNR==NR{Seen[$0]++} FNR!=NR && !Seen[$0]' A B I am not sure what you mean by "non-adjacent lines". And note that the two solutions above give different results for lines that appear more than once in B: it is not clear what you want. Surely solutions based on diff or comm will work if you first sort A and B? -- John. |
| |||
| On Aug 15, 2:37 am, William James <w_a_x_...@yahoo.com> wrote: > John L wrote: > > <tnte...@yahoo.com> wrote in messagenews:1187142111.587829.251580@q3g2000prf.go oglegroups.com... > > > I am trying to find an easy and fast way to compare two files, each > > > with several thousand lines - only one column and spit out what is > > > unique only to one of the files. > > > So, compare file A and file B, and only lines that re unique to file A > > > are spit out to a new file.. comm and diff / sort and Uniq do not > > > work because in this case the two files will have non-adjacent lines. > > > cat A A B | sort |uniq -u > > awk 'FNR==NR{Seen[$0]++} FNR!=NR && !Seen[$0]' A B > > > I am not sure what you mean by "non-adjacent lines". > > And note that the two solutions above give different results for > > lines that appear more than once in B: it is not clear what you want. > > Surely solutions based on diff or comm will work if you first sort > > A and B? > > > -- > > John. > > Since he wants the lines that are in A but not in B, > I think the order of the files should be reversed. > > awk 'NR==FNR{seen[$0]++; next} !seen[$0]' B A > > Another way: > > awk 'NR==FNR{seen[$0]; next} !($0 in seen)' B A > > One Ruby solution: > > ruby -e 'def lines;gets(nil).split("\n") end; puts lines - lines' A B- Hide quoted text - > > - Show quoted text - Thank you all --- When i attempt awk 'NR==FNR{seen[$0]++; next} !seen[$0]' B A - i get awk: syntax error near line 1 awk: bailing out near line 1 =/ |
![]() |
| Tags |
| comparing, entries, files, lines, nonadjacent, text, unique |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
| Display Modes | |
|
|
| ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Comparing two text files with non-adjacent lines for unique entries | tntelle@yahoo.com | Unix Shell Programming | 6 | 06-27-2008 11:13 PM |
| Comparing two text files with non-adjacent lines for unique entries | tntelle@yahoo.com | Unix Shell Programming | 0 | 06-27-2008 11:13 PM |
| Countif Query - Finding Unique Entries | jameswilkinsonfjs@googlemail.com | MS-Access | 4 | 05-21-2008 03:10 PM |
| Re: Importing text files whose lines are longer than 80-chars? | MLH | MS-Access | 4 | 03-16-2008 05:47 PM |
| Comparing two text files with non-adjacent lines for unique | Unix Shell Programming | 10 | 08-17-2007 01:17 PM | |