Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: Compare two large directories?

  • Subject: Re: Compare two large directories?
  • From: Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx>
  • Date: Fri, 11 Nov 2005 12:14:05 +0000
  • Newsgroups: alt.os.linux.suse
  • Organization: schestowitz.com / MCC / Manchester University
  • References: <3tj2gkFt2lo5U1@individual.net> <cm5e43-rka.ln1@penne.houghi> <3tjbhmFt5lnnU1@individual.net>
  • Reply-to: newsgroups@xxxxxxxxxxxxxxx
  • User-agent: KNode/0.7.2
__/ [ac] on Friday 11 November 2005 10:53 \__

> houghi wrote:
>> ac wrote:
>>> I want to compare two large (50GB each) directories and do not know good
>>> ways of doing this.
>>> They are almost the same from reports from Properties (KDE), but I want
>>> to know more precisely.
>> What do you want to compare? Install dirdiff and try that.
> Thanks I will try.
> The directories are nominally indentical - one should be a copy of the
> other - but they report as having slightly different numbers of files
> and bytes. Some are very old files from earlier machies, the stone ages,
> and other os's, I have found that on a few occasions, a few files
> 'cannot be read', including the occasional jpeg. I am now trying to find
> out more about what is happening.
> A while back I was still using samba, and I then concluded that somehow
> I was getting trouble in some way sometimes  copying accross the lan.
> One pc is a PIII (390MB ram or near) and the files are on a recent
> seagate HD. Samba seems to have had some trouble not being able to read
> some of the files, so the copy process stops and asks to skip the file.
> The original file can (often) be read ok anyway, but is probably somehow
> corrupted (?)
> I guess there are a lot of possibilities, including a problematic HD - I
> had a similar newer seagate drive fail recently. I copied the directory
> recently and there ae 3 files fewer on the copy, I used nfs for the copy
> to the modern pc.
> There are a large number of files, some are not big, and although I
> might not care about the apparently 'lost' ones, I do care about knowing
> what is going on. With 100's of 1000's of files it is tedious to look
> individually..... :-)
> comments appreciated.

I personally do this when I make backups and want a 'sanity check'. One thing
I do is:

du -all data1 >new; du -all data2 >old

diff new old >diff

It doesn't work too well when it comes to flagging differences( unless you
hope for an empty file typically, or filenames being renamed). You could
also do a filecount if it's good enough in your situation

alias nfiles='find . -type f | wc -l'

nfiles data1; nfiles data2

Best wishes,


[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index