__/ [ac] on Friday 11 November 2005 10:53 \__
> houghi wrote:
>> ac wrote:
>>> I want to compare two large (50GB each) directories and do not know good
>>> ways of doing this.
>>> They are almost the same from reports from Properties (KDE), but I want
>>> to know more precisely.
>> What do you want to compare? Install dirdiff and try that.
> Thanks I will try.
> The directories are nominally indentical - one should be a copy of the
> other - but they report as having slightly different numbers of files
> and bytes. Some are very old files from earlier machies, the stone ages,
> and other os's, I have found that on a few occasions, a few files
> 'cannot be read', including the occasional jpeg. I am now trying to find
> out more about what is happening.
> A while back I was still using samba, and I then concluded that somehow
> I was getting trouble in some way sometimes copying accross the lan.
> One pc is a PIII (390MB ram or near) and the files are on a recent
> seagate HD. Samba seems to have had some trouble not being able to read
> some of the files, so the copy process stops and asks to skip the file.
> The original file can (often) be read ok anyway, but is probably somehow
> corrupted (?)
> I guess there are a lot of possibilities, including a problematic HD - I
> had a similar newer seagate drive fail recently. I copied the directory
> recently and there ae 3 files fewer on the copy, I used nfs for the copy
> to the modern pc.
> There are a large number of files, some are not big, and although I
> might not care about the apparently 'lost' ones, I do care about knowing
> what is going on. With 100's of 1000's of files it is tedious to look
> individually..... :-)
> comments appreciated.
I personally do this when I make backups and want a 'sanity check'. One thing
I do is:
du -all data1 >new; du -all data2 >old
diff new old >diff
It doesn't work too well when it comes to flagging differences( unless you
hope for an empty file typically, or filenames being renamed). You could
also do a filecount if it's good enough in your situation
alias nfiles='find . -type f | wc -l'
nfiles data1; nfiles data2