The web made sales games a lot easier, in particular the bait and switch.

I just resuscitated an old hard drive, bought an external enclosure for it.  Hooked it up.  Nice.  Now I want to clean it up.  Does it have stuff I may be missing from other drives?  Can’t tell, so many files and many duplicates.

So, first step is remove the duplicate files.  Do a web search for Win7 file duplication utilities.  What a mess.  Its hard to figure out what is a legitimate offering.   I opened a forum where the question is asked, what is a good duplicate file finder.  One response was that “XYZ Duplicate File Finder” (I changed the name of the actual tool) was easy and free.  Did the developer of the software post it?  Is it really free?

I visited the site.  xyz_site.  Well, its a ‘com’, site, but that can mean anything.  For example, they can sell their product, but give away crippled versions or older products for free, like winzip.com.   The thing is, nowhere on the site does it state the price.  If you dig into the site there is a page that indicates you have to pay for it:  the help-register-xxx.html   Yet, even here there is no price!  In fact, the license expires, but on the license FAQ page it says you only lose the ability to receive free updates.

I don’t mean to single out the makers of this software.  Perhaps I missed the price somewhere or misunderstood the site itself.  This type of site is very common in the Windows world.   I’m all for people making an honest buck, but the key here is honest.

Lets look at the winzip site.  Right on the first page, it gives the price.  That’s nice.  What I don’t like about it though is that there is a download button.  What, I can download a trial or free version?  Nope, click the button and way down at the bottom of the download page that results they tell you its for registered users only.  And which you probably won’t see unless you scroll the window.  Huh?  Why not call the “Download” button the “Upgrade” button instead?

I don’t think you see this kind of stuff in the Linux or Unix world.  Probably because people in the *nix world want every useful utility is free or eventually built into the OS itself (OpenSolaris has dedupe via ZFS)?

I guess in the Windows world free means, not to the user, but to the sellers who are free to do whatever they want.

Now how do I remove duplicate files?  Maybe I can just write a script to do it.  I could create a database and then query for duplicates, that must be easy in SQL?  I wonder how its done in Linux, probably a one-line Perl script.

By the way, I had some luck with Duplicate File Finder 0.8.0 by Matthias Boehm.  Also, in the Linux world this can be done with very clever bash scripting.  Here is a sample scripting approach.

What I don’t like of the tools I’ve seen so far is that they are very bad in terms of usability.  They should look at how diff and merge tools do things, like KDiff3.  Another example to look at is the graphics approach found in D-Dupe.

