Tag Archives: perl

File renaming tools

Long ago, I wrote a utility (brename) that renames a set of files based on a supplied pattern. (Imagine you had an arbitrary set of JPEGs and you wanted to pretend they all came from a digital camera with names like IMG_0001, IMG_0002, etc. – that’s my favorite use case for brename. It’s really more of a re-numbering than a renaming tool.)

I also have a tool I call pmv (an alias for Larry Wall’s perl rename tool (version 3.0.1.2 from August of 1990, which was something like this “fork” of version 4.2 I found here)). I use pmv when what I want to do is more complicated than brename will permit. (Interestingly, the version of perl rename I use will force filename case changes on Macs, which like to pretend that ABC.txt and abc.txt aren’t different names, while the newer version won’t.)

But I recently stumbled across also mmv. It’s like the perl rename tool but with error checking beforehand. The downside is that you can’t (easily?) limit the application of your pattern to some set of files. It’s like coming up with a rename expression s/before/after/ and applying it to *. (Not only that, but from reading the man page leads me to think it’s over-eager to apply that pattern not just to * but to **.)

And what about renameutils? I have something like its qmv. The idea is you print a list of filenames and bring it into the editor for a human to fix there. (Way back in the 80’s I used an awk command to do this; it was something like this:

$ ls -1 *.c | awk '{printf("mv %20s %s\n",$1,$1);}' > list ; $EDITOR list

The only problem with qmv is that the “plan” you create isn’t saved. Typically, the files I want to rename (especially when there’s more than a few, when qmv should shine) are backed up somewhere else, and I’d prefer to apply the same plan to the backup folder, rather than copying the (same) files there and then deleting the originals.

Google Refine

A while ago, Google bought the company that made Freebase, a tool for making sense of messy data. Earlier this week, they released a 2.0 version of that software, now renamed Google Refine. Watch the videos to see what that does.

This looks pretty darned impressive. For great chunks of my career, I’ve been doing work like that the hard way. In the 1980s, I started my career by doing data reduction in Fortran, but quickly graduated to sed and awk, and in the 2000s I used perl and ruby. Of course, when I say “the hard way,” that is in hindsight. Each of those was an improvement over what I used before, and this looks like it could be a similar type of improvement.

(I still do some of that kind of work even now. It’s been a couple of years, but I probably spent at least a week, spread across too many evenings and weekends, massaging the church directory from a text format Word document into tabular spreadsheet data.)