Pandoc for Word Document conversion

I just discovered pandoc. Well, I first bookmarked it in 2008, and again in 2016, so I guess I rediscovered it. But what I mean is that I finally discovered what to use it for: converting Word files to Markdown. It’s dead easy:

$ pandoc -f docx -t markdown sample.docx > sample.md

I’ve been using Antiword for years to convert Word 2006 (DOC) files to text, but it doesn’t do DOCX, and, instead of producing Markdown or something more neutral, it tries to recreate the DOC experience in text by centering lines, etc. Not complaining: it gets me plain text and I can take it from there, but Markdown is a big improvement. DOCX is even better, since, apart from pandoc, the only way I knew to read those at the command line was via Libre/OpenOffice:

$ libreoffice –headless –convert-to “txt:Text (encoded):UTF8” sample.docx > sample.txt

(I see — now, when it is too late — that there is also code to do this in ruby: antiword-xp-rb. I hope that’s an awesome tool, but it took me 9 years to figure out what to do with pandoc so don’t wait for me to tell you.)

Google Refine

A while ago, Google bought the company that made Freebase, a tool for making sense of messy data. Earlier this week, they released a 2.0 version of that software, now renamed Google Refine. Watch the videos to see what that does.

This looks pretty darned impressive. For great chunks of my career, I’ve been doing work like that the hard way. In the 1980s, I started my career by doing data reduction in Fortran, but quickly graduated to sed and awk, and in the 2000s I used perl and ruby. Of course, when I say “the hard way,” that is in hindsight. Each of those was an improvement over what I used before, and this looks like it could be a similar type of improvement.

(I still do some of that kind of work even now. It’s been a couple of years, but I probably spent at least a week, spread across too many evenings and weekends, massaging the church directory from a text format Word document into tabular spreadsheet data.)

Cool Software: PDF Clerk Pro

Until a couple of hours ago, I’d never heard of PDFClerk Pro. But some website or other (dealmac?) alerted me to a bargain price for it on MacUpdate. I downloaded it, tried it out, and sprang for the $25 price after about 20 minutes’ worth of fiddling.

Why? After all, I’m a Mac fanboi. And one of the many benefits of working on a Mac is that it comes with Preview, which allows you to do 95% of what you might want to do with PDFs: reorder pages, combine pages from multiple files, etc. I use Preview’s PDF-editing features 10-20 times a week, if not more. So why do I need PDF Clerk Pro?
Continue reading

Blue Screens

Man, I’m sick of Windows. The secretary’s machine at church got infected with something a couple of weeks ago. I was only able to get rid of it by reinstalling Windows. I got an antivirus solution set-up and spent, well, a couple of hours, but it seemed like a month, uninstalling all the crap-ware and getting everything down to the bare minimum. My next project was to make a Ghost-type image, to avoid all that work the next time. But I don’t know how to make a Ghost image on Windows, so I put it off until I had a couple of hours to figure out what to do.

That was a bad decision. Today, we got this:

Crash 1 - Windows PC

The infamous Blue Screen of Death

And we got it every time we rebooted, early in the boot process. So early, I don’t know any way past it. So now I need to come up with some kind of recovery media and boot off that, and save all her data.

Then I need to migrate us away from using Quicken and replace it with some kind of cloud-based Web 2.0 service in its place.

And, honestly, if I get that far, then we’re replacing Windows with Linux, because Quicken is the last Windows-only app we use.

One less Linksys WRT-54G in the world

Well, actually, no. We still have it. But it’s unplugged. At the next garage sale, we’ll get rid of the carcass.

I replaced my Linksys WRT-54G wireless router with an ASUS WL-520GU. Out of box, the ASUS is a better deal with far superior features. These include using static IPs, so I can permit individual machines rather than allowing all my neighbors to crack my WPA key at their leisure. Another useful feature is meaningful logging. The Linksys and my Westel DSL modem don’t work well together; I have to reboot the pair of them about 2-3 times a week. If the Linksys had logs, I could tell whether the problem was in it or in the Westel modem. Now I can find out.

But that’s the out-of-box firmware. The ASUS router also works with Tomato, an aftermarket firmware upgrade, that provides a slew of additional features. There are similar projects for Linksys routers, but all the Linksys routers I’ve ever found are cost-reduced emasculated versions too lacking in RAM or Flash memory to work with any of the replacement firmware.

So. I’m a happy ASUS customer for three reasons: better out of box features, the potential for even more features when or if I get around to upgrading the firmware, and (best of all) I get to retire a blue Linksys router. What’s not to like?

Reinstalling Vista

I managed to find a buyer for my Inspiron 1525 laptop. (No thanks to eBay and the Nigerian crooks who have made it useless for selling computers.)

But then my buyer tried to install software on it. And he ran into two problems. The first is that the battery seems not to hold a charge for very long. That one is news to me, but, then, I rarely used the battery except as a UPS; mostly I ran the computer off wall-current. Anyway, the buyer (we’ll call him Mr. X) was installing some software into his new computer, when it powered down because the battery went south.

That’s when problem two occurred. It’s called “Vista”. Somehow the crash (I’m told) clobbered the system so he got the NTLDR.SYS message. That means the HD is corrupted. I don’t know if the OS is susceptible to corruption when it crashes due to a power failure. (Poor design, if so.) Or possibly Mr. X was installing some virus-ridden L337 W4REZ and the virus clobbered NRLDR.SYS. I don’t know.

So here I am now, with a laptop I’d allowed myself to hope I was done with, and the task of reinstalling Vista. (So I can figure out what to do about the battery.) What fun that is.
Continue reading

MacSpeech Dictate (Review)

We just installed a MacSpeech Dictate on the iMac. I’m using it right now to type this review. I have to say, having used it just a bit, it is an awesome program.

When I first installed the software, I had to train it by reading a few sentences. That only took perhaps five minutes. Then I was ready to go. I goofed around for a few moments, trying to think of sentences to throw at it. I was curious how well it understood church jargon, so I made up a few sentences about the Trinitarian controversy and about Tertullian. It was funny that the speech software knew how to spell Tertullian, which the Mac’s spellcheck service underlined in red.

I was trying to think of a real world test. Here’s what I decided to do.

I recently finished a book called Visioneering, by Andy Stanley. Whenever I read a book, I underline things that catch my interest as I go along. That way, the next time I read the book, the highlights will jump out at me. What occurred to me was that I could read those highlights into the computer and then I would be able to grep on them.

So I did it. I flipped through the whole book, reading everything I’d underlined. There were 97 quotes totaling 1808 words. Reading them in took me about 30 minutes. (Mind you, I’d never used this software before, nor any other voice recognition software.)

But the truly amazing thing is that one of my children was in the next room playing “Lego Star Wars” the whole time, with the volume turned up to 11.

Maybe the reason I’ve had such good luck is because of my Princeton trained voice. After all, I’m a preacher: I use my voice every Sunday. Maybe that’s why the software was able to understand me so well. But considering the technical vocabulary I was using and the noisy environment, I’m still impressed with how good this software is. I look forward to getting a copy to use at work.

Standup Desk at Church

I got rid of the behemoth.

stand-up desk at church

When I moved my day off from Friday to Monday, I still wrote my sermons on Friday, but no longer on my day off. As a result, I no longer wrote them at home. Which in turn meant I came home with all kinds of bursitis and odd aches and pains from trying to type a few thousand words in a couple of hours at a non-ergonomic workstation. Ergo, the stand-up desk (“bar table”) I use at home must be more ergonomic than the gigantic desk at church.

So for my birthday, more or less, Mrs. Mess of Pottage bought me a new desk. My arms feel better, but my feet are sore. (The blue shock-absorbing mat at the bottom of the desk is a late-afternoon addition.)