Moving Evernote notes into WordPress

proprietary insecurity

I’ve accumulated many notes (2000+) in Evernote over the years, and love that it can store binary attachments such as images or other media files. My favorite feature is the Evernote Web Clipper browser extension; it does a fantastic job at saving the parts of an article I want to save while keeping the styling intact.

Evernote has a free plan which I’ve enjoyed for a long time, but recently the financial status of the company has come into question, and they restricted syncing to only two devices. Also, the last thing I want to happen is another kind of Google Reader shutdown fiasco. I doubt that a shutdown would make my existing notes disappear, but it’s better to be prepared ahead of time. To that extent, I’ve been looking for a viable option to migrate my notes into another platform. Continue reading

Apache, Fastcgi, PHP 7 on Debian Wheezy & Ubuntu 14.04

Intro: The Tyranny of Prefork

There are a lot of tutorials out there that go through the rote instructions on upgrading your Debian or Ubuntu system to use PHP 7. While I’m sure most of them are fine, they assume you’d want to use the prefork process model or event/threaded via CGI (via proxy and fcgi modules). While prefork is certainly battle-tested, it uses a ton more memory than it needs to, so I’m going to document how to upgrade an existing Fastcgi install to PHP 7. Continue reading

distribution: histograms in the terminal

My new favorite tool is a python program called distribution that can easily show histograms in your terminal:

I used homebrew to install it, but you can see some usage examples and a few other tools on this stackoverflow page. I eagerly anticipate showing off some histograms to people.

Debian server DNS bogosity

Note: I’m running my Raspberry Pi as a server, and NetworkManager is not installed.

I discovered that if you want to manually assign search and nameserver entries in your /etc/resolv.conf file, you can’t just add the relevant entries to static entry in /etc/network/interfaces:

For some unknown reason, the resolvconf utility will still attempt to query an upstream DHCP server to get additional name service data. I don’t know why it works this way, I believe it should be hands-off if you’ve specified static in your interfaces file. I finally found that dhcpcd was called to get the info, and added the following line to /etc/dhcpcd.conf to disable actions relating to eth0:

I suppose if I wanted additional interfaces to work properly using dhcp, I’d have to get rid of all this and configure each interface manually via NetworkManager or wicd.

GNU xargs is missing the -J option. WHY!?!

I find that using an idiom like

is so useful. It replaces the replstr (“%” in this example) with all the arguments at once, or as many as can fit without going over the system’s limit. I couldn’t believe it when I learned that the GNU version of xargs lacks this flag. Yes, it’s only on the BSD xargs as far as I can tell.

Every time I’ve searched, someone suggests using the -I flag on GNU xargs instead, but they are not quite the same. The -I flag substitutes the replstr one argument at a time, so that in the earlier example, instead of executing

only once, with the -I flag it will instead do

I’ve also tried using the -n and -L flags, but they are mutually exclusive with each other and with -I. OK, so we need some kind of klugey workaround.

This adds the “bar/” suffix to the standard input before adding it to the end of the mv command. “But,” you say, “those strings are supposed to be null-terminated!” True, but we’re providing a suffix rather than an extra replacement argument, so the EOF signaled from the input stream is really all we need.

There’s another, more intuitive way, but harder to get right; get the argument list output from a subshell command:

But this suffers from not handling weird file names the right way. Instead one could do:

This actually works better for file names, but lacks the flexibility of find.

Is this stuff really what we ought to do? Just give us the -J, GNU. If you know a different way to deal with this, tweet me @realgeek and I’ll update this post.

Unison dependency hell

UnisonI would really like to rid myself of Dropbox, but all the alternatives I’ve tried are too bloated, beta- or alpha-quality stage, too complicated to set up, or just plain don’t do what Dropbox does (minus the sharing stuff, which I don’t care about). I don’t want btsync, it’s closed-source. Seafile is too complicated, and makes dubious security claims. Owncloud is a cool project, but their file sync is slow, error prone, and has other limitations. There are some good services, but they don’t run on all the platforms I need, including Mac OSX, Linux x86 (32 and 64-bit), Linux ARMv6 (my Raspberry Pi B) and Android. I ran Syncthing for a while, but the continuous memory usage is pretty steep for the Pi, and I’ve experienced random silent file truncation in my shared directories with it. So I needed something else. Continue reading

WordPress performance problem with many posts

If you have a ton of posts in your WordPress blog (we have over 35K in one site at work), it turns out that the Previous and Next links on each post may be running a tough query on your database.

I wanted to know why MySQL was using so much CPU and wrongly assumed it was due to a bad tuning effort (it usually is). I googled “SELECT p.ID FROM wp_posts AS p INNER JOIN wp_term_relationships AS tr ON p.ID = tr.object_id INNER JOIN wp_term_taxonomy tt ON tr.term_taxonomy_id = tt.term_taxonomy_id” which was in my output of MySQL’s show full processlist command. It led me to this StackExchange page which showed an alternative, more efficient version of the WP function calls that produce those previous and next links.

In our case, we just didn’t need those links and our theme let us turn them off from the admin. An instant and dramatic drop in CPU by MySQL ensued.

Allow webapps to make outgoing requests

I was experiencing a pretty bad slowdown while trying to use the admin pages of a WordPress site recently. The load on the machine was quite low, so I began to suspect that it was trying to call out to external services (facebook, pinterest, etc) that might have been blocked by CSF (configserver firewall).

I started playing around with tcpdump and friends and then realized that the information I was looking for (blocked outgoing requests) was already being logged in /var/log/kern.log on our Ubuntu system (same on Debian). Continue reading

Prepare a PDF file for OCR

If you have some need to OCR some text from a PDF or image file, you may want to use a tool like tesseract to do the job. But it won’t take any old input file, you’ll probably need to convert it first.

The first error I got from tesseract was

The Googles indicated that I can’t pass a PDF to it directly. Then I found that one format it will take is tiff. Continue reading

Discard first column without AWK

UPDATE: Major derp moment on my part, thinking that you needed a loop in AWK to print all but one fields. Commandlinefu just cause a forehead-slapping moment when I saw this in my feed:

So, it seems AWK wins again. Carry on.

If you’re trying to print one or more particular columns from some input it is quite straightforward with AWK. You’d simply specify the variable(s) you know exist from the input (e.g.,

). However, it’s pretty AWKward (sorry) to omit one column of data and to print the rest, particularly if you don’t know exactly how many columns of input are expected on each line. Then you’d need to actually program a loop in AWK. Ugh. Continue reading

Raspberry Pi can do fast video encoding

Yes, the Raspberry Pi can do fast video encoding. Of course you normally wouldn’t want to re-encode any video with an ARM processor, but that’s not what we’re going to do here. We’re going to leverage the GPU. I should point out before proceeding that the input formats for re-encoding are limited in this method, more about that below.

In order to do this, I’m using a proof-of-concept tool called omxtx, which I think is supposed to be a shortened form of “OpenMAX Transcoding”. Off the top of my head, here are the prerequisites for building the binary from source:

  • Raspbian. It will probably work on other RPi distros, but I haven’t tried them.
  • The build-essential package installed, which you normally need to build anything.
  • Memory split of 64MB for video. I previously had this all the way down to 16 since I don’t use a display on my Pi, but bumping it to only 32MB caused runtime errors from the omxtx binary. You need to give the GPU some breathing room to encode video.
  • There’s probably some libraries you may or may not have installed that the build wants to link in. When I run ldd on my finished binary, it loads all kinds of media libs like libav, libvorbis, libvpx, etc. YMMV.

Continue reading