Debian server DNS bogosity

Note: I’m running my Raspberry Pi as a server, and NetworkManager is not installed.

I discovered that if you want to manually assign search and nameserver entries in your /etc/resolv.conf file, you can’t just add the relevant entries to static entry in /etc/network/interfaces:

For some unknown reason, the resolvconf utility will still attempt to query an upstream DHCP server to get additional name service data. I don’t know why it works this way, I believe it should be hands-off if you’ve specified static in your interfaces file. I finally found that dhcpcd was called to get the info, and added the following line to /etc/dhcpcd.conf to disable actions relating to eth0:

I suppose if I wanted additional interfaces to work properly using dhcp, I’d have to get rid of all this and configure each interface manually via NetworkManager or wicd.

Sometimes, you don’t have any inodes left

You notice something is wrong with your system. I’ll just put this error message here for the sake of the Googles:

That stinks.
Continue reading “Sometimes, you don’t have any inodes left”

GNU xargs is missing the -J option. WHY!?!

I find that using an idiom like

is so useful. It replaces the replstr (“%” in this example) with all the arguments at once, or as many as can fit without going over the system’s limit. I couldn’t believe it when I learned that the GNU version of xargs lacks this flag. Yes, it’s only on the BSD xargs as far as I can tell.

Every time I’ve searched, someone suggests using the -I flag on GNU xargs instead, but they are not quite the same. The -I flag substitutes the replstr one argument at a time, so that in the earlier example, instead of executing

only once, with the -I flag it will instead do

I’ve also tried using the -n and -L flags, but they are mutually exclusive with each other and with -I. OK, so we need some kind of klugey workaround.

This adds the “bar/” suffix to the standard input before adding it to the end of the mv command. “But,” you say, “those strings are supposed to be null-terminated!” True, but we’re providing a suffix rather than an extra replacement argument, so the EOF signaled from the input stream is really all we need.

There’s another, more intuitive way, but harder to get right; get the argument list output from a subshell command:

But this suffers from not handling weird file names the right way. Instead one could do:

This actually works better for file names, but lacks the flexibility of find.

Is this stuff really what we ought to do? Just give us the -J, GNU. If you know a different way to deal with this, tweet me @realgeek and I’ll update this post.

Unison dependency hell

UnisonI would really like to rid myself of Dropbox, but all the alternatives I’ve tried are too bloated, beta- or alpha-quality stage, too complicated to set up, or just plain don’t do what Dropbox does (minus the sharing stuff, which I don’t care about). I don’t want btsync, it’s closed-source. Seafile is too complicated, and makes dubious security claims. Owncloud is a cool project, but their file sync is slow, error prone, and has other limitations. There are some good services, but they don’t run on all the platforms I need, including Mac OSX, Linux x86 (32 and 64-bit), Linux ARMv6 (my Raspberry Pi B) and Android. I ran Syncthing for a while, but the continuous memory usage is pretty steep for the Pi, and I’ve experienced random silent file truncation in my shared directories with it. So I needed something else. Continue reading “Unison dependency hell”

WordPress performance problem with many posts

If you have a ton of posts in your WordPress blog (we have over 35K in one site at work), it turns out that the Previous and Next links on each post may be running a tough query on your database.

I wanted to know why MySQL was using so much CPU and wrongly assumed it was due to a bad tuning effort (it usually is). I googled “SELECT p.ID FROM wp_posts AS p INNER JOIN wp_term_relationships AS tr ON p.ID = tr.object_id INNER JOIN wp_term_taxonomy tt ON tr.term_taxonomy_id = tt.term_taxonomy_id” which was in my output of MySQL’s show full processlist command. It led me to this StackExchange page which showed an alternative, more efficient version of the WP function calls that produce those previous and next links.

In our case, we just didn’t need those links and our theme let us turn them off from the admin. An instant and dramatic drop in CPU by MySQL ensued.

Allow webapps to make outgoing requests

I was experiencing a pretty bad slowdown while trying to use the admin pages of a WordPress site recently. The load on the machine was quite low, so I began to suspect that it was trying to call out to external services (facebook, pinterest, etc) that might have been blocked by CSF (configserver firewall).

I started playing around with tcpdump and friends and then realized that the information I was looking for (blocked outgoing requests) was already being logged in /var/log/kern.log on our Ubuntu system (same on Debian). Continue reading “Allow webapps to make outgoing requests”

Prepare a PDF file for OCR

If you have some need to OCR some text from a PDF or image file, you may want to use a tool like tesseract to do the job. But it won’t take any old input file, you’ll probably need to convert it first.

The first error I got from tesseract was

The Googles indicated that I can’t pass a PDF to it directly. Then I found that one format it will take is tiff. Continue reading “Prepare a PDF file for OCR”

Discard first column without AWK

UPDATE: Major derp moment on my part, thinking that you needed a loop in AWK to print all but one fields. Commandlinefu just cause a forehead-slapping moment when I saw this in my feed:

So, it seems AWK wins again. Carry on.

If you’re trying to print one or more particular columns from some input it is quite straightforward with AWK. You’d simply specify the variable(s) you know exist from the input (e.g.,

). However, it’s pretty AWKward (sorry) to omit one column of data and to print the rest, particularly if you don’t know exactly how many columns of input are expected on each line. Then you’d need to actually program a loop in AWK. Ugh. Continue reading “Discard first column without AWK”

Raspberry Pi can do fast video encoding

Yes, the Raspberry Pi can do fast video encoding. Of course you normally wouldn’t want to re-encode any video with an ARM processor, but that’s not what we’re going to do here. We’re going to leverage the GPU. I should point out before proceeding that the input formats for re-encoding are limited in this method, more about that below.

In order to do this, I’m using a proof-of-concept tool called omxtx, which I think is supposed to be a shortened form of “OpenMAX Transcoding”. Off the top of my head, here are the prerequisites for building the binary from source:

  • Raspbian. It will probably work on other RPi distros, but I haven’t tried them.
  • The build-essential package installed, which you normally need to build anything.
  • Memory split of 64MB for video. I previously had this all the way down to 16 since I don’t use a display on my Pi, but bumping it to only 32MB caused runtime errors from the omxtx binary. You need to give the GPU some breathing room to encode video.
  • There’s probably some libraries you may or may not have installed that the build wants to link in. When I run ldd on my finished binary, it loads all kinds of media libs like libav, libvorbis, libvpx, etc. YMMV.

Continue reading “Raspberry Pi can do fast video encoding”

Clone hard disk with rsync

I recently wanted to move a system over to a faster, larger SSD. I didn’t want to have to re-install an OS, figure out which old files to transfer over, and then re-configure everything. That’s not a fun time in my book.

Here’s what I did (on a live system, yeah!) to clone my disk. Note that this may cause data loss, don’t blame me, keep backups, blah blah…

First, use a partition tool like GNU parted to create a nice big partition on the new drive and mark it as bootable. Leave some space for other partitions or swap space. If you use a separate /boot partition, then I think that needs the bootable flag instead. I’m only using a single root partition and swap. For the purposes of this tutorial, I’ll call my new root partition /dev/sdb1. YMMV.

Wait a while.

Take note of the UUID listed for /dev/sdb.

Or use whatever editor you like and put the UUID for /dev/sdb in place of the existing UUID for /.

Now you should just need to swap out the drives.

Self-hosted open source RSS readers

I think I’ve tried pretty much all of them. After the Google Reader-pocalypse, one of the primary requirements was that I could host it myself. Bonus points go to apps that have configurable keyboard navigation (“j” to open the next item must be distinct from “space” to just scroll down in the browser), as well as decent integration on mobile. Here’s a roundup of the ones I’ve tried.


Awesome platform, but way too big for someone looking to host their own personal solution. I tried upgrading it once and broke it. No idea what I did wrong or how to even figure out how why it wasn’t working. Seems very well designed for a massive multi-user operation, though, if you’ve got the Python chops to figure everything out. Newsblur website.


Commafeed is also a larger piece of software, but requires many fewer components than Newsblur. You need Java, some java tools like maven, a DB and of course more than a little bit of RAM.

TT-RSS (Tiny Tiny RSS)

Nice, but not as configurable as I’d like. This and the rest of the readers listed are written in PHP. There are three larger downsides to tt-rss:

  • I had quite a bit of trouble trying to get it to run from a subdirectory on Nginx. This is not necessarily specific to tt-rss, many apps are hard to config this way.
  • The primary developer is not friendly. He seems to take pleasure in ridiculing people in the support forums.
  • Although it’s supposed to be tiny, and the application part is, it requires Postgres or MySQL with InnoDB support. I would prefer something that uses less memory on the DB side, either MyISAM tables or better yet SQLite.


I ran SelfOSS for a while and liked it. However, I didn’t like the Android experience (what, no swipe?) so I went looking for something else.


I’m currently running FreshRSS and it’s really, really good. But I’m starting to get discouraged by a few nagging bugs and the lack of recent updates to the github repo.


I ran Miniflux for a short time a while back and my memory is a bit hazy on the experience (after a while RSS reader experiences tend to blend in with one another). I think I’m going to give it another shot. On his site, reading down the list of what Miniflux is not vs what it is makes me take heart. The developer is clearly trying to convey a no-BS attitude with his intentions for this app. One thing that gives me a spark of hope is that there was a new point release this month. I will update this post with any news with Miniflux.

Finding call-time pass by references in PHP.

While trying to move an older code base to a newer system and thus a newer version of PHP (5.3 -> 5.5), I knew that some of the code would need to be changed to avoid using some removed features. Specifically, I mean call-time pass by references. For those who don’t know, this is kind of a weird feature of earlier versions of PHP that allows one to call a function and pass any of the arguments by reference rather than the usual call by value if the caller prepends an argument variable with the “reference to” operator &.

So, to illustrate, normally this code won’t have side effects because of call by value:

However there will be side effects if the caller chooses pass by reference:

I thought a regex might be in order to find these guys and fix them:

but it was a naive idea, and this regex devolved (heh) to its current form before I realized I could just use the built-in linter to find the problem spots.