Reduce PNG sizes from Mac Finder

When I save a screenshot on my systems, it doesn’t try to automatically crush/minify the PNG. Here’s how on a Mac I’ve added a context menu item for images to crush them. This can help save storage space or make web pages load faster.

  1. Ensure you have pngquant installed. Other tools may work for you, but this post is pngquant-specific. I got it via Homebrew.
  2. Open Automator.
  3. Choose Service.
  4. In the Service Receives drop-down, select image files.
  5. In the left pane, select Utilities under Library.
  6. From the middle pane, drag Run Shell Script to the right pane.
  7. On the top of the box that was just created, use /bin/bash as the shell and pass input as arguments.
  8. Paste in the following code for the shell script:
  9. Choose a name to save the script as which is how you’d like to see in when right-clicking to get the context menu. I named mine png_crush.

Now when you right-click a PNG in Finder, go to Services, and click on the entry for png_crush. Works for multiple-selected files as well.

MemcacheD is your friend

MemcacheD Is Your Friend is an object caching plugin for WordPress that offers faster access to cached objects, especially if your database happens to reside on a different host. The problem this time around is that it scopes some of its members as private, and as a result, is incompatible with some plugins. I’ve run into this before when trying to import data into WordPress.

This time, I was trying to get Elasticpress working on a site where the Memcached object cache was already set up. Thankfully, someone on a Github support thread helpfully pointed to a fork they made of the long-neglected Memcached Is Your Friend plugin. I haven’t had a chance to try it yet but wanted at least a placeholder here to it in case it comes up again.

Prolong the life of the SD card in your Raspberry Pi

The web is littered with stories of people who love their Raspberry Pis but are disappointed to learn that the Pi often eats the SD card. I’ve recovered a card once, but otherwise had a few that have been destroyed and were not recoverable. I’ll lay out how I use This One Weird Trick(tm), ahem, to try and prolong the life of the SD card.

First I should point out that my Pi storage layout is not typical. I basically followed this guide to boot from SD card, but run the root filesystem on a flash drive. While the stated purpose of the guide is to help reduce activity on the SD card (and improve storage performance somewhat), I come at the SD card corruption issue from a different perspective.

In my view, the corruption is most likely caused by a timing bug which could be rather low-level in the design or implementation of the hardware itself. Writing to the card less often probably reduces the chances of corruption, but my personal feeling is that after a Pi has been powered on for a certain amount of time, you can’t really predict if the bug is going to manifest. I don’t believe that most instances of SD card corruption happen in the first hours or days of a Pi booting up, so my goal was to only write to it within that initial period of time, if possible.

After following the guide linked above, the SD card is now only hosting the /boot partition. After init has started on / (the external storage), we really don’t need /boot any longer. In the middle of my /etc/rc.local file, I’ve added
mount -o ro,remount /boot

In the typical usage of a running system, /boot doesn’t really need to be mounted read-write. Of course, if you forget it’s mounted read-only, then things like apt-get upgrade or rpi-update may certainly fail. Now when I want to run those commands I first reboot the Pi, and remount the /boot partition with
sudo mount -o remount,rw /boot

Once the updating is done, I reboot again and leave /boot read-only.

Finding how much time Apache requests take

When a request is logged in Apache’s common or combined format, it doesn’t actually show you how much time each request took to complete. To make reading logs a bit more confusing, each request is logged only once it’s completed. So a long-running request may have an earlier start time but appear later in the log than quicker requests.

To help look at some timing info without going deep enough to need a debugger, I decided that step one was to use a custom log format that saved the total request time. After adding usec:%D to the end of my Apache custom log format, we can now see how long various requests are taking to complete.

tail -q -1000 *access.log | mawk 'FS="(:|GET |POST | HTTP/1.1|\")" {print $NF" "$6}' | sort -nr | head -100 > /tmp/heavy2

I’m using the “%D” format for compatibility with older Apache releases, which reports the response time in microseconds. I would prefer milliseconds, but when I tried using “%{ms}T” on a server running 2.4.7, it didn’t work; too old. This output is a bit hard to read when looking at the numbers, so we can try to add in a little visual aid with commas as the thousands separator.

cat /tmp/heavy2 | xargs -L 1 printf "%'d %s\n" | less

Note that because we are measuring the total request time, some of the numbers may be high due to remote network latency or a slow client. I recommend correlating several samples before blaming some piece of local application code.

Hope this helps finding your long-running requests!

Speed of the sort command

GNU sort is normally crazy fast at what it does. However, recently I was trying to sort & unique several huge files and it seemed to be taking way too long. I did a little googling, and realized that it takes a lot longer to sort the full range of Unicode characters because it has to decode one or more bytes (UTF-8) before deciding where a character should be placed. There’s an easy way to increase the speed of the sort command, given a few caveats.

I’m not sure how I haven’t run into this already, but I love whenever I run into one of these little gems. The solution is pretty simple:

The C locale simply uses byte-ordering, so non-ASCII characters may end up in the wrong place. If you don’t need strict lexicographical sort, just a consistent sort, this seems to be the way to go.

Apache, Fastcgi, PHP 7 on Debian Wheezy & Ubuntu 14.04

Intro: The Tyranny of Prefork

There are a lot of tutorials out there that go through the rote instructions on upgrading your Debian or Ubuntu system to use PHP 7. While I’m sure most of them are fine, they assume you’d want to use the prefork process model or event/threaded via CGI (via proxy and fcgi modules). While prefork is certainly battle-tested, it uses a ton more memory than it needs to, so I’m going to document how to upgrade an existing Fastcgi install to PHP 7. Continue reading “Apache, Fastcgi, PHP 7 on Debian Wheezy & Ubuntu 14.04”

WordPress performance problem with many posts

If you have a ton of posts in your WordPress blog (we have over 35K in one site at work), it turns out that the Previous and Next links on each post may be running a tough query on your database.

I wanted to know why MySQL was using so much CPU and wrongly assumed it was due to a bad tuning effort (it usually is). I googled “SELECT p.ID FROM wp_posts AS p INNER JOIN wp_term_relationships AS tr ON p.ID = tr.object_id INNER JOIN wp_term_taxonomy tt ON tr.term_taxonomy_id = tt.term_taxonomy_id” which was in my output of MySQL’s show full processlist command. It led me to this StackExchange page which showed an alternative, more efficient version of the WP function calls that produce those previous and next links.

In our case, we just didn’t need those links and our theme let us turn them off from the admin. An instant and dramatic drop in CPU by MySQL ensued.

Raspberry Pi SSH cipher speed

I was curious to see how quickly I could transfer files to my Pi using SSH rather than FTP. Obviously using FTP is way faster than almost any other method, but still I wanted to see how fast I could transfer data over SSH.

Here’s the time it took to transfer a 50 MB file to my Pi using different SSH ciphers.

I later re-tested the aes128-ctr cipher and it took about a second less than what I’d recorded initially. This boils down to:

  • Don’t use triple-DES ever, for both performance and security reasons
  • Most other ciphers give about the same performance, and are generally considered secure
  • arcfour is the fastest class of ciphers, but there is less trust in it from the crypto community. If you’re going to use it, try to avoid the base arcfour cipher and instead use the 128 or 256 version, which tosses out some of the initial bits as a precaution

APC is dead, long live APCu & ZendOpcache

So far, the site seems slightly snappier now that I’ve replaced the venerable (but old and unmaintained) APC with APCu for user-space object caching and ZendOpcache for opcode caching. Various people report seeing 10-30% improvement in speed with the new opcode cache / optimizer that will be the default in PHP 5.5. Also APCu is nice because it looks exactly the same to existing apps & requires no configuration. One thing I recall reading about APCu, however, is that it doesn’t provide the upload progress function that APC had. There are alternatives for that, though.

Faster webserver on Raspberry Pi

The Raspberry Pi has a 32-bit ARM CPU running at 700MHz by default, although you can usually overclock them somewhat and still enjoy stable behavior. I’m running Raspbian, a Debian-based distribution built for the Pi.

One thing that’s mildly annoying is that running WordPress on the Pi using Nginx and php-fpm has been dog slow, more so than just due to the relatively low clock speed of the CPU. I finally figured out why. I noticed some entries showing up in the Nginx virtual host error log: an upstream response is buffered to a temporary file /var/lib/nginx/fastcgi/3/07/0000000073 while reading upstream. The responses from PHP were being buffered to disk (in this case, an SD card which is the boot device). I googled and found the option I needed to set. Once that was set to 0 and Nginx was restarted, I immediately noticed an improvement in the response time.

Before finding this, I’d tried setting various things in W3 Total Cache plugin for WordPress that didn’t make much of difference. Now some of these caching options produce a noticeable boost in performance in addition to the bump from avoiding disk buffering. Needless to say, I’m much happier with the performance.

Why no alternate compression algorithms in rsync?

I was thinking the other day, gzip is all fine and good, but why doesn’t rsync support other compression methods? There are a few use cases where using LZO (a very low latency compression algorithm) would be a better choice.

One such case would be when operating with a relatively slow CPU, such as on an old system or an embedded device. LZO should still get the job done quickly.

Another case is where you’re transferring files over a fast network, but moving so much data that you still want some compression in effect. In this case, gzip -1, while fast, may still be slowing down the transfer too much to take full advantage of the available bandwidth. On anything but the slowest CPUs, LZO should give near-line speed performance.

gzip by default

In my lastĀ post on gzip, I discovered that gzip can compress data in a more sync-friendly way. This totally unrelatedĀ blog entry from nginx discusses a new gunzip filter that decompresses compressed data for clients that don’t support gzip.

I was thinking about this the other day. Why not store all your content compressed, then you can just quickly use sendfile() or some other fast method to deliver data directly to a client, and decompress the compressed data for clients that don’t support it?

  • Decompressing is always faster than compressing (apples to apples).
  • You get to save storage space.
  • You could potentially reduce your IO by a large margin (over the network obviously, but also inside the box).
  • Since nearly every web browser in use today supports compression, you’d use it almost all the time. It’s the default case now, not the edge case.

There you have it. Compress to impress. Maybe we’ll see a return to the days of using compressed filesystems, but with multiple entry points depending on whether you want to get the data in a compressed or uncompressed form, like mounting a block device from /uncomp to retrieve a decompressed file, and a /comp mount point to get files in the native compressed form.