Poor Man’s Continuous Integration

I created the following script which saves a few seconds of switching to a terminal and typing grunt on each change to my source files.

I can just leave this running in a terminal window and glance at the output to make sure there were no errors.

Reduce PNG sizes from Mac Finder

When I save a screenshot on my systems, it doesn’t try to automatically crush/minify the PNG. Here’s how on a Mac I’ve added a context menu item for images to crush them. This can help save storage space or make web pages load faster.

  1. Ensure you have pngquant installed. Other tools may work for you, but this post is pngquant-specific. I got it via Homebrew.
  2. Open Automator.
  3. Choose Service.
  4. In the Service Receives drop-down, select image files.
  5. In the left pane, select Utilities under Library.
  6. From the middle pane, drag Run Shell Script to the right pane.
  7. On the top of the box that was just created, use /bin/bash as the shell and pass input as arguments.
  8. Paste in the following code for the shell script:
  9. Choose a name to save the script as which is how you’d like to see in when right-clicking to get the context menu. I named mine png_crush.

Now when you right-click a PNG in Finder, go to Services, and click on the entry for png_crush. Works for multiple-selected files as well.

Filtering: In or Out, Exclusive or Inclusive and Regular Expressions

My current preferred open source, self-hosted & simple RSS reader is Selfoss. I compared several of these types of RSS readers in a prior post and came to different conclusion, but I’ve been using Selfoss for at least a year now. My having switched to it may have coincided with improvements from the mobile reading side, which used to be… not good.

I have at least one aggregated feed in my lineup which pulls in feeds from multiple sources based on a common topic. But a lot of the posts are about things I care not at all about. I’ll give an example. I really like posts about programming and exploring concepts in certain computer languages, but I really don’t want to know anything about commercial products related to programming or professional conferences. How can I keep that stuff out of my reader? By using Selfoss’s filter feature, of course.
Continue reading “Filtering: In or Out, Exclusive or Inclusive and Regular Expressions”

Defend against fake google bots

I can think of some reasons why folks might use the Googlebot user agent on their non-Google bots, but I can’t think of any good, upstanding reasons to do it.

Here’s how one might find some fine folks who would do such a thing.

As of right now (May 2018), all valid Google Bot source IPs start with the same prefix, 66.249. This may change in the future, so if you’re having problems being crawled by Google, check to make sure you’re not blocking a new range they may have started using. OK, here’s the nitty-gritty.

Interesting that a few of the IPs from my logs indicated that Facebook in Ireland are using the Google user agent. Naughty! Anyway, if you want to test that you’re not blocking a valid Google address, then you need to do an IP lookup on some of these groups of addresses. And of course you can modify the above to scan the current log files instead of the archived gzipped files.

Here’s how I’m blocking the baddies (this isn’t original, I searched and found a version of this). This goes in your Apache config or .htaccess file:

Quickly show which directories are serving host names on a multiple Vhost Apache system

Does this look familiar? Maybe you need more fiber in your diet. Or maybe you need THIS:

You’re welcome.

Finding the most persistent, pernicious baddies by processing log files

Logwatch is a great utility for emailing me a summary of system logs over the last 24 hours. One of the things it shows are unsuccessful login attempts and their source IP addresses. But the default unsorted output is hard to analyze and take action on, since a single IP may appear many times in the output but at random locations.

It looks kind of like this (I’ve obscured the full IP to protect the guilty).

So, here we go. Create a shell script or alias with the following:
pbpaste | ggrep -Po '\b((?:\d{1,3}\.){3}\d{1,3})\s' | distribution

Once you’ve got the sections from the logwatch email copied to the clipboard, run this to see which source IPs are the top offenders. Since I’m using pbpaste and ggrep, it should be clear I’m on a Mac. This works on Linux using xsel --clipboard --output and grep, respectively.

And if you haven’t checked out distribution, you should. Super useful.

MemcacheD is your friend

MemcacheD Is Your Friend is an object caching plugin for WordPress that offers faster access to cached objects, especially if your database happens to reside on a different host. The problem this time around is that it scopes some of its members as private, and as a result, is incompatible with some plugins. I’ve run into this before when trying to import data into WordPress.

This time, I was trying to get Elasticpress working on a site where the Memcached object cache was already set up. Thankfully, someone on a Github support thread helpfully pointed to a fork they made of the long-neglected Memcached Is Your Friend plugin. I haven’t had a chance to try it yet but wanted at least a placeholder here to it in case it comes up again.

Macbook sd card reader always in read-only mode

Most of the time when I insert an SD card into the slot on my mid-2014 Macbook Pro, it mounts in read-only mode and I don’t notice until I try to delete a file. When that happens, the option to “Move to Trash” is missing. After checking that the tiny sliding switch on the side is, in fact, set to the unlocked position, I consulted the Googs.

I found a page suggesting that when inserting the card, pressing in the card on the side closest to me rather than the display helped. But still after several tries I couldn’t get it work. One part of that possible solution piqued my interest, though. It was the phrase “bad tolerances.”

On a whim, I pushed the card’s lock button only one or two millimeters down towards the locked position and tried again. It worked! Wooo!

Monitor load in the terminal

Just a small one that tends to get my attention when I have a terminal open into a remote machine and would like to notice right away when the load rises. Of course one could just use the echo -e "\a" if you have visual bell enabled or if you want to annoy people around you and have audio bell enabled.

Prolong the life of the SD card in your Raspberry Pi

The web is littered with stories of people who love their Raspberry Pis but are disappointed to learn that the Pi often eats the SD card. I’ve recovered a card once, but otherwise had a few that have been destroyed and were not recoverable. I’ll lay out how I use This One Weird Trick(tm), ahem, to try and prolong the life of the SD card.

First I should point out that my Pi storage layout is not typical. I basically followed this guide to boot from SD card, but run the root filesystem on a flash drive. While the stated purpose of the guide is to help reduce activity on the SD card (and improve storage performance somewhat), I come at the SD card corruption issue from a different perspective.

In my view, the corruption is most likely caused by a timing bug which could be rather low-level in the design or implementation of the hardware itself. Writing to the card less often probably reduces the chances of corruption, but my personal feeling is that after a Pi has been powered on for a certain amount of time, you can’t really predict if the bug is going to manifest. I don’t believe that most instances of SD card corruption happen in the first hours or days of a Pi booting up, so my goal was to only write to it within that initial period of time, if possible.

After following the guide linked above, the SD card is now only hosting the /boot partition. After init has started on / (the external storage), we really don’t need /boot any longer. In the middle of my /etc/rc.local file, I’ve added
mount -o ro,remount /boot

In the typical usage of a running system, /boot doesn’t really need to be mounted read-write. Of course, if you forget it’s mounted read-only, then things like apt-get upgrade or rpi-update may certainly fail. Now when I want to run those commands I first reboot the Pi, and remount the /boot partition with
sudo mount -o remount,rw /boot

Once the updating is done, I reboot again and leave /boot read-only.

kworker using cpu on an otherwise idle system

I have an old thin client that I upgraded to a home server by adding some additional RAM and storage. I noticed after a recent kernel upgrade that the system seemed sluggish at times, despite doing nothing in particular at the time. top showed that a kworker process was using CPU, not all of it, but perhaps 25 to 50% of the total CPU.

I did a lot of searching to try and track down the offender. I used tools such as perf and iotop, read about various tunables under /proc related to power management. Finally, I ran Intel’s powertop command. It showed that “Audio codec alsa…” was hammering on some event loop.

I looked at the loaded kernel modules, and on a whim, I did sudo rmmod snd_hda_intel and that fixed the issue for me.

Others may find that a kworker is running in a tight loop for some other reason. It could be some other misbehaving driver or an I/O problem.

Finding how much time Apache requests take

When a request is logged in Apache’s common or combined format, it doesn’t actually show you how much time each request took to complete. To make reading logs a bit more confusing, each request is logged only once it’s completed. So a long-running request may have an earlier start time but appear later in the log than quicker requests.

To help look at some timing info without going deep enough to need a debugger, I decided that step one was to use a custom log format that saved the total request time. After adding usec:%D to the end of my Apache custom log format, we can now see how long various requests are taking to complete.

tail -q -1000 *access.log | mawk 'FS="(:|GET |POST | HTTP/1.1|\")" {print $NF" "$6}' | sort -nr | head -100 > /tmp/heavy2

I’m using the “%D” format for compatibility with older Apache releases, which reports the response time in microseconds. I would prefer milliseconds, but when I tried using “%{ms}T” on a server running 2.4.7, it didn’t work; too old. This output is a bit hard to read when looking at the numbers, so we can try to add in a little visual aid with commas as the thousands separator.

cat /tmp/heavy2 | xargs -L 1 printf "%'d %s\n" | less

Note that because we are measuring the total request time, some of the numbers may be high due to remote network latency or a slow client. I recommend correlating several samples before blaming some piece of local application code.

Hope this helps finding your long-running requests!

Default route via VPN while keeping LAN & services available

OpenVPN is working great and all, but I was having trouble getting my other LAN hosts to connect to the OpenVPN client system (a Raspberry Pi) while also keeping the services I normally run on it available from the internet. On the remote server, I was using redirect-gateway def1, which works but makes some assumptions about how you intend to use it.

After a lot of frustration and perusal of almost-but-not-quite posts on OpenVPN troubleshooting, I came across an article which didn’t mention OpenVPN but instead discussed how to set default routes for multiple interfaces.

Here’s what I took away. Extra lines in /etc/openvpn/client.conf:

and in multiple_gateways.sh:

One caveat: I haven’t done a ton of testing, and after rebooting my Pi, it didn’t come up cleanly, so a down.sh script may be needed to tear down the extra config when OpenVPN disconnects. That being said, I have services available from the internet, connections from the LAN to the Pi working, and the default route for outgoing connections still going over the VPN.