I’ve accumulated many notes (2000+) in Evernote over the years, and love that it can store binary attachments such as images or other media files. My favorite feature is the Evernote Web Clipper browser extension; it does a fantastic job at saving the parts of an article I want to save while keeping the styling intact.
Evernote has a free plan which I’ve enjoyed for a long time, but recently the financial status of the company has come into question, and they restricted syncing to only two devices. Also, the last thing I want to happen is another kind of Google Reader shutdown fiasco. I doubt that a shutdown would make my existing notes disappear, but it’s better to be prepared ahead of time. To that extent, I’ve been looking for a viable option to migrate my notes into another platform.
Obviously it’s not in Evernote’s best interest to help people migrate away from them, but nevertheless the desktop app provides two flavors of export: one big Evernote XML file with base64 encoded attachments; or a directory of HTML files, one for each note which a correspondingly-named .resources directory holding attachments and web assets for each note.
The technologically savvy searcher can find several projects designed to ETL data from Evernote’s servers and put in into another system. If you wanted to populate a Hugo site with Evernote data, one could use enwrite. If you’re already on WordPress you could try out the Evernote Sync plugin. But with as many notes as I have, pulling the notes from the central server is slow, error-prone, and likely to hit some usage limit that Evernote enforces. The best approach for me is to use the desktop app’s export feature and then transform it into something digestible by WordPress.
Following this track, I first thought of writing a converter for the giant Enex (Evernote XML) file to make it into one or more RSS / WXR files, then using the associated native importer for WordPress. But I wasn’t sure how to keep all the base64 attachments, or if I would be able to keep the metadata I wanted. After many fits and starts, I saw the WordPress plugin HTML Import 2. This is the solution that worked for me, but there’s no free lunch, and it took me at least five tries with it to get what I wanted. In fact it says plainly in the documentation that you won’t get it right the first time. :-)
Making it work consists of going through about 5 panes of plugin settings, transforming the Evernote HTML files a bit, and putting all the files in the right place.
From this point forward, when I use
ABSROOT, I mean the absolute path to the document root where WordPress is running.
mkdir -p ABSROOT/ehtml/evernote
Then copy all the .html files and .resources directories to
ehtml/. Next we want to separate the resources from the HTML files.
Note that in this article, I’m not going to be importing the images and attachments into the WordPress media library. You can do this if you want though: the plugin author recommends using the Add from Server plugin to accomplish this. I’ve used this before as well and if you do it right it works well.
for i in ../*.resources/*; do ln "$i"; done
# now you have hard links to all your resources here
mv evernote ../wp-content/uploads/
# get rid of the resources directories
find . -type d -print0 | xargs -0 rm -r
The plugin author stated in the support forum or FAQ that at some point, she’d like to modify the custom fields pane to use XPath selectors for greater flexibility, but until that time comes, the plugin can’t read metadata from the
meta tags in the head of each HTML file. That was a deal-breaker for me, so I decided the easiest thing was to do a bit of Perl scripting to modify each HTML file.
# now we can run a small script to fix up the html
for i in *; do perl -i -pe '
s|meta name="([^"]+)" content="([^"]+)"/>|meta name="$1" content="$2"/><$1>$2</$1>|g;
' "$i"; done
The last Perl substitution keeps the existing meta tags but adds new tags named after the
name attribute of meta tags. So for example,
<meta name="keywords" content="tech">
<meta name="keywords" content="tech"><keywords>tech</keywords>
This is potentially dangerous, as the name and content attributes may contain values that are illegal or unescaped elsewhere in XML. For the few meta tags that Evernote creates, it should be fine.
Here are some screenshots of my settings in the tab order that they appear.
Warning: I almost gave up on this plugin because every time I tried to save the settings page, it wouldn’t save. After a lot of searching, I found no one else was having this issue, so I tried changing and disabling various settings and plugins. There is an incompatibility with the Memcached Object Cache plugin. You need to delete or move the
/wp-content/object-cache.php file out of the way for the settings page to save.
When I run the importer, I get a browser message that says the connection was reset after about 30 seconds or so, but the web server is still running in the background. You may need to increase the
max_memory settings in PHP for the import to complete. For the huge number of files I needed to process, the import took a little less than three minutes. This is much less time than other importers that pull from an Evernote server.
So far, what I’ve found is that the HTML Import 2 plugin is the best way to get a pile of static pages into WordPress. My images and file links are all linking correctly from the posts, and while the images aren’t in the media library, I really don’t need them there. And as stated above, there are options to migrate or copy those images later with another tool.
I look forward to further updates to this great piece of software (like the XPath selectors for meta tags), but it’s good enough and has been useful enough to me that I donated a few dollars via PayPal.
I hope you found my tips for importing from Evernote helpful.