{"id":752,"date":"2016-08-17T18:52:36","date_gmt":"2016-08-17T22:52:36","guid":{"rendered":"https:\/\/www.devolve.net\/blog\/?p=752"},"modified":"2018-07-13T11:11:06","modified_gmt":"2018-07-13T15:11:06","slug":"moving-evernote-notes-wordpress","status":"publish","type":"post","link":"https:\/\/www.devolve.local\/moving-evernote-notes-wordpress\/","title":{"rendered":"Moving Evernote notes into WordPress"},"content":{"rendered":"

proprietary insecurity<\/h3>\n

I’ve accumulated many notes (2000+) in Evernote over the years, and love that it can store binary attachments such as images or other media files. My favorite feature is the Evernote Web Clipper browser extension; it does a fantastic job at saving the parts of an article I want to save while keeping the styling intact. <\/p>\n

Evernote has a free plan which I’ve enjoyed for a long time, but recently the financial status of the company has come into question, and they restricted syncing to only two devices. Also, the last thing I want to happen is another kind of Google Reader shutdown fiasco. I doubt that a shutdown would make my existing notes disappear, but it’s better to be prepared ahead of time. To that extent, I’ve been looking for a viable option to migrate my notes into another platform.<\/p>\n

Obviously it’s not in Evernote’s best interest to help people migrate away from them, but nevertheless the desktop app provides two flavors of export: one big Evernote XML file with base64 encoded attachments; or a directory of HTML files, one for each note which a correspondingly-named .resources directory holding attachments and web assets for each note.<\/p>\n

The technologically savvy searcher can find several projects designed to ETL data from Evernote’s servers and put in into another system. If you wanted to populate a Hugo site with Evernote data, one could use enwrite<\/a>. If you’re already on WordPress you could try out the Evernote Sync plugin. But with as many notes as I have, pulling the notes from the central server is slow, error-prone, and likely to hit some usage limit that Evernote enforces. The best approach for me is to use the desktop app’s export feature and then transform it into something digestible by WordPress.<\/p>\n

my solution<\/h3>\n

Following this track, I first thought of writing a converter for the giant Enex (Evernote XML) file to make it into one or more RSS \/ WXR files, then using the associated native importer for WordPress. But I wasn’t sure how to keep all the base64 attachments, or if I would be able to keep the metadata I wanted. After many fits and starts, I saw the WordPress plugin HTML Import 2<\/a>. This is the solution that worked for me, but there’s no free lunch, and it took me at least five tries with it to get what I wanted. In fact it says plainly in the documentation that you won’t get it right the first time. :-)<\/p>\n

Making it work consists of going through about 5 panes of plugin settings, transforming the Evernote HTML files a bit, and putting all the files in the right place.<\/p>\n

the details<\/h3>\n

From this point forward, when I use ABSROOT<\/code>, I mean the absolute path to the document root where WordPress is running.<\/p>\n

mkdir -p ABSROOT\/ehtml\/evernote<\/span> <\/p>\n

Then copy all the .html files and .resources directories to ehtml\/<\/code>. Next we want to separate the resources from the HTML files.<\/p>\n

Note that in this article, I’m not going to be importing the images and attachments into the WordPress media library. You can do this if you want though: the plugin author recommends using the Add from Server<\/a> plugin to accomplish this. I’ve used this before as well and if you do it right it works well.<\/p>\n

cd ABSROOT\/ehtml\/evernote\r\nfor i in ..\/*.resources\/*; do ln \"$i\"; done\r\n# now you have hard links to all your resources here\r\ncd ..\r\nmv evernote ..\/wp-content\/uploads\/\r\n# get rid of the resources directories\r\nfind . -type d -print0 | xargs -0 rm -r<\/pre>\n

The plugin author stated in the support forum or FAQ that at some point, she’d like to modify the custom fields pane to use XPath selectors for greater flexibility, but until that time comes, the plugin can’t read metadata from the meta<\/code> tags in the head of each HTML file. That was a deal-breaker for me, so I decided the easiest thing was to do a bit of Perl scripting to modify each HTML file.<\/p>\n

# now we can run a small script to fix up the html\r\nfor i in *; do perl -i -pe '\r\ns|src=\".*?\\.resources|src=\"\/wp-content\/uploads\/evernote|g;\r\ns|href=\".*?\\.resources|href=\"\/wp-content\/uploads\/evernote|g;\r\ns|meta name=\"([^\"]+)\" content=\"([^\"]+)\"\/>|meta name=\"$1\" content=\"$2\"\/><$1>$2<\/$1>|g;\r\n' \"$i\"; done<\/pre>\n

The last Perl substitution keeps the existing meta tags but adds new tags named after the name<\/code> attribute of meta tags. So for example,<\/p>\n

<meta name=\"keywords\" content=\"tech\"><\/pre>\n

becomes<\/p>\n

<meta name=\"keywords\" content=\"tech\"><keywords>tech<\/keywords><\/pre>\n

This is potentially dangerous, as the name and content attributes may contain values that are illegal or unescaped elsewhere in XML. For the few meta tags that Evernote creates, it should be fine.<\/p>\n

the settings<\/h3>\n

Here are some screenshots of my settings in the tab order that they appear.
\n