{"id":449,"date":"2014-01-08T12:48:44","date_gmt":"2014-01-08T17:48:44","guid":{"rendered":"http:\/\/devolve.net\/blog\/?p=449"},"modified":"2015-05-21T10:09:21","modified_gmt":"2015-05-21T14:09:21","slug":"awk-blows-me-away","status":"publish","type":"post","link":"https:\/\/www.devolve.local\/awk-blows-me-away\/","title":{"rendered":"AWK blows me away"},"content":{"rendered":"

How did I not know this about awk<\/code>!? Don’t get me wrong, I’m no awk<\/code> expert; I’m always using some of the most simple and obvious features it has. But I almost always use the -F<\/code> option to specify the field separator. Until today, I thought you could only give it either a single character or a string literal. Go ahead, laugh, I’ll wait. <\/p>\n

From StackOverflow<\/a>:<\/p>\n

$ echo '\"School\",\"College\",\"City\"'|awk -F'\",\"|^\"|\"$' '{for(i=1;i< =NF;i++) {if($i)print $i}}'\r\nSchool\r\nCollege\r\nCity<\/pre>\n

I just used this to help parse the output of drush pml<\/code>. Since it uses more than one space to separate fields, I used <\/p>\n

awk -F'  +' '{print $2\"\\t\"$4\"\\t\"$5}'<\/pre>\n

(that’s two spaces and a plus sign for the field separator). I initially tried using <\/p>\n

' {2,}'<\/pre>\n

but apparently that’s some kind of super advanced regex that awk doesn’t understand.<\/p>\n

Another cool thing about the StackOverflow page is that one person shows how you can parse a CSV with embedded \/ escaped commas if you have gawk<\/code> 4 installed:<\/p>\n

% cat infile \r\n\"School\",College: \"My College\",\"City, I\"\r\n\r\n% awk '{    \r\n  for (i = 0; ++i < = NF;)\r\n    print i, substr($i, 1, 1) == \"\\042\" ?\r\n      substr($i, 2, length($i) - 2) : $i\r\n  }' FPAT='([^,]+)|(\\\"[^\\\"]+\\\")' infile  \r\n1 School\r\n2 College: \"My College\"\r\n3 City, I<\/pre>\n","protected":false},"excerpt":{"rendered":"

How did I not know this about awk!? Don’t get me wrong, I’m no awk expert; I’m always using some of the most simple and obvious features it has. But I almost always use the -F option to specify the field separator. Until today, I thought you could only give it either a single character […]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[6],"tags":[34,41],"_links":{"self":[{"href":"https:\/\/www.devolve.local\/wp-json\/wp\/v2\/posts\/449"}],"collection":[{"href":"https:\/\/www.devolve.local\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.devolve.local\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.devolve.local\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.devolve.local\/wp-json\/wp\/v2\/comments?post=449"}],"version-history":[{"count":3,"href":"https:\/\/www.devolve.local\/wp-json\/wp\/v2\/posts\/449\/revisions"}],"predecessor-version":[{"id":452,"href":"https:\/\/www.devolve.local\/wp-json\/wp\/v2\/posts\/449\/revisions\/452"}],"wp:attachment":[{"href":"https:\/\/www.devolve.local\/wp-json\/wp\/v2\/media?parent=449"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.devolve.local\/wp-json\/wp\/v2\/categories?post=449"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.devolve.local\/wp-json\/wp\/v2\/tags?post=449"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}