Thomas Nybergh’s pages

  • Home
  • About
  • Notes
  • På svenska
  • Suomeksi
  • Notes and essays updates feed
«Openness and microblogging
Finlandsinsikter »

Batch convert Irssi logs or other text files to UTF-8 using recode

Sunday, February 8th 2009, 14:27 UTC Published in in English, internet, link tips, linux/unix, notes, politics, software, technology

When I recently rented a VPS running a fresh install of Debian, I thought it was about time to stick with the now default Unicode locale, UTF-8. Doing this switch in a sensible fashion would include converting often used text files, such as chat logs from the older, more compatible but limited ISO 8859-15 charset.

(By the way: Linode, my VPS host, seems to be awesome.)

My chat client Irssi combined with OpenSSH, GNU Screen and Bitlbee provides me with hugely powerful social infrastructure in the form of continuous conversations that can be reached with any SSH client. Add logging and basic Unix tools to the mix and you have a silly fast and simple way of finding stuff you’ve discussed. In other words: my IRC/Live/Jabber logs are important works of reference and must be kept up to date with the system locale.

I failed to find any directly suitable or functional shell one-liners for this operation, until Thomas handed me something that worked for me.

The conversion command later on this page converts all files in Irssi’s default log location, ~/irclogs, and its subdirectories from ISO-8859-15 to UTF-8.  The conversion is performed on the files themselves using recode in their current location. Don’t run with scissors, please do yourself a favor by making a backup copy of your precious logs. The most obvious tool for that is perhaps:

"cp -r ~/irclogs ~/backup_irclogs"

This is Thomas‘ conversion command:

"find ~/irclogs/* | while read i; do echo "Converting $i"; recode ISO-8859-15..UTF-8 "$i"; done"

Stig later informed me about find having exec capablities, but since I’m lazy and all that, you’ll have to optimze the above version yourself.

Leave a Response

Where am I?

You've landed on Thomas Nybergh's personal site. This section is an occasionally expanded essay collection.

For more timely updates on things I find interesting, follow my link blog or Twitter silliness.

My contact details are listed here as well.

RSS Things I’ve posted recently

  • Tarjouspyyntö www.kuvalauta.fi -sivuston hostaamisesta
  • Tarjouspyyntö www.kuvalauta.fi -sivuston hostaamisesta
  • Tarjouspyyntö http://www.kuvalauta.fi -sivuston hostaamisesta http://j.mp/cwuyZV
  • So, here's my lengthy excuse for posting a rapey Assange related link pointing towards Valleywag of all places: http://j.mp/alAzGT
  • Okay, it’s Valleywag, but here’s a supposed Wikileaks/Assange police report leak
  • RT @JoonasD6: "we learn that Assange's preferred method of seducing groupies is to feed them cheese." http://u.nu/5y62f #piratpartiet #wikileaks
  • RT @nexenta: Reading Simon Crosby on Open Source does not mean Interoperable or Compatible http://community.citrix.com/x/XADnBw
  • Pirate, warez scener, Fairlight founder, GOP chairman
  • The shutdown of Chatroulette is a fine example of the problems w/ "cloud computing." Your favorite masturbation tools can be gone any minute
  • Group of shitfaced teens lit cigarettes on a packed bus last night. Got an apology by asking them to at least pass around a spliff next time
  • Chat Roulette Removed? http://chatroulette.com (via @hackernews)
  • Web design. http://www.viialanleipomo.fi
  • Mad Men is notable for its subtly dramatic take on alcoholic 60's advertising creatives, their women and other property.
  • On the other hand, autoposting links like some SEO webdick would challenge me to keep the signal/noise ratio of @omglog better than before.
  • Wondering if it's too douchy to autopost the feed of my @omglog posts to my Twitter account. The issue is linking to a blog w/ mostly links.


©2010 Thomas Nybergh
This site runs on WordPress using a slightly modified Gridline Lite theme.