Thomas Nybergh’s pages

  • Home
  • About
  • Notes
  • På svenska
  • Suomeksi
  • Notes and essays updates feed
«Openness and microblogging
Finlandsinsikter »

Batch convert Irssi logs or other text files to UTF-8 using recode

Sunday, February 8th 2009, 14:27 UTC Published in in English, internet, link tips, linux/unix, notes, politics, software, technology

When I recently rented a VPS running a fresh install of Debian, I thought it was about time to stick with the now default Unicode locale, UTF-8. Doing this switch in a sensible fashion would include converting often used text files, such as chat logs from the older, more compatible but limited ISO 8859-15 charset.

(By the way: Linode, my VPS host, seems to be awesome.)

My chat client Irssi combined with OpenSSH, GNU Screen and Bitlbee provides me with hugely powerful social infrastructure in the form of continuous conversations that can be reached with any SSH client. Add logging and basic Unix tools to the mix and you have a silly fast and simple way of finding stuff you’ve discussed. In other words: my IRC/Live/Jabber logs are important works of reference and must be kept up to date with the system locale.

I failed to find any directly suitable or functional shell one-liners for this operation, until Thomas handed me something that worked for me.

The conversion command later on this page converts all files in Irssi’s default log location, ~/irclogs, and its subdirectories from ISO-8859-15 to UTF-8.  The conversion is performed on the files themselves using recode in their current location. Don’t run with scissors, please do yourself a favor by making a backup copy of your precious logs. The most obvious tool for that is perhaps:

"cp -r ~/irclogs ~/backup_irclogs"

This is Thomas‘ conversion command:

"find ~/irclogs/* | while read i; do echo "Converting $i"; recode ISO-8859-15..UTF-8 "$i"; done"

Stig later informed me about find having exec capablities, but since I’m lazy and all that, you’ll have to optimze the above version yourself.

Leave a Response

Where am I?

You've landed on Thomas Nybergh's personal site. This section is an occasionally expanded essay collection.

For more timely updates on things I find interesting, follow my link blog or Twitter silliness.

My contact details are listed here as well.

RSS Things I’ve posted recently

  • Chief exorcist says Devil is in Vatican
  • Wired Reread: retro gold from old issues of Wired Magazine
  • @JoonasD6 Both movies mimic the dialogue and visual style of (super hero) comics in a way I find vaguely pompous, infantile and shallow.
  • allmusic ((( Acoustic Chill > Review )))
  • Don't enjoy being spoken to like a 12-y-old.Which is why Watchmen doubled the damage V For Vendetta caused my respect for comic based movies
  • No amount of All Along the Watchtower&cutesy anachronisms can save Watchmen from being a movie adaptation of ridiculous comic book dialogue.
  • Pansentient League: blog with Spotify related recommendations and discussion
  • Palautus.org – MITEN MURSKAAMME KANSAMME OLEMASSAOLOA UHKAAVAN MONIKULTTUURISUUDEN VAARAN?
  • Wired: Cyberwar Hype Intended to Destroy the Open Internet
  • Technical Objections To the Ogg Container Format
  • Marijuana use by seniors goes up as boomers age
  • BioShock involved designer remakes Arcadia level for Doom II
  • Newsweek 1995: Why the Internet Will Fail
  • Hundpartiet
  • Yllättävän moni suomalainen käyttää silmälaseja


©2010 Thomas Nybergh
This site runs on WordPress using a slightly modified Gridline Lite theme.