Random musings from my awakening dementia...
06.09.2004  
Applescript and Perl and Unicode ... Oh my!
 

I've been a computer geek since a boy, and thoughts related to computers and software engineering get dropped here for the benefit of humanity and my own hubris.

© 2004-2005, Howard Abrams



Except where otherwise noted, all original content is licensed under a Creative Commons License.
See details.

You may have noticed that on the right side of my web site is a blurb about what I am listening to (yeah, I mentioned this before). Anyway, playing anything from another country would result in garbage on the screen. Thought I would mention what the hell I go through for stoopid scripts.

First of all, Perl 8 does support Unicode … but the problem is the plethora of Unicode-related modules on CPAN. Picking and choosing between them is quite interesting, especially since I didn’t really know what I was dealing with… I mean, the text I was getting from my Applescript call was a series of text with some “other values” thrown in. (BTW: I highly recommend grabbing Unicode Checker, a program that displays lots of great information about each Unicode character).

Second, it is good to finally know that iTunes returns text to Applescripts as UTF-8 (that’s a variant of Unicode … yes, the lovely thing about standards is there are so many to choose from—including Unicode).

But Perl’s internal Unicode format is not UTF-8 … it is something else. In order to convert things so that Perl can deal with them, we need to use the following module:[perl]

use Unicode::Transform;
my $uni = utf8_to_unicode($original_text);

But what I wanted was to convert the text over to HTML escape sequences. Let’s grab yet-another-perl-module:[perl]

use HTML::Entities;
my $esc = encode_entities($uni);

Now things look and work fabulously … as you can when I listen to some of my Nordic Punk.