Monday, November 1, 2010

Be part of the problem, or the solution

Well, I'm still alive.

I've just decided that unless I'm actually going to write an opinionated rant about something it's not going to be very interesting so I'm not really updating about whatever latest book I've read or am reading(though "The elements of computing systems" is awesome so far!)

Anyway, today, a little bit of a rant.

I *hate* Shift-JIS.

We're in the era of unicode now, or at least should be. It can represent any character you can think of, is fully ascii compatible, well supported in all development API's, and yet so many places are still using ancient artifacts from the past.

I work at a Japanese company so unfortunately, I have to deal with Shift-JIS and EUC-JP a lot. Generally in the context of turning them into UTF-8. Whether it's taking search queries and figuring out their encoding or dealing with buggy source code because someone thought it would be a good idea to mix encodings in varying source files, encodings other than unicode upset me a lot.

Apparently some big shops still haven't entered the modern era either. Try loading a UTF-8 encoded csv file with japanese in it into excel.

Garbage!

Thanks Microsoft. It's kind of sad because overall excel is a good product. Anyway, OpenOffice(or LibreOffice now that a lot of the devs ran from oracle) loads em great. And it gives you an option to select encoding if you do decide to use one of this dinosaur age encodings(I'm sure there's a setting for this somewhere but I don't use excel anymore and the people that reported the problem to me couldn't find one, which means neither will the vast majority of users).

So, when our software puts out CSV files for customers, how do we encode them?

In Shift-JIS. It makes me hurt. But what can we do when the vast majority of people use Excel? What does one do in a situation like this?

No comments: