[Ruby] Ruby & i18n

Пн Сен 30 11:24:29 MSD 2002

Привет все,

Тема Rider-а, по-моему, оч. актуальная,

Я дополнительно хочу еще спросить:

Как в Ruby реализована поддержка unicode (UTF-8 - в частности)?
Есть ли какая-нибудь документация по использованию unicode в Ruby?
Может кто имел опыт работы - поделитесь примерами.

Все, что мне удалось найти - это письмо с примером, и модуль для
конвертации: http://www.yoshidam.net/uconv-0.4.10.tar.gz .

Subject: [ruby-talk:27350] Re: Unicode string for the standard library ?
From: TAKAHASHI Masayoshi <maki на open-news.com>
Date: Mon, 3 Dec 2001 23:43:33 +0900
References: * * *
In-reply-to: *

Rik Hemsley <rik на kde.org> wrote:
> > the utf-8 representation works fine for me, as regexes do support them.
>
> Is UTF-8 really ok ?

Ruby's string is byte sequence, but Ruby's regex supports
UTF-8 string. For example,  /./u  matches one UTF-8
character(one code unit).

###------------------------------------------------------------
# Ruby UTF-8 handling sample

## Notice: In this sample, 'character' means 'code unit'.

## split by character
str = "soci\303\251t\303\251"  # societe; 'e' is with acute
p str.split(//u)
#=> ["s", "o", "c", "i", "\303\251", "t", "\303\251"]

## character match
str.scan(/.(.)./u){|c|
  p c[0]
}
#=> "o"         ## 2nd character
#=> "\303\251"  ## 5th character

## length
p str.split(//u).length # => 7   ## character length
p str.split(//n).length # => 9   ## byte length
p str.length            # => 9   ## byte length

## replace
str2 = str.gsub(/\xC3\xA9/u,"e")
p str2 #=> "societe"

## other replace
str3 = str2.gsub(/e/u,"\xC3\xA9")
p str3 #=> "soci\303\251t\303\251"

###------------------------------------------------------------

Пока,
Сергей