Hi Chris,
thanks for sharing your comment. I put the link to the public comment
https://www.icann.org/public-comments/ird-draft-final-2015-03-09-en
for japanese, I can confirm for the difficulties of automated
transliteration, my name was mistreated in many ways when dealing with
japanese administration, banks etc
I think the same may happen with the case of Arabic too
Best,
Rafik
2015-04-07 17:36 GMT+09:00 Dillon, Chris <[log in to unmask]>:
> Dear colleagues,
>
>
>
> Just to let you know that I have in a private capacity sent the following
> comment to the Public Comment for the Draft final report from the EWG on
> Internationalized Registration Data:
>
>
>
> ==
>
> Subj.: each form may be derived from the other
>
>
>
> I am Co-Chair of the Translation & Transliteration of Contact Information
> PDP WG, but making this comment in a private capacity.
>
> It is a solid and practical report and obviously the result of a huge
> amount of work.
>
> I would like to make some points about Table 4 on p.11:
>
> • I feel it is important to stress that Table 4 is an ideal of
> clean data, that currently may only be produced manually and that it
> contains aspects of both transliteration (e.g. 千代田 -> Chiyoda) and
> translation (e.g. ビル -> Bldg.).
>
> • Moreover, the relationships between the original and the
> transformed records are complex and it is not possible to move
> automatically in either direction e.g.“first” is usually 第一 /daiichi/ but
> in this case is ファースト /faasuto/ from the English, where // represents a
> transliteration. In fact there are other Japanese possibilities, but I’ll
> omit those in the interests of brevity. Unfortunately ファースト may be either
> “first” or “fast”. (One wonders whether the Japanese fast food chain
> ファースト・キッチン /faasuto kitchin/, also advertised in Japan as the translated
> form, “First Kitchen” may have originally been a mistranslation of “Fast
> Kitchen”.)
>
> • This may affect text such as “original data could have been in
> either form” and, especially, “each from can be derived from the other”.
>
> Scripts where letters can be read in more than one way or which do not use
> spaces to define word boundaries (Japanese falls into both of these
> categories) will be the most resistant to automated
> transliteration/translation.
>
> Chris Dillon.
>
> --
>
> Research Associate in Linguistic Computing, Centre for Digital Humanities,
> UCL, Gower St, London WC1E 6BT Tel +44 20 7679 1599 (int 31599)
> www.ucl.ac.uk/dis/people/chrisdillon
>
>
>
|