Thu May 15th, 2008 at 11:51:04 AM EST
I have from time to time though about the problem of creating a mulit-lingual community blog, and have some
ideas on what some requirements of such a platform ought to be. First, I
think a hybrid machine/person translation system is a must. We cannot
expect to hand translate every comment, and one can often get a sense
of what it should be from the machine version. A machine version
should initially be generated for all contributions, and improvements
should then be user submitted for diaries and substantial
comments. The key, I think, is a true multi-lingual community blog. As
in, users with different languages should be reading the same
material, and responding to each other's comments. If all we manage to
create are parallel language communities with the occasional cross
over diary, it is in my opinion a failure. Given this, how might a
multi-lingual interface work?
This requires an interface that allows for the translations of a
submission to be editable by any user. This editable content must be
tracked in a database where changes can be viewed and rolled back. I
for see two parallel ratings systems, one for translations, and
another for content as is now implemented. A machine translation would
be labeled as such, and have a rate of '0', to be improved by user
editing and subsequent up-rating.
Something like a user score of translation should be kept. I.e. users
who reliably (by community voting) generate good translations would be
able to edit a piece of text and then self-rate a higher scored
output, even before voting. Also, vote weighing should be tied to user
skill in the relevant language pairs. I.e. someone with good knowledge
of two languages should have their vote count for more in determining
the translation score for that piece. This score would be determined
by translation contributions by that user. Ideally there should be
some way to mark sections of a text. As in, "text mostly good, but
there is a problem here.". Markup (color-coded?) of translated texts
should thus also be displayable, both as default or by a simple one
click with dynamic update.
All this should be seamless. One should not have to open a separate
interface to view/edit translations. The user account must accommodate
options for both content display and translation activity. As in, I
may wish to see the site primarily in English, but if the original
content is in French, I would like this displayed too if the
translation is rated below X, as I can read some French and maybe help
improve the translation. Side by side display in a format similar to
current ET translation columns is probably good.
Even users without knowledge of a language can help with translations,
as machine translated output is generally quite comprehensible, but
clearly broken in some way. It usually involves switching the word
order here and there, adjust some verb tenses , correct personal
pronouns, and change awkward phrasing where the meaning still seems
clear. Thus, there should be a way for the translation rating to
indicate that a piece of text has been edited for grammar and
readability, but without knowledge of the original language. Maybe
with a dual translate score, in correctness and in readability?
Some careful though would have to be put into the operation of the
rating system to have properly scored translations. A user may for
example make some minor improvements to a text, leaving it better off
than before, but still not perfect. In this case we would not like the
user's translation score be brought down from low ratings. Thus, any
improvement in the score after an edit should be seen as positive.
A mock-up example
I chose an Italian article. I speak no Italian. Thus, the only
improvements I can make to the text are ones of readability of the
English output. This is what someone on the "English only" version of
the site would see. (or on some version that does not include Italian
as a language)
(Mouse-over ratings to see what they mean...)
> Italian (original)-->English, rated: (0/0 ; 2.0/0)
ANSA.it - Afghanistan, 2,500 Italians between Kabul and Herat
ROME - There are bout 2500 Italian soldiers in Afghanistan. The two main contingents as are equally divided between the capital Kabul and Herat, in the western part of the country. Both incorporated in the NATO mission ISAF. For Eupol, the European Union mission to rebuild the local civil police, Italy participate with a dozen Carabinieri, while a core group of border police is responsible for the training of the customs police. But the Italian contingent in Afghanistan is scheduled for changes: in August, in fact, there could be a drastic reduction of troops deployed in Kabul and a corresponding increase of those in Herat, where there is now already a brigade type structure. All this in view of greater responsibility for the Afghan forces in the capital.
/Commentreadability of translation
Request Improved Translation
Edit translationTo comment/rate a particular sentence or section, highlight it and press the relevant rate button.
The translation only text comes with a drop down menu to rate the
readability of the translation. The user cannot judge the correctness
of the translation as the original text is not shown. The "Comment" button here is for commenting on the translation, rather than submit a comment on the content. The "Request
Improved Translation" button would mark this comment/diary in a way
that is viewable to those interested in helping out. It would also send the user an
email when the translation improves, or provide the content by the
user's personalized RSS feed.
Now, clicking the arrow next to the translation block heading should replace this content with something like the following. The same content would be displayed by default to users with English/Italian language pairs in their preferences.
|∨ Italian (original)-->English, rated: (0/0 ; 2.0/0)|
∨ Translation mark-up guide √ Show markup
- someone (0/0;3/0) -- Did some changes to improve readability of the English
- other translators...
eventual notes on mouseover
red: to be changed
orange: inferior quality
|Italian (Original) ∨||English (translation) ∨|
|ANSA.it - Afghanistan, 2.500 italiani tra Kabul e Herat|| ANSA.it - Afghanistan, 2,500 Italians between Kabul and Herat |
|ROMA - I militari italiani in Afghanistan sono circa 2.500. Due i contingenti principali in cui sono equamente divisi, nella capitale Kabul e a Herat, nell'ovest del Paese, entrambi inseriti nella missione Isaf della Nato. Ad Eupol, la missione dell'Unione europea per la ricostruzione della polizia civile locale, partecipano invece una decina di carabinieri, mentre un nucleo di finanzieri si occupa della formazione della polizia doganale. Ma per il contingente italiano in Afghanistan sono previste novità: da agosto, infatti, potrebbe esserci una drastica riduzione dei soldati schierati a Kabul e un corrispondente incremento di quelli ad Herat, dove già adesso esiste una struttura di tipo brigata. Tutto ciò anche in vista di una maggiore assunzione di responsabilità, nell'area della capitale, da parte delle forze afgane.||ROME - There are bout 2500 Italian soldiers in Afghanistan. The two main contingents as are equally divided between the capital Kabul and Herat, in the western part of the country. Both incorporated in the NATO mission ISAF. For Eupol, the European Union mission to rebuild the local civil police, Italy participate with a dozen Carabinieri, while a core group of border police is responsible for the training of the customs police. But the Italian contingent in Afghanistan is scheduled for changes: in August, in fact, there could be a drastic reduction of troops deployed in Kabul and a corresponding increase of those in Herat, where there is now already a brigade type structure. All this in view of greater responsibility for the Afghan forces in the capital.|
|Rate ∨/Commentcorrectness of translation|
Rate ∨/Commentreadability of translation
To comment/rate the translation of a particular sentence or section, highlight it and press the relevant rate button.
The two column text comes with more options, like the ability to rate the correctness of the translation for those who know both languages. The numbers next to my name in the list of translators are my hypothetical scores as a translator. The first is zero as I don't know Italian, the second is higher as I am capable of producing a readable text from machine translated output. The translation rating of the comment is the starting score that I self-rated it at. (mouse-over the various parts for meaning...)
How to implement this? Well, I have not had a look at exactly what is
out there. I doubt there is anything which would work "out of the box". The
site I envision would be Scoope like in its Diary/Comments structure,
but with the added translation support. I don't know if It would be
easier to start from some product like Scoop and build on top, or if one would be
better off starting from just a database/webserver setup. (I suspect the latter...) This is clearly a
quite substantial bit of coding. But since the multilingual ET has
been brought up in a few places recently, I though I'd get some ideas
on functionality out there.