|
by someone
I have from time to time though about the problem of creating a mulit-lingual community blog, and have some
ideas on what some requirements of such a platform ought to be. First, I
think a hybrid machine/person translation system is a must. We cannot
expect to hand translate every comment, and one can often get a sense
of what it should be from the machine version. A machine version
should initially be generated for all contributions, and improvements
should then be user submitted for diaries and substantial
comments. The key, I think, is a true multi-lingual community blog. As
in, users with different languages should be reading the same
material, and responding to each other's comments. If all we manage to
create are parallel language communities with the occasional cross
over diary, it is in my opinion a failure. Given this, how might a
multi-lingual interface work?
This requires an interface that allows for the translations of a
submission to be editable by any user. This editable content must be
tracked in a database where changes can be viewed and rolled back. I
for see two parallel ratings systems, one for translations, and
another for content as is now implemented. A machine translation would
be labeled as such, and have a rate of '0', to be improved by user
editing and subsequent up-rating.
Something like a user score of translation should be kept. I.e. users who reliably (by community voting) generate good translations would be able to edit a piece of text and then self-rate a higher scored output, even before voting. Also, vote weighing should be tied to user skill in the relevant language pairs. I.e. someone with good knowledge of two languages should have their vote count for more in determining the translation score for that piece. This score would be determined by translation contributions by that user. Ideally there should be some way to mark sections of a text. As in, "text mostly good, but there is a problem here.". Markup (color-coded?) of translated texts should thus also be displayable, both as default or by a simple one click with dynamic update. All this should be seamless. One should not have to open a separate interface to view/edit translations. The user account must accommodate options for both content display and translation activity. As in, I may wish to see the site primarily in English, but if the original content is in French, I would like this displayed too if the translation is rated below X, as I can read some French and maybe help improve the translation. Side by side display in a format similar to current ET translation columns is probably good. Even users without knowledge of a language can help with translations, as machine translated output is generally quite comprehensible, but clearly broken in some way. It usually involves switching the word order here and there, adjust some verb tenses , correct personal pronouns, and change awkward phrasing where the meaning still seems clear. Thus, there should be a way for the translation rating to indicate that a piece of text has been edited for grammar and readability, but without knowledge of the original language. Maybe with a dual translate score, in correctness and in readability? Some careful though would have to be put into the operation of the rating system to have properly scored translations. A user may for example make some minor improvements to a text, leaving it better off than before, but still not perfect. In this case we would not like the user's translation score be brought down from low ratings. Thus, any improvement in the score after an edit should be seen as positive. A mock-up example
I chose an Italian article. I speak no Italian. Thus, the only
improvements I can make to the text are ones of readability of the
English output. This is what someone on the "English only" version of
the site would see. (or on some version that does not include Italian
as a language) > Italian (original)-->English, rated: (0/0 ; 2.0/0)
ANSA.it - Afghanistan, 2,500 Italians between Kabul and Herat ROME - There are bout 2500 Italian soldiers in Afghanistan. The two main contingents as are equally divided between the capital Kabul and Herat, in the western part of the country. Both incorporated in the NATO mission ISAF. For Eupol, the European Union mission to rebuild the local civil police, Italy participate with a dozen Carabinieri, while a core group of border police is responsible for the training of the customs police. But the Italian contingent in Afghanistan is scheduled for changes: in August, in fact, there could be a drastic reduction of troops deployed in Kabul and a corresponding increase of those in Herat, where there is now already a brigade type structure. All this in view of greater responsibility for the Afghan forces in the capital.Rate ∨/Commentreadability of translation Request Improved Translation Edit translation To comment/rate a particular sentence or section, highlight it and press the relevant rate button. The translation only text comes with a drop down menu to rate the readability of the translation. The user cannot judge the correctness of the translation as the original text is not shown. The "Comment" button here is for commenting on the translation, rather than submit a comment on the content. The "Request Improved Translation" button would mark this comment/diary in a way that is viewable to those interested in helping out. It would also send the user an email when the translation improves, or provide the content by the user's personalized RSS feed. Now, clicking the arrow next to the translation block heading should replace this content with something like the following. The same content would be displayed by default to users with English/Italian language pairs in their preferences.
The two column text comes with more options, like the ability to rate the correctness of the translation for those who know both languages. The numbers next to my name in the list of translators are my hypothetical scores as a translator. The first is zero as I don't know Italian, the second is higher as I am capable of producing a readable text from machine translated output. The translation rating of the comment is the starting score that I self-rated it at. (mouse-over the various parts for meaning...) How to implement this? Well, I have not had a look at exactly what is out there. I doubt there is anything which would work "out of the box". The site I envision would be Scoope like in its Diary/Comments structure, but with the added translation support. I don't know if It would be easier to start from some product like Scoop and build on top, or if one would be better off starting from just a database/webserver setup. (I suspect the latter...) This is clearly a quite substantial bit of coding. But since the multilingual ET has been brought up in a few places recently, I though I'd get some ideas on functionality out there. |
Menu
. Home
. About . Contact . New User Guide . FAQ . Search . Search (Google) . Archives (Wiki) Art, Economics, Energy, Environment, EU Politics, Mech & Tech, By Country Login
|
|||||||||||||||||
|
Multi-lingual ET? | 36 comments (36 topical, 0 editorial, 0 hidden)
Multi-lingual ET? | 36 comments (36 topical, 0 editorial, 0 hidden)
| |||||||||||||||||||
| |||||||||||||||||||