Machine translation: Difference between revisions
No edit summary |
No edit summary |
||
Line 230: | Line 230: | ||
::Yes, I have seen the results of Japanese and Chinese deconversions. They're actually OK (in the case of Chinese; the Japanese ones are grammatically OK but most of the vocabulary seems to be missing), but the emphasis here is that UNL is a ''work-in-progress'' and should not be judged in its present form. It may be at least semi-sucky at teh stage it is in right now, but hopefully it will improve. Also, I've noticed lots of users on this page say stuff about how MT produces sucky results. Well, if you think about the advance of MT technology like the advance of computers, then it makes more sense. If you just look up each word in a dictionary, well, that is the original method of machine translation. When you start to add some grammar, that is what comes next. When you start to add more grammar and even some context-sensitivity, that's even better. Then there come the more advanced things: the "reverso method" (trying to get a translation so that the back translation matches the original as closely as possible), neural networks, UNL, etc... These methods produce much better results than those before them, and those before them produce much better results than those before THEM... ASASF... [[w:en:User:Node_ue|Node]] | ::Yes, I have seen the results of Japanese and Chinese deconversions. They're actually OK (in the case of Chinese; the Japanese ones are grammatically OK but most of the vocabulary seems to be missing), but the emphasis here is that UNL is a ''work-in-progress'' and should not be judged in its present form. It may be at least semi-sucky at teh stage it is in right now, but hopefully it will improve. Also, I've noticed lots of users on this page say stuff about how MT produces sucky results. Well, if you think about the advance of MT technology like the advance of computers, then it makes more sense. If you just look up each word in a dictionary, well, that is the original method of machine translation. When you start to add some grammar, that is what comes next. When you start to add more grammar and even some context-sensitivity, that's even better. Then there come the more advanced things: the "reverso method" (trying to get a translation so that the back translation matches the original as closely as possible), neural networks, UNL, etc... These methods produce much better results than those before them, and those before them produce much better results than those before THEM... ASASF... [[w:en:User:Node_ue|Node]] | ||
:::My project www.babelcode.org has the same goal as UNL, but has better underlying theory and better development strategy. [[User:bootedcat|bootedcat]] | :::My project www.babelcode.org has the same goal as UNL, but has better underlying theory and better development strategy. [[w:en:User:bootedcat|bootedcat]] | ||
|} | |} |
Revision as of 00:30, 25 August 2004
Help us develop tools for translating Wikipedia. | ||
The purpose of the Wikipedia Machine Translation Project is to develop ideas, methods and tools that can help translate Wikipedia articles from one language to another (particularly out of English and into languages with small numbers of fluent speakers). Motivation
TradWiki/WikiTran
LicenseAll code and data should be released under a free licence GFDL Advantages
TradWiki/WikiTran - Translation memory aproach
Lexical, syntactic and semantic analysis of wikipedia content
Information about the most popular sentences and expressions can be used to create a translation database of such expressions so translators don't need to repeat a translation.
Resources:
Links
References:
DiscussionTranslate or Write from scratch?Are you sure the other language wikipedias would rather translate text than write it themselves? It seems to me that it's almost more effort to translate text than to write an article yourself. For instance, I run the Esperanto wikipedia or eo: and I think we appreciate the international nature of our articles. I was wondering if other second-language wikipedias would feel the same way. I mean do the other language wikipedias want it?
T14N, I18N, L10N
It's true that the effort to write articles is almost the same as the effort to translate. But there are some exceptions. If you are not an expert in the topic, it's easier to translate than to write. On the other hand, it's clear to me that the number of contributers to the portuguese encyclopedia is very small. You must consider the fact that portuguese is not a second-language, but the first language of milions of people. Unfortunaly very few of those millions have access to the internet and/or have an education. A free encyclopedia would be an extraordinary resource for tose people, so every effort to speed the creation of portugurese version is welcomed. Of course, people that write to a second language wikipedia like Vikipedio, have diferent purposes, do it for fun and are not interested in Machine Translation. PS-You may be interested in knowing that the Traduki project uses esperanto for the deeper word representation to achieve machine translation. user:joao. Point well made. It would be especially good for the minority languages. I was aware of the Traduki project and it looks interesting although it looks like nothing has happened on the project lately... maybe I'm wrong. I actually looked at the pages again yesterday. ...and since I'm going to start learning Portuguese soon (I plan to visit Brazil next August), I'll probably take a closer look at it later. I now know Sim, N~ao and Obrigado. :) Make Auto-translations available?Now that I think more about it I'd like to see auto-translation so I can get rough translations of encyclopedias in non-English, non-Esperanto language wikipedias. Seems like in a future version we could have a drop down list on each page that could translate a page for us and also give a link to the article on another language wikipedia if it exists. I'm think there's already free services that do this, does anyone know? --Chuck Smith There are some links to such services above under "Free translations on the web" Joao Has anyone seen Google Translate at http://translate.google.com/translate_t ? Would a automatical translation script be run only once for each article, multiple times at an interval, immediately when changes are made or immediately on demand by a reader? If only once or by an interval, how would article conficts be handled? 24.198.63.192 03:52 Oct 18, 2002 (UTC) Machine translation can give the best of both worlds:
...or Not
Yeah, I'm not so big on the idea anymore either. I do think it's interesting as an extremely long term project, though. I've noticed that machine translation is adequate for getting the general meaning across but isn't very pleasing to the eye. If it has to be used, it'd probably best be used to populate a blank page so that native speakers of the language can clean it up in normal wikiwiki style. -- Daniel Thomas Mi nur bedauwras ke la tuta diskuto estas nur en la angla kaj ke la diskutantoj deiras de la punkto kvazaux la anglalingva vikipedio estus la cxefa kulturfonto. Ja valorus traduki artikolojn jam ekzistantajn sed en cxiujn direktojn (ne nepre nur de la angla). Kaj mi gxis nun spertis ke la auxtomataj tradukiloj donas acxegajn rezultojn. Arno Lagrange Fixing Auto-translatorsWhen I use automated translation, I usually observe two problems:
Both seems to be caused by ambiguities. So, my idea is:
This would require for each language to add two additional wikis to the 'presentable' version: one for disambingued texts, and one as a collection pool for raw translations. Sloyment 12:47, 22 Oct 2003 (UTC) Some examples how the above procedure could work:
The assumption behind this idea is that it would be easier to disambingue a text than to translate it, and that it is easier to correct an automated translation that has only few mistakes in it, than to correct the rubbish that current translation programs produce. Sloyment 14:59, 22 Oct 2003 (UTC) There are other problems. Some languages may not have words or phrases for certain technical concepts because no native speaker has ever needed them before. This is particularly true of languages with small numbers of native speakers in rural settings. It may be difficult to automatically translate an article on co-routines, for instance, because ideas like subroutine, co-routine, time-sharing and multi-tasking have never been put into words in that particular language before. A human translator can normally use a bit of imagination to invent a new term or reuse a term previously used for an analogous existing concept and if the translator is any good, the result will fit into the language reasonably well. However a machine can do little better than to leave the untranslatable term untranslated and mark it for human attention. -- Derek Ross 16:05, 26 Mar 2004 (UTC) Other wonderingThree main things I'm wondering about.
So essentially, if I knew any programming language other than HTML (hey, I'm only 14, though I am going to begin taking CC courses in C or some crap like that over the summer) and I were to make MT software, it would incorporate all 3 of these. I think that a lot of the programming behind neural networks is availible for free online to plug into whatever you want, so that (afaik) wouldn't be very hard, except maybe the customization part. UNL, at its best, claims a 99% accuracy rate. I have seen UNL at work. The English deconversions are fantastic, though they do leave something to be desired. As far as I can tell from what others have told me, though, the deconversions for languages such as Russian and Italian are - though one can get what they say - totally ungrammatical.--Node_ue 03:11, 7 Apr 2004 (UTC)
|