Machine translation: Difference between revisions
http://www.babylon.com/ |
TradWiki |
||
Line 4: | Line 4: | ||
<br>Small languages can't produce articles as fast as english wikipedia because the number of wikipedians is too low. The solution for this problem is the translation of english wikipedia. But, some languages will not have enough translators. Machine Translation can improve de productivity of the community. | <br>Small languages can't produce articles as fast as english wikipedia because the number of wikipedians is too low. The solution for this problem is the translation of english wikipedia. But, some languages will not have enough translators. Machine Translation can improve de productivity of the community. | ||
''' | '''TradWiki''' | ||
<br>WikipediaTranslator | <br> TradWiki (WikipediaTranslator/WikiTranslator/BabelWiki) is a to be coded wiki that helps wikipedians to translate articles from english to other languages. | ||
Revision as of 16:12, 15 November 2001
The purpose of the Wikipedia Machine Translation Project is to develop ideas, methods and tools that can help translate Wikipedia to non-english languages.
Motivation
Small languages can't produce articles as fast as english wikipedia because the number of wikipedians is too low. The solution for this problem is the translation of english wikipedia. But, some languages will not have enough translators. Machine Translation can improve de productivity of the community.
TradWiki
TradWiki (WikipediaTranslator/WikiTranslator/BabelWiki) is a to be coded wiki that helps wikipedians to translate articles from english to other languages.
License
All code and data should be released under a free licence
Advantages
- faster translation of wikipedia
- generation of large amounts of usefull data (corpora).
- creation of an usefull tool
Lexical, syntactic and semantic analysis of wikipedia content
The first step for wikipedia translation is the analysis of wikipedia's content. This analysis will determine:
- Number of words and sentences
- Words distribution
- Frequency of the most popular sentences and expressions
- Semantic relations between words and between sentences
- Syntactic analysis of all sentences
Information about the most popular sentences and expressions can be used to create a translation database of such expressions so translators don't need to repeat a translation.
Resources:
- Dictionaries
- Ergane (free dictionary, several languages)
- Translation rules
WikipediaTranslator - Translation memory aproach
A Translation Memory is a computer program that uses a database of old translations to help a human translator. If this aproach is followed, WikipediaTranslator will need the following features:
- visualization of translated and original versions
- split of original versions on several parts for individual translation
Links
- general
- Links on Machine Translation (MT): http://www.ife.dk/url-mt.htm
- Machine translation (MT), and the future of the translation industry http://accurapid.com/journal/15mt.htm
- Machine Translation: an Introductory Guide: http://clwww.essex.ac.uk/MTbook/
- Visual Interactive Syntax Learning: http://visl.sdu.dk/visl/
- wikipedia articles
- Free translations on the web
- Neural nets
- Machine translation
- Translations memories
- wired magazine
- Portuguese
- Processamento Computacional do Português http://www.portugues.mct.pt/index.html