Jump to content

Machine translation: Difference between revisions

From Niidae Wiki
traduki version 0.2 released
question about purpose
Line 49: Line 49:
***Python-based project, uses Esperanto as a metalanguage
***Python-based project, uses Esperanto as a metalanguage
***Website hasn't been updated in while
***Website hasn't been updated in while
***http://traduki.sourceforge.net (version 0.2 released, and translates "The dog eats the apple" to esperanto: "La  hundo mangxas la pomon")
***http://traduki.sourceforge.net (version 0.2 released, and translates "The dog eats the apple" to Esperanto: "La  hundo mangxas la pomon")
**http://www.link.cs.cmu.edu/link/ -- Link Grammar
**http://www.link.cs.cmu.edu/link/ -- Link Grammar
*Databases
*Databases
**http://www.cogsci.princeton.edu/~wn/links/ -- <nowiki>WordNet</nowiki>, a lexical database for the english language.
**http://www.cogsci.princeton.edu/~wn/links/ -- <nowiki>WordNet</nowiki>, a lexical database for the English language.




Line 93: Line 93:
*The World Wide Translator (The Tragedy of the Anticommons of translations memories)
*The World Wide Translator (The Tragedy of the Anticommons of translations memories)
**http://www.technologyreview.com/web/leo/leo092101.asp?nt=le921t
**http://www.technologyreview.com/web/leo/leo092101.asp?nt=le921t
----
Are you sure the other language wikipedias would rather translate text than write it themselves?  It seems to me that it's almost more effort to translate text than to write an article yourself.  For instance, I run the [http://eo.wikipedia.com/ Esperanto wikipedia] and I think we appreciate the international nature of our articles.  I was wondering if other second-language wikipedias would feel the same way.  I mean do the other language wikipedias want it?
--[[ChuckSmith]]

Revision as of 19:31, 16 December 2001

The purpose of the Wikipedia Machine Translation Project is to develop ideas, methods and tools that can help translate Wikipedia to non-english languages.

Motivation
Small languages can't produce articles as fast as english wikipedia because the number of wikipedians is too low. The solution for this problem is the translation of english wikipedia. But, some languages will not have enough translators. Machine Translation can improve the productivity of the community.

TradWiki/WikiTran
TradWiki/WikiTran (WikipediaTranslator/WikiTranslator/BabelWiki) is a to be coded wiki that helps wikipedians to translate articles from english to other languages.


License All code and data should be released under a free licence

Advantages

  • faster translation of wikipedia
  • generation of large amounts of useful data (corpora).
  • creation of an useful tool


Lexical, syntactic and semantic analysis of wikipedia content
The first step for wikipedia translation is the analysis of wikipedia's content. This analysis will determine:

  • Number of words and sentences
  • Words distribution
  • Frequency of the most popular sentences and expressions
  • Semantic relations between words and between sentences
  • Syntactic analysis of all sentences

Information about the most popular sentences and expressions can be used to create a translation database of such expressions so translators don't need to repeat a translation.

Resources:


TradWiki/WikiTran - Translation memory aproach
A Translation Memory is a computer program that uses a database of old translations to help a human translator. If this aproach is followed, WikipediaTranslator will need the following features:

  • visualization of translated and original versions
  • split of original versions on several parts for individual translation


Links


Are you sure the other language wikipedias would rather translate text than write it themselves? It seems to me that it's almost more effort to translate text than to write an article yourself. For instance, I run the Esperanto wikipedia and I think we appreciate the international nature of our articles. I was wondering if other second-language wikipedias would feel the same way. I mean do the other language wikipedias want it?

--ChuckSmith