by Samuel Chong ([email protected])
Pasadena City College
(Last update: February 6, 2015)
What is WordNet?
WordNet is a lexical database that groups words into sets of synonyms called synsets, providing short definitions and usage examples, and records a number of relations among these synonym sets or their members. WordNet can thus be seen as a combination of dictionary and thesaurus. While it is accessible to human users via a web browser, its primary use is in automatic text analysis and artificial intelligence applications. Both the lexicographic data (lexicographer files) and the compiler (called grind) for producing the distributed database are available
Multilingual WordNet by Language and Their Licenses
Below is a table of multilingual WordNet by language and their licenses, as well as other pertinent information. Please contact me if you have any additions.Language Multilingual WordNet Name License Afrikaans Afrikaans WordNet Afrikaans WordNet License: For educational or non-commercial research only Albanian AlbaNet GPL 3.0 Arabic Arabic WordNet CC by SA 3.0 Armenian N/A N/A Azerbaijani N/A N/A Basque EusWordNet (Multilingual Central Repository) CC BY-NC-SA 3.0 Belarusian N/A N/A Bengali Bengali WordNet Browse Online, CC By 3.0 Bosnian N/A N/A Bulgarian BulNet Browse Online, CC By 3.0 Catalan Catalan WordNet-LMF CC by SA 3.0 Cebuano N/A N/A Chichewa N/A N/A Chinese Chinese WordNet (Taiwan) WordNet Chinese Open WordNet WordNet Croatian Croatian WordNet CC - BY - NC - SA 3.0 Czech Czech WordNet CC BY-NC-SA 3.0 and WordNet Danish DanNet WordNet Dutch Dutch WordNet Dutch WordNet License: Free for research only. Open Source Dutch WordNet CC SA BY 4.0 English Princeton WordNet of English WordNet Esperanto N/A N/A Estonian Estonian Wordnet (EstWN) CC BY-NC-SA 3.0 Filipino N/A (currently in development) N/A Finnish FinnWordNet CC BY 3.0 and WordNet French WOLF Cecill-C license (LGPL compatible) WoNeF CC-By SA 3.0 Galician Galician WordNet-LMF (Multilingual Central Repository) CC BY 3.0 Georgian N/A N/A German GermaNet Free for academic research Greek Greek Wordnet CC by SA 3.0 Gujarati Gujarati WordNet Browse Online Haitian Creole N/A N/A Hausa N/A N/A Hebrew Hebrew WordNet WordNet Hindi Hindi WordNet GNU FDL Hmong N/A N/A Hungarian Hungarian WordNet OPEN FOR ACADEMIC USE MetaShare Commons NonCommercial-NoRedistribution (MSCommons_NoCOM-NC-NR) Icelandic IceWordNet CC BY 3.0 Igbo N/A N/A Indonesian Wordnet Bahasa MIT License Irish Irish Language Semantic Network GNU Free Documentation License Italian ItalWordNet ELRA END USER MultiWordNet CC BY 3.0 Japanese Japanese WordNet WordNet Javanese N/A N/A Kannada Indo Wordnet Browse Online Kazakh N/A N/A Khmer N/A N/A Korean Korlex (Korean WordNet) Browse Online (currently not working) Lao N/A N/A Latin MultiWordNet Latin CC By 3.0 Latvian N/A N/A Lithuanian Lithuanian WordNet CC BY SA 3.0 Macedonian Non Public N/A Malagasy N/A N/A Malay Wordnet Bahasa MIT License Malayalam N/A N/A Maltese N/A N/A Maori N/A N/A Marathi Marathi WordNet Browse Online Mongolian N/A N/A Myanmar (Burmese) Under development N/A Nepali N/A N/A Norwegian Norwegian Wordnet - Bokmål CC - ZERO, WordNet Persian Persian Wordnet Free to use Polish plWordNet WordNet PolNet Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License Portuguese Onto.PT Portuguese WordNet CC BY 3.0 Open Wordnet Portuguese CC BY 4.0 WordNet.PT Browse Online Punjabi N/A N/A Romanian Romanian WordNet 3.0 MS Commons - BY - NC - ND Russian RussianWordnet WordNet Serbian Serbian Wordnet CC BY NC Sesotho N/A N/A Sinhala Sinhala Wordnet Wordnet Slovak Slovak WordNet CC BY SA Slovenian sloWTool Online Browse Somali N/A N/A Spanish MultiWorldNet Spanish CC by 3.0 Sundanese N/A N/A Swahili N/A N/A Swedish Swedish WordNet-SALDO CC BY 3.0 Tajik N/A N/A Tamil Tamil WordNet N/A Telugu Indo Wordnet Browse Online Thai Thai WordNet WordNet Turkish Turkish WordNet WordNet Ukrainian N/A N/A Urdu Urdu WordNet Browse Online Uzbek N/A N/A Vietnamese VietSentiWordNet WordNet Welsh Gweiadur Gwerin Yiddish N/A N/A Yoruba N/A N/A Zulu N/A N/A
Wordnets and Translation Applications
Concerning translation, wordnets can be used both directly as an auxiliary tool and as a lexical basis for machine translation, as distinguishing between multiple possible senses of a word is an important subtask of most NLP applications, machine translation amongst them.
In fact, machine translation is one of the most direct applications of word-sense disambiguation: if we are able to identify the correct semantic meaning of each word in the source language, this will allow us to determine with more accuracy the appropriate words that lexicalize it in the target language.
Currently, Google Translate uses wordnet data along with statistical machine translation method to improve its translation results.
Copyright 2014-2017. Samuel Chong, Pasadena City College