Q

將語言代碼與此語言爲官方或常用語言的國家/地區的語言代碼匹配

2010-04-21 92 views 4 likes

4

是否有任何python庫可以獲取特定語言代碼的國家/地區列表，該語言代碼是官方語言還是常用語言？將語言代碼與此語言爲官方或常用語言的國家/地區的語言代碼匹配

例如，「fr」的語言代碼與法語爲官方語言的29個國家以及常用的8個國家相關聯。

2010-04-21 jack

A

回答

0

pycountry（嚴重）。你可以從Package Index得到它。

2010-04-21 06:02:13 doug

+1

我只是看看它的文檔，它不似乎你可以提供一個語言代碼，並獲得所有使用該語言的國家列表 – 2010-04-21 06:08:25

+0

可能值得再次檢查 - 我之所以這麼說是因爲我使用這個軟件包用於類似的目的（貨幣） - *但*我無法使用該界面。相反，我不得不直接使用包中提供的五個XML數據庫。 – doug 2010-04-21 06:25:10

+1

@ a_m0d：您可能需要自己編寫一些代碼。 – 2010-06-05 23:39:11

3

尋找Babel套餐。它爲每個受支持的語言環境提供了一個pickle文件。請參閱localedata模塊中的list（）函數以獲取所有語言環境的列表。然後寫一些代碼的語言環境分成（語言，國家）等等等等

2010-06-05 23:38:01

+0

使用'babel.languages.get_territory_language_info（）' – Rmatt 2017-01-03 17:41:21

+0

@Rmatt真的很容易這是一個驚人的事情，一個包在六年內可以變得更容易使用:-) – 2017-01-03 21:47:18

+0

當然，這就是爲什麼我也提高了你的答案！你帶來了一條體面的道路，讓新手更加精確;） – Rmatt 2017-01-06 14:31:29

0

退房Ethnologue

不過要小心......

印度a lot of official languages。

2010-07-23 19:27:17 NinjaCat

12

儘管接受了答案，據我所知，pycountry下的所有xml文件都不包含映射語言到國家的方法。它包含語言和他們的iso代碼列表，國家和他們的iso代碼列表，以及其他有用的東西，但不是。

同樣，巴別克包是偉大的，但經過一段時間後，我找不到任何方式列出所有語言爲一個特定的國家。你能做的最好的是「最有可能的」語言：https://stackoverflow.com/a/22199367/202168

因此，我不得不把它自己...

def get_territory_languages(): 
    import lxml 
    import urllib 

    langxml = urllib.urlopen('http://unicode.org/repos/cldr/trunk/common/supplemental/supplementalData.xml') 
    langtree = lxml.etree.XML(langxml.read()) 

    territory_languages = {} 
    for t in langtree.find('territoryInfo').findall('territory'): 
     langs = {} 
     for l in t.findall('languagePopulation'): 
      langs[l.get('type')] = { 
       'percent': float(l.get('populationPercent')), 
       'official': bool(l.get('officialStatus')) 
      } 
     territory_languages[t.get('type')] = langs 
    return territory_languages

你可能想這樣做的結果保存在一個文件，而不是調用每次你需要它時都可以在網上瀏覽。

此數據集包含「非官方」的語言，以及，你可能不希望包括那些，這裏的一些示例代碼：

TERRITORY_LANGUAGES = get_territory_languages() 

def get_official_locale_ids(country_code): 
    country_code = country_code.upper() 
    langs = TERRITORY_LANGUAGES[country_code].items() 
    # most widely-spoken first: 
    langs.sort(key=lambda l: l[1]['percent'], reverse=True) 
    return [ 
     '{lang}_{terr}'.format(lang=lang, terr=country_code) 
     for lang, spec in langs if spec['official'] 
    ] 

get_official_locale_ids('es') 
>>> ['es_ES', 'ca_ES', 'gl_ES', 'eu_ES', 'ast_ES']

2014-03-05 15:59:18 Anentropic

相關問題