計算蟒蛇

在2個字典值之間的平均絕對誤差百分比我的位置的字典，然後屬性值對，像這樣：計算蟒蛇

{"Russia": 
    {"/location/statistical_region/size_of_armed_forces": 65700.0, 
    "/location/statistical_region/gni_per_capita_in_ppp_dollars": 42530.0, 
    "/location/statistical_region/gdp_nominal": 1736050505050.0, 
    "/location/statistical_region/foreign_direct_investment_net_inflows": 8683048195.0, 
    "/location/statistical_region/life_expectancy": 80.929, ...

等等，對每一個國家。

，然後將含有單個陣列字典，數組中的每個值是3個鍵的字典：

{ 
    "sentences": [ 
     { 
      "location-value-pair": { 
       "Russia": 6.1 
      }, 
      "parsedSentence": "On Tuesday , the Federal State Statistics Service -LRB- Rosstat -RRB- reported that consumer price inflation in LOCATION_SLOT hit a historic post-Soviet period low of NUMBER_SLOT percent in 2011 , citing final data .", 
      "sentence": "On Tuesday , the Federal State Statistics Service -LRB- Rosstat -RRB- reported that consumer price inflation in Russia hit a historic post-Soviet period low of 6.1 percent in 2011 , citing final data ." 
     }, 
     { 
      "location-value-pair": { 
       "Russia": 8.8 
      }, 
      "parsedSentence": "In 2010 , annual inflation in LOCATION_SLOT hit NUMBER_SLOT percent due to the summer drought , exceeding forecasts and equalling the figure for 2009 , the year of the global financial meltdown .", 
      "sentence": "In 2010 , annual inflation in Russia hit 8.8 percent due to the summer drought , exceeding forecasts and equalling the figure for 2009 , the year of the global financial meltdown ." 
     }, ...

我想要做的就是比較每個句子，每個位置和價值計算與第一個字典中的位置 - 值對匹配的最接近的匹配值，然後返回其對應的頂部統計屬性，並將其添加爲句子字典的新關鍵字。

例如：

句子1，我看到，我們正在尋找在俄羅斯和6.1的值。我想索引第一本字典，找到「俄羅斯」，並查看所有存在的值，例如65700.0,42530.0,1736050505050.0,8683048195.0。然後，我想找出每個屬性的平均絕對誤差，例如想着當

{ 
       "location-value-pair": { 
        "Russia": 6.1 
       }, 
       "predictedRegion": "/location/statistical_region/gni_in_ppp_dollars" 
       "meanabserror": 2% 
       "parsedSentence": "On Tuesday , the Federal State Statistics Service -LRB- Rosstat -RRB- reported that consumer price inflation in LOCATION_SLOT hit a historic post-Soviet period low of NUMBER_SLOT percent in 2011 , citing final data .", 
       "sentence": "On Tuesday , the Federal State Statistics Service -LRB- Rosstat -RRB- reported that consumer price inflation in Russia hit a historic post-Soviet period low of 6.1 percent in 2011 , citing final data ." 
      },

我的困惑：23％的size_of_armed_forces價值，爲gni_per_capita財產等的話，我想找到10％的最小的一個假設，並將其添加爲重點，以第二字典，所以寫這只是如何訪問另一個字典的鍵值作爲另一個字典的條件。我現在的想法是：

def predictRegion(sentenceArray,trueDict): 

    absPercentageErrors = {} 

    for location, property2value in trueDict.items(): 
     print location 
     absPercentageErrors['location'] = {} 
     for property,trueValue in property2value.iteritems(): 
      print property 
      absError = abs(sentenceArray['sentences']['location-value-pair'].key() - trueValue) 
      absPercentageErrors['location']['property'] = absError/numpy.abs(trueValue) 

    for index, dataTriples in enumerate(sentenceArray["sentences"]): 
     for location, trueValue in dataTriples['location-value-pair'].items(): 
      print location

但是很明顯，我不能在此行中訪問sentenceArray['sentences']['location-value-pair'].key()：absError = abs(sentenceArray['sentences']['location-value-pair'].key() - trueValue)因爲它是循環之外。

我怎樣才能獲得從環指的是完全不同的變量如此重要呢？

來源

2016-06-13 Dhruv Ghulati

您無緣*最小*在[最小，完整的，並且Verifable]（http://stackoverflow.com/help/mcve）實施例部分。你發佈了這樣一本大字典，所有「俄羅斯」的價值都被切斷了，所以你不可能完全理解你想要做什麼。 –

我發佈的第一個字典是一個例子（只有一個國家），但我已經修改它是俄羅斯而不是加拿大，以便更清楚。 –

請進一步修改它，讓你展示**您使用您的示例中的實際數字。**你使用'[23，421，24，412]'但是我沒有看到任何地方的那些當然 –

今後請閱讀如何制定一個很好的問題：https://stackoverflow.com/help/mcve

最小的，完整的和可驗證的。

我想這就是你要找的。

countries = {'Canada': {'a': 10, 'b': 150, 'c': 1000}, 
      'Russia': {'d': 9, 'e': 5, 'f': 1e5}} 
sentences = [ 
     {"location-value-pair": {"Russia": 6.1}, 
     "parsedSentence": "bob loblaw", 
     "sentence": "lobs law bomb" 
     }, 
     {"location-value-pair": {"Russia": 8.8}, 
      "parsedSentence": "some sentence", 
      "sentence": "lorem ipsum test" 
     }] 


def absError(numer,denom): 
    return abs(numer-denom)/float(denom) 

def findMatch(target, country): 
    return min(country, key= lambda x: absError(target, country.get(x))) 

def update(sentence): 
    (c,target), = sentence.get("location-value-pair").items() 
    country = countries[c] 
    matched = findMatch(target,country) 
    error = absError(target, country.get(matched)) 
    res = sentence.copy() 
    res.update({'predictedRegion': matched, 'meanabserror': "{:.2f}%".format(100*error)}) 
    return res 

updated = [update(sentence) for sentence in sentences]  
updated

輸出：

[{'location-value-pair': {'Russia': 6.1}, 
    'meanabserror': '22.00%', 
    'parsedSentence': 'bob loblaw', 
    'predictedRegion': 'e', 
    'sentence': 'lobs law bomb'}, 
{'location-value-pair': {'Russia': 8.8}, 
    'meanabserror': '2.22%', 
    'parsedSentence': 'some sentence', 
    'predictedRegion': 'd', 
    'sentence': 'lorem ipsum test'}]

來源

2016-06-13 20:13:01

你好，先生，是一個真正的英雄。我現在也學會了不要把'...'放在我的輸入句子裏，就像瘟疫一樣。 'lambda'等的使用對我來說是新的，謝謝你給我看這個！ –

回答

相關問題