2016-11-17 72 views
3

我正在爲法拉第咒語努力掃描整個DDB表。下面的函數產生輸出,但返回的結果遠遠少於我所知道的18M記錄。如何獲取法拉第/掃描以遍歷整個DynamoDB表?

(far/scan 
    common/client-opts 
    v2-index/layer-table-name 
    {:return #{:layer-key :range-key}}) 
=> 
[{:range-key "soil&2015-07-22T15:13:09.101Z&ssurgo&v1", :layer-key "886985&886985"} 
{:range-key "soil&2015-07-29T19:20:09.973Z&ssurgo&v1", :layer-key "886985&886985"} 
    ... 
{:range-key "veg&2014-05-29T16:16:31.000Z&true-color&v1", :layer-key "1674603&1674603"} 
{:range-key "veg&2014-06-14T16:16:39.000Z&abs&v1", :layer-key "1674603&1674603"}] 

我能做些什麼來讓法拉第處理所有記錄?源代碼表明有一些:last-prim-kvs選項,但它不清楚那裏會發生什麼?此DDB表上的主鍵是由:layer-key:range-key組成的組合主鍵。

回答

1

如果它會適合在內存中,這個工程......

的關鍵,整個方案是獲得OPTS地圖設置與:limit 99映射以及一些:span-reqs {:max 1}映射。 :span-reqs映射對我來說完全是模糊的,但它似乎是概念上「頁面大小」背後的真正驅動力。我已成立了一個10元的表像...

;; This only works on the whole table because the table is small!!!! 
(far/scan 
    common/client-opts 
    "users.robert.kuhar.wtf_far" 
    {:return #{:part_key :sort_key :note}}) 
=> 
[{:part_key "456", :sort_key "fha.abs", :note "\"456\",\"fha.abs\" created 2016-12-08T21:32:20.789Z."} 
{:part_key "456", :sort_key "fha.rank", :note "\"456\",\"fha.rank\" created 2016-12-08T21:32:20.789Z."} 
{:part_key "456", :sort_key "fha.raw", :note "\"456\",\"fha.raw\" created 2016-12-08T21:32:20.789Z."} 
{:part_key "456", :sort_key "fha.true-color", :note "\"456\",\"fha.true-color\" created 2016-12-08T21:32:20.789Z."} 
{:part_key "456", :sort_key "soil.ssurgo", :note "\"456\",\"soil.ssurgo\" created 2016-12-08T21:32:20.789Z."} 
{:part_key "123", :sort_key "fha.abs", :note "\"123\",\"fha.abs\" created 2016-12-08T21:24:30.139Z."} 
{:part_key "123", :sort_key "fha.rank", :note "\"123\",\"fha.rank\" created 2016-12-08T21:24:30.139Z"} 
{:part_key "123", :sort_key "fha.raw", :note "\"123\",\"fha.raw\" created 2016-12-08T21:24:30.139Z."} 
{:part_key "123", :sort_key "fha.true-color", :note "\"123\",\"fha.true-color\" created 2016-12-08T21:24:30.139Z."} 
{:part_key "123", :sort_key "soil.ssurgo", :note "\"123\",\"soil.ssurgo\" created 2016-12-08T21:24:30.139Z."}] 

如果我想通過在這個時間4種元素移動,最初的電話是......

(far/scan 
    common/client-opts 
    "users.robert.kuhar.wtf_far" 
    {:return #{:part_key :sort_key :note} 
    :limit 4 
    :span-reqs {:max 1}}) 
=> 
[{:part_key "456", :sort_key "fha.abs", :note "\"456\",\"fha.abs\" created 2016-12-08T21:32:20.789Z."} 
{:part_key "456", :sort_key "fha.rank", :note "\"456\",\"fha.rank\" created 2016-12-08T21:32:20.789Z."} 
{:part_key "456", :sort_key "fha.raw", :note "\"456\",\"fha.raw\" created 2016-12-08T21:32:20.789Z."} 
{:part_key "456", :sort_key "fha.true-color", :note "\"456\",\"fha.true-color\" created 2016-12-08T21:32:20.789Z."}] 

的所有後續調用需要將:last-prim-kvs {:part_key "xxx" :sort_key "yyy"}設置爲該選擇地圖,以告訴法拉第在哪裏選擇。對於第2頁的通話就像是......

(far/scan 
    common/client-opts 
    "users.robert.kuhar.wtf_far" 
    {:return #{:part_key :sort_key :note} 
    :limit 4 
    :span-reqs {:max 1} 
    :last-prim-kvs {:part_key "456" :sort_key "fha.true-color"}}) 
=> 
[{:part_key "456", :sort_key "soil.ssurgo", :note "\"456\",\"soil.ssurgo\" created 2016-12-08T21:32:20.789Z."} 
{:part_key "123", :sort_key "fha.abs", :note "\"123\",\"fha.abs\" created 2016-12-08T21:24:30.139Z."} 
{:part_key "123", :sort_key "fha.rank", :note "\"123\",\"fha.rank\" created 2016-12-08T21:24:30.139Z"} 
{:part_key "123", :sort_key "fha.raw", :note "\"123\",\"fha.raw\" created 2016-12-08T21:24:30.139Z."}] 

我的10單元表的最後一頁是...

(far/scan 
    common/client-opts 
    "users.robert.kuhar.wtf_far" 
    {:return #{:part_key :sort_key :note} 
    :limit 4 
    :span-reqs {:max 1} 
    :last-prim-kvs {:part_key "123" :sort_key "fha.raw"}}) 
=> 
[{:part_key "123", :sort_key "fha.true-color", :note "\"123\",\"fha.true-color\" created 2016-12-08T21:24:30.139Z."} 
{:part_key "123", :sort_key "soil.ssurgo", :note "\"123\",\"soil.ssurgo\" created 2016-12-08T21:24:30.139Z."}] 

即使我問4.嘗試遠僅2元/超出此範圍的掃描總是空的。

(far/scan 
    common/client-opts 
    "users.robert.kuhar.wtf_far" 
    {:return #{:part_key :sort_key :note} 
    :limit 4 
    :span-reqs {:max 1} 
    :last-prim-kvs {:part_key "123" :sort_key "soil.ssurgo"}}) 
=> [] 

所以,這是它端到端,只要一切都適合內存。

(loop [accum [] 
     page (far/scan 
       client-opts 
       "users.robert.kuhar.wtf_far" 
       {:limit 4 
       :span-reqs {:max 1}})] 
    (if (empty? page) 
    accum 
    (let [last-on-page (last page) 
      last-part-key (:part_key last-on-page) 
      last-sort-key (:sort_key last-on-page)] 
     (recur 
     (into accum page) 
     (far/scan 
      client-opts 
      "users.robert.kuhar.wtf_far" 
      {:limit 4 
      :span-reqs {:max 1} 
      :last-prim-kvs {:part_key last-part-key :sort_key last-sort-key}}))))) 
=> 
[{:part_key "456", :sort_key "fha.abs", :note "\"456\",\"fha.abs\" created 2016-12-08T21:32:20.789Z."} 
... 
{:part_key "123", :sort_key "soil.ssurgo", :note "\"123\",\"soil.ssurgo\" created 2016-12-08T21:24:30.139Z."}] 

我覺得在「如何讓法拉第/掃描行走整個DynamoDB表格?是它不能。你需要手動建立它。