我正在編寫一個Clojure庫來解析Mac OS X的基於XML的property list files。代碼工作正常,除非你給它一個大的輸入文件,在這一點你得到java.lang.OutOfMemoryError: Java heap space
。同一個多方法的不同方法之間的遞歸
下面是一個例子輸入文件(足夠小,做工精細):
<plist version="1.0">
<dict>
<key>Integer example</key>
<integer>5</integer>
<key>Array example</key>
<array>
<integer>2</integer>
<real>3.14159</real>
</array>
<key>Dictionary example</key>
<dict>
<key>Number</key>
<integer>8675309</integer>
</dict>
</dict>
</plist>
clojure.xml/parse
變成這樣:
{:tag :plist, :attrs {:version "1.0"}, :content [
{:tag :dict, :attrs nil, :content [
{:tag :key, :attrs nil, :content ["Integer example"]}
{:tag :integer, :attrs nil, :content ["5"]}
{:tag :key, :attrs nil, :content ["Array example"]}
{:tag :array, :attrs nil, :content [
{:tag :integer, :attrs nil, :content ["2"]}
{:tag :real, :attrs nil, :content ["3.14159"]}
]}
{:tag :key, :attrs nil, :content ["Dictionary example"]}
{:tag :dict, :attrs nil, :content [
{:tag :key, :attrs nil, :content ["Number"]}
{:tag :integer, :attrs nil, :content ["8675309"]}
]}
]}
]}
我的代碼變成Clojure的數據結構
{"Dictionary example" {"Number" 8675309},
"Array example" [2 3.14159],
"Integer example" 5}
這
我的代碼的相關部分看起來像
; extract the content contained within e.g. <integer>...</integer>
(defn- first-content
[c]
(first (c :content)))
; return a parsed version of the given tag
(defmulti content (fn [c] (c :tag)))
(defmethod content :array
[c]
(apply vector (for [item (c :content)] (content item))))
(defmethod content :dict
[c]
(apply hash-map (for [item (c :content)] (content item))))
(defmethod content :integer
[c]
(Long. (first-content c)))
(defmethod content :key
[c]
(first-content c))
(defmethod content :real
[c]
(Double. (first-content c)))
; take a java.io.File (or similar) and return the parsed version
(defn parse-plist
[source]
(content (first-content (clojure.xml/parse source))))
代碼的肉是content
函數,該函數是一個派生在:tag(XML標記的名稱)上的多方法。我想知道是否有什麼不同,我應該做的,以使這種遞歸更好。我試圖用trampoline content
替換所有三個電話content
,但那不起作用。我應該怎麼做才能讓這種相互遞歸更高效地工作?或者我採取了一個根本錯誤的方法?
編輯:順便說一句,這個代碼是available on GitHub,在這種形式可能更容易玩弄。
我還沒有聽說過xml-zip,但我會研究它。謝謝! – bdesham 2011-02-03 19:18:36