HXT：以純代碼讀取和寫入HTML到字符串時的令人驚訝的行爲

我想從字符串中讀取HTML，處理它並使用HXT將字符串作爲字符串返回。由於此操作不需要IO，我寧願執行箭頭runLA而不是runX。HXT：以純代碼讀取和寫入HTML到字符串時的令人驚訝的行爲

的代碼看起來是這樣的（省略爲了簡化處理）：

runLA (hread >>> writeDocumentToString [withOutputHTML, withIndent yes]) html

然而，周邊html標籤在結果丟失：

["\n <head>\n <title>Bogus</title>\n </head>\n <body>\n  Some trivial bogus text.\n </body>\n",""]

當我使用RUNX代替這樣：

runX (readString [] html >>> writeDocumentToString [withOutputHTML, withIndent yes])

我得到預期的結果：

["<html>\n <head>\n <title>Bogus</title>\n </head>\n <body>\n  Some trivial bogus text.\n </body>\n</html>\n"]

這是爲什麼，我該如何解決？

來源

2011-08-26 jgre

如果你看兩者的XmlTree s，你會看到readString增加了一個頂級"/"元素。對於非IOrunLA版本：

> putStr . formatTree show . head $ runLA xread html 
---XTag "html" [] 
    | 
    +---XText "\n " 
    | 
    +---XTag "head" [] 
    ...

並與runX：

> putStr . formatTree show . head =<< runX (readString [] html) 
---XTag "/" [NTree (XAttr "transfer-Status") [NTree (XText "200")... 
    | 
    +---XTag "html" [] 
     | 
     +---XText "\n " 
     | 
     +---XTag "head" [] 
     ...

writeDocumentToStringgetChildren使用以剝離該根元素。

解決此問題的簡單方法是使用類似selem包裹的xread輸出類似的根元素，以使它看起來像那種輸入writeDocumentToString的預計：

> runLA (selem "/" [xread] >>> writeDocumentToString [withOutputHTML, withIndent yes]) html 
["<html>\n <head>\n <title>Bogus</title>\n </head>\n <body>\n  Some trivial bogus text.\n </body>\n</html>\n"]

這將產生所需的輸出。

來源

2011-08-26 19:57:59

HXT：以純代碼讀取和寫入HTML到字符串時的令人驚訝的行爲

回答

相關問題