在谷歌的Diff-比賽貼片項目股份的一些想法的維基。從 http://code.google.com/p/google-diff-match-patch/wiki/Plaintext:
One method is to strip the tags from the HTML using a simple regex or node-walker. Then diff the HTML content against the text content. Don't perform any diff cleanups. This diff enables one to map character positions from one version to the other (see the diff_xIndex function). After this, one can apply all the patches one wants against the plain text, then safely map the changes back to the HTML. The catch with this technique is that although text may be freely edited, HTML tags are immutable.
Another method is to walk the HTML and replace every opening and closing tag with a Unicode character. Check the Unicode spec for a range that is not in use. During the process, create a hash table of Unicode characters to the original tags. The result is a block of text which can be patched without fear of inserting text inside a tag or breaking the syntax of a tag. One just has to be careful when reconverting the content back to HTML that no closing tags are lost.
我有一種預感,第二個想法,地圖HTML標籤對Unicode的佔位符,可能會更好地工作比一個原本想......特別是如果你的HTML標籤是從一些縮小的集合,以及在顯示交錯(刪除線/加下劃線)diff標記時可以執行一點點打開/關閉修改。
另一種可能使用簡單樣式的方法是刪除HTML標籤,但記住受影響的字符索引。例如,「職位8-15是粗體」。然後,執行明文差異。最後,使用wiki第一種方法中的diff_xIndex位置映射思想,智能地重新插入HTML標籤以重新應用樣式到存活/添加的範圍。 (也就是說,如果老位置8-13活了下來,但轉移到20-25,插入周圍還有在B標記。)
Gamers2000,感謝您的評論。我曾嘗試過SynchoEdit,但沙箱和開發版本都沒有工作。順便說一句,我也在你原來的「OT庫問題」中提出一個問題,你是否也在使用google-diff-match-patc?你如何使用它豐富的格式htmlstrings?感謝您的任何意見。 – Steve 2010-01-27 02:17:10
您好Steve,我正在使用diff-match-patch,但我正在使用它來同步純文本。 另外,我實際上使用了MobWrite(http://code.google.com/p/google-mobwrite),它是一個diff-match-patch的實現。 對不起,我不能有太大的幫助! – gamers2000 2010-01-27 03:38:06
感謝您的快速評論。 – Steve 2010-01-27 05:01:26