獲取純文本從控制中的WinForms

-1

htmlEditor1.Html在WinForm的輸出是：獲取純文本從控制中的WinForms

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> 
<META content="text/html; charset=unicode" http-equiv=Content-Type>.......

我是新來這個。我不知道上面的格式是什麼。

但我需要在下面的格式（純文本或HTML）的輸出，這樣我就可以在數據庫表中的保存：

"some text checking\r\n"

任何建議怎麼走呢？

來源

2015-11-04 Digambar Malla

輸出是HTML，看起來像一個Microsoft Word文檔。您使用的是什麼WinForms控件？ –

Dave R，htmlEditor是第三方控件：https：//yarte.codeplex.com/。由於我必須顯示富文本和編輯功能...... –

顯然，您的第三方控件不支持檢索除原始HTML以外的任何內容。

如果你需要解析這個來檢索特定元素的值，那麼我建議使用HTML Agility Pack。您可以使用NuGet軟件包管理器將它添加到您的解決方案中（在解決方案資源管理器中右鍵單擊您的解決方案，選擇「管理NuGet軟件包...」，然後搜索並添加HtmlAgilityPack軟件包）。

完成此操作後，您可以在代碼中處理HTML。例如，如果你想找回在每個段落的文字，你可以這樣做：

// Create an HTML Document to parse 
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument(); 
// Load in the third party control's HTML output 
doc.LoadHtml(htmlEditor1.Html); 
// Retrieve the paragraph (p) nodes of the document 
List<HtmlAgilityPack.HtmlNode> paragraphNodes = doc.DocumentNode.DescendantNodes() 
    .Where(node => node.Name == "p") 
    .ToList(); 

// Process each of the paragraph nodes in turn 
foreach (var node in paragraphNodes) 
{ 
    // Output the paragraph text 
    // TODO: save the text in the database... 
    Console.WriteLine(node.InnerText); 
}

注：如果HTML確實代表一個Word文檔，節點很可能有不同的名稱來上面可能帶有名稱空間前綴和冒號。您需要將上述示例中的node.Name == "p"代碼更改爲node.Name == "<prefix>:<nodename>"，以便能夠處理這些代碼。 node.Name == "w:p"。

來源

2015-11-04 14:58:43

獲取純文本從控制中的WinForms

回答

相關問題