2016-08-24 153 views
2

我想下載網絡上的網頁的內部HTML,但是當我做到這一點,像šđčćž字符由ć¡等取代。下載網頁的HTML作爲一個UTF-8字符串

代碼我使用:

Dim sourceString As String = New System.Net.WebClient().DownloadString("SomeWebPage") 
TextBox1.Text = sourceString 

回答

2

你可能已經下載字節,然後使用Encoding類轉換爲UTF-8:

Async Function GetHtmlString(address As String) As Task(Of String) 
    Using client As New WebClient 
     Dim bytes = Await client.DownloadDataTaskAsync(address) 
     Dim s = Encoding.UTF8.GetString(bytes) 
     return s 
    End Using 
End Function 

一個更簡單的方式感謝@戴夫的評論:

Async Function GetHtmlString(address As String) As Task(Of String) 
    Using client As New WebClient 
     client.Encoding = Encoding.UTF8 
     Dim s = Await client.DownloadStringTaskAsync(address) 
     return s 
    End Using 
End Function 

使用示例:

Imports System.Net 
Imports System.Text 

Public Class Form1 
    Private Async Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load 
     Dim s = Await GetHtmlString("http://www.radiomerkury.pl/") 
    End Sub 

    Async Function GetHtmlString(address As String) As Task(Of String) 
     Using client As New WebClient 
      client.Encoding = Encoding.UTF8 
      Dim s = Await client.DownloadStringTaskAsync(address) 
      Return s 
     End Using 
    End Function 
End Class 
+1

或者只是設置[編碼(https://msdn.microsoft.com/en-us/library/system.net.webclient.encoding(V = vs.110)的.aspx)的屬性Web客戶端。 – dave

+0

我想這就是我想要的答案,但我不知道C#所以我可以將它翻譯成VB.Net ... –

+0

對不起,我已經粘貼了VB代碼:)我猜@dave是對的! – Aybe

0

Kibi,我認爲你在這裏的方式,方式,方式。我沒有看到VB.NET如何幫助你處理這種事情。下面是一個簡單而直觀的Excel VBA解決方案。我希望這能幫助你實現你的目標。

Sub DumpData() 

Set IE = CreateObject("InternetExplorer.Application") 
IE.Visible = True 

URL = "http://finance.yahoo.com/q?s=sbux&ql=1" 

'Wait for site to fully load 
IE.Navigate2 URL 
Do While IE.Busy = True 
    DoEvents 
Loop 

RowCount = 1 

With Sheets("Sheet1") 
    .Cells.ClearContents 
    RowCount = 1 
    For Each itm In IE.document.all 
     .Range("A" & RowCount) = itm.tagname 
     .Range("B" & RowCount) = itm.ID 
     .Range("C" & RowCount) = itm.classname 
     .Range("D" & RowCount) = Left(itm.innertext, 1024) 

     RowCount = RowCount + 1 
    Next itm 
End With 
End Sub