2016-01-13 106 views
0

我想使用excel vba從網頁中的標題類型中提取屬性值。我想從webpage刮數據結構如下:Excel:從HTML標題查詢屬性

<div class="index-detail"> 
 
    <h5><a href="/indices/equity/dow-jones-sustainability-chile-index-clp" title="DJSI Chile" contentIdentifier="2e9cb165-0cbf-4070-a5ef-dc20bf6219ba" contentType="web-page" contentTitle="Dow Jones Sustainability™ Chile Index (CLP)">DJSI Chile</a></h5> 
 
    <span class="return-value">917.08 </span> 
 
    <span class="daily-change down ">-0.1% ▼ </span> 
 
</div>

使用getElementsByClassNamegetElementsByTagName我已經提取的標題<h5>,但是當我打印的標題我的innerText得到DJSI Chile,但我想獲得屬性contentTitle的文本Dow Jones Sustainability™ Chile Index (CLP)

我該怎麼做?

UPDATE

的代碼我使用如下:

Sub myConSP() 
 
    
 
    ' Declare variables 
 
    Dim oHtmlSP As HTMLDocument 
 
    Dim tSPIndex As HTMLDivElement 
 
    Dim tSPIdx As HTMLDivElement 
 

 
    ' Load page inside HTMLDocument 
 
    Set oHtmlSP = New HTMLDocument 
 
    With CreateObject("WINHTTP.WinHTTPRequest.5.1") 
 
     .Open "GET", "http://www.espanol.spindices.com", False 
 
     .send 
 
     oHtmlSP.body.innerHTML = .responseText 
 
    End With 
 

 
    ' Get indices 
 
    Set tSPIndex = oHtmlSP.getElementById("all-indices-slider") 
 

 
    Set objTitleTag = tSPIndex.getElementsByClassName("index-detail")(0).getElementsByTagName("h5")(0) 
 
    MsgBox objTitleTag.getAttribute("contentTitle").innerText 
 

 
End Sub

+0

'objTitleTag.getAttribute(「contentTitle」)' –

+0

如何定義objTitleTag? – capm

+0

這就是你所謂的'innerText'。總是最好展示您的實際代碼:更容易提出有關添加內容的建議。 –

回答

1

的屬性附加到<a>,不<h5>(抱歉,是我的錯在以上評論中):

Sub TT() 

    Dim html As String, d As New HTMLDocument, el 

    html = "<div class='index-detail'>" & _ 
    "<h5><a href='/indices/equity/dow-jones-sustainability-chile-index-clp' " & _ 
    "title='DJSI Chile' contentIdentifier='2e9cb165-0cbf-4070-a5ef-dc20bf6219ba' " & _ 
    "contentType = 'web-page' " & _ 
    "contentTitle='Dow Jones Sustainability™ Chile Index (CLP)'>DJSI Chile</a></h5> " & _ 
    "<span class='return-value'>917.08 </span> " & _ 
    "<span class='daily-change down '>-0.1% ? </span></div>" 

    d.body.innerHTML = html 

    Set el = d.getElementsByClassName("index-detail")(0).getElementsByTagName("a")(0) 

    Debug.Print el.getAttribute("contentTitle") 
     ' >>> Dow Jones Sustainability™ Chile Index (CLP) 


End Sub 
+0

我明白了,當我通過class'index-detail'提取元素,然後隔離標題'

'時,屬性不屬於標題,而是屬於以< capm