2015-11-08 106 views
-1

因此,我正在嘗試從最低價格讀取Steam商店頁面到最高價格。我有需要的URL,我寫了一些代碼,這些代碼曾經工作過,但不再工作。我花了一些時間試圖解決這個問題,但我似乎無法找到問題。C#嘗試使用XmlNode讀取頁面

Link I am trying to read.

這裏是代碼。

//List of items from the Steam market from lowest to highest 
    private void priceFromMarket(int StartPage) 
    { 
     if (valueList.Count != 0) 
     { 
      valueList.Clear(); 
      numList.Clear(); 
      nameList.Clear(); 
     } 
     string pageContent = null; 
     string results_html = null; 
     try 
     { 
      HttpWebRequest myReq = (HttpWebRequest)WebRequest.Create("http://steamcommunity.com/market/search/render/?query=appid:730&start=" + StartPage.ToString() + "&sort_column=price&sort_dir=asc&count=100&currency=1&l=english"); 
      HttpWebResponse myRes = (HttpWebResponse)myReq.GetResponse(); 
      using (StreamReader sr = new StreamReader(myRes.GetResponseStream())) 
      { 
       pageContent = sr.ReadToEnd(); 
      } 
     } 
     catch { Thread.Sleep(30000); priceFromMarket(StartPage); } 
     if (pageContent == null) { priceFromMarket(StartPage); } 
     try 
     { 
      JObject user = JObject.Parse(pageContent); 
      bool success = (bool)user["success"]; 
      if (success) 
      { 
       results_html = (string)user["results_html"]; 
       string data = results_html; 
       data = "<root>" + data + "</root>"; 
       XmlDocument document = new XmlDocument(); 
       document.LoadXml(System.Net.WebUtility.HtmlDecode(data)); 
       XmlNode rootnode = document.SelectSingleNode("root"); 
       XmlNodeList items = rootnode.SelectNodes("./a/div"); 
       foreach (XmlNode node in items) 
       { 
        //This does not work anymore! 
        //The try fails here at line 574! 
        string value = node.SelectSingleNode("./div[contains(concat(' ', @class, ' '), ' market_listing_their_price ')]/span/span").InnerText; 
        string num = node.SelectSingleNode("./div[contains(concat(' ', @class, ' '), ' market_listing_num_listings ')]/span/span").InnerText; 
        string name = node.SelectSingleNode("./div/span[contains(concat(' ', @class, ' '), ' market_listing_item_name ')]").InnerText; 
        valueList.Add(value); //Lowest price for the item 
        numList.Add(num); //Volume of that item 
        nameList.Add(name); //Name of that item 
       } 
      } 
      else { Thread.Sleep(60000); priceFromMarket(StartPage); } 
     } 
     catch { Thread.Sleep(60000); priceFromMarket(StartPage); } 
    } 

回答

3

這是從來沒有可靠的解析HTML爲XML,因爲HTML沒有得到很好的格式,以正確解析...

在C#中解析HTML我更喜歡使用CSQuery https://www.nuget.org/packages/CsQuery/

它可以讓你在c#中解析HTML,類似於通過jquery進行。

另一種方法是HTML敏捷包,您可能可以使用它,而無需更改大量代碼。它的功能與System.Xml.XmlDocument庫類似。