2017-02-13 60 views
0

我想使用第一個indexof和substring。如何從文本部分使用indexof和substring或HtmlAgilityPack獲取數字?

在HTML文件中,我下載我有這部分文本:

var arrayImageTimes = []; 
arrayImageTimes.push('201702130145');arrayImageTimes.push('201702130200');arrayImageTimes.push('201702130215');arrayImageTimes.push('201702130230');arrayImageTimes.push('201702130245');arrayImageTimes.push('201702130300');arrayImageTimes.push('201702130315');arrayImageTimes.push('201702130330');arrayImageTimes.push('201702130345');arrayImageTimes.push('201702130400'); 

,我想提取到一個列表或數組只有數字到底意味着我將有例如一串名單:

201702130145 
201702130200 
201702130215 

所有各數字之間 ''

我想:

public void ExtractDateAndTimes(string f) 
     { 
      string startTag = "var arrayImageTimes = [];"; 
      string endTag = "</script>"; 
      int startTagWidth = startTag.Length; 
      int endTagWidth = endTag.Length; 
      int index = 0; 
      while (true) 
      { 
       index = f.IndexOf(startTag, index); 
       if (index == -1) 
       { 
        break; 
       } 
       // else more to do - index now is positioned at first character of startTag 
       int start = index + startTagWidth; 
       index = f.IndexOf(endTag, start + 1); 
       if (index == -1) 
       { 
        break; 
       } 
       // found the endTag 
       string g = f.Substring(start, index - start); 
      } 
     } 

而且在構造函數:

string text = File.ReadAllText(@"c:\Temp\testinghtml.html"); 
ExtractDateAndTimes(text); 

但我得到的是VAR arrayImageTimes我在上面添加的文字只是塊。

+1

爲什麼有些數字不能在你的結果,如201702130230? – CodingYoshi

+0

@CodingYoshi你說得對。我剛剛舉了解釋數字的例子。但它應該解析所有這些不僅僅是我顯示的結果。 –

回答

1

使用Regex找到所有匹配到名爲捕獲組使用Named matched subexpression

// Don't forget to escape full stops! 
// Capture quoted values inside round braces into imageTime capturing group 
Regex regex = new Regex(@"arrayImageTimes\.push\('(?<imageTime>\d+)'\)", RegexOptions.ExplicitCapture | RegexOptions.IgnoreCase | RegexOptions.Singleline); 

MatchCollection matches = regex.Matches(myString); 

List<string> timestamps = new List<string>(); 

foreach (Match m in matches) 
{ 
    timestamps.Add(m.Groups["imageTime"].Value); 
} 
相關問題