提取文本多次

-1

我有一個樣本文本數據如下：提取文本多次

1; ABC; 111; 10-NOV-2017 2; abc; 222; 11-NOV-2017 3; ABC; 333; 12-NOV-2017

鑑於2個輸入ABC和11 nov1017我想提取字符串之間的兩個，即

如何使用regex得到結果？有沒有其他辦法可以達到同樣的效果？

實際的數據是這樣的：

113434;軸黃金ETF; 2651.2868; 2651.2868; 2651.2868; 20-NOV-2017 113434;軸黃金ETF; 2627.6778; 2627.6778; 2627.6778; 21-新手覺得2017年 113434;軸黃金ETF; 2624.1880; 2624.1880; 2624.1880; 22 - 11月 - 2017年

任何幫助，高度讚賞。謝謝！

來源

2017-12-02 Imran

和你嘗試過什麼 – revo

我試過數據[/ ＃{ 'ABC;'}。（*？）＃{'; 11-nov1017'}/m，1]但它的返回; 111; 10-nov-2017 2; abc; 222;即第一個abc和11-nov1017之間的數據 – Imran

以下是提取所需子字符串（如果存在）的兩種方法。我們給出以下內容。

str = "1;abc;111;10-nov-2017 2;abc;222;11-nov-2017 3;abc;333;12-nov-2017" 
before_str = "abc;" 
date_str = ";11-nov-2017"

我假設的date_str值出現在str最多一次。

＃1使用正則表達式

r =/
    .*   # match any number of characters greedily 
    #{before_str} # match the content of the variable 'before_str' 
    (.*)   # match any number characters greedily, in capture group 1 
    #{date_str} # match the content of the variable 'date_str' 
    /x   # free-spacing regex definition mode 
    #=> /.*abc;(.*);11-nov-2017/x 

str[r,1] 
    #=> "222"

這裏的關鍵是.*在正則表達式的開始。作爲一個貪婪的匹配，它會導致下一個匹配成爲"abc;"（的值before_str）的前一個（值爲date_str）的最後一個實例。

＃2確定用於期望subtring的開始和結束索引

idx_date = str.index(date_str) 
    #=> str.index(";11-nov-2017") => 31 
idx_before = str.rindex(before_str, idx_date-before_str.size) 
    #=> str.rindex("abc;", 27) => 24 
str[idx_before + before_str.size..idx_date-1] 
    #=> str[24+4..31-1] => str[28..30] => "222"

如果任idx_date或idx_before被nil，nil將被返回，並且最後一個表達式不進行評估。

查看String#rindex，特別是可選的第二個參數的功能。

（有人可能會寫str[idx_before + date_str.before...idx_date]，但我發現在範圍內使用三個點的是錯誤的潛在來源，所以我總是用兩個點。）

來源

2017-12-02 23:48:11

完美！說得通。謝謝。 – Imran

你可以看看結果： /abc(.*?)10-nov-2017/g.exec("1;abc;111;10-nov-2017 2; abc; 222; 11-nov-2017 3; abc; 333; 12-nov-2017「）[1]

來源

2017-12-02 21:12:13

提取文本多次

回答

相關問題