2013-02-28 46 views
0

我有一個卡在我的項目中,無法克服這個困難。我想從別人的一些幫助給我這個問題的解決方案:獲取字符串中的令牌塊

我有一個字符串,並在該字符串內有一些標記文本,我想手動將它們取出並將它們放入一個數組列表字符串。最終的結果可能有兩個數組列表,一個是普通文本,另一個是標記文本。下面是一個字符串示例,其中包含一些由開放標記「[[」和關閉標記「]]」包圍的標記。


第一步,通過將澱粉源與熱水混合製備麥芽汁,稱爲[[Textarea]]。熱水與搗碎的麥芽或麥芽混合。糖化過程需要[[CheckBox]],在這期間澱粉轉化爲糖,然後甜麥芽汁從穀物中排出。現在穀物被稱爲[[Radio]]。這種洗滌使釀酒商儘可能地從穀物中收集[[DropDownList]]可發酵液體。


有兩個數組列表操縱串後得到:

結果:

Normal Text ArrayList { "The first step, where the wort is prepared by mixing the starch source with hot water, is known as ", ". Hot water is mixed with crushed malt or malts in a mash tun. The mashing process takes around ", ", during which the starches are converted to sugars, and then the sweet wort is drained off the grains. The grains are now washed in a process known as ", ". This washing allows the brewer to gather ", " the fermentable liquid from the grains as possible." } 

Token Text ArrayList { "[[Textarea]]", "[[CheckBox]]", "[[Radio]]", "[[DropDownList]]" } 

兩個數組列表,一種是正常的文本數組列表已經5個元件,其文本之前或者在令牌之後,另一個是令牌文本數組列表具有4個元素,它們是字符串內的令牌文本。

這項工作可以完成哪些技術的剪切和子字符串,但它是一個很長的文本太難了,並會很容易得到錯誤和一些時間不能得到我想要的。如果在這個問題上有一些幫助,請在C#中發佈,因爲我使用C#來完成這項任務。

回答

1

這似乎做的工作(但請注意,此刻,我tokens數組包含普通令牌,而不是將它們包裹與[[]]

var inp = @"The first step, where the wort is prepared by mixing the starch source with hot water, is known as [[Textarea]]. Hot water is mixed with crushed malt or malts in a mash tun. The mashing process takes around [[CheckBox]], during which the starches are converted to sugars, and then the sweet wort is drained off the grains. The grains are now washed in a process known as [[Radio]]. This washing allows the brewer to gather [[DropDownList]] the fermentable liquid from the grains as possible."; 

var step1 = inp.Split(new string[] { "[[" }, StringSplitOptions.None); 
//step1 should now contain one string that's due to go into normal, followed by n strings which need to be further split 
var step2 = step1.Skip(1).Select(a => a.Split(new string[] { "]]" }, StringSplitOptions.None)); 
//step2 should now contain pairs of strings - the first of which are the tokens, the second of which are normal strings. 

var normal = step1.Take(1).Concat(step2.Select(a => a[1])).ToArray(); 
var tokens = step2.Select(a => a[0]).ToArray(); 

這還假定不存在不平衡[[]]序列輸入

是進入該解決方案的意見:如果你要圍繞每[[對原文中第一分割字符串,那麼第一個輸出字符串已經制作完畢。此外,第一個字符串之後的每個字符串都由一個標記,]]對和一個普通文本組成。例如。第二個結果中step1是:「多行文本]熱水用的糖化桶粉碎的麥芽或麥芽混合糖化過程大約需要。」

所以,如果你身邊的]]對分割這些結果,然後第一個結果是一個標記,第二個結果是一個普通的字符串。

+0

是的。這真太了不起了。這是我需要知道的問題。這是排序和計劃,以獲得錯誤的泄漏。我已經測試過,並根據需要得到最終結果。非常感謝你的幫助。 – 2013-02-28 07:32:06

+0

對不起,在通知之前發佈我的答案已經有一個答案。但非常感謝您的幫助。 – 2013-02-28 07:33:10