2
嗨,大家好,我有這個wiki格式化算法,我在Stacked上使用「wiki語法」創建HTML,我不確定當前使用的算法是否足夠好,最優或包含錯誤我不是真正的「正則表達式大師」。這是我目前使用的;什麼是完成維基格式化(.Net)的完美正則表達式?
// Body is wiki content...
string tmp = Body.Replace("&", "&").Replace("<", "<").Replace(">", ">");
// Sanitizing carriage returns...
tmp = tmp.Replace("\\r\\n", "\\n");
// Replacing dummy links...
tmp = Regex.Replace(
" " + tmp,
"(?<spaceChar>\\s+)(?<linkType>http://|https://)(?<link>\\S+)",
"${spaceChar}<a href=\"${linkType}${link}\"" + nofollow + ">${link}</a>",
RegexOptions.Compiled).Trim();
// Replacing wiki links
tmp = Regex.Replace(tmp,
"(?<begin>\\[{1})(?<linkType>http://|https://)(?<link>\\S+)\\s+(?<content>[^\\]]+)(?<end>[\\]]{1})",
"<a href=\"${linkType}${link}\"" + nofollow + ">${content}</a>",
RegexOptions.Compiled);
// Replacing bolds
tmp = Regex.Replace(tmp,
"(?<begin>\\*{1})(?<content>.+?)(?<end>\\*{1})",
"<strong>${content}</strong>",
RegexOptions.Compiled);
// Replacing italics
tmp = Regex.Replace(tmp,
"(?<begin>_{1})(?<content>.+?)(?<end>_{1})",
"<em>${content}</em>",
RegexOptions.Compiled);
// Replacing lists
tmp = Regex.Replace(tmp,
"(?<begin>\\*{1}[ ]{1})(?<content>.+)(?<end>[^*])",
"<li>${content}</li>",
RegexOptions.Compiled);
tmp = Regex.Replace(tmp,
"(?<content>\\<li\\>{1}.+\\<\\/li\\>)",
"<ul>${content}</ul>",
RegexOptions.Compiled);
// Quoting
tmp = Regex.Replace(tmp,
"(?<content>^>.+$)",
"<blockquote>${content}</blockquote>",
RegexOptions.Compiled | RegexOptions.Multiline).Replace("</blockquote>\n<blockquote>", "\n");
// Paragraphs
tmp = Regex.Replace(tmp,
"(?<content>)\\n{2}",
"${content}</p><p>",
RegexOptions.Compiled);
// Breaks
tmp = Regex.Replace(tmp,
"(?<content>)\\n{1}",
"${content}<br />",
RegexOptions.Compiled);
// Code
tmp = Regex.Replace(tmp,
"(?<begin>\\[code\\])(?<content>[^$]+)(?<end>\\[/code\\])",
"<pre class=\"code\">${content}</pre>",
RegexOptions.Compiled);
// Now hopefully tmp will contain perfect HTML
對於那些誰認爲這是很難看到這裏的代碼,你也可以檢查出來here ...
下面是完整的「維基語法」;
語法在這裏:
Link; [http://x.com text]
*bold* (asterisk on both sides)
_italic_ (underscores on both sides)
* Listitem 1
* Listitem 2
* Listitem 3
(the above is asterixes but so.com also creates lists from it)
2 x Carriage Return is opening a new paragraph
1 x Carriage Return is break (br)
[code]
if(YouDoThis)
YouCanWriteCode();
[/code]
> quote (less then operator)
如果有一些「正則表達式大師」想先回顧一下這個正則表達式的邏輯誰我明白了很多:)
你有沒有找到一個替代方案,或者你是否暫時使用正則表達式? – Tomalak 2008-12-09 08:04:58