2011-06-16 90 views
0

我有一個文件叫lijst.txt。該文件是來自printmessage事件日誌文件的輸出。 所有行都具有相同的格式。 我想從每行中提取用戶名,該用戶名位於單詞owned bywas之間。另外,我想提取位於pages printed:.之間的pagecount。我想將這些值放在一個新的文本文件中。幫我解析一個文本文件並提取具體的值

問候,

丹尼斯(新在F#)

+4

樣本輸入數據在這裏會有很大幫助。 – khachik 2011-06-16 10:39:54

回答

0

我會建議使用正則表達式這一點,是這樣的:

open System.Text.RegularExpressions 

let usernameRegex = new Regex(".*owned by\s+(?<username>.*)\s+was.*") 

/// Trys to extract the username from a given line of text. Returns None if the line is malformed 
// Note: You could also use failwith in the else branch or throw an exception or ... 
let extractUsername line = 
    let regexMatch = usernameRegex.Match(line) in 
    if regexMatch.Success then Some regexMatch.Groups.["username"].Value else None 

// In reality you would like to load this from file using File.ReadAllLines 
let sampleLines = 
    ["Some text some text owned by DESIRED USERNAME was some text some text"; 
    "Some text line not containing the pattern"; 
    "Another line owned by ANOTHER USER was"] 

let extractUsernames lines = 
    lines 
    |> Seq.map extractUsername 
    |> Seq.filter (fun usernameOption -> usernameOption.IsSome) 
    |> Seq.map (fun usernameOption -> usernameOption.Value) 

// You can now save the usernames to a file using 
// File.WriteAllLines("FileName", extractUsernames(sampleLines)) 
+0

嗨,thanx爲awnser。它的解決方案的一部分,但我想我可以設法排序de用戶名並獲得每個用戶打印的頁面(我認爲)關心 – Coolzero1974 2011-07-04 07:50:08

0

你可以這樣做:

let getBetween (a:string) (b:string) (str:string) = 
     str.Split(a.ToCharArray()).[1].Split(b.ToCharArray()).[0].Trim() 

let total (a:string seq) = 
    (a |> Seq.map Int32.Parse |> Seq.reduce (+)).ToString() 

File.ReadAllLines("inFile") |> Seq.map (fun l -> (getBetween "owned by" "was" l , getBetween "Pages printed:" "." l)) 
|> Seq.groupBy (fun (user,count) -> user) 
|> Seq.map (fun (user,counts) -> user + "\t" + (counts |> Seq.map snd |> total)) 
|> (fun s -> File.WriteAllLines("outFile",s)) 
+0

如何把每個用戶的總頁數放到一個txt文件中? – Coolzero1974 2011-06-16 19:29:48

+0

我已更新我的答案,請嘗試。這將把用戶和計數在每一行和用戶和計數是由製表符分隔 – Ankur 2011-06-17 04:33:06

+0

嗨,thx的代碼,但我得到了很多錯誤System.formatexeption:inputstring不是在正確的格式 – Coolzero1974 2011-06-17 10:12:53

相關問題