處理中間狀態的列表

我正在處理一個字符串列表，你可以將它們想象成一本書的行。當一條線是空的時，它必須被丟棄。當它是標題時，它被「保存」爲當前標題。每個「正常」行必須生成一個包含文本和當前標題的對象。最後你有一系列的行，每行都有相應的標題。處理中間狀態的列表

例：

- Chapter 1 

Lorem ipsum dolor sit amet 
consectetur adipisicing elit 

- Chapter 2 

sed do eiusmod tempor 
incididunt u

第一行是一個標題，第二線必須被丟棄，則兩條線保持爲段落，每個用「第1章」的標題。等等。你結束了類似的集合：

{"Lorem ipsum...", "Chapter 1"}, 
{"consectetur...", "Chapter 1"}, 
{"sed do...", "Chapter 2"}, 
{"incididunt ...", "Chater 2"}

我知道標題/段模型不使100％的感覺，但我簡化模型來說明這個問題。

這是我的迭代求解：

let parseText allLines = 
    let mutable currentTitle = String.Empty 
    seq { 
     for line in allLines do 
      match parseLine line with 
      | Empty -> 0 |> ignore 
      | Title caption -> 
       currentTitle <- caption 
      | Body text -> 
        yield new Paragraph(currentTitle, text) 
    }

第一個問題是我不得不丟棄空行，我這樣做是與0 |> ignore但它看起來很對我不好。什麼是適當的做到這一點（沒有預過濾列表）？

此功能的尾遞歸版本很簡單：

let rec parseText allLines currentTitle paragraphs = 
    match allLines with 
    | [] -> paragraphs 
    | head :: tail -> 
     match head with 
     | Empty -> parseText tail currentTitle paragraphs 
     | Title caption -> parseText tail caption paragraphs 
     | Body text -> parseText tail currentTitle (new Paragraph(currentTitle, text) :: tail)

的問題（S）：

有兩個版本（風格/性能之間的顯著差異/等等）？
有沒有更好的方法來解決這個問題？是否可以用一個List.map來完成它？

來源

2011-09-22 Francesco De Vittori

雖然不是一個單一的List.Map，他再次是解決方案，我想出了：

let parseText allLines = 
    allLines 
    |> Seq.fold (fun (currentTitle,paragraphs) line -> 
     match parseLine line with 
     | Empty -> currentTitle,paragraphs 
     | Title caption -> caption,paragraphs 
     | Body text -> String.Empty,Paragraph(currentTitle, text)::paragraphs 
     ) (String.Empty,[]) 
    |> snd

我使用的是倍(currentTitle,paragraphs)的狀態。 snd用於提取結果（它是狀態元組的一部分）。

當你在F＃中完成大部分處理時，使用列表很有吸引力，但其他數據結構，甚至普通序列都有它們的用途。

順便說一句，你的序列代碼編譯？我必須用currentTitle = ref String.Empty替換mutable currentTitle = String.Empty。

來源

2011-09-23 21:35:28 Mark

現在這是非常好的！ –

您可以將0 |> ignore替換爲()（單位），這是一個無操作。你的兩個實現最大的區別是第一個是懶惰的，這對於大量輸入可能是有用的。

下也可能會爲你工作（這是我能想到的最簡單的解決方案）：

let parseText (lines:seq<string>) = 
    lines 
    |> Seq.filter (fun line -> line.Trim().Length > 0) 
    |> Seq.pairwise (fun (title, body) -> Paragraph(title, body))

如果沒有，也許這將工作：

let parseText (lines:seq<string>) = 
    lines 
    |> Seq.choose (fun line -> 
    match line.Trim() with 
    | "" | null -> None 
    | Title title -> Some title 
    | Body text -> Some text) 
    |> Seq.pairwise (fun (title, body) -> Paragraph(title, body))

來源

2011-09-22 16:03:55 Daniel

我想Seq.pairwise不會這樣做，因爲我可以在下一個標題之前有n行文本。對不起，如果我沒有在問題中說清楚。 –

單位+1而不是0 |>忽略。謝謝！ –

@Francesco：在這種情況下，我認爲你的可變解決方案和它所獲得的一樣好（對於序列）。功能解決方案將更長，可讀性更差。 – Daniel

下面是一個這樣的實現（雖然沒有測試過，但我希望它給你的想法）

let isNotEmpty l = match l with 
        | Empty -> false 
        | _ -> true 

let parseText allLines = 
    allLines |> Seq.map parseLine |> Seq.filter isNotEmpty 
    |> Seq.scan (fun (c,t,b) i -> match i with 
            | Title tl -> (0,tl,"") 
            | Body bb -> (1,t,bb) 
            | _ -> (0,t,b)) (0,"","") 
    |> Seq.filter (fun (c,_,_) -> c > 0) 
    |> Seq.map (fun (_,t,b) -> Paragraph(t,b))

來源

2011-09-22 16:57:40 Ankur

不錯！我覺得這個版本不太可讀，但對於學習還是很有趣的。 –

處理中間狀態的列表

回答

相關問題