爪哇 - 寫基於規範的音節計數器

規範音節：爪哇 - 寫基於規範的音節計數器

每組相鄰元音（A，E，I，O，U，Y）計爲一個音節（例如，所述在「真實」中的「ea」貢獻了一個音節，但「regal」中的「e ... a」被視爲兩個音節）。然而，一個單詞末尾的「e」並不算作一個音節。同樣，每個單詞至少有一個音節，即使先前的規則給出零計數。

我countSyllables方法：

public int countSyllables(String word) { 
    int count = 0; 
    word = word.toLowerCase(); 
    for (int i = 0; i < word.length(); i++) { 
     if (word.charAt(i) == '\"' || word.charAt(i) == '\'' || word.charAt(i) == '-' || word.charAt(i) == ',' || word.charAt(i) == ')' || word.charAt(i) == '(') { 
      word = word.substring(0,i)+word.substring(i+1, word.length()); 
     } 
    } 
    boolean isPrevVowel = false; 
    for (int j = 0; j < word.length(); j++) { 
     if (word.contains("a") || word.contains("e") || word.contains("i") || word.contains("o") || word.contains("u")) { 
      if (isVowel(word.charAt(j)) && !((word.charAt(j) == 'e') && (j == word.length()-1))) { 
       if (isPrevVowel == false) { 
        count++; 
        isPrevVowel = true; 
       } 
      } else { 
       isPrevVowel = false; 
      } 
     } else { 
      count++; 
      break; 
     } 
    } 
    return count; 
}

的isVowel方法，其確定的信是元音：

public boolean isVowel(char c) { 
     if (c == 'a' || c == 'e' || c == 'i' || c == 'o' || c == 'u') { 
      return true; 
     } else { 
      return false; 
     } 
    }

根據給同事，這將導致在528個音節時在this text上使用，但我似乎可以得到它，我不知道我們哪一個是正確的。請幫助我將我的方法發展爲正確的算法或幫助證明這是正確的。謝謝。

來源

2012-02-05 mino

一個問題是，字符串是不可變的。嘗試改變word.toLowerCase（）; to word = word.toLowerCase（）; ，看看是否改變任何東西。 – 2012-02-05 23:31:21

你似乎也在做很多確定字數限制的工作。在這裏查看String的split（）方法：http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#split%28java.lang.String%29並且可能會簡化給你的東西。 – 2012-02-05 23:34:56

這確實給了我508個音節的不同結果（也許更加正確！）。仍然不是528雖然我的解決方案現在正確或是我的同事的528結果是正確的，我的代碼中仍然存在錯誤？ – mino 2012-02-05 23:35:49

其中一個問題可能是您在輸入中調用了情人案例方法，但您沒有指定它。

所以，如果你改變

word.toLowerCase();

到

word = word.toLowerCase();

將幫助是肯定的。

來源

2012-02-05 23:36:50

我強烈建議您使用Java的字符串API來發揮其全部功能。例如，考慮String.split（字符串正則表達式）：

http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#split%28java.lang.String%29

這需要一個字符串，及一個正則表達式，則返回所有的子串的陣列，使用正則表達式作爲分隔符。如果你讓你的正則表達式匹配所有的輔音或空白，那麼你最終會得到一串空的（因此不代表輔音）或一系列元音（代表輔音）的字符串。數了後者，你將有一個解決方案。

另一種選擇它也接受字符串API和正則表達式的優點是的replaceAll：

http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#replaceAll%28java.lang.String,%20java.lang.String%29

在這種情況下，你想有一個正則表達式，採取的形式[可選的東西是不是元音] [一個或多個元音] [任意不是元音的任何東西]。在你的字符串上運行這個正則表達式，並用一個字符替換它（例如「1」）。最終的結果是，每個音節將被一個單一的字符替換。然後你需要做的就是String.length（），你會知道你有多少個音節。

根據您的解決方案的要求，這些可能無法正常工作。如果這是一個與算法設計有關的作業問題，那麼這幾乎肯定不是首選答案，但它確實具有簡明扼要的優點，並且可以充分利用內置（因此高度優化）的Java API。

來源

2012-02-06 00:01:38 Erica

這應該是一些正則表達式容易可行：

Pattern p = Pattern.compile("[aeiouy]+?\w*?[^e]"); 
String[] result = p.split(WHAT_EVER_THE_INPUT_IS); 
result.length

請注意，這是未經測試。

來源

2012-02-06 01:47:56 devsnd

不是一個直接的答案（如果我認爲這是有建設性的，我的計數在最後一次嘗試中大約爲238），但我會給你一些提示，這將是創建答案的基礎：

劃分你的問題：讀取行，然後將行分成單詞，然後計算每個單詞的音節。之後，請將它們統計爲所有行。
想一想事物的順序：首先找到所有音節，並通過單詞「行走」來計算每個音節。事後要考慮特殊情況。
在設計過程中，使用調試器來遍歷代碼。機會很高，你會犯類似toUpperCase()方法的常見錯誤。更好地發現這些錯誤，沒有人會第一次創建完美的代碼。
打印到控制檯（高級用戶使用日誌並在最終程序中保留沉默的日誌行）。確保使用註釋標記println並將其從最終實施中移除。打印行號和音節數等東西，以便可以直觀地將它們與文本進行比較。
如果您已經升級了一點，您可以使用Matcher.find（正則表達式），使用Pattern來查找音節。正則表達式是困難的野獸掌握。一個常見的錯誤是讓他們做得太多。

這種方式可以快速掃描文本。你很快會發現的一件事是，你將不得不處理文本中的數字。所以你需要檢查一個單詞是否實際上是一個單詞，否則，按照你的規則，它將至少有一個單音節。

如果您覺得自己在重複某些操作，例如使用相同字符集的isVowel和String.contains()方法，則可能是做錯了。源代碼中的重複是代碼異味。

使用正則表達式，我計算了大約238（第四次去），但我沒有真正檢查每個音節（當然）。

1 14 
2 17 
3 17 
4 15 
5 15 
6 14 
7 16 
8 19 
9 17 
10 17 
11 16 
12 19 
13 18 
14 15 
15 18 
16 15 
17 16 
18 17 
19 16 
20 17 
21 17 
22 19 
23 17 
24 16 
25 17 
26 17 
27 16 
28 17 
29 15 
30 17 
31 19 
32 23 
33 0 

--- total --- 
538

來源

2012-02-06 23:19:43

我剛剛發明了一種計算Java中音節的新方法。

我的新圖書館，勞倫斯樣式檢查，在這裏可以查看：https://github.com/troywatson/Lawrence-Style-Checker

我用我的程序計算你的音節每個單詞和顯示結果在這裏：http://pastebin.com/LyiBTcbb

隨着我計數的字典法音節我得到了：共528個音節。

這是提問者給出的正確音節數的確切數字。但我仍然質疑這個數字，原因如下：

罷工率：99。4％的正確

字錯誤：337分之2話

字錯了，錯的音節數：{樹脂：4，阿德沃夫：3}

這裏是我的代碼：

Lawrence lawrence = new Lawrence(); 

    // Turn the text into an array of sentences. 
    String sentences = "" 
    String[] sentences2 = sentences.split("(?<=[a-z])\\.\\s+"); 

    int count = 0; 

    for (String sentence : sentences2) { 
     sentence = sentence.replace("-", " "); // split double words 
     for (String word : sentence.split(" ")) { 

      // Get rid of punctuation marks and spaces. 
      word = lawrence.cleanWord(word); 

      // If the word is null, skip it. 
      if (word.length() < 1) 
       continue; 

      // Print out the word and it's syllable on one line. 
      System.out.print(word + ","); 
      System.out.println(lawrence.getSyllable(word)); 
      count += lawrence.getSyllable(word); 
     } 
    } 
    System.out.println(count);

嘭！

來源

2015-09-25 14:06:30 troy

勞倫斯是基於關鍵詞，而不是基於規則。問題依據規範而不是基於關鍵字的檢查器。 – 2017-05-09 19:29:45

-1

這是我實現計數音節

protected int countSyllables(String word) 
{ 
    // getNumSyllables method in BasicDocument (module 1) and 
    // EfficientDocument (module 2). 
    int syllables = 0; 
    word = word.toLowerCase(); 
    if(word.contains("the ")){ 
     syllables ++; 
    } 
    String[] split = word.split("e!$|e[?]$|e,|e |e[),]|e$"); 

    ArrayList<String> tokens = new ArrayList<String>(); 
    Pattern tokSplitter = Pattern.compile("[aeiouy]+"); 

    for (int i = 0; i < split.length; i++) { 
     String s = split[i]; 
     Matcher m = tokSplitter.matcher(s); 

     while (m.find()) { 
      tokens.add(m.group()); 
     } 
    } 

    syllables += tokens.size(); 
    return syllables; 
}

它工作正常的我。

來源

2016-05-27 04:24:15

private static int countSyllables(String word) 
{ 
    //System.out.print("Counting syllables in " + word + "..."); 
    int numSyllables = 0; 
    boolean newSyllable = true; 
    String vowels = "aeiouy"; 
    char[] cArray = word.toCharArray(); 
    for (int i = 0; i < cArray.length; i++) 
    { 
     if (i == cArray.length-1 && Character.toLowerCase(cArray[i]) == 'e' 
       && newSyllable && numSyllables > 0) { 
      numSyllables--; 
     } 
     if (newSyllable && vowels.indexOf(Character.toLowerCase(cArray[i])) >= 0) { 
      newSyllable = false; 
      numSyllables++; 
     } 
     else if (vowels.indexOf(Character.toLowerCase(cArray[i])) < 0) { 
      newSyllable = true; 
     } 
    } 
    //System.out.println("found " + numSyllables); 
    return numSyllables; 
}

另一種實現可以在以下鏈接引擎收錄中找到： https://pastebin.com/q6rdyaEd

來源

2017-04-16 17:39:52 DISHA

爪哇 - 寫基於規範的音節計數器

回答

相關問題