Tokenize方法：將字符串拆分成數組

我一直在努力編寫一個程序。基本上，我們必須編寫一個程序，將英語句子翻譯成豬拉丁語中的一個。我們需要的第一個方法是標記字符串，我們不允許使用Java中通常使用的Split方法。我一直沒有運氣做到這一點，在過去2天后，這裏是我到目前爲止有：Tokenize方法：將字符串拆分成數組

public class PigLatin 
    { 
     public static void main(String[] args) 
     { 
       String s = "Hello there my name is John"; 
       Tokenize(s); 
     } 


     public static String[] Tokenize(String english) 
     { 
      String[] tokenized = new String[english.length()]; 
      for (int i = 0; i < english.length(); i++) 
      { 
        int j= 0; 
        while (english.charAt(i) != ' ') 
        { 
         String m = ""; 
         m = m + english.charAt(i); 
         if (english.charAt(i) == ' ') 
         { 
           j++; 
         } 
         else 
         { 
           break; 
         } 
        } 
      for (int l = 0; l < tokenized.length; l++) { 
      System.out.print(tokenized[l] + ", "); 
     } 
     } 
    return tokenized; 
    } 
}

這一切確實是打印一個極其漫長的「空」 S陣列。如果任何人都可以提供任何意見，我會reallllyyyy欣賞它！

預先感謝您更新：我們應該假設不會有標點符號或多餘的空格，所以基本上只要有空間，這是一個新詞

來源

2015-02-23 IH9522

如果您允許使用StringTokenizer，它將執行與split相同的操作，但是您可以遍歷它創建的令牌。 – 2015-02-23 03:04:20

您是否有可能誤解作業？這聽起來像你應該創建一個'StringTokenizer'對象。 – Dando18 2015-02-23 03:06:34

m看起來像是在錯誤的級別聲明的，它應該在while循環之外。一旦它被填充，你也不會對它做任何事情。 – 2015-02-23 03:07:31

如果我理解你的問題，什麼你的Tokenize是打算做的;然後我會寫一個函數來分割String

static String[] splitOnWhiteSpace(String str) { 
    List<String> al = new ArrayList<>(); 
    StringBuilder sb = new StringBuilder(); 
    for (char ch : str.toCharArray()) { 
     if (Character.isWhitespace(ch)) { 
      if (sb.length() > 0) { 
       al.add(sb.toString()); 
       sb.setLength(0); 
      } 
     } else { 
      sb.append(ch); 
     } 
    } 
    if (sb.length() > 0) { 
     al.add(sb.toString()); 
    } 
    String[] ret = new String[al.size()]; 
    return al.toArray(ret); 
}

開始，然後如果你允許使用StringTokenizer對象（我認爲打印使用Arrays.toString(Object[])像

public static void main(String[] args) { 
    String s = "Hello there my name is John"; 
    String[] words = splitOnWhiteSpace(s); 
    System.out.println(Arrays.toString(words)); 
}

來源

2015-02-23 03:07:47

是什麼任務在問，它會是這個樣子：

StringTokenizer st = new StringTokenizer("this is a test"); 
while (st.hasMoreTokens()) { 
    System.out.println(st.nextToken()); 
}

這將產生輸出：

this 
is 
a 
test

摘自here。

該字符串被分成標記並存儲在堆棧中。 while循環遍歷令牌，這是您可以應用豬拉丁語邏輯的地方。

來源

2015-02-23 03:11:08 Dando18

有些提示讓你做「手動拆分」工作。

有一種方法來String#indexOf(int ch, int fromIndex)幫助你找到一個字符
的下一次出現有一種方法String#substring(int beginIndex, int endIndex)提取字符串的某一部分。

下面是一些僞代碼，告訴你如何分割它（有更多的安全處理，你需要，我會留給你）

List<String> results = ...; 
int startIndex = 0; 
int endIndex = 0; 

while (startIndex < inputString.length) { 
    endIndex = get next index of space after startIndex 
    if no space found { 
     endIndex = inputString.length 
    } 
    String result = get substring of inputString from startIndex to endIndex-1 
    results.add(result) 
    startIndex = endIndex + 1 // move startIndex to next position after space 
} 

// here, results contains all splitted words

來源

2015-02-23 03:18:29

   String english = "hello my fellow friend" 
      ArrayList tokenized = new ArrayList<String>(); 
      String m = ""; 
      int j = 0; //index for tokenised array list. 
      for (int i = 0; i < english.length(); i++) 
      { 

        //the condition's position do matter here, if you 
        //change them, english.charAt(i) will give index  
        //out of bounds exception 
        while(i < english.length() && english.charAt(i) != ' ') 
        { 
         m = m + english.charAt(i); 
         i++; 

        } 
        //add to array list if there is some string 
        //if its only ' ', array will be empty so we are OK. 
        if(m.length() > 0) 
        { 
         tokenized.add(m); 
         j++; 
         m = ""; 

        } 

      }  
      //print the array list 
      for (int l = 0; l < tokenized.size(); l++) { 
      System.out.print(tokenized.get(l) + ", "); 

         }

這版畫「你好，我的同胞，朋友，「我使用了一個數組列表，因爲在第一次看到數組的長度不明確。

來源

2015-02-23 03:24:43 epipav

Tokenize方法：將字符串拆分成數組

回答

相關問題