從文本文件中排序字符串出現

我已經將文件中的字符串存儲到ArrayList中，並使用HashSet來計算每個字符串的出現次數。從文本文件中排序字符串出現

我在尋找列出前5個單詞及其出現次數。我應該能夠完成這個沒有實現哈希表，樹形圖等等。我該如何去實現這個？

這裏是我的ArrayList：

List<String> word_list = new ArrayList<String>(); 

     while (INPUT_TEXT1.hasNext()) { 
      String input_word = INPUT_TEXT1.next(); 
      word_list.add(input_word); 

     } 

     INPUT_TEXT1.close(); 

     int word_list_length = word_list.size(); 



     System.out.println("There are " + word_list_length + " words in the .txt file"); 
     System.out.println("\n\n"); 

     System.out.println("word_list's elements are: "); 



     for (int i = 0; i<word_list.size(); i++) { 
       System.out.print(word_list.get(i) + " "); 

      } 

     System.out.println("\n\n");

這裏是我的HashSet：

Set<String> unique_word = new HashSet<String>(word_list); 

    int number_of_unique = unique_word.size(); 

    System.out.println("unique worlds are: "); 

    for (String e : unique_word) { 
     System.out.print(e + " "); 

    } 

    System.out.println("\n\n"); 


    String [] word = new String[number_of_unique]; 
    int [] freq = new int[number_of_unique]; 

    int count = 0; 

    System.out.println("Frequency counts : "); 

    for (String e : unique_word) { 
     word[count] = e; 
     freq[count] = Collections.frequency(word_list, e); 



     System.out.println(word[count] + " : "+ freq[count] + " time(s)"); 
     count++; 

    }

難道是我一個得太多一步？由於事先

來源

2016-11-20 codeREXO

創建一個內部類，說Z，具有兩個字段 - 字，它實現'可比'和覆蓋'哈希碼（）'和'的equals計數（） '方法。創建該類的實例集 - 如果設置包含對象獲取它並增加計數。使用'Collections.sort（）'對其進行排序。你去了。 – GurV

也就是說，Hashmap可能是更好的方法 – GurV

Apache Commons中有一個簡單的實現，使用'HashBag' – ifly6

可以做到這一點使用HashMap（具有唯一的字作爲key和頻率作爲value保持），然後在如在下面的步驟說明相反的順序排序values：

（1）裝載word_list用的話

（2）搜索由word_list唯一字

（3）存儲的唯一字到HashMap與獨特字作爲key和FREQUENC ÿ爲value

（4）與值（頻率）的排序HashMap

可以參考下面的代碼：

public static void main(String[] args) { 

     List<String> word_list = new ArrayList<>(); 
     //Load your words to the word_list here 

     //Find the unique words now from list 
     String[] uniqueWords = word_list.stream().distinct(). 
             toArray(size -> new String[size]); 
     Map<String, Integer> wordsMap = new HashMap<>(); 
     int frequency = 0; 

     //Load the words to Map with each uniqueword as Key and frequency as Value 
     for (String uniqueWord : uniqueWords) { 
      frequency = Collections.frequency(word_list, uniqueWord); 
      System.out.println(uniqueWord+" occured "+frequency+" times"); 
      wordsMap.put(uniqueWord, frequency); 
     } 

     //Now, Sort the words with the reverse order of frequency(value of HashMap) 
     Stream<Entry<String, Integer>> topWords = wordsMap.entrySet().stream(). 
     sorted(Map.Entry.<String,Integer>comparingByValue().reversed()).limit(5); 

     //Now print the Top 5 words to console 
     System.out.println("Top 5 Words:::"); 
     topWords.forEach(System.out::println); 
}

來源

2016-11-20 05:35:39 developer

用java 8，並把所有的代碼在一個塊中。

Stream<Map.Entry<String,Long>> topWords = 
      words.stream() 
        .map(String::toLowerCase) 
        .collect(groupingBy(identity(), counting())) 
        .entrySet().stream() 
        .sorted(Map.Entry.<String, Long> comparingByValue(reverseOrder()) 
          .thenComparing(Map.Entry.comparingByKey())) 
        .limit(5);

遍歷流

topWords.forEach(m -> { 
      System.out.print(m.getKey() + " : "+ m.getValue() + "time(s)"); 
     });

來源

2016-11-20 06:51:56

從文本文件中排序字符串出現

回答

相關問題