2015-04-22 116 views
2

我有一個CSV文件,格式如下:從CSV計算平均文件

City,Job,Salary Delhi,Doctors,500 Delhi,Lawyers,400 Delhi,Plumbers,100 London,Doctors,800 London,Lawyers,700 London,Plumbers,300 Tokyo,Doctors,900 Tokyo,Lawyers,800 Tokyo,Plumbers,400 Lawyers,Doctors,300 Lawyers,Lawyers,400 Lawyers,Plumbers,500 Hong Kong,Doctors,1800 Hong Kong,Lawyers,1100 Hong Kong,Plumbers,1000 Moscow,Doctors,300 Moscow,Lawyers,200 Moscow,Plumbers,100 Berlin,Doctors,800 Berlin,Plumbers,900 Paris,Doctors,900 Paris,Lawyers,800 Paris,Plumbers,500 Paris,Dog catchers,400

我想找個工資總額的平均值。

這是我的代碼:

`import java.io. *;

公共類A {

public static void main(String args[]) 
{ 
A a= new A(); 
a.run(); 
} 

public void run() 
{ 
String csv="C:\\Users\\Dipayan\\Desktop\\salaries.csv"; 
     BufferedReader br = null; 
String line = ""; 
int sum=0; 
int count=0; 
//String a=new String(); 


     try { 

      br = new BufferedReader(new FileReader(csv)); 
      try { 
       while ((line = br.readLine()) != null) { 

         // use comma as separator 
        String[] country = line.split(","); 
        int sal=Integer.parseInt(country[2]); 
        sum=sum+sal; 
         count++; 
       //System.out.println("Salary [job= " + country[0] 
            //  + " , salary=" + country[2] + "]"); 

       } 
      } catch (NumberFormatException | IOException e) { 
       System.out.println("NA"); 
       e.printStackTrace(); 
      } 


     } catch (FileNotFoundException e) { 
      e.printStackTrace(); 
     } 
     System.out.println(sum/count); 


     System.out.println("Done"); 
     } 

    }` 

但是,其示值誤差:

java.lang.NumberFormatException: For input string: "Salary" at java.lang.NumberFormatException.forInputString(Unknown Source) at java.lang.Integer.parseInt(Unknown Source) at java.lang.Integer.parseInt(Unknown Source) at A.run(A.java:30) at A.main(A.java:9) Exception in thread "main" java.lang.ArithmeticException:/by zero at A.run(A.java:46) at A.main(A.java:9)`

有沒有更好的或短代碼來解析CSV文件。

+0

你需要把代表你的頭的第一行不同的「工資」顯然不是一個數字。 – Leon

回答

1

第一行在第三個地方包含「薪水」一詞。在循環前放置br.readLine(),一切都應該沒問題。

您有:

br = new BufferedReader(new FileReader(csv)); 
try { 
    while ((line = br.readLine()) != null) { 

將其更改爲:

br = new BufferedReader(new FileReader(csv)); 
br.readLine() 
try { 
    while ((line = br.readLine()) != null) { 
+0

請編寫該行的代碼。沒有得到你! –

+1

添加了代碼。 –

+0

非常感謝! –

1

跳過CSV文件的第一行。做一個額外的

br.readLine() 

之前。

您可能還想添加一些格式檢查以確保您正在閱讀的文件格式正確。

0

while-loop之前br.readLine()會避免標題行的問題,但如果你的數據不正確,你將再次得到相同的Exception,所以,爲了使一個更安全的方法,你可以改變這一行:

int sal=Integer.parseInt(country[2]); 

隨着try-catch塊通過整個文件迭代即使值不是一個有效的數字

int sal; 
try { 
    sal=Integer.parseInt(country[2]); 
} catch (NumberFormatException e) { 
    // if you want here you can show an error message 
    // to give feedback to the user there is not a valid number 
} 
0

首先,使用CSV解析器 - 我會在這個例子中使用OpenCSV。我與OpenCSV沒有任何關係,這正是我目前在我的POM中所擁有的。

首先,創建一個class

public class Salary { 
    private String city; 
    private String job; 
    private long salary; 

    public String getCity() { 
     return city; 
    } 

    public void setCity(String city) { 
     this.city = city; 
    } 

    public String getJob() { 
     return job; 
    } 

    public void setJob(String job) { 
     this.job = job; 
    } 

    public long getSalary() { 
     return salary; 
    } 

    public void setSalary(long salary) { 
     this.salary = salary; 
    } 
} 

現在您CSV有三列,和CSV的頭我們的bean的屬性名相匹配,所以我們可以簡單地用一個HeaderColumnNameMappingStrategy,以確定哪些屬性設置在豆:

final HeaderColumnNameMappingStrategy<Salary> mappingStrategy = new HeaderColumnNameMappingStrategy<>(); 
mappingStrategy.setType(Salary.class); 

現在我們只需要解析CSV文件導入我們的豆List

final CsvToBean<Salary> csvToBean = new CsvToBean<>(); 
try (final Reader reader = ...) { 
    final List<Salary> salaries = csvToBean.parse(mappingStrategy, reader); 
} 

好的。

現在,你如何從這混亂中獲得平均工資?只需使用Java 8 Stream的結果:

final LongSummaryStatistics statistics = salaries.stream() 
      .mapToLong(Salary::getSalary) 
      .summaryStatistics(); 

現在我們可以獲取各種有用信息:

final long min = statistics.getMin(); 
final double average = statistics.getAverage(); 
final long max = statistics.getMax();