2016-07-07 40 views

回答

4

根據spark-user archives

sc.hadoopConfiguration.setInt("dfs.blocksize", some_value) 
sc.hadoopConfiguration.setInt("parquet.block.size", some_value) 

所以PySpark

sc._jsc.hadoopConfiguration().setInt("dfs.blocksize", some_value) 
sc._jsc.hadoopConfiguration().setInt("parquet.block.size", some_value)