2015-10-05 74 views
4

我閱讀spark website上的度量標準部分。我想嘗試一下wordcount示例,但我無法使其工作。關於wordcount示例的火花指標

火花/ conf目錄/ metrics.properties:

# Enable CsvSink for all instances 
*.sink.csv.class=org.apache.spark.metrics.sink.CsvSink 

# Polling period for CsvSink 
*.sink.csv.period=1 

*.sink.csv.unit=seconds 

# Polling directory for CsvSink 
*.sink.csv.directory=/home/spark/Documents/test/ 

# Worker instance overlap polling period 
worker.sink.csv.period=1 

worker.sink.csv.unit=seconds 

# Enable jvm source for instance master, worker, driver and executor 
master.source.jvm.class=org.apache.spark.metrics.source.JvmSource 

worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource 

driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource 

executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource 

我跑我的本地應用程序一樣的文檔中:

$SPARK_HOME/bin/spark-submit --class "SimpleApp" --master local[4] target/scala-2.10/simple-project_2.10-1.0.jar 

我檢查的/ home /火花/文檔/測試/它是空的。

我錯過了什麼?


殼牌:

$SPARK_HOME/bin/spark-submit --class "SimpleApp" --master local[4] --conf spark.metrics.conf=/home/spark/development/spark/conf/metrics.properties target/scala-2.10/simple-project_2.10-1.0.jar 
Spark assembly has been built with Hive, including Datanucleus jars on classpath 
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 
INFO SparkContext: Running Spark version 1.3.0 
WARN Utils: Your hostname, cv-local resolves to a loopback address: 127.0.1.1; using 192.168.1.64 instead (on interface eth0) 
WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 
INFO SecurityManager: Changing view acls to: spark 
INFO SecurityManager: Changing modify acls to: spark 
INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(spark); users with modify permissions: Set(spark) 
INFO Slf4jLogger: Slf4jLogger started 
INFO Remoting: Starting remoting 
INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:35895] 
INFO Utils: Successfully started service 'sparkDriver' on port 35895. 
INFO SparkEnv: Registering MapOutputTracker 
INFO SparkEnv: Registering BlockManagerMaster 
INFO DiskBlockManager: Created local directory at /tmp/spark-447d56c9-cfe5-4f9d-9e0a-6bb476ddede6/blockmgr-4eaa04f4-b4b2-4b05-ba0e-fd1aeb92b289 
INFO MemoryStore: MemoryStore started with capacity 265.4 MB 
INFO HttpFileServer: HTTP File server directory is /tmp/spark-fae11cd2-937e-4be3-a273-be8b4c4847df/httpd-ca163445-6fff-45e4-9c69-35edcea83b68 
INFO HttpServer: Starting HTTP Server 
INFO Utils: Successfully started service 'HTTP file server' on port 52828. 
INFO SparkEnv: Registering OutputCommitCoordinator 
INFO Utils: Successfully started service 'SparkUI' on port 4040. 
INFO SparkUI: Started SparkUI at http://cv-local.local:4040 
INFO SparkContext: Added JAR file:/home/spark/workspace/IdeaProjects/wordcount/target/scala-2.10/simple-project_2.10-1.0.jar at http://192.168.1.64:52828/jars/simple-project_2.10-1.0.jar with timestamp 1444049152348 
INFO Executor: Starting executor ID <driver> on host localhost 
INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://[email protected]:35895/user/HeartbeatReceiver 
INFO NettyBlockTransferService: Server created on 60320 
INFO BlockManagerMaster: Trying to register BlockManager 
INFO BlockManagerMasterActor: Registering block manager localhost:60320 with 265.4 MB RAM, BlockManagerId(<driver>, localhost, 60320) 
INFO BlockManagerMaster: Registered BlockManager 
INFO MemoryStore: ensureFreeSpace(34046) called with curMem=0, maxMem=278302556 
INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 33.2 KB, free 265.4 MB) 
INFO MemoryStore: ensureFreeSpace(5221) called with curMem=34046, maxMem=278302556 
INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 5.1 KB, free 265.4 MB) 
INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:60320 (size: 5.1 KB, free: 265.4 MB) 
INFO BlockManagerMaster: Updated info of block broadcast_0_piece0 
INFO SparkContext: Created broadcast 0 from textFile at SimpleApp.scala:11 
WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
WARN LoadSnappy: Snappy native library not loaded 
INFO FileInputFormat: Total input paths to process : 1 
INFO SparkContext: Starting job: count at SimpleApp.scala:12 
INFO DAGScheduler: Got job 0 (count at SimpleApp.scala:12) with 2 output partitions (allowLocal=false) 
INFO DAGScheduler: Final stage: Stage 0(count at SimpleApp.scala:12) 
INFO DAGScheduler: Parents of final stage: List() 
INFO DAGScheduler: Missing parents: List() 
INFO DAGScheduler: Submitting Stage 0 (MapPartitionsRDD[2] at filter at SimpleApp.scala:12), which has no missing parents 
INFO MemoryStore: ensureFreeSpace(2848) called with curMem=39267, maxMem=278302556 
INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 2.8 KB, free 265.4 MB) 
INFO MemoryStore: ensureFreeSpace(2056) called with curMem=42115, maxMem=278302556 
INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.0 KB, free 265.4 MB) 
INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:60320 (size: 2.0 KB, free: 265.4 MB) 
INFO BlockManagerMaster: Updated info of block broadcast_1_piece0 
INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:839 
INFO DAGScheduler: Submitting 2 missing tasks from Stage 0 (MapPartitionsRDD[2] at filter at SimpleApp.scala:12) 
INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks 
INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, PROCESS_LOCAL, 1391 bytes) 
INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, PROCESS_LOCAL, 1391 bytes) 
INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 
INFO Executor: Running task 1.0 in stage 0.0 (TID 1) 
INFO Executor: Fetching http://192.168.1.64:52828/jars/simple-project_2.10-1.0.jar with timestamp 1444049152348 
INFO Utils: Fetching http://192.168.1.64:52828/jars/simple-project_2.10-1.0.jar to /tmp/spark-cab5a940-e2a4-4caf-8549-71e1518271f1/userFiles-c73172c2-7af6-4861-a945-b183edbbafa1/fetchFileTemp4229868141058449157.tmp 
INFO Executor: Adding file:/tmp/spark-cab5a940-e2a4-4caf-8549-71e1518271f1/userFiles-c73172c2-7af6-4861-a945-b183edbbafa1/simple-project_2.10-1.0.jar to class loader 
INFO CacheManager: Partition rdd_1_1 not found, computing it 
INFO CacheManager: Partition rdd_1_0 not found, computing it 
INFO HadoopRDD: Input split: file:/home/spark/development/spark/conf/metrics.properties:2659+2659 
INFO HadoopRDD: Input split: file:/home/spark/development/spark/conf/metrics.properties:0+2659 
INFO MemoryStore: ensureFreeSpace(7840) called with curMem=44171, maxMem=278302556 
INFO MemoryStore: Block rdd_1_0 stored as values in memory (estimated size 7.7 KB, free 265.4 MB) 
INFO BlockManagerInfo: Added rdd_1_0 in memory on localhost:60320 (size: 7.7 KB, free: 265.4 MB) 
INFO BlockManagerMaster: Updated info of block rdd_1_0 
INFO MemoryStore: ensureFreeSpace(8648) called with curMem=52011, maxMem=278302556 
INFO MemoryStore: Block rdd_1_1 stored as values in memory (estimated size 8.4 KB, free 265.4 MB) 
INFO BlockManagerInfo: Added rdd_1_1 in memory on localhost:60320 (size: 8.4 KB, free: 265.4 MB) 
INFO BlockManagerMaster: Updated info of block rdd_1_1 
INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 2399 bytes result sent to driver 
INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 2399 bytes result sent to driver 
INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 139 ms on localhost (1/2) 
INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 133 ms on localhost (2/2) 
INFO DAGScheduler: Stage 0 (count at SimpleApp.scala:12) finished in 0.151 s 
INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
INFO DAGScheduler: Job 0 finished: count at SimpleApp.scala:12, took 0.225939 s 
INFO SparkContext: Starting job: count at SimpleApp.scala:13 
INFO DAGScheduler: Got job 1 (count at SimpleApp.scala:13) with 2 output partitions (allowLocal=false) 
INFO DAGScheduler: Final stage: Stage 1(count at SimpleApp.scala:13) 
INFO DAGScheduler: Parents of final stage: List() 
INFO DAGScheduler: Missing parents: List() 
INFO DAGScheduler: Submitting Stage 1 (MapPartitionsRDD[3] at filter at SimpleApp.scala:13), which has no missing parents 
INFO MemoryStore: ensureFreeSpace(2848) called with curMem=60659, maxMem=278302556 
INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 2.8 KB, free 265.3 MB) 
INFO MemoryStore: ensureFreeSpace(2056) called with curMem=63507, maxMem=278302556 
INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 2.0 KB, free 265.3 MB) 
INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on localhost:60320 (size: 2.0 KB, free: 265.4 MB) 
INFO BlockManagerMaster: Updated info of block broadcast_2_piece0 
INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:839 
INFO DAGScheduler: Submitting 2 missing tasks from Stage 1 (MapPartitionsRDD[3] at filter at SimpleApp.scala:13) 
INFO TaskSchedulerImpl: Adding task set 1.0 with 2 tasks 
INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 2, localhost, PROCESS_LOCAL, 1391 bytes) 
INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 3, localhost, PROCESS_LOCAL, 1391 bytes) 
INFO Executor: Running task 0.0 in stage 1.0 (TID 2) 
INFO Executor: Running task 1.0 in stage 1.0 (TID 3) 
INFO BlockManager: Found block rdd_1_0 locally 
INFO Executor: Finished task 0.0 in stage 1.0 (TID 2). 1830 bytes result sent to driver 
INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 2) in 9 ms on localhost (1/2) 
INFO BlockManager: Found block rdd_1_1 locally 
INFO Executor: Finished task 1.0 in stage 1.0 (TID 3). 1830 bytes result sent to driver 
INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 3) in 10 ms on localhost (2/2) 
INFO DAGScheduler: Stage 1 (count at SimpleApp.scala:13) finished in 0.011 s 
INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 
INFO DAGScheduler: Job 1 finished: count at SimpleApp.scala:13, took 0.024084 s 
Lines with a: 5, Lines with b: 12 
+0

它沒有產生任何錯誤日誌? – GameOfThrows

+0

我只是編輯我的帖子,添加運行我的應用程序時出現在shell上的日誌,並且沒有任何錯誤。是否有另一個日誌文件,我應該看看?我不知道在運行應用程序時文件metrics.properties是否「已加載」的位置。你的火花代碼中的 – GermainGum

+0

,你確實指定了在哪裏寫結果?我認爲這是在/ home/spark/Documents/test /?當你組裝你的jar時,你應該能夠看到你的包中是否包含了metrices.properties – GameOfThrows

回答

1

我做了工作,指定火花提交路徑規格文件

--files=/yourPath/metrics.properties --conf spark.metrics.conf=./metrics.properties