2017-04-17 76 views

回答

5

這裏是如何 -

from pyspark.sql.types import * 

cSchema = StructType([StructField("WordList", ArrayType(StringType()))]) 

# notice extra square brackets around each element of list 
test_list = [['Hello', 'world']], [['I', 'am', 'fine']] 

df = spark.createDataFrame(test_list,schema=cSchema) 
-2
You can create a RDD first from the input and then convert to dataframe from the constructed RDD 
    <code> 
    import sqlContext.implicits._ 
     val testList = Array(Array("Hello", "world"), Array("I", "am", "fine")) 
     // CREATE RDD 
     val testListRDD = sc.parallelize(testList) 
    val flatTestListRDD = testListRDD.flatMap(entry => entry) 
    // COnvert RDD to DF 
    val testListDF = flatTestListRDD.toDF 
    testListDF.show 
    </code> 
相關問題