2016-11-15 142 views
-1

我有一個DataFrame(Apache Spark 1.5)。 我想要使用spark sql上下文來添加新列以獲取新列,其中所有原始數據都包含單引號。Spark單引號錯誤

我的代碼:

df.registerTempTable("tempdf"); 
df = df.sqlContext().sql("SELECT *, \" \\\" \" as quoteCol FROM tempdf"); 

執行星火拋出一個異常後:

Exception in thread "main" java.lang.RuntimeException: [1.44] failure: ``union'' expected but ErrorToken(end of input) found 

SELECT *, " \" " as quoteCol FROM tempdf 
             ^
    at scala.sys.package$.error(package.scala:27) 
    at org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:36) 
    at org.apache.spark.sql.catalyst.DefaultParserDialect.parse(ParserDialect.scala:67) 
    at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:211) 
    at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:211) 
    at org.apache.spark.sql.execution.SparkSQLParser$$anonfun$org$apache$spark$sql$execution$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:114) 
    at org.apache.spark.sql.execution.SparkSQLParser$$anonfun$org$apache$spark$sql$execution$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:113) 
    at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:137) 
    ... 

下一個代碼正常工作,並與單個字符添加新行:

df.registerTempTable("tempdf"); 
df = df.sqlContext().sql("SELECT *, \" q \" as quoteCol FROM tempdf"); 

什麼時我做錯了?

回答

0

SQL字符串應使用單引號:

sqlContext().sql("SELECT *, '\"' AS quoteCol FROM tempdf"); 
+0

下一頁表達會導致相同的異常: 'SELECT *, '\''作爲quoteCol FROM tempdf' 我明白,我不能使用在他們之外的同一個引號內的引號? –

+0

適合我。 – 2016-11-15 18:38:13