編寫調用scala中泛型函數的泛型函數

我使用Spark數據集讀取csv文件。我想做一個多態函數來爲一些文件做這件事。這裏的功能：編寫調用scala中泛型函數的泛型函數

def loadFile[M](file: String):Dataset[M] = { 
    import spark.implicits._ 
    val schema = Encoders.product[M].schema 
    spark.read 
     .option("header","false") 
     .schema(schema) 
     .csv(file) 
     .as[M] 
}

，我得到的錯誤是：

[error] <myfile>.scala:45: type arguments [M] do not conform to method product's type parameter bounds [T <: Product] 
[error]  val schema = Encoders.product[M].schema 
[error]        ^
[error] <myfile>.scala:50: Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases. 
[error]  .as[M] 
[error]  ^
[error] two errors found

我不知道該怎麼辦的第一個錯誤。我嘗試添加與產品定義（M <：產品）相同的方差，但隨後出現錯誤「沒有可用於M的類型標籤」

如果我傳入已從編碼器產生的模式，錯誤：

[error] Unable to find encoder for type stored in a Dataset

來源

2017-07-26 kim

你需要要求任何人打電話loadFile[M]提供證據證明存在的M這種編碼器。您可以通過使用M背景下邊界做到這一點，需要一個Encoder[M]：

def loadFile[M : Encoder](file: String): Dataset[M] = { 
    import spark.implicits._ 
    val schema = implicitly[Encoder[M]].schema 
    spark.read 
    .option("header","false") 
    .schema(schema) 
    .csv(file) 
    .as[M] 
}

來源

2017-07-26 09:27:26

謝謝！這絕對是編譯的，但我有一些訪問問題和運行我的程序的內存不足問題，即使我不調用該函數。我想我可以讓我的案例類擴展編碼器，它應該工作，如果我沒有這些其他運行時問題？ – kim

@kim這是一個編譯時間要求，根本不應該影響運行時。也許別的東西導致你的代碼到OOM。 –

我決定通過不使用Spark解決整個Encoder問題，但是我確實發現了這個問題，它討論了[自定義對象的編碼器]（https://stackoverflow.com/questions/36648128/how-to-store-自定義的對象 - - 數據集）。當我有一段時間時，我會回頭找出答案。儘管它讓我走上了正軌，但我會將其標記爲我的答案。 – kim

編寫調用scala中泛型函數的泛型函數

回答

相關問題