我有多個json文件。我必須使用apache spark來解析它。它嵌套了關鍵的init。我必須打印所有欄和嵌套鍵。如何從json文件中使用java中的apache spark創建嵌套列
這些文件也有嵌套鍵。 我想要獲取所有列名稱以及嵌套的列名稱。我怎麼能得到它。
我想這樣的:在文件
String jsonFilePath = "/home/vipin/workspace/Smarten/jsonParsing/Employee/Employee-01.json,/home/vipin/workspace/Smarten/jsonParsing/Employee/Employee-02.json";
String[] jsonFiles = jsonFilePath.split(",");
Dataset<Row> people = sparkSession.read().json(jsonFiles);
JSON結構:
{
"Name":"Vipin Suman",
"Email":"[email protected]",
"Designation":"Programmer",
"Age":22 ,
"location":
{
"City":"Ahmedabad",
"State":"Gujarat"
}
}
我得到的結果:
people.show(50, false);
Age | Designation | Email | Name | Location
------------------------------------------------------------
22 |Programmer |[email protected] | Vipin Suman|[Ahmedabad,Gujarat]
我要像數據:
Age | Designation | Email | Name | City | State
------------------------------------------------------------
22 |Programmer |[email protected] | Vipin Suman| Ahmedabad |Gujarat
或類似: -
Age | Designation | Email | Name | Location
---------------------------------------------------------------
22 |Programmer |[email protected] | Vipin Suman| Ahmedabad,Gujarat
如果scema這個樣子
root
|-- Age: long (nullable = true)
|-- Company: struct (nullable = true)
| |-- Company Name: string (nullable = true)
| |-- Domain: string (nullable = true)
|-- Designation: string (nullable = true)
|-- Email: string (nullable = true)
|-- Name: string (nullable = true)
|-- Test: array (nullable = true)
| |-- element: string (containsNull = true)
|-- location: struct (nullable = true)
| |-- City: struct (nullable = true)
| | |-- City Name: string (nullable = true)
| | |-- Pin: long (nullable = true)
| |-- State: string (nullable = true)
和JSON結構
{
"Name":"Vipin Suman",
"Email":"[email protected]",
"Designation":"Trainee Programmer",
"Age":22 ,
"location":
{"City":
{
"Pin":324009,
"City Name":"Ahmedabad"
},
"State":"Gujarat"
},
"Company":
{
"Company Name":"Elegant",
"Domain":"Java"
},
"Test":["Test1","Test2"]
}
那又怎麼能找到嵌套的關鍵。並表示在適當的formet表
請準備好:輸入數據樣本,你做了什麼,有什麼問題? –