從Cassandra中的CSV導入時沒有插入表中的行

我試圖將CSV文件導入到Cassandra表中，但是我面臨一個問題。當插入成功時，至少這是卡桑德拉所說的，我仍然看不到任何記錄。這裏是一個小的詳細信息：從Cassandra中的CSV導入時沒有插入表中的行

qlsh:recommendation_engine> COPY row_historical_game_outcome_data FROM '/home/adelin/workspace/docs/re_raw_data2.csv' WITH DELIMITER='|'; 

2 rows imported in 0.216 seconds. 
cqlsh:recommendation_engine> select * from row_historical_game_outcome_data; 

customer_id | game_id | time | channel | currency_code | game_code | game_name | game_type | game_vendor | progressive_winnings | stake_amount | win_amount 
-------------+---------+------+---------+---------------+-----------+-----------+-----------+-------------+----------------------+--------------+------------ 

(0 rows) 
cqlsh:recommendation_engine>

這是我的數據看起來像

'SomeName'|673|'SomeName'|'SomeName'|'TYPE'|'M'|123123|0.20000000000000001|0.0|'GBP'|2015-07-01 00:01:42.19700|0.0| 
'SomeName'|673|'SomeName'|'SomeName'|'TYPE'|'M'|456456|0.20000000000000001|0.0|'GBP'|2015-07-01 00:01:42.19700|0.0|

這是卡桑德拉版本Apache的卡桑德拉-2.2.0

編輯：

CREATE TABLE row_historical_game_outcome_data (
    customer_id int, 
    game_id int, 
    time timestamp, 
    channel text, 
    currency_code text, 
    game_code text, 
    game_name text, 
    game_type text, 
    game_vendor text, 
    progressive_winnings double, 
    stake_amount double, 
    win_amount double, 
    PRIMARY KEY ((customer_id, game_id, time)) 
) WITH bloom_filter_fp_chance = 0.01 
    AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' 
    AND comment = '' 
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'} 
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} 
    AND dclocal_read_repair_chance = 0.1 
    AND default_time_to_live = 0 
    AND gc_grace_seconds = 864000 
    AND max_index_interval = 2048 
    AND memtable_flush_period_in_ms = 0 
    AND min_index_interval = 128 
    AND read_repair_chance = 0.0 
    AND speculative_retry = '99.0PERCENTILE';

我也嘗試了以下建議uri2x

，但仍然沒有：

select * from row_historical_game_outcome_data; 

customer_id | game_id | time | channel | currency_code | game_code | game_name | game_type | game_vendor | progressive_winnings | stake_amount | win_amount 
-------------+---------+------+---------+---------------+-----------+-----------+-----------+-------------+----------------------+--------------+------------ 

(0 rows) 
cqlsh:recommendation_engine> COPY row_historical_game_outcome_data ("game_vendor","game_id","game_code","game_name","game_type","channel","customer_id","stake_amount","win_amount","currency_code","time","progressive_winnings") FROM '/home/adelin/workspace/docs/re_raw_data2.csv' WITH DELIMITER='|'; 

2 rows imported in 0.192 seconds. 
cqlsh:recommendation_engine> select * from row_historical_game_outcome_data; 

customer_id | game_id | time | channel | currency_code | game_code | game_name | game_type | game_vendor | progressive_winnings | stake_amount | win_amount 
-------------+---------+------+---------+---------------+-----------+-----------+-----------+-------------+----------------------+--------------+------------ 

(0 rows)

來源

2015-08-28 Adelin

你能告訴我們你的'DESCRIBE TABLE嗎？ – uri2x

你在這裏我已經添加了表格說明。 – Adelin

似乎您的列順序與CSV文件中的列順序不同（第一列不是int，第三列不是日期等）。嘗試使用COPY列名稱來匹配CSV文件的順序。 – uri2x

好吧，我不得不改變一些事情對你的數據文件，使這項工作：

SomeName|673|SomeName|SomeName|TYPE|M|123123|0.20000000000000001|0.0|GBP|2015-07-01 00:01:42|0.0 
SomeName|673|SomeName|SomeName|TYPE|M|456456|0.20000000000000001|0.0|GBP|2015-07-01 00:01:42|0.0

刪除了尾隨管。
截斷時間縮短到秒。
刪除所有單引號。

一旦我做到了，然後我執行：

[email protected]:stackoverflow> COPY row_historical_game_outcome_data 
(game_vendor,game_id,game_code,game_name,game_type,channel,customer_id,stake_amount, 
win_amount,currency_code , time , progressive_winnings) 
FROM '/home/aploetz/cassandra_stack/re_raw_data3.csv' WITH DELIMITER='|'; 

Improper COPY command.

這一個是有點棘手。我終於明白COPY不喜歡列名time。我調整了表使用的名稱，而不是game_time，並重新跑COPY：

[email protected]:stackoverflow> DROP TABLE row_historical_game_outcome_data ; 
[email protected]:stackoverflow> CREATE TABLE row_historical_game_outcome_data (
      ...  customer_id int, 
      ...  game_id int, 
      ...  game_time timestamp, 
      ...  channel text, 
      ...  currency_code text, 
      ...  game_code text, 
      ...  game_name text, 
      ...  game_type text, 
      ...  game_vendor text, 
      ...  progressive_winnings double, 
      ...  stake_amount double, 
      ...  win_amount double, 
      ...  PRIMARY KEY ((customer_id, game_id, game_time)) 
      ...); 

[email protected]:stackoverflow> COPY row_historical_game_outcome_data 
(game_vendor,game_id,game_code,game_name,game_type,channel,customer_id,stake_amount, 
win_amount,currency_code , game_time , progressive_winnings) 
FROM '/home/aploetz/cassandra_stack/re_raw_data3.csv' WITH DELIMITER='|'; 

3 rows imported in 0.738 seconds. 
[email protected]:stackoverflow> SELECT * FROM row_historical_game_outcome_data ; 

customer_id | game_id | game_time    | channel | currency_code | game_code | game_name | game_type | game_vendor | progressive_winnings | stake_amount | win_amount 
-------------+---------+--------------------------+---------+---------------+-----------+-----------+-----------+-------------+----------------------+--------------+------------ 
     123123 |  673 | 2015-07-01 00:01:42-0500 |  M |   GBP | SomeName | SomeName |  TYPE | SomeName |     0 |   0.2 |   0 
     456456 |  673 | 2015-07-01 00:01:42-0500 |  M |   GBP | SomeName | SomeName |  TYPE | SomeName |     0 |   0.2 |   0 

(2 rows)

我不知道爲什麼它說：「3行進口，」所以我的猜測是，它是計數標題行。
您的密鑰都是分區密鑰。不知道你是否真的明白這一點。我只指出，因爲我想不出指定多個分區鍵而沒有的原因，它也指定了一個或多個集羣鍵。
我在DataStax文檔中找不到任何指示「時間」是保留字的內容。這可能是一個在cqlsh中的錯誤。但嚴重的是，您應該可能將基於時間的列名稱指定爲「時間」以外的其他名稱。

來源

2015-08-28 19:03:48 Aaron

你調查的是真的，問題出在Informix DB生成的CSV上，但是CassandraDB應該有對其錯誤更詳細 – Adelin

有跡象表明，在您的CSV文件打擾cqlsh兩件事情：

刪除尾隨|在每個CSV行的末尾
從您的時間值中刪除微秒（精度應至多爲毫秒）。

來源

2015-08-28 13:04:58 uri2x

一個其他評論。CQL中的COPY增加了WITH HEADER = TRUE，這會導致CSV文件的標題行（第一行）被忽略。「時間」不是CQL中的保留字（相信我，因爲我只是在DataStax文檔中自己更新了CQL保留字）。但是，您確實在COPY命令的列名稱周圍顯示了列名「time」的空格，我認爲這是問題所在。沒有空格，只是逗號;在CSV文件中爲所有行執行相同操作。（http://docs.datastax.com/en/cql/3.3/cql/cql_reference/keywords_r.html）

來源

2015-09-15 05:04:45 polandll

好點，CQLSH的COPY命令肯定可能會很棘手。 – Aaron

從Cassandra中的CSV導入時沒有插入表中的行

回答

相關問題