字符串在卡桑德拉排序CQL

當查詢在卡桑德拉CQL文本主鍵，字符串比較工作的什麼人預期相反的方向，即字符串在卡桑德拉排序CQL

 
cqlsh:test> select * from sl; 

name      | data 
--------------------------+------ 
000000020000000000000003 | null 
000000010000000000000005 | null 
000000010000000000000003 | null 
000000010000000000000002 | null 
000000010000000000000001 | null 

cqlsh:test> select name from sl where token(name) < token('000000010000000000000005'); 
name 
-------------------------- 
000000020000000000000003 

(1 rows) 

cqlsh:test> select name from sl where token(name) > token('000000010000000000000005'); 
name 
-------------------------- 
000000010000000000000003 
000000010000000000000002 
000000010000000000000001 

(3 rows)

在constrast，這是我從字符串比較得到在Python（我認爲在大多數其他語言）：

>>>'000000020000000000000003' < '000000010000000000000005' 
False

如果我查詢，而不令牌功能，我得到以下錯誤：

 
cqlsh:test> select name from sl where name < '000000010000000000000005'; 
Bad Request: Only EQ and IN relation are supported on the partition key (unless you use the token() function)

表描述是：

CREATE TABLE sl (
    name text, 
    data blob, 
    PRIMARY KEY (name) 
) WITH 
    bloom_filter_fp_chance=0.010000 AND 
    caching='KEYS_ONLY' AND 
    comment='' AND 
    dclocal_read_repair_chance=0.000000 AND 
    gc_grace_seconds=864000 AND 
    index_interval=128 AND 
    read_repair_chance=0.100000 AND 
    replicate_on_write='true' AND 
    populate_io_cache_on_flush='false' AND 
    default_time_to_live=0 AND 
    speculative_retry='99.0PERCENTILE' AND 
    memtable_flush_period_in_ms=0 AND 
    compaction={'class': 'SizeTieredCompactionStrategy'} AND 
    compression={'sstable_compression': 'LZ4Compressor'};

有沒有在我已經錯過了或其他地方，爲什麼選擇這樣一個奇怪的字符串比較順序的文檔的解釋，或者做字符串比較操作就不是我所期望它（即返回一些不相關的順序，即將它們寫入數據庫時的順序）。我使用Murmur3Partitioner分區程序以防萬一。

來源

2014-09-30 alexk

在Cassandra中，行按其鍵值的散列排序。使用Random和Murmur3分割器時，散列值有一個隨機元素，因此順序爲A）無意義，B）設計爲均勻分佈在環中。

因此，查詢小於token('000000010000000000000005')的令牌不會基於字符串值「000000010000000000000005」進行比較。它將對散列標記值進行比較。根據您所看到的結果，字符串「000000020000000000000003」的標記值小於「000000010000000000000005」的標記值。

欲瞭解更多的信息，從DataStax檢查此文檔：Paging Through Unordered Partitioner Results。

假設你希望能夠通過「名」的值來查詢你的數據，你可以建一個表有點像這樣：

CREATE TABLE sl (
    type text, 
    name text, 
    data blob, 
    PRIMARY KEY (type, name) 
)

我創建type作爲分區鍵。我不確定您的數據是否有意義被「類型」（或其他任何事情）分開，所以它更多的是爲了舉例而不是其他任何事情。無論如何，與name作爲聚集鍵（確定磁盤上的排序順序）此查詢會工作：

select * from sl where type='sometype' AND name < '000000010000000000000005';

同樣它只是一個例子，但我希望可以幫助到你指出正確的方向。

來源

2014-09-30 14:51:54 Aaron

謝謝你，我很困惑，行似乎是在DESC順序排序，但它看起來像一個純粹的巧合。項目進行的方式我不需要太多的分區，所以我可能會使用有序的分區程序，或者完全使用應用程序級別的排序和比較。 – alexk 2014-09-30 15:14:43

@alexk只是警告，字節順序分區程序已被棄用，應該*不能*被使用。 http://www.datastax.com/documentation/cassandra/2.1/cassandra/architecture/architecturePartitionerBOP_c.html – Aaron 2014-09-30 15:22:05

以下是關於令牌功能和相關分頁文檔的一些鏈接。爲廣泛的話題道歉。我不確切知道哪些可能有所幫助：

http://www.datastax.com/documentation/cql/3.1/cql/cql_using/paging_c.html通過無序分區程序結果進行分頁意味着使用Murmur3Partitioner確實很重要。
http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/select_r.html?scroll=reference_ds_d35_v2q_xj__paging-through-unordered-results部分表示使用RandomPartitioner進行分頁不會給您有意義的結果。 RandomPartitioner在這種情況下與Murmer3Partitioner是同步的。文檔應該提及兩者。
http://www.datastax.com/dev/blog/client-side-improvements-in-cassandra-2-0請參閱自動尋呼。
http://datastax.github.io/python-driver/query_paging.html
http://www.datastax.com/drivers/java/2.0/index.html請參閱ResultSet。

來源

2014-09-30 14:44:14 catpaws

字符串在卡桑德拉排序CQL

回答

相關問題