0
我有這段代碼。在豬中創建一個龐大的過濾器
large = load 'a super large file'
CC = FILTER large BY $19 == 'abc OR $20 == 'abc'
OR $19 == 'def' or $20 == 'def' ....;
或條件的數量可能會上升到100甚至數千。
有沒有更好的方法來做到這一點?
我有這段代碼。在豬中創建一個龐大的過濾器
large = load 'a super large file'
CC = FILTER large BY $19 == 'abc OR $20 == 'abc'
OR $19 == 'def' or $20 == 'def' ....;
或條件的數量可能會上升到100甚至數千。
有沒有更好的方法來做到這一點?
是的,將這些條件放在另一個文件中。將它加載到一個關係中,並將兩個關係連接到該列上。如果必須在多列上過濾,則創建與條件一樣多的過濾器文件。下面是2列
large = load 'a super large file'
filter1 = load 'file with values needed to compare with $19';
filter2 = load 'file with values needed to compare with $20';
f1 = JOIN large BY $19,filter1 BY $0;
f2 = JOIN large BY $20,filter2 BY $0;
final = UNION f1,f2;
DUMP final;
你或許可以使用多列1個過濾文件和加入那些得到不同的過濾效果,然後就工會的關係。
large = load 'a super large file'
filter_file = load 'file with values in different columns';
f1 = JOIN large BY $19,filter_file BY $0;
f2 = JOIN large BY $20,filter_file BY $1;
final = UNION f1,f2;
DUMP final;