2017-03-06 79 views
2

我從here得到了代碼。當我使用BPF過濾器時,python腳本的CPU利用率

from binascii import hexlify 
from ctypes import create_string_buffer, addressof 
from socket import socket, AF_PACKET, SOCK_RAW, SOL_SOCKET 
from struct import pack, unpack 

sniff_interval=120 
# A subset of Berkeley Packet Filter constants and macros, as defined in 
# linux/filter.h. 

# Instruction classes 
BPF_LD = 0x00 
BPF_JMP = 0x05 
BPF_RET = 0x06 

# ld/ldx fields 
BPF_H = 0x08 
BPF_B = 0x10 
BPF_ABS = 0x20 

# alu/jmp fields 
BPF_JEQ = 0x10 
BPF_K = 0x00 

def bpf_jump(code, k, jt, jf): 
    return pack('HBBI', code, jt, jf, k) 

def bpf_stmt(code, k): 
    return bpf_jump(code, k, 0, 0) 


# Ordering of the filters is backwards of what would be intuitive for 
# performance reasons: the check that is most likely to fail is first. 
filters_list = [ 
    # Must have dst port 67. Load (BPF_LD) a half word value (BPF_H) in 
    # ethernet frame at absolute byte offset 36 (BPF_ABS). If value is equal to 
    # 67 then do not jump, else jump 5 statements. 
    bpf_stmt(BPF_LD | BPF_H | BPF_ABS, 36), 
    bpf_jump(BPF_JMP | BPF_JEQ | BPF_K, 5201, 0, 5), 

    # Must be UDP (check protocol field at byte offset 23) 
    bpf_stmt(BPF_LD | BPF_B | BPF_ABS, 23), 
    bpf_jump(BPF_JMP | BPF_JEQ | BPF_K, 0x06, 0, 3), 

    # Must be IPv4 (check ethertype field at byte offset 12) 
    bpf_stmt(BPF_LD | BPF_H | BPF_ABS, 12), 
    bpf_jump(BPF_JMP | BPF_JEQ | BPF_K, 0x0800, 0, 1), 

    bpf_stmt(BPF_RET | BPF_K, 0x0fffffff), # pass 
    bpf_stmt(BPF_RET | BPF_K, 0), # reject 
] 

# Create filters struct and fprog struct to be used by SO_ATTACH_FILTER, as 
# defined in linux/filter.h. 
filters = ''.join(filters_list) 
b = create_string_buffer(filters) 
mem_addr_of_filters = addressof(b) 
fprog = pack('HL', len(filters_list), mem_addr_of_filters) 

# As defined in asm/socket.h 
SO_ATTACH_FILTER = 26 

# Create listening socket with filters 
s = socket(AF_PACKET, SOCK_RAW, 0x0800) 
s.setsockopt(SOL_SOCKET, SO_ATTACH_FILTER, fprog) 
s.bind(('eth0', 0x0800)) 

while True: 
    data, addr = s.recvfrom(65565) 
    #print "*****" 
    print 'got data from', addr, ':', hexlify(data) #Have to print data, then only the CPU is 2% 

我與iperf3測試,產生的流量從另一臺筆記本電腦通過以太網電纜我的筆記本電腦。在5021上列出的服務器(我的筆記本電腦)和客戶端(另一臺筆記本電腦)發送數據。

  • 如果我評論print 'got data from', addr, ':', hexlify(data),並運行該腳本,腳本 的CPU利用率,以同比增長30%,在100MB的流量存在40%。
  • 如果我取消註釋print 'got data from', addr, ':', hexlify(data)並再次運行,CPU在下降到2%存在相同的流量。我登記了 htop

那麼,這裏有什麼?

回答

0

我敢打賭,要麼hexlify(),否則極有可能print(因爲它與標準輸出同步)是給你的主線程非常需要休息,一個呼吸的空間,而不是隻是衝擊插座讀出的無限while循環

嘗試添加time.sleep(0.05)(當然首先導入time)而不是print語句並再次檢查CPU使用情況。

+0

嘿,謝謝你的回答。我不想用'睡眠';高速的數據/流量可能會出現在界面上。如果我們失去一些流量或其他東西會怎麼樣? – Veerendra

+0

如果你的套接字提供了足夠的緩衝區,你不會失去任何東西,你只需給其他線程/進程一些空間來呼吸 - 當你調用print/hexlify()時,可能會發生同樣的事情,你只是不會直接控制它。 – zwer