2011-10-01 57 views
0

更新:在我的內核int4是錯誤的。與PyOpenCL結構對齊

我正在使用pyopencl,但無法使結構對齊正常工作。在下面調用內核兩次的代碼中,b值正確返回(如1),但c值有一些「隨機」值。

換句話說:我想讀一個結構的兩個成員。我可以閱讀第一個,但不是第二個。爲什麼?

無論我使用numpy結構化數組還是使用struct打包,都會發生同樣的問題。並且評論中的_-attribute__設置也沒有幫助。

我懷疑我在代碼的其他地方做了些愚蠢的事,但看不到它。任何幫助讚賞。

import struct as s 
import pyopencl as cl 
import numpy as n 

ctx = cl.create_some_context() 
queue = cl.CommandQueue(ctx) 

for use_struct in (True, False): 

    if use_struct: 
     a = s.pack('=ii',1,2) 
     print(a, len(a)) 
     a_dev = cl.Buffer(ctx, cl.mem_flags.WRITE_ONLY, len(a)) 
    else: 
#  a = n.array([(1,2)], dtype=n.dtype('2i4', align=True)) 
     a = n.array([(1,2)], dtype=n.dtype('2i4')) 
     print(a, a.itemsize, a.nbytes) 
     a_dev = cl.Buffer(ctx, cl.mem_flags.WRITE_ONLY, a.nbytes) 

    b = n.array([0], dtype='i4') 
    print(b, b.itemsize, b.nbytes) 
    b_dev = cl.Buffer(ctx, cl.mem_flags.READ_ONLY, b.nbytes) 

    c = n.array([0], dtype='i4') 
    print(c, c.itemsize, c.nbytes) 
    c_dev = cl.Buffer(ctx, cl.mem_flags.READ_ONLY, c.nbytes) 

    prg = cl.Program(ctx, """ 
     typedef struct s { 
      int4 f0; 
      int4 f1 __attribute__ ((packed)); 
//   int4 f1 __attribute__ ((aligned (4))); 
//   int4 f1; 
     } s; 
     __kernel void test(__global const s *a, __global int4 *b, __global int4 *c) { 
      *b = a->f0; 
      *c = a->f1; 
     } 
     """).build() 

    cl.enqueue_copy(queue, a_dev, a) 
    event = prg.test(queue, (1,), None, a_dev, b_dev, c_dev) 
    event.wait() 
    cl.enqueue_copy(queue, b, b_dev) 
    print(b) 
    cl.enqueue_copy(queue, c, c_dev) 
    print(c) 

輸出(我不得不重新格式化,同時切+粘貼,所以可能會搞砸線略微突破;我還添加了評論,指出各種打印值):

# first using struct 
/home/andrew/projects/personal/kultrung/env/bin/python3.2 /home/andrew/projects/personal/kultrung/src/kultrung/test6.py 
b'\x01\x00\x00\x00\x02\x00\x00\x00' 8 # the struct packed values 
[0] 4 4        # output buffer 1 
[0] 4 4        # output buffer 2 
/home/andrew/projects/personal/kultrung/env/lib/python3.2/site-packages/pyopencl/cache.py:343: UserWarning: Build succeeded, but resulted in non-empty logs: Build on <pyopencl.Device 'Intel(R) Core(TM)2 CPU   T5600 @ 1.83GHz' at 0x1385a20> succeeded, but said: 

Build started Kernel <test> was successfully vectorized Done. warn("Build succeeded, but resulted in non-empty logs:\n"+message) 
[1]   # the first value (correct) 
[240]  # the second value (wrong) 

# next using numpy 
[[1 2]] 4 8 # the numpy struct 
[0] 4 4  # output buffer 
[0] 4 4  # output buffer 
/home/andrew/projects/personal/kultrung/env/lib/python3.2/site-packages/pyopencl/__init__.py:174: UserWarning: Build succeeded, but resulted in non-empty logs: Build on <pyopencl.Device 'Intel(R) Core(TM)2 CPU   T5600 @ 1.83GHz' at 0x1385a20> succeeded, but said: 

Build started Kernel <test> was successfully vectorized Done. warn("Build succeeded, but resulted in non-empty logs:\n"+message) 
[1]  # first value (ok) 
[67447488] # second value (wrong) 

Process finished with exit code 0 

回答

0

好吧,我不知道我從哪裏得到int4 - 我認爲它必須是英特爾擴展。由於內核類型按預期工作,因此切換到AMD,並且使用int。一旦我清理了一些東西,我會在http://acooke.org/cute/Somesimple0.html上發帖。

0

在OpenCL的程序,嘗試對結構本身packed屬性,而不是成員之一:

typedef struct s { 
     int4 f0; 
     int4 f1; 
} __attribute__((packed)) s; 

這可能是因爲你只能有一個我packed屬性大部分結構,它可能沒有包裝整個結構。

+0

謝謝,我只是試了一下,但它沒有解決問題(也是在這裏「包裝」下的第一個例子http://www.khronos.org/registry/cl/sdk/1.0/docs/man/ xhtml/attributes-variables.html表明它應該是我擁有它的地方,我認爲) –