Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XL_TYPE_ARRAY ? #16

Open
sdementen opened this issue Nov 16, 2013 · 2 comments
Open

XL_TYPE_ARRAY ? #16

sdementen opened this issue Nov 16, 2013 · 2 comments

Comments

@sdementen
Copy link

I am using xlloop with the python server. On the python side, I am using numpy to handle large arrays/vectors. The elements within these arrays have always the same type (often doubles but maybe strings).
With the current XL_TYPE_MULTI, the type is encoded for each element so there is no easy way to process n elements in one shot. Moreover, the numpy library allows to convert the array directly to bytes. However, as one must give the type for each element, we cannot directly use these bytes.

So, could it be possible to have a XL_TYPE_ARRAY behaving as the XL_TYPE_MULTI except that the datatype would be encoded once at the beginning of the stream. So we would have
XL_TYPE_MULTI, ROWS, COLS, XL_TYPE_of_element (NUM/INT/...), ELEMENT1, ELEMENT2, ..., ELEMENTn (with n = ROWS x COLS) ?
This would ease the exchange of data through the sockets for python (but R and Java may also be in the same league...).

If such functionnality already exist, where could i find some documentation on it ?

sebastien

@sdementen
Copy link
Author

BTW, here is the code I am using to encode a numpy.array in the xlloop.py server (added in XLCodec.encode method) and that could be improved with XL_TYPE_ARRAY

        elif isinstance(value, np.ndarray):
            socket.send(struct.pack('B', XL_TYPE_MULTI))        # this could be a XL_TYPE_ARRAY 
            sh = value.shape
            rows = sh[0]
            value = value.reshape((-1,))
            socket.send(struct.pack('>i', rows))
            if len(sh) == 1:
                # one dimensional
                socket.send(struct.pack('>i', 0)) # zero cols
            else:
                # two dimensional
                assert len(sh)==2
                cols = sh[1]
                socket.send(struct.pack('>i', cols))
            # preparing conversion for endianness + adding the type before each byte
            result = np.zeros(dtype=[('type', 'i1'), ('data', '>f8')], shape = len(value))
            result["type"] = XL_TYPE_NUM
            result["data"] = value
            socket.send(result.tostring())
            # if no endianness conversion and single type before the steam of a XL_TYPE_ARRAY, this would simplify to
            #socket.send(struct.pack('B', XL_TYPE_NUM))
            #socket.send(result.tostring())
            # ... saving memory and time

@mnar53
Copy link

mnar53 commented Jan 18, 2016

Assuming we are interested only to arrays of double, I think the encoding can be streamlined to

def rowcol(A):
a = A.shape
rank = len(a)
if (rank==2):
return (a[0],a[1])
elif (rank==1):
return (a[0],1) # a column
else:
return (0,0)

def sendDoubleArray(socket,value) :
(rows, cols) = rowcol(value)
socket.send(struct.pack('B', XL_TYPE_MULTI))
socket.send(struct.pack('>i', rows))
if rows == 0:
socket.send(struct.pack('>i', 0)) # zero cols
else:
socket.send(struct.pack('>i', cols))
for x in numpy.nditer(value,order='C'):
socket.send(struct.pack('B', XL_TYPE_NUM))
socket.send(struct.pack('>d', x))

On the other side, the decoding of an XL_TYPE_MULTI, might be

....elif type == XL_TYPE_MULTI:
rows = decodeInt(sockt.recv(4))
cols = decodeInt(sockt.recv(4))
if cols == 0 or rows == 0:
return []
##################
#print 'DECODING ARRAY'
type = ord(sockt.recv(1,socket.MSG_PEEK))
if type == XL_TYPE_NUM:
return decodeDoubleArray(rows,cols,sockt.recv(9_rows_cols))
elif type == XL_TYPE_STR:
return decodeStringArray(rows,cols,sockt)

with:
def decodeDoubleArray(rows,cols,buff):
a = numpy.zeros((rows_cols))
k = 1
idx = 0
for i in xrange(rows):
for j in xrange(k,k+9_cols,9):
a[idx] = struct.unpack_from('>d', buff, j)[0]
idx += 1
k += 9*cols
return a.reshape(rows,cols)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants