Sunday, March 6, 2011

long <-> str binary conversion

Is there any lib that convert very long numbers to string just copying the data?

These one-liners are too slow:

def xlong(s):
    return sum([ord(c) << e*8 for e,c in enumerate(s)])

def xstr(x):
    return chr(x&255) + xstr(x >> 8) if x else ''

print xlong('abcd'*1024) % 666
print xstr(13**666)
From stackoverflow
  • You want the struct module.

    packed = struct.pack('l', 123456)
    assert struct.unpack('l', packed)[0] == 123456
    
    J.F. Sebastian : It will not work for large numbers e.g., 13**666
  • How about

    from binascii import hexlify, unhexlify
    
    def xstr(x):
        hex = '%x' % x
        return unhexlify('0'*(len(hex)%2) + hex)[::-1]
    
    def xlong(s):
        return int(hexlify(s[::-1]), 16)
    

    I didn't time it but it should be faster and also work on larger numbers, since it doesn't use recursion.

  • I'm guessing you don't care about the string format, you just want a serialization? If so, why not use Python's built-in serializer, the cPickle module? The dumps function will convert any python object including a long integer to a string, and the loads function is its inverse. If you're doing this for saving out to a file, check out the dump and load functions, too.

    >>> import cPickle
    >>> print cPickle.loads(cPickle.dumps(13**666)) % 666
    73
    >>> print (13**666) % 666
    73
    
  • If you need fast serialization use marshal module. It's around 400x faster than your methods.

  • Performance of cPickle vs. marshal (Python 2.5.2, Windows):

    python -mtimeit -s"from cPickle import loads,dumps;d=13**666" "loads(dumps(d))"
    1000 loops, best of 3: 600 usec per loop
    
    python -mtimeit -s"from marshal import loads,dumps;d=13**666" "loads(dumps(d))"
    100000 loops, best of 3: 7.79 usec per loop
    
    python -mtimeit -s"from pickle import loads,dumps;d= 13**666" "loads(dumps(d))"
    1000 loops, best of 3: 644 usec per loop
    

    marshal is much faster.

  • In fact, I have a lack of long(s,256) . I lurk more and see that there are 2 function in Python CAPI file "longobject.h":

    PyObject * _PyLong_FromByteArray( const unsigned char* bytes, size_t n, int little_endian, int is_signed);
    int _PyLong_AsByteArray(PyLongObject* v, unsigned char* bytes, size_t n, int little_endian, int is_signed);
    

    They do the job. I don't know why there are not included in some python module, or correct me if I'am wrong.

0 comments:

Post a Comment