python 坑的记录

  1. python 的字符编码问题
  2. MySQL text, longtext
  3. MySQL charset
  4. MySqlDb 再出现 attribute error 可以试试到 site-package 的 cursors.py 下面随便加一句 print re
    … 虽然至今该 bug 成因不明。。。
  5. itertools.product 不是笛卡尔积。。。因为对 empty 它会生成 (elem, None)…
    http://stackoverflow.com/questions/3154301/what-should-itertools-product-yield-when-supplied-an-empty-list
  6. 苦苦追寻遍历 list 两遍的方法…
  7. CPython implementation detail: In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation). If you want your application to make better use of the computational resources of multi-core machines, you are advised to use multiprocessing. However, threading is still an appropriate model if you want to run multiple I/O-bound tasks simultaneously. 注意 Python 只支持单线程执行。
  8. MySQLDB 远端大文件:

    class CursorUseResultMixIn
    This is a MixIn class which causes the result set to be stored
    in the server and sent row-by-row to client side, i.e. it uses
    mysql_use_result(). You MUST retrieve the entire result set and
    close() the cursor before additional queries can be peformed on
    the connection.

    特别注意这个。
    默认的 cursor 会把远端结果存到本地来的

  9. 在我的 2.6 的 python 上,multprocessing 和 partial 有兼容性问题,partial is not picklable. 考虑这个问题的实质是保存一个状态量。而为了保存局部的状态量,class 是一个很合适的选择,只要加上 __call__ 方法,就能使他被调用,同时 self 也会传入,不会丢失状态信息。所以我使用 class 解决这个问题。

    class handler:
        def __init__(self, func, bindarg):
            self.param, self.func = bindarg, func;
        def __call__(self, arg): # make sure that he takes only one argument
            self.func(bindarg, arg);
    
    pool.map(handler(max, 10), range(0, 100));
    

    另外的 workaround 见 Hack for functools.partial and multiprocessing

  10. never ever use builtin json library. Use ultraJson instead!

Leave a Reply

Your email address will not be published. Required fields are marked *