Friday, December 5, 2014

Obfuscating Python code


WARNING!!!:
Obfuscating code is probably a bad idea. It can have unpredicted consequences. Think twice before continuing


The script:
Next shell script will serve as reference to obfuscate Python code by compiling original files to .pyo while hiding the original source from py files.


#!/bin/sh
# DIR_L: directory list (module list)

DIR_L="pythonModule1 pythonModule2 pythonModule3"

CWD=`pwd`
for dirA in ${DIR_L}; do
    cd ${CWD}/${dirA}
    rm -f *pyc *pyo
    python -OO -m compileall .
    ls *py | while read f ; do
        # Create empty "file".py with same modification time
        #
that tmp (the renamed original "file".py used to
        # create the pyo file)
        mv $f tmp && touch -r tmp $f && rm tmp
    done
done



Explanation:
 Python will *always* compare the modification date of the *py file with the one hardcoded in the *pyo object. If it doesn't match, the pyo will be recreated.
Creating a pyo file and then removing the code in the original *py file will not work since once the code is removed (echo -n "" > myfile.py) the modification date is updated. Next run will update the pyo file from the empty py file.
 Removing the py file leaving only the pyo file will not work either, since python will ignore the pyo file if the matching py file is not present.

The quick-and-easy solution then is to use "touch" with the -r option. This combination will create an empty file and update the creation/modification timestamp of the new file taking as reference the file passed to "-r".  Since moving a file doesn't modify its modification timestamp a line similar to:

  mv myfile.py tmp && touch -r tmp myfile.py && rm tmp 

will create the empty myfile.py with a modification timestamp matching that of the myfile.pyo (created previously using python -OO -m compileall .)

While not shown in the script it's pretty clear that we must be working with a copy of the original sources. Otherwise they will be lost after obfuscation. Is up to the reader to choose its favourite way to do the copy. As a hint I use something like the next one-liner:

tar -C src_dir -cf - * | tar -C working_dir -xf -

Alternatives:
pyminifier  provides utilities to minify, obfuscate and compress Python code as well as a Python API to use it inside other Python code. Unfortunately in my own tests it was buggy with many cryptic errors about missing modules.

Notes:
touch is just available on UNIX-POSIX friendly systems. Probably it will also work on Windows with the help of cygwin, but it's not tested at all (and it will never be!).


No comments: