angle-uparrow-clockwisearrow-counterclockwisearrow-down-uparrow-leftatcalendarcard-listchatcheckenvelopefolderhouseinfo-circlepencilpeoplepersonperson-fillperson-plusphoneplusquestion-circlesearchtagtrashx

Using PyInstaller and Cython to create a Python executable

Compile selected modules with Cython and bundle your application with PyInstaller.

6 October 2021 Updated 6 October 2021
post main image
https://www.pexels.com/nl-nl/@polina-kovaleva

You have created a Python application and want to distribute it. You probably have it running in a Python Virtual Environment. But customers don't have this setup, some may not even have Python installed.

There are several programs that can convert your Python application into a single executable file. Here I am using PyInstaller. You may also want to protect some of your code and/or speed up some operations. For this we can use Cython.

Spoiler: Adding static Cython type declarations for two variables gave a 50-fold performance increase.

As always, my development machine is running Ubuntu 20.04.

PyInstaller

PyInstaller is a tool to bundle your Python files and all its dependencies into a single executable or directory. The packaged app can run without installing a Python interpreter or modules.

Currently, you can use PyInstaller to build executables for Linux, Windows and MacOS. To create an executable for Linux, you must run PyInstaller on a Linux system, to create an executable for Windows, you need to run PyInstaller on a Windows system, etc. PyInstaller also runs on a Raspberry Pi.

PyInstaller uses (library) files from the target system which means an executable created for Windows 10 may not run on Windows 8.

Currently, PyInstaller does not support ARM. However, it is possible to create a bootloader for ARM, see 'Building the bootloader'. Once you have done this, you can run PyInstaller on your ARM system to build the executable.

It is also possible to feed PyInstaller with some (library) files, and do cross-compilation, e.g. use an ARM compiler on your development system, this would be great but I have not researched this, nor is there much to find about it on the internet.

The generated executable file is not small. The size of a minimal executable file is about 7 MB. You should always run PyInstaller in a Virtual Environment where you have a minimum of packages.

PyInstaller can create a single executable or a directory containing your executable and library files. The directory approach is a bit messy, but has the advantage that your application starts faster because lots of files don't have to be unzipped. If you have more executables that use the same libraries you can group them together in this directory.

An executable generated by PyInstaller can trigger antivirus software. In this case you should whitelist it.

Cython

From the website: 'Cython is an optimizing static compiler for both the Python programming language and the extended Cython programming language (based on Pyrex). It makes writing C extensions for Python as easy as Python itself.'

I have a Python application and want to distribute it. I can do this with PyInstaller, but I also want to have some parts of my Python code protected. If you search the internet for 'pyinstaller decompile' you will see that it is not that difficult to get the code out, .pyc bytecode is easy to decompile.

Since we already need to build executables with PyInstaller for different target systems, we can use Cython to compile some Python files before starting PyInstaller. For Linux, the compiled files are .so files, shared library files. Everything can be reverse-engineered, but .so is more difficult than .pyc.

And now that we're using Cython, maybe we can also easily improve performance.

The test application

In a Virtual Environment we install the packages PyInstaller and Cython:

  • pip install pyinstaller
  • pip install cython

Then we create a directory 'project' with the following stucture and files:

.
├── app
│   ├── factory.py
│   └── some_package
│       ├── __init__.py
│       └── some_classes.py
├── README
└── run.py
# run.py
from app.factory import run

if __name__ == '__main__':
    run()
# app/factory.py
import datetime
from app.some_package import SomeClass1

def run():
    print('Running ...')
    some_class1 = SomeClass1()
    ts = datetime.datetime.now()
    loops = some_class1.process()
    print('Ready in: {} seconds, loops = {}'.\
        format((datetime.datetime.now() - ts).total_seconds(), loops))
# app/some_package/__init__.py
from .some_classes import SomeClass1
# app/some_package/some_classes.py

class SomeClass1:
    def process(self):
        print('Processing ...')
        a = 'a string'
        loops = 0
        for i in range(10000000):
            b = a
            loops += 1
        return loops

README is an empty file. The method 'process' in SomeClass1 contains a for loop doing an assignment and increment 10000000 times. In factory.py we measure the time it takes to execute this method.

Run with Python

To run this with Python we do:

python run.py

The result is:

Running ...
Processing ...
Ready in: 0.325362 seconds, loops = 10000000

This means it takes 320 milliseconds to complete.

Create executable with PyInstaller

To create an executable:

pyinstaller --onefile --name="myapp" --add-data="README:README" run.py

This gives a lot of output:

23 INFO: PyInstaller: 4.5.1
23 INFO: Python: 3.8.10
29 INFO: Platform: Linux-5.11.0-37-generic-x86_64-with-glibc2.29
...
4996 INFO: Building EXE from EXE-00.toc completed successfully.

In the 'project' directory two new directories have been added:

  • build
  • dist

Our executable is in the dist directory, it is almost 7 MB:

-rwxr-xr-x 1 peter peter 6858088 okt  6 11:05 myapp

To execute it:

dist/myapp

And the result:

Running ...
Processing ...
Ready in: 0.422389 seconds, loops = 10000000

This is about 20% slower than the non-PyInstaller version.

Now let PyInstaller create a directory with files, i.e. without the --onefile option:

pyinstaller --name="myapp" --add-data="README:README" run.py

Important: The format of --add-data is <SRC:DST>. The separator is ':' on Linux and ';' on Windows.

Our executable is in the dist/myapp directory, to execute it:

dist/myapp/myapp

And the result:

Running ...
Processing ...
Ready in: 0.423248 seconds, loops = 10000000

Processing time is about the same as with the --onefile option.

Add Cython

To start, we only let Cython compile one file:

app/some_package/some_classes.py

For this we create a setup.py file in de 'project' directory:

# setup.py
from setuptools import find_packages, setup
from setuptools.extension import Extension

from Cython.Build import cythonize
from Cython.Distutils import build_ext

setup(
    name="myapp",
    version='0.100',
    ext_modules = cythonize(
        [
            Extension("app.some_package.some_classes", 
                ["app/some_package/some_classes.py"]),
        ],
        build_dir="build_cythonize",
        compiler_directives={
            'language_level' : "3",
            'always_allow_keywords': True,
        }
    ),
    cmdclass=dict(
        build_ext=build_ext
    ),
)

To compile, we run:

python setup.py build_ext --inplace

This created a build_cythonize directory but more important, it created a new file in our module directory:

└── some_package
    ├── __init__.py
    ├── some_classes.cpython-38-x86_64-linux-gnu.so
    └── some_classes.py

The file 'some_classes.cpython-38-x86_64-linux-gnu.so' is the compiled version of some_classes.py! Note that the size of the compiled file is 168 KB.

-rw-rw-r-- 1 peter peter     68 okt  6 10:31 __init__.py
-rwxrwxr-x 1 peter peter 168096 okt  6 12:48 some_classes.cpython-38-x86_64-linux-gnu.so
-rw-rw-r-- 1 peter peter    293 okt  6 12:39 some_classes.py

The beauty of Cython is that we now can run our application in the same way without changing anything:

python run.py

The result is:

Running ...
Processing ...
Ready in: 0.079608 seconds, loops = 10000000

It runs 4 times faster than without Cython!

Create executable with PyInstaller using the compiled file

Because we must add more things we create a Bash script compile_and_bundle that we can execute:

#!/usr/bin/bash

app_name="myapp"
echo "building for app = ${app_name}"

# cleanup
rm -R dist
rm -R build
rm -R "${app_name}.spec"
find . | grep -E "(__pycache__|\.pyc|\.pyo$)" | xargs rm -rf

# compile
python setup.py build_ext --inplace

# bundle
pyinstaller \
    --onefile \
    --name "${app_name}" \
	--add-data="README:README" \
	--add-binary="app/some_package/some_classes.cpython-38-x86_64-linux-gnu.so:app/some_package/" \
    run.py

Make the script executable:

chmod 755 compile_and_bundle

Compile and bundle:

./compile_and_bundle

Again a lot of output ending with:

...
064 INFO: Building EXE from EXE-00.toc completed successfully.

Our executable is in the dist directory. Lets run it:

dist/myapp

The result is:

Running ...
Processing ...
Ready in: 0.128354 seconds, loops = 10000000

Without changing anything we have created an executable that is running 3 to 4 times faster than the version without Cython. It also protects our code better.

The Python version of some_classes.py is not included in the executable, only the compiled version is added. At least that is what I conclude. This shows a number of hits:

grep -r factory.py build

This does not show anything:

grep -r some_classes.py build

Much faster with static Cython type declarations

At certain locations in our code we can add static Cython type declarations. To do this we must first import Cython. In some_classes.py we add type declarations for 'i' and 'loops'.

# app/some_package/some_classes.py
import cython

class SomeClass1:
    def process(self):
        i: cython.int # here
        loops: cython.int # and here
        print('Processing ...')
        a = 'a string'
        loops = 0
        for i in range(10000000):
            b = a
            loops += 1
        return loops

Compile and bundle:

./compile_and_bundle

Our executable is in the dist directory. Lets run it:

dist/myapp

The result is:

Running ...
Processing ...
Ready in: 0.002335 seconds, loops = 10000000

Amazing! With a simple change, our program is running 50 times faster!


Compiling all files of a package

I also wanted to compile all files of my package. But I could not get this working. I tried in setup.py:

        [
            Extension("app.some_package.some_classes.*", 
                ["app/some_package/some_classes/*.py"]),
        ],

The result was:

ValueError: 'app/some_package/some_classes/*.py' doesn't match any files

The only way I could get this to work is by creating an almost identical file compile.py in the 'app' directory.

# app/compile.py
from setuptools import find_packages, setup
from setuptools.extension import Extension

from Cython.Build import cythonize
from Cython.Distutils import build_ext

setup(
    name="packages",
    version='0.100',
    ext_modules=cythonize(
        [
           Extension('some_package.*', ['some_package/*.py']),
        ],
        build_dir="build_cythonize",
        compiler_directives={
            'language_level' : "3",
            'always_allow_keywords': True,
        },
    ),
    cmdclass=dict(
        build_ext=build_ext
    ),
    packages=['some_package']
)

and then running in the 'app' directory:

python compile.py build_ext --inplace

Now both files are compiled. We can add them using --add-binary like before.

Summary

This was an amazing journey. We learned how to build platform dependent executables. And in the mean time we increased the performance of our program with almost no effort.

Of course, if your program is mostly IO-bound than you will not notice much change, but for CPU-bound tasks the increase in speed can be very signifant (or may be even essential).

Links / credits

Compiling Python Code with Cython
https://ron.sh/compiling-python-code-with-cython

Cython
https://en.wikipedia.org/wiki/Cython

PyInstaller Manual
https://pyinstaller.readthedocs.io/en/stable/

Using Cython to protect a Python codebase
https://bucharjan.cz/blog/using-cython-to-protect-a-python-codebase.html

Using PyInstaller to Easily Distribute Python Applications
https://realpython.com/pyinstaller-python

Leave a comment

Comment anonymously or log in to comment.

Comments

Leave a reply

Reply anonymously or log in to reply.