I am always keeping and eye on mypyc, typed_python (llvm Python compiler) and nuitka
I guess that because Python is extremely dynamic, we may never have a full everything-works compiler, but I’m excited about the possibility of this becoming some kind of intermediate step where different parts of the program get compiled when possible.
We can have a compiler that does everything. It's just a matter of whether you have to stick the python interpreter in the compiled binary or not, or how much of it you have to use and whether you can only use the parts required. This is how a lot of Scheme compilers work, even though you still have `eval` and similar things.
If Tcl can be compiled (to a large degree, and without type annotations) to machine code (AOT) using TclQuadCode there's every hope for Python !
Nuitka only removes interpreter overhead. (just 30%) It's still quite slow. To get real performance improvements, we'd need memory optimizations such as a modern JIT's hidden classes and shapes, which store data directly on the object, instead of inside a dictionary. https://mathiasbynens.be/notes/shapes-ics
I'd add Pythran to that list. It's a python to cpp compiler for numerical python. It achieves impressive speed ups often with very little adjustment of the numerical code. It's highly undervalued IMO, you get speed similar or better than highly optimized cython or c code with very little or no adjustments.
I compared it to cython, numba and Julia for DSO, which I wrote about here: https://jochenschroeder.com/blog/articles/DSP_with_Python2/
If you have a Python2 codebase, Shedskin also gives excellent speedups for numerical codes, the only thing that didn't see as good of a speed boost was string operations. Although that might be fixed.
Have you considered Kotlin and Graal? It's obviously not Python, but Kotlin feels syntactically like Python meets Java, and since it compiles to byte code, you can do AoT compilation in Graal.
Edit: apparently GraalPython is a thing.
Syntactically, sure. But D is semantically a better combination of Python and Java. With `mixin`, you can `eval` arbitrary strings at compile time. You can call methods and use variables that are dynamically generated, like Python's `__getitem__` with D's `opDispatch`. You can generate code based on names and member variables using D's traits. You can use Python-like refcounting with `RefCounted!`. You can use Python's generators/iterators using D's lazy ranges, which are just as fast as manual for loops. You can bind to Python with little effort using PyD. Just like Python, D has great C interop.
D compiles quickly and has much nicer syntax than C or C++.
The main benefit of Python is the ecosystem.
Kotlin Native also compiles to platform binaries.
Kotlin Native is going through a reboot after they realised making a memory model incompatible with JVM semantics wasn't that great idea after all.
Who would have guessed....
Cython is also mentioned downthread.
typed_python is new to me. I'll check it out. I'm too am keeping an eye on this space. I think that compiling or transpiling python may be the solution to both the major problems I have with python: performance and distribution. Exciting times.
I feel like it should be the agenda of the typed python syntax to allow writing annotated python code that can be compiled into a form that is as fast as equivalent c code.
As a bit of background info, mypyc is “not really” ready for broader use yet. The devs are planning a soft-launch: https://github.com/mypyc/mypyc/issues/780
It is quite promising though, if it becomes more robust and compatible. I also believe they have still only scratched the surface of possible optimizations.
Yes, this. Actually I first shared it here, because I thought that's cool and could work quite cleanly since mypy works well, but when I actually tried compiling one of my Advent of Code solutions with it, what i got was goto stuffed mess. I know I can't expect nice C code, but i certainly didn't expect gotos.
As for the performance gain - 13.5 s with Python, 9 s compiled. It was a naive implementation of AoC 2020/23, so a lot of array cutting, concatenation etc. So this isn't really math, rather lot of RAM I/O
There's nothing wrong with gotos in compiled code. At the end of the day, machine code is really just a bunch of gotos with other instructions in between.
The reason goto is considered bad is that it can make code hard to follow for humans. Since this is an intermediate step in compilation, that's not an issue here.
Cython has a similar feature: https://cython.readthedocs.io/en/latest/src/tutorial/pure.ht...
Yes and it works.
What is the difference between cython and mypyc? I think they should answer the question why anyone would want this over cython on the readme.
Not having worked with cython, the difference seems to be that cython requires using special types in its annotations as well as not supporting specializing the standard types like ‘list’.
Mypy aims to be compatible with the standard Python type annotations and still be able to optimize them. So in theory, you don’t need to modify your existing type-annotated program. In practice I believe there are limitations currently.
Cython has first class treatment for Numpy arrays. Can Mypyc generate machine optimized code for chomping Numpy arrays element-wise?
I don’t think I want my toolchain to have first class knowledge of specific libraries...
Python is married to Numpy for scientific computing.
Cython was around long before Python got type annotations so they kind of had to come up with their own thing. Cython will also happily compile Python WITHOUT type annotations, you just won't see much of a performance boost.
Even without types cython provides a neat way to embed your code and the interpreter into a native executable and has applications for distributing python programs on systems that are tricky for python like Android and WASM.
> Note the use of cython.int rather than int - Cython does not translate an int annotation to a C integer by default since the behaviour can be quite different with respect to overflow and division.
This seems like an important difference to me. Your regular type annotations can be used.
Cython is great, but it (used to?) introduce its own language with its own type syntax.
But that's because Python didn't have type annotations. Now that it has them, cython can just use those instead of its own and developers will get the benefit of being able to compile to C using pure Python.
I am not qualified to make any technical arguments. There’s a strong security and tech-managerial argument for using the software that’s aligned to the reference implementation. Obviously cython is currently the better choice for risk-adverse organizations that need compiled Python. But I think C-ish level people have a good reason to trust the stability, longevity, and security of a product built by the “most official” Python folks. There would need to be a deeply compelling technological reason to choose cython, not merely chasing a few wasted cycles or nifty features.
Obviously organizations that don’t manage human lives or large amounts of money can use ‘riskier’ tools without as much worry. This isn’t an argument against cython generally. But I worked at a hospital and wrote a lot of Python, and would not have been able to get the security team to support cython on their SELinux servers without a really good argument. Cython is just an unnecessary liability when your job manages identifiers and medical details on servers accessible on a fairly wide (albeit private) network.
Cython lets you use C structs to speed up memory access, and generally gives you lower-level access.
Note that GraalPython has the C structs memory layout too.
Actually spent the evening trying to compile black through mypyc. The tooling is there (blacks setup.py has a thing) but most recent revisions of mypyc with black aren’t quite working for me
The biggest issue right now seems to be miscompiles and the resulting errors being a bit inscrutable. It leaves you in the “am I wrong or is the system what’s wrong?” stuff a bit still.
But overall I think the techniques are really sound and I believe this is the most promising way forward for perf in Python.
IMHO it makes little sense to compile complete Python programs vs just compiling the slow parts. Some of the best reasons to choose Python are precisely the ones that preclude compilation, including:
- "batteries included" including a massive set of libraries (any one of which won't be supported by the compiler)
- dynamism which makes it easy to wrangle syntax to your needs (including the creation of domain-specific languages), but which destroys the performance improvement of compilation, even if the compiler can handle all the crazy introspection.
This isn't about compiling an entire program, this is about compiling the individual libraries that you may be consuming, if they already have type hint coverage. A "free" performance boost.
If I have a pure python, fully type hinted library I'm consuming, hats off to them, and they choose to use this, awesome.
> IMHO it makes little sense to compile complete Python programs
Which is why this compiles specified modules, which can freely call noncompiled modules, not “complete Python programs”.
> IMHO it makes little sense to compile complete Python programs vs just compiling the slow parts.
It makes sense for distribution of apps to end users, which is a particular pain point with Python.
I think this extends outside of Python. Performance and safety are trade offs, not absolutes, and the balance of needing safety or performance vs extensibility vs ease of development may result in dozens or hundreds of different trade off needs in different parts of a single application.
One consequence is that it never makes sense to use static typing or compilation as application-wide absolutes for any language or paradigm.
You should virtually never be writing whole applications in Rust, C, C++, Java, Haskell, etc. It is a huge sign of bad premature optimization and dogmatism. Compiling subsets in these languages and then exposing them through extension modules in other languages that don’t force those constant trade offs is almost always superior, and it’s very telling about poor engineering culture when this results in debates or vitriolic dismissiveness from people with prior dogmatic commitments to static typing or AOT compilation.
Somewhat related, I had a devil of a time a little bit ago trying to ship a small Python app as a fully standalone environment runnable on "any Linux" (but for practical purposes, Ubuntu 16.04, 18.04, and 20.04). It turns out that if you don't want to use pip, and you don't want to build separate bundles for different OSes and Python versions, it can be surprisingly tricky to get this right. Just bundling the whole interpreter doesn't work either because it's tied to a particular stdlib which is then linked to specific versions of a bunch of system dependencies, so if you go that route, you basically end up taking an entire rootfs/container with you.
After evaluating a number of different solutions, I ended up being quite happy with pex: https://github.com/pantsbuild/pex
It basically bundles up the wheels for whatever your workspace needs, and then ships them in an archive with a bootstrap script that can recreate that environment on your target. But critically, it natively supports the idea of targeting multiple OS and Python versions, you just explicitly tell it which ones to include, eg:
Docs on this: https://pex.readthedocs.io/en/latest/buildingpex.html#platfo...
--platform=manylinux2014_x86_64-cp-38-cp38 # 16.04 --platform=manylinux2014_x86_64-cp-36-cp36m # 18.04 --platform=manylinux2014_x86_64-cp-35-cp35m # 20.04
And you can see the tags in use for any package on PyPI which ships compiled parts, eg: https://pypi.org/project/numpy/#files
I don't know that this would be suitable for something like a game, but in my case for a small utility supporting a commercial product, it was perfect.
I recently just used pyinstaller and pip on an Ubuntu 16.04 build machine. Everything works for 16, 18, 20 and even some late Redhat versions with no work. Installed it on 3000 servers with paramiko under prefect. Aside from the odd individual server issue it all worked.
> “if you don't want to use pip”
Why wouldn’t you want to use pip?
Pip is suitable for use by developers working in python, setting up python workspaces with python sources and python dependencies, but it's a UX fiasco for an end-user who just wants to run a black box application and not have to care.
In my particular case the "application" was in fact interactive bootstrap/install scripts for a large, proprietary blob which wouldn't have been suitable for publishing on PyPI, anyway. Setting up a separate, possibly authenticated PyPI instance, and then training end users how to use it, vs just shipping everything together in a single package? Total non-starter.
Interesting, sounds like a very unique use case. Is containerizing not a possible solution?
That would have worked, but it would have made the whole thing a lot bigger— even a featherweight base image would have added more than what pex was able to do. It complicates the usage side too, as then you need to be root to chroot/nspawn/docker/whatever your way into the container.
Definitely a complicating factor was that all of this was meant to be usable by non-nerds and in an environment with limited or possibly no internet access. It wouldn't have been acceptable to download your installer package at the hotel, and then get to site and invoke it only to discover that you were on the hook for a few GBs of transfer from docker.io.
This sounds a bit like a GUI application, so containers would bring their own problems. Also you again force end user install docker etc
It would be a dream-come-true to be able to compile Python (or some kind of very-close Python) down to a static binary. I want to run it like a Go binary.
You already could? Or are you asking about something else?
I looked into this and it seems like no, these are not static binaries at all. They dynamically load stuff, and cython seems to be embedding very specific headers, including linux specific ones (asm/errno.h). I tried to build using musl-gcc but it was too different.
When I say build static like a Go binary, I mean that the binary contains everything, and is not allowed to dynamically load anything at all. Also, preferrably it doesn't need a C standard library, and does system calls manually.
Here is one recent benchmark. Looks very promising. https://github.com/mypyc/mypyc-benchmark-results/blob/master...
> Classes are compiled into extension classes without __dict__ (much, but not quite, like if they used __slots__)
Is there any way to say "no, a really want a __dict__ class here, please"?
I think defining __dict__ explicitly should work.
> Is there any way to say "no, a really want a __dict__ class here, please"?
Write it in a module you aren't compiling, and import it, since this supports compiled modules using noncompiled ones.
CUDA is basically C with Fortran semantics, right? Wouldn't something like that be possible with Python?
It seems really interesting that the mypy team went to such lengths to create a binary version of their linter.
The big draw with mypyc has got to be direct integration with other source code in C.
Can anyone answer if it’s possible to replace PyPy’s VM backend with LLVM for AOT compilation? I wonder if that will results in any performance improvements.
Does the resulting code run as fast as native C?
Would love to see some benchmarks on this.
> Does the resulting code run as fast as native C?
The motivating use case is mypy, so I guess if someone wants to hand code mypy in native C we can assess this. But not doing that is as much, I would expect, of the motivation as speeding up mypy is.
There is also Pyccel https://github.com/pyccel/pyccel. When I last tried it, it worked on most small codes, but there were some bugs.
"The aim of Pyccel is to provide a simple way to generate automatically, parallel low level code. The main uses would be:
Convert a Python code (or project) into a Fortran or C code. Accelerate Python functions by converting them to Fortran or C functions. Pyccel can be viewed as:
Python-to-Fortran/C converter a compiler for a Domain Specific Language with Python syntax"
The downvotes probably came from non-slavic readers. I read it as Murus too haha.
This sounds awesome.
What does it do?
It compiles type-annotated Python to C
Does the resulting code run as fast as native C?
> Mypyc is a compiler that compiles mypy-annotated, statically typed Python modules into CPython C extensions. Currently our primary focus is on making mypy faster through compilation -- the default mypy wheels are compiled with mypyc. Compiled mypy is about 4x faster than without compilation.
My wager is that it does not. It may if you have math intensive code, but if you have an algorithm that touches lots of python built in datatypes, access to those types will be the bottleneck.