Thursday, June 23, 2016

Progress summary of additional unpacking generalizations

Currently there's only so much to tell about my progress. I fixed a lot of errors and progressed quite a bit at the unpacking task. The problems regarding AST generator and handler are solved. There's still an error when trying to run PyPy though. Debugging and looking for errors is quite a tricky undertaking because there are many ways to check whats wrong. What I already did though is checking and reviewing the whole code for this task, and it is as good as ready as soon as that (hopefully) last error is fixed. This is probably done by comparing the bytecode instructions of cpython and pypy, but I still need a bit more info.

As a short description of what I implemented: until now the order of parameters allowed in function calls was predefined. The order was: positional args, keyword args, * unpack, ** unpack. The reason for this was simplicity, because people unfamiliar with this concept might get confused otherwise. Now everything is allowed, breaking the last thought about confusions (as described in PEP 448). So what I had to do was checking parameters for unpackings manually, first going through positional and then keyword arguments. Of course some sort of priority has to stay intact, so it is defined that "positional arguments precede keyword arguments and * unpacking; * unpacking precedes ** unpacking" (PEP 448). Pretty much all changes needed for this task are implemented, there's only one more fix and a (not that important compared to the others) bytecode instruction (map unpack with call) to be done.

As soon as it works, I will write the next entry in this blog. Also, next in the line is already asyncio coroutines with async and await syntax.

Short Update (25.06.): Because of the changes I had to do in PyPy, pretty much all tests failed for some time as function calls haven't been handled properly. I managed to reduce the number of failing tests by about 1/3 by fixing a lot of errors, so there's missing just a bit for the whole thing to work again.

Update 2 (26.06.): And the errors are all fixed! As soon as all opcodes are implemented (that's gonna be soon), I will write the promised next blog entry.

Thursday, June 9, 2016

The first two weeks working on PyPy


I could easily summarize my first two official coding weeks by saying: some things worked out while others didn’t! But let’s get more into detail.


The first visible progress!
I am happy to say that matrix multiplication works! That was pretty much the warm-up task, as it really only is an operator which no built-in Python type implements yet. It is connected to a magic method __matmul__() though, that allows to implement it. For those who don't know, magic methods get invoked in the background while executing specific statements. In my case, calling the @ operator will invoke __matmul__() (or __rmatmul__() for the reverse invocation), @= will invoke __imatmul__(). If “0 @ x” gets called for example, PyPy will now try to invoke “0.__matmul__(x)”. If that doesn't exist for int it automatically tries to invoke “x.__rmatmul__(0)”.
As an example the following code already works:
>>> class A:
...         def __init__(self,val):
...            self.value = val
...         def __matmul__(self, other):
...            return self.value + other
>>> x = A(5)
>>> x @ 2
7


The next thing that would be really cool is a working numpy implementation for PyPy to make good use of this operator. But that is not part of my proposal, so it's probably better to focus on that later.


The extended unpacking feature is where I currently work on. Several things have to be changed in order for this to work.
Parameters of a function are divided into formal (or positional) arguments, as well as non-keyworded (*args) and keyworded (**kwargs) variable-length arguments. I have to make sure that the last two can be used any number of times in function calls. Until now, they were only allowed once per call.
PyPy processes arguments as args, keywords, starargs (*) and kwargs (**) individually. The solution is to compute all arguments only as args and keywords, checking all types inside of them manually instead of using a fix order. I'm currently in the middle of implementing that part while fixing a lot of dependencies I overlooked. I broke a lot of testcases with my changes at an important method, but that should work again soon. The task requires the bytecode instruction BUILD_SET_UNPACK to be implemented. That should already work, it still needs a few tests though.
I also need to get the AST generator working with the new syntax feature. I updated the grammar, but I still get errors when trying to create the AST, I already got clues on that though.


Some thoughts to my progress so far
The cool thing is, whenever I get stuck at errors or the giant structure of PyPy, the community helps out. That is very considerate as well as convenient, meaning the only thing I have to fight at the moment is time. Since I have to prepare for the final commitments and tests for university, I lose some time I really wanted to spend in developing for my GSoC proposal. That means that I probably fall a week behind to what I had planned to accomplish until now. But I am still really calm regarding my progress, because I also planned that something like that might happen. That is why I will spend way more time into my proposal right after I that stressful phase to compensate for the time I lost.
With all that being said, I hope that the extended unpacking feature will be ready and can be used in the coming one or two weeks.