Wednesday, May 18, 2016

GSoC 2016, let the project begin!


First of all I am really excited to be a part of this years Google Summer of Code! From the first moment I heard of this event, I gave it my best to get accepted. I am happy it all worked out :)

About me

I am a 21 years old student of the Technical University of Vienna (TU Wien) and currently work on my BSc degree in software and information engineering. I learned about GSoC through a presentation of PyPy, explaining the project of a former participant. Since I currently attend compiler construction lectures, I thought this project would greatly increase my knowledge in developing compilers and interpreters. I was also looking for an interesting project for my bachelor thesis, and those are the things that pretty much lead me here.

My Proposal

The project I am (and will be) working on is PyPy (which is an alternative implementation of the Python language [2]). You can check out a short description of my work here.

Here comes the long but interesting part!

So basically I work on implementing Python 3.5 features in PyPy. I already started and nearly completed matrix multiplication with the @ operator. It would have been cool to implement the matmul method just like numpy does (a Python package for N-dimensional arrays and matrices adding matrix multiplication support), but sadly the core of numpy is not yet functional in PyPy.
The main advantage of @ is that you can now write:

S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)

instead of:
S = dot((dot(H, beta) - r).T, dot(inv(dot(dot(H, V), H.T)), dot(H, beta) - r))

making code much more readable. [1]

I will continue with the additional unpacking generalizations. The cool thing of this extension is that it allows multiple unpackings in a single function call.
Calls like

function(*args,3,4,*args,5)

are possible now. That feature can also be used in variable assignments.

The third part of my proposal is also the main part, and that is asyncio and coroutineswith async and await syntax. To keep this short and understandable: coroutines are functions that can finish (or 'return' to be precise) and still remember the state they are in at that moment. When the coroutine is called again at a later moment, it continues where it stopped, including the values of the local variables and the next instruction it has to execute.
Asyncio is a Python module that implements those coroutines. Because it is not yet working in PyPy, I will implement this module to make coroutines compatible with PyPy.
Python 3.5 also allows those coroutines to be controlled with “async” and “await” syntax. That is also a part of my proposal.

I will explain further details as soon as it becomes necessary in order to understand my progress.

Bonding Period

The Bonding Period has been a great experience until now. I have to admit, it was a bit quiet at first, because I had to finish lots of homework for my studies. But I already got to learn a bit about the community and my mentors before the acceptance date of the projects. So I got green light to focus on the development of my tasks already, which is great! That is really important for me, because it is not easy to understand the complete structure of PyPy. Luckily there is documentation available (here http://pypy.readthedocs.io/en/latest/) and my mentors help me quite a bit.
My timeline has got a little change, but with it comes a huge advantage because I will finish my main part earlier than anticipated, allowing me to focus more on the implementation of further features.
Until the official start of the coding period I will polish the code of my first task, the matrix multiplication, and read all about the parts of PyPy that I am still a bit uneasy with.

My next Blog will already tell you about the work of my first official coding weeks, expect good progress!