Final GSoC status report
It's 5am and I have a headache. The perfect time for some reflection!
Not only that, but I've just had to play the part of Static Site Ungenerator, because I found out that I deleted the source of the last post and I didn't want to lose it in the upcoming publish. If your Atom feed went funky, sorry.
This document is my Final Work Submission, but is fun for all the family, including the ones who don't work at Google. Hi everyone!
# What we wanted to happen
Going into the summer, the plan was to add functionality to wlroots so that its users (generally Wayland compositors) could more easily switch to a smarter frame schedule. I've had many goes at explaining the problem and they all sucked, so here we go again: if a compositor puts some thought into when it starts its render, desktop latency as perceived by the user can decrease. The computer will feel snappier.
wlroots started the summer with no accommodations for compositors that wanted to put thought into when they start to render. It assumed exactly no thought was to be put in, and left you on your own if you were to decide otherwise. But that has all changed!
The aim of my work could have comprised three things, but I added a fourth and then didn't have time for the third:
- measurement - a way to determine how long a render job took, from start (on the CPU) to finish (on the GPU).
- scheduling - the system that chooses when to tell a compositor that it should render. this wants a way for wlroots users to dictate when they want to start rendering a new frame, relative to when this frame is due to be displayed
- prediction - some clever maths that learns from the time took by previous renders and guesses how long the next one will take. this allows for moving the render start time closer to the frame deadline (good), carefully enough to avoid missing the deadline (very bad)
- bonus! tracing - measurement, but for humans. we're getting 60 new numbers every second, and it's going to be hard to make sense of them if they're just being printed out to the console
# What happened
After some flailing around trying to add a delay to the existing scheduling, I started writing patches worth landing.
First came the render timer API. Now we can measure the duration of our render passes. This MR brought an abstraction for timers, and an implementation for wlroots' gles2 renderer.
Next, the scene timer API.
wlr_scene does some of its own work before setting off the render pass itself, so it needed to become aware of timers and expose a way to use them.
Meanwhile, I was having another stab at configuring a frame delay. It wasn't very good, and the design of wlroots' scheduling and the complexity of the logic underneath it turned out to take a long time to get through. With this MR, though, I had a better idea of where I was trying to go. A long thought process followed, much of which lives in this issue, and further down we'll see what came of that.
Before working on a prediction algorithm, I wanted to be able to see live feedback on how render timings behaved and which frames were missed so that I could do a good (informed) job of predicting them. I took a detour into the world of tracing.
libuserevents was spawned and so was the work to make use of it in wlroots. Linux's user_events tracing interface was appealing because it meant that GPUVis, an existing tool that can display a timeline of CPU and GPU events, would be able to show wlroots' events. Unfortunately Linux and I have so far struggled to get along and this work is still in progress - no submission yet because it's broken. Even more unfortunately, this meant that I wasn't able to get around to prediction.
Then I got tired of fighting that, and despite the words of discouragement...
“it's kind of impossible and I don't think we can do much better than !4214”
a refactor of wlroots' frame scheduling that allows us to do much better than !4214: !4307! This hasn't quite made it past the finish line, but it's close; I can feel it in my frames. It (in my opinion) neatly extracts the hairy logic that lived in
wlr_output into a helper interface, allowing users to swap out which frame scheduler they use, or to forgo the helpers and roll their own without there being bits and pieces left over in the parts of wlroots that they do care about. This is the most exciting piece of the puzzle IMO;
wlr_output has grown to have its fingers in many pies, and this MR reduces that and leaves
wlr_output a little bit more friendly in a way that took a lot of brain cycles but turned out clean.
This new interface doesn't come with a frame delay option for free, but an implementation of the interface that has this feature is underway: !4334. It fits nicely! We hashed it out a little on IRC because the frame delay option is a surprisingly tricky constraint on the interface, but I think the conclusion is good. It was definitely a lot easier to write this with confidence after the scheduling redesign :)
To make this scheduling design possible and clean, a couple of little changes were needed in other areas, and thankfully the case for these changes was easy to make. They're helpful to me, but also make those parts of wlroots less surprising and/or broken. There was also a discussion about the fate of
wlr_output.events.needs_frame, which is an extra complexity in wlroots' frame scheduling. It turned out that while removing it is possible, it wasn't necessary for the new scheduling system, so it continues in the background.
# Loose ends
libuserevents is usable, the wlroots integration is not ready.
There is sadly no "stock" plug-and-play prediction algorithm in wlroots.
The new scheduling infrastructure has not landed but I'm sure it will Soon™. The implementation with the frame delay option will hopefully follow shortly after. When (touch wood) it does, compositors will have to bring their own prediction algorithm, but a "good enough" algorithm can be very simple and given the current interface design can easily be swapped out for a stock one if one materialises.
And finally, the funniest one. I wrote an implementation of the timer API for wlroots' Vulkan renderer, and then put off submitting it for two months because everything else was more important. gles2 is the default renderer and supports roughly every GPU in existence. Writing the Vulkan timer was fun but landing it was less of a priority than every other task I had and nothing really depended on it, so it remains stuck on my laptop to this day. Perhaps I should get round to that.
The project didn't go how I expected it to - not even close. I even wrote up a schedule as part of my application that almost immediately turned out completely wrong. I'm not bothered, though, because it was fun, I made myself useful, and I met some cool people.
If you're considering doing something like I did, I can happily recommend Simon as a mentor, X.Org, and GSoC, in that order. Much love to Simon for making me feel comfortable when I really didn't know what I was doing, and for participating in my wildly off-topic free software rambles. I've only interacted with a small part of the X.Org community so far but it struck me from the start how welcoming everyone is; I have no doubts that the other X.Org project mentors are as lovely in their own ways. And of course, as a strong proponent of software
that doesn't suck that's free, I have to appreciate that GSoC gave me a welcoming place to do my part in that and relieve my worldly pressures (did you know you have to pay for internet??).
Thanks everyone for putting up with me. If you would like to put up with me some more, click the links on the left - I'm not going anywhere, there's still work to do!