This is the second blog post about my Outreachy internship at Fractal. The project I’m working on is the integration of a video player in Fractal.
The progress I’ve made
Like any communication app based on the Matrix protocol, Fractal is structured into rooms. When a user enters a room, they can see the messages that have been sent in that room. I’ll refer to those messages as the room history. During the first weeks of my internship, I’ve integrated a simple video player in the room history: when receiving a video file, the user can play, pause and stop the video right there.
The control you can see in the picture with the play/pause button, the time slider and download button, etc. was already implemented for audio reproduction in Fractal. So basically, my task so far has been to get the video rendered above that box. It might seem simple, but it has been fun. I’ll share just a couple of things that I’ve learned in the process.
A pipeline in GStreamer seems to be one of those concepts whose basic idea is pretty easy to grasp, but that can get as complicated as you want. As its name suggests, a pipeline is a system of connecting pieces that manipulate the media in one way or another. Those connecting pieces are called elements. The element where the media comes from is called source element and the one(s) where it’s rendered is/are called sink element. An example is shown in the drawing in https://bit.ly/2twW6Ht . As you can see there, every element itself again has a source and/or one or more sinks, that connect the elements among each other. The phenomenon, just described, of finding the same concept at the level of elements and at the level of the pipeline is not uncommon. I’ll give two more examples.
The first example is about buffering. On one hand, when pushing data through the pipeline, an element step by step gets access to the media by receiving a pointer to a small buffer in memory from the preceding element (buffers on the level of elements). Before receiving that, the element cannot start working on that piece of media. On the other hand, one can add a buffer element to the pipeline. That element is responsible for letting bigger chunks of data get stored (buffers on the level of the pipeline). Before that’s done, the pipeline cannot start the playback.
The second example concerns external and internal communication. The way a pipeline communicates internally is by sending events from one element to another. There are different kinds of events. Some of them are responsible for informing all pieces of the pipeline about an instruction that might come from outside the pipeline. An example is wanting to access a certain point of the video and playing the video from there, called seek event. For that to happen, the application can send a seek event to the pipeline (event on the level of pipeline). When that happens, that seek event is put on all sink elements of the pipeline and from there sent upstream, element by element (events on the level of elements), until it reaches the source element, which then pulls the requested data and sends it through the pipeline. But events are just one example of communication. Of course, there are other means. To mention some more: messages the pipeline leaves on the pipeline bus for the application to listen to, state changes and queries on elements or pads.
So I find the concept of pipelines quite interesting. But to practically get media processed the way I want, I’d have to set up a whole pipeline correspondingly. Creating an adequate pipeline and communicating with it and/or its elements can get complicated. But luckily for me, the audio player in Fractal is implemented using a concept called GstPlayer, so that’s what I’ve also used for video. It’s an abstraction of a pipeline that sets up a simple pipeline for you when creating it. It also has a simple API to manipulate certain functionalities of the pipeline once created. And to go beyond those functionalities, you can still extract the underlying pipeline from a GstPlayer and manipulate it manually.
In the last section, I’ve briefly mentioned seek events, i.e. events that request to play the video from a certain point. When a source element receives such an event, it tries to pull the requested piece of media. If communicating via http, it tries that sending a range request, which is a request with a header field called Range that specifies which part of the media is requested in bytes (see https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Range). In order to make sure that range requests are supported, the responses are checked for a header entry “accept-ranges”. Only if that entry exists and its parameter is “bytes” (the other option would be “none”), the support of range requests is guaranteed. Synapse, the standard server of Matrix, does not include the accept-ranges entry in the headers of its response. Therefore seek requests to media files on that server fail.
At some point, I thought I could solve that problem by activating progressive buffering in the pipeline and seek only in the buffered data. But progressive buffering itself uses seeking. So when activating progressive buffering even playback fails. There might be other kinds of buffering that’d do. But for now, our way around the problem is to download the video files and play them locally.