Performance insights into info-beamer piPosted Jan 10 2015 by Florian Wesch
Performance has always been a top priority while developing info-beamer for the Raspberry PI. Since version 0.6.3 (released in november 2013) a lot has changed. It's time for some benchmarks and for some insights on how improvements were made.
Benchmark: font output
This is a node that just displays text of different sizes in various locations. It also uses random text (in the form of random numbers), so the internal text caching of info-beamer is tested.
Here's the performance (measured in frame per second/fps) observed in the different info-beamer versions starting with 0.6.3:
As you can see, font rendering was really slow in 0.6.3. Each character was rendered individually into a texture and then drawn by rendering those textures. Starting in 0.7.0 this changed: Now there is only one texture per font loaded. This texture contains all the characters required to render text. This technique is called font atlas and gives a huge performance boost when compared to rendering each character individually. Such a font atlas dynamically created by info-beamer pi looks like this:
As you can see, info-beamer tries to be smart so it can store the maximum amount of characters in a single texture. If there is no room anymore a bigger texture is used and the atlas is regenerated. The atlas is shared among different nodes using the same fonts so there is no need to waste precious texture memory if the same font is used multiple times.
The rendering is done by uploading a bunch of squares to the GPU that can then be rendered in one batch. The uploaded information is cached so it doesn't have to be uploaded again if the text rendered doesn't change. So rendering a text that doesn't change is very fast.
Since uploaded text also uses memory on the GPU, info-beamer tries to optimize what's cached there: Text that has not been rendered recently is removed. This ensures a good balance between performance and memory usage.
Benchmark: A game scoreboard
This node was used at a small conference where players could compete in a programming game. The node displays the current scoreboard. The node was modified to show more spaceships that are all flying over the screen so it can be used as a benchmark.
That's quite some space ships. Let's look that the performance graph:
The small change between 0.6.3 and 0.7 can be attributed to the change in font rendering again. The huge jump in 0.8.1 is a result of the change made to the default shader used. In an OpenGL there are multiple projections that happen to anything rendered: There is a model, view and projection matrix that transform any object coordinates from the model coordinate to world space coordinates, then to camera space and finally to screen coordinates.
You can think of that as the following steps: The model matrix takes a virtual object and places it into the world. Now this objects exists somewhere. To be useful it should be visible. This is archived by moving it in front of a virtual camera. So the view matrix projects the objects in front of a camera and all its coordinates are now relative to the camera. Finally a projection matrix transforms those into a flat representation (kind of taking a picture of the object).
Each of those projections is represented by a matrix. Coordinates are then multiplied by each of those matrices. There is a trick to speed up these multiplications: You can compose these matrices once so you get a single model view projection matrix and can then be used to project coordinates directly from model space to screen coordinates. Before 0.8.1 this composition wasn't fully utilized: Only the ModelView and the Projection matrix where calculated. As a result, for each texel drawn there were two matrix multiplications instead of one. This slowed things down quite a bit. 0.8.1 fixes this mistake by precalculating the complete model view projection matrix once and then doing only a single matrix multiplication per texel. As you can see it makes quite a difference.
Benchmark: conference information
This is a node that was used at a conference. It shows information about the next talk. It uses a shader to animate the background so it displays a rotating vortex. A font is used to show some information about the next talk.
As you can see there's quite some improvement since 0.6.3. The reason for the jump between 0.6.3 and 0.7.0 is an improvement on how font rendering works (see the first benchmark). The change between 0.8.0 and 0.8.1 is a change in the way shaders are set up (see previous benchmark). Finally the change between 0.7.2 and 0.8.0 is related to a change in how output is rendered to the screen. Lets see that this means.
The Raspberry PI can output multiple layers of visual information stacked ontop of each other. Programs request a new layer to draw onto. When info-beamer starts it sets up a new layer and creates an OpenGL surface bound to it. That way output generated by OpenGL ends up on the screen.
Now there is a problem if you output content that uses a 4:3 aspect ratio on a 16:9 screen. You have to scale the output at some point so it looks correct. Previously info-beamer always rendered into a texture and then fitted this texture into the full screen output layer while preserving the aspect ratio.
In 0.7 this changed. Now info-beamer is smarter than that. Instead of allocating a fullscreen layer it requests a layer that corresponds to the size of the root node fitted into the available screen size and then renders directly into that layer. So there's no need anymore to render into an intermediate texture. This saves some GPU bandwidth since the root node output is directly rendered on the screen and not into a texture. The more resolution the root note requests (using gl.setup) the more performance you gain by that change.
Developing info-beamer is fun. It's an interesting challenge to find more ways that improve performance. Certainly 0.8.1, which is going to be officially stable in the next few weeks is the fastest info-beamer ever. Enjoy. As always, let me know if you have any feedback or questions.
Recent blog postsPi4 dual display/4K and HEVC/H265 support
info-beamer hosted 10 released
The 35C3 info beamer setup
Edge computing and digital signage
More blog posts...
info-beamer.com offers the most advanced digital signage platform for the Raspberry Pi. Fully hosted, programmable and easy to use. Learn more...
Get started for free!
Trying out the best digital signage solution for the Raspberry Pi is totally free: Use one device and 1GB of storage completely free of charge. No credit card required.
Follow @infobeamer on twitter to get notified of new blog posts and other related info-beamer news. It's very low traffic so just give it a try.
You can also subscribe to the RSS feed.
Share this post:
Questions or comments?
Get in contact!