Optimising The Draw Pipeline

The original intent of the article that you are reading was to help people who wish to create games for the OUYA console optimise the draw pipeline and get the best performance possible from some (let's be honest here) low end hardware. However, that's not to say that the techniques shown here are only for that platform... They most definitely are not! You can use the optimisations that we will discuss on almost all target platforms, and you will get an FPS boost from them, although the optimisations will be most noticeable on low-end mobile devices.

The Base File

To make things easy for you, we have created 5 different "base files" for you to download and test as we go along. These base files are very simple GMZ that you should import into GameMaker. Studio and then run on the target platform. They simply spawn objects that move across the room when you press the mouse or the gamepad button, and have a controller that shows the real FPS. The controller is where we will be adding in extra codes to optimise things so when you load each base file into GameMaker: Studio, you should always check out the Create and Draw events of this object.

To get started, download Part 1 of the base file and run it: Optimise Drawing Part 1

Run the "game" now on your OUYA or mobile device, and tap the screen (or press the gamepad button) to spawn instances and get an idea of what the default settings are capable of.

This base file has a small 640x360 room which uses the view port settings to scale it up to 1280x720, which is a pretty standard approach to scaling a game. However it is not the most optimal solution, but it gives us a base-line test from which to get some figures for comparing. If you look at the following table you can see how it runs on the OUYA (after each optimisation in this article, you will see a similar table with figures based on the performance of the test on an OUYA device):

 Instances  100  200  500  1000  2000
 Real FPS  230  195  120  69  37
 Actual FPS  48  36  20  12  6

Alpha Testing and Blending

You may not know it, but alpha blending is actually pretty expensive when it comes to GPU processing, so our first optimisation is to remove that from the draw pipeline if possible. 

For this, we need to use a couple of functions that were added to GameMaker: Studio in recent updates:

draw_enable_alphablend();
draw_set_alpha_test();
draw_set_alpha_test_ref_value();

The first of those functions, draw_enable_alpha_blend(), is used to switch on or off alpha blending, and switching it off has the effect of making ALL alpha blended pixels opaque (you can see an example of this in the manual) . This means that the GPU doesn't have to worry about what's "underneath" the pixels being drawn, speeding up the drawing.

However, if we simply turn this off then run our game we get an image like this:

OptimisingAlphaBlend.png

Oops! That's not quite what we want is it? This is where alpha testing comes in to play... Switching on or off alpha blending doesn't actually affect the alpha values of individual pixels, but rather simply forces them all to be drawn opaque. This means that we can switch on alpha testing, and then set a threshold (or reference) value for testing and the GPU will actual check and discard ("throw away") those pixels that have an alpha value below that which we set.

So, we switch on alpha testing, and set the reference value to 128, so that all those pixels with an alpha value of less than 0.5 are discarded. This happens before they are rendered to the screen, so the fact that there is no alpha blending enabled now doesn't matter as the pixels that had an alpha value have been discarded and just won't be drawn.

In this way we can add transparency to our sprites again, even though we have no longer enabled alpha blending, and it also gives a decent boost to the frame rate.

Download Part 2 of the base file and run it: Optimise Drawing Part 2

When testing this version of the test game on the OUYA, I could now get  almost 80 real FPS for 1000 instances on the screen, and 15 actual FPS, which still isn't great but shows an improvement. We can still do better though...

 Instances  100  200  500  1000  2000
 Real FPS  250  200  130  77  42
 Actual FPS  52  40  25  15  8

NOTE: This part may not be appropriate to all games, so you should experiment with these functions and test different reference values etc... before adding them to your game.

The Application Surface

With the 1.3 update, the application surface has been exposed to you (the programmer). This means that you can also squeeze more FPS from your game by knowing just how to draw this surface, which is what we are going to cover here.

The first thing we want to do, is get rid of aspect ratio correction and scale the surface to fit our screen. This will permit us to switch off the "Clear Display Buffer With Window Colour" in the Room Editor. Switching this off will also help bosst the FPS as it removes a fullscreen fill from the draw pipeline.

NOTE: In your game you may also want to switch off the "Draw Background Colour" option too, but only if you are drawing a fullscreen background, otherwise you'll get some pretty "interesting" effects (the base file for this part of the article has the background colour off, so you can see what happens!).

So, we switch off the buffer clear, and while we are here, we can switch of the views too, as we are no longer needing them to scale our game up to fit the device screen. With those options off in the room editor, we can then go to the Global Game Settings and switch to the "stretch to fit" option in the graphics tab for our target platform.

Finally, we take over the drawing of the application surface itself, and resize it to fit better our screen. It's important to note that when resizing the application surface, you really want it to be the minimum size necessary and then scale it up to fit. So, in the case of the OUYA, you do not want to create a 1280x720 surface, but rather a smaller surface with the proportional aspect ratio, and then draw that scaled to fit the device.

Download Part 3 of the base file and run it: Optimise Drawing Part 3

Note that when we draw the surface itself, we switch off alpha testing all together as it's not needed, and we also switch off interpolation too, as (again) it's not needed since we are essentially drawing a rectangle of pixels all with an alpha of 1.

If you test the new file, you will see that our controller object has now "taken over" the drawing of the application surface, and the boost is quite spectacular! So, switching off buffer and background clearing, switching off automatic aspect ratio correction, and drawing things yourself will give a fair bit of extra FPS on all platforms

 Instances  100  200  500  1000  2000
 Real FPS  1040  830  375  85  55
 Actual FPS  60  60  60  51  28

When testing now, we get greater boost in FPS... On the OUYA I tested these files on I now get real FPS of 85 or so for 1000 instances, which isn't too different from the previous version, but the actual FPS has jumped up to 50! The lesson here is that you really should only draw the pixels you need to draw... 

Bypass The Default Shader

You may not realise it, but everything that you draw to the screen in your games made with GameMaker: Studio pass through a "default" shader. This shader is designed to cover almost all eventualities when drawing and contains a lot of extra code for doing things "just in case" you want to, which means that they are possibly not the optimal solution when drawing.

However, you can actually bypass this shader. In fact, if you are using shaders anywhere else in your game, you already are! Basically, when you call the function shader_set(), you are telling the GPU to use this new shader instead of the default one. Now, the great thing about this is that you can use a stripped down shader when drawing everything and bypass the default shader all together which will help your FPS greatly and further optimise things.

Some of you may be wondering how on earth you write a shader for this? Well, you needn't worry about it because when you create a new shader in GameMaker: Studio, it creates a simple "pass through" shader, which is exactly what we need for this next optimisation.

Download Part 4 of the base file and run it: Optimise Drawing Part 4

it's worth noting that we have modified the fragment shader slightly here. Instead of it simply passing through the data, it checks the alpha component of each pixel and does a discard if it's less than 0.5 (half alpha) - this is the same as our alpha test reference functions. Why do we do this? Well, by taking over the draw pipeline with your own shader, the default shader is no longer used, meaning that those functions which apply to the default shader (like alpha testing) are no longer relevant.

NOTE: you could probably remove the code for alpha-testing from the game now, but leaving it in will ensure that on those systems that use OpenGL 1.0 and above your game has maximum compatibility.

This should now give another nice boost to your FPS, with the OUYA I tested showing 150 or so real FPS, and a steady 60 actual FPS. So for our game with a 1000 instances, we have finally "broken even" and got our FPS to match our room speed.

 Instances  100  200  500  1000  2000
 Real FPS  1150  900  430  150  60
 Actual FPS  60  60  60  60  35

Finishing Touches

So, we have optimised everything we can, no? Well, not quite. We want to do a couple of other things to squeeze as much as possible from our game and device...

First thing to do is change the colour depth. This is Android specific, and so isn't applicable to other platforms, but switching from 24bit to 16bit colour depth will help with your FPS and the difference is barely noticeable unless you are using some very smooth gradients etc.... in your game.

We can also remove the alpha testing being done from the shader from the application surface, since we don't want it to be transparent anyway, using a simple pass-through shader in the Draw GUI event when drawing the surface.

Download Part 5 of the base file and run it: Optimise Drawing Part 5

In this new example we have shader0 now do the alpha test in the fragment shader, and we have shader1 as a pass through shader only for drawing the surface to the screen. We also have the colour depth set to 16bit in the Global Game Settings, and these things combined should give a further boost to the FPS when drawing.

On my OUYA, I'm now getting a steady 180 real FPS and a solid 60 actual FPS for 1000 instances on the screen. 

 Instances  100  200  500  1000  2000
 Real FPS  1220  920  460  180  60
 Actual FPS  60  60  60  60  37

Summary

As you have seen, optimising the draw pipeline can be done using various techniques, and you should be using these where possible on all low-end platforms, or platforms where your game could be played on low-end devices. Not all of the optimisations shown above will be applicable, as it will largely depend on your game, but experiment with each of them and see what you can and cannot do. Remember, every little bit helps, especially when making games with a large instance count, or intensive drawing!

 

 

 

Have more questions? Submit a request

0 Comments

Article is closed for comments.