Labs: iTorque2D speed improvements
I have received feedback regarding my tests that my numbers were totally off. Michael Perry from Garage Games took time out and ran my tests…and he measured 49fps to my 1fps! Michael was able to get a couple of the people on Garage Games IRC to reproduce his results, meaning we had multiple confirmed reports of 49fps.
Naturally I was very worried I had misled people, and I spent a bit of time determining what the big differences were between my version and his. Originally Michael thought I must be running a debug version, this did not turn out to be the case. After cleaning/compiling/rebuilding in every permutation I could think of, this did not provide any improvement.
There are a lot of differences between a stock 1.4.1 iTorque2D version and ours, and as I mentioned previously, we are ‘overclockingâ€™ our iTorque2D engine to run at 60fps by modifying smTickShift to be â€™4′ instead of â€™5′. This turned out to be the crux of the problem, literally one bit. I confirmed setting the smTickShift in iTickable back to 5 in our engine made the test run at 49fps. Additionally, if I modified the stock 1.4.1 engine, so that smTickShift was 4 (so that it ran at ~60fps) then I saw 1fps.
This is interesting because how does swapping the frame tick from 32ms to 16ms have such a dramatic effect?
As mentioned in Part 2, the iTickable will run a time step over and over to ‘catch upâ€™, so if you oversaturate iTorque2D in any way, you see a dramatic performance decrease. While I donâ€™t have much time to test this right now, this seems to be the case.
When the frame tick is 32ms, then the device has an additional 16ms to execute the TorqueScript. Our TorqueScript is still taking the same time to execute, but the window of time is now large enough for it to execute, draw to the screen, and keep moving.
When the frame tick is 16ms, it runs out of time while executing, draws to the screen, and this takes, say, ~25ms. After two frames, iTickable says weâ€™re ‘behindâ€™ and executes another tick to catch up. The tick takes too long, iTickable executes more ticks while catching up, and everything grinds to a halt.
So whatâ€™s the lesson on this? First, run Torque at 30fps! It raises your ceiling to allow for more execution time. Second, just because thereâ€™s more time to execute TorqueScript doesnâ€™t mean TorqueScript is any faster. You will most likely hit that ‘ceilingâ€™ if things arenâ€™t ported to C.
Special thanks to Michael Perry for all his help!