Input Lag and the Limits of Human Reflex, Part 3

Published Aug 20, 2017 by Marwan Daar in: Area 51: Display Research Guest Input Lag

Simulation 1: The Quick Draw

Suppose that, in the real world, someone suddenly appeared in front of you holding a gun aimed in your direction, and that at the moment this person appears, you also have a gun aimed in their direction. Whoever fires first will survive, and whoever has the lower reaction time (including the time it takes to squeeze the trigger) will fire first.

In online gaming, the situation is a bit different. The “real world” is actually a simulation that takes place on a computer (called the game server), and this server sends instructions to all the individual computers that the players are using (the clients) to update them with information such as the position of each player.

If something suddenly happens in the server simulation, for example, an object suddenly appears in a room, then the other players in that room won’t necessarily see this object appear at the same time on their individual displays. If a player has a high latency internet connection to the server, then it will take more time for the information to travel from the server to that player’s computer, and the object will appear on their display later than it does on players with faster internet connections.

Likewise, if one player has a slower display, then the object will take longer to render on their display. So, if two players appear in front of each other in the server simulation, then whoever has the fastest internet connection + display + reaction time will fire first. And if it’s a sudden death situation, whoever fires first will survive the encounter.

I’ve created a simple simulation to illustrate how this sudden death outcome is affected by display latency. The conditions are as follows.

In the game server, two players suddenly appear in front of each other and have guns already drawn.
Both players have the same internet latency.
Both players are running their displays at 240 Hz, and have frame rates of at least 240 fps, and are running with VSYNC off.
The two players are randomly sampled from a population of elite players, and this elite population has a mean reaction time of 170 ms, and a standard deviation of 20 ms. This means that on any given “trial”, a given player’s reaction time will be roughly 170 ms. If the standard deviation was 0 ms, then the reaction time would be 170 ms each and every trial. A higher standard deviation means that there’s more variability from trial to trial. Now given that this is an elite population of gamers, their performance represents a narrow slice compared to the whole population of gamers. This means that their performance will likely be tightly distributed. Imagine randomly sampling 10 people from the general population and having them compete in the 100 metre sprint. The sprint times for these 10 people are going to vary wildly, perhaps as much as 8 seconds between the fastest to the slowest. But in as narrow a slice as the Olympic finals, there’s going to be a much smaller spread in performance. In the 2016 Olympics mens 100 metre sprint finals, the difference in performance between the first and last placed 100 metre sprint was a quarter of a second in the mens, and the reaction time across these 8 sprinters ranged from 128 ms to 156 ms (with an interquartile range of 14 ms). So a standard deviation of 20 ms reaction time seems like a fairly reasonable assumption for a population of elite gamers (stats geek sidenote: in my simulation I assume a normally distributed set of reaction times; however reaction times are typically right skewed).
There is a delay between the moment the video data is scanned out from the video cable feeding the display, and the moment the pixels render the information on the display. This is due to display processing and pixel rise and fall time. We’ll call this delay “display latency.”
There’s also a delay due to refresh rate and frame rate. To understand this delay, consider what would happen if you had a display with 100 rows of pixels, running at 1 Hz, and with VSYNC off. If the server instructs your computer to render a permanent green line on row 50 at time = 0 ms, and at that very same moment, your display is scanning out the 1st row, you’ll have to wait around half a second (500 ms) before the display gets to row 50 before you see the green line appear (and if the green line isn’t permanent, and disappears before your display renders it, you’ll never see it!). Now if we don’t know in advance which row is being refreshed at time = 0 ms, there’s an equal chance that it could be any of the 100 rows. It could be the end of the 49th row, in which case the green line will appear immediately, or it could be the 51st row, in which case you’ll have to wait for about a whole second (this will all be true so long as your frame rate is at least 1 fps). So the range of possible delays is between 0 and 1000 ms. If the refresh rate was 10 Hz, the range of delays would be between 0 and 100 ms (since it takes 100 ms to refresh all the rows @ 10 Hz). A more general way of stating this is that the range of delays is uniformly distributed between 0 and 1/refresh rate. Since the players are running their displays at 240 Hz with at least 240 fps, the delay due to refresh rate in this simulation is a uniform distribution ranging between 0 and 4.17 ms.
The delay due to display processing and pixel rise and fall time differs between the two players by a certain amount. For example, one player may have a display with slower electronics, resulting in a delay of 5 ms relative to the other player. We’ll call this delay “display latency.”

The simulation is fairly simple. It randomly samples two players, and simulates how long it will take for each of the two players to appear on each other’s displays (which is based on refresh rate and display latency), and then adds the reaction time. Whoever presses the fire button first wins a round (the simulation assumes that the time it takes between button press and server acknowledging the button press is the same between the two players).

It does this for 20 differences in display latency, ranging from 1 ms to 20 ms. And, importantly, for each of these 20 conditions, the simulation is run 100 thousand times, and for each of those 100 thousand trials, whoever wins the round gets a point. After 100 thousand trials, these points are converted into a winning percentage. Here are the results.

So with a display latency advantage of 5 ms, your winning percentage goes up to around 57%, which isn’t trivial. The key parameter in this simulation is how widely distributed reaction times are. If everyone had exactly the same reaction time, then a 5 ms display latency advantage would mean you’d win 100% of the time. So the more similar your reaction times are to your opponent, the more that other factors, like display latency, matter. If you run the same simulation with a much larger spread in reaction time (a standard deviation of 50 ms), then a 5 ms difference in display latency results in a win percentage of around 53%.

Simulation 2: Tracking a Dodging Enemy

In first person shooters like Quake Live, a useful measure of performance is weapon accuracy. With the lighting gun (LG), this is the percentage of cells that hit the enemy compared to the total cells fired. So the longer you are able to track the enemy with your crosshair, the higher your accuracy. Against players who are highly skilled at dodging, anything above 35% is considered excellent (against players who are moving more predictably, this percentage can rise to 45% or 50%). I was interested to see how display latency would affect LG accuracy in a simple scenario. Here’s the setup.

One player (the shooter) is attempting to track a dodging player (the target) with the LG. The target moves in a very simple pattern: left and right in a straight line, switching direction every 300 ms. The shooter has 150 cells at her disposal, and holds the fire button down until the ammo is depleted. Given the LG firing rate of 20 cells/second, this means that it takes 7.5 seconds to run through all 150 cells. During this period, the target has switched directions 25 times (the target starts at a standstill, so the first movement is considered the first direction switch).
When target is moving in a straight line, the shooter is able to place the crosshair directly in the center of the target. When the target begins to decelerate in order to switch direction, the shooter notices this, and uses it as a cue to change crosshair direction by applying a force to the mouse. Because she is human, however, it takes her some time to react. And by the time she’s reacted, the target has already slid his body far enough that the crosshair is no longer on it. The smaller the size of the target, and the faster he’s moving, the sooner he’ll be able to slide out of the beam’s path. Let’s assume that it takes 50 ms from the moment he starts to decelerate to the moment he’s slid out of the beam’s path. We’ll also assume that the shooter has a manual tracking reflex of 120 ms, and that it takes an additional 80 ms to move the crosshair so that she has reacquired the target.
Remember that the shooter is reacting to information on her display. This means that her display adds a further delay from the moment the target switches directions to the moment she reacts to it. As in the previous simulation, this delay is due to the finite refresh rate (she’s running 240 hz with at least 240 fps and VSYNC off), and the delay due to display processing & pixel transition time.
The simulation uses this information to calculate how many of the 150 cells hit their target, as a function of display latency (remember, display latency is defined here as the delay due to display processing and pixel transition time). Here are the results (note: the jitter in the graph is due to the uniform random sampling of latency due to refresh rate).

This shows about a 6.5% drop in accuracy that occurs when you add 20 ms of display latency, or about a 1.5% drop for every 5 ms. Again, this isn’t trivial. A well armored enemy in Quake Live has 300 units of damage protection, and each LG cell deals out 6 units of damage. If two fully armored opponents face each other with LG, where one player has a 33% LG, and the other has 31.5% LG, then the better player will survive the encounter with about 16 damage protection points to spare.

Of course, this is a very simple simulation. In reality, if an target was dodging so predictably, you wouldn’t even have to react and instead could use pattern recognition to anticipate the movements. But it’s an interesting start. Perhaps we’ll do a followup investigation where we artificially add a given amount of display latency, and see how this affects real world performance.

There’s another point I’d like to bring up here about these simulations. The term “input lag” can be somewhat misleading. We often think of the input lag chain as starting with a mouse movement or button press, and ending when the pixels reflect this event. But in the simulations here, we’re interested in the chain of events that occurs between the moment the client computer receives information from the server, and the moment the pixels on the client’s display reflect that information. And unlike the case where a mouse movement or button press causes a change in the display (e.g. when you move the mouse the view pans left or right), in these simulations, the cause of the pixels changing has nothing to do with the “input” of the client.

There’s another tricky nuance here. Remember how we talked about refresh rate being a limiting factor in how fast the display can update information? Well, that’s true when we’re interested in information we need to see. In order to react to an enemy appearing or switching directions, we need to see this information, and the faster we see this, the better our performance. But does having a fast display mean that our reactions themselves will be registered sooner with the server?

For example, suppose you see an enemy appear, and you click the fire button. This information is sent to the server, and the server registers this event, and if you aimed well, the enemy will be damaged. But what if your display is super slow, and it takes a whole second for the pixels to show your gun firing. Does this mean that it takes a whole second for the enemy to be actually hit? While I’m hardly an expert here, I think the answer is “it depends.”

Clearly, something like display processing and pixel transition time cannot affect how long it takes for the enemy to actually be hit. The client’s computer doesn’t wait until the pixels have actually rendered the event before sending information to the server. But client frame rate may play a role here. Some games may require the GPU (graphics processing unit) to have internally rendered the frame before allowing the information in that frame to be sent to the server. And if this is the case, then it means that having a slow framerate may slow down your ability to react in two ways.

First, it will mean that information from the server (i.e. enemy position) takes longer to be rendered on your display. And it also means that it will take longer for your reactions to this information (i.e. shooting this enemy) to be registered with server. Having a high framerate in games like CSGO and Quake Live is a good thing, and even if you can’t visually detect the difference between say 125 fps and 250 fps, it’s still a good idea to choose 250 fps, especially when you have the hardware to achieve this in a stable manner.

One more quick point: It goes without say that other parts of the input lag chain, such as USB latency, are important too. They’ve been ignored in these simulations, but have essentially the same effect as reaction time. The more latency there is between user input and the display reflecting this input and/or the server receiving this information so that this input is registered in the actual game, the greater the performance hit.

What Does All This Mean?

We now have a bit of an idea of why input lag matters. But how should this information affect decisions you make around choosing a display, or modifying a display’s settings? First, you need to have access to reliable information about a given display. Being able to discern how much latency in the chain is due to the display itself, rather than to processes that occur earlier in the chain, is certainly important. Also, knowing how various settings, such as GSYNC, VSYNC, etc. affect latency will allow users to make intelligent decisions. Trading a few milliseconds of input lag for a radically smoother visual experience may be an excellent tradeoff in many gaming situations, but unacceptable in others.

For more in-depth information on this subject, see our G-SYNC 101 series for a comprehensive investigation on how syncing methods affects input lag.

Major props if you made it through this entire thing, and hopefully some of you will have learned something useful. There are a few people I’d like to thank for helping with this:

‘flood’ for conversations about human reaction time, providing critical feedback for the manual tracking reflex methodology, giving awesome suggestions for future testing, and for providing the CSGO sniping video.
Yasukazu ‘kukkii’ Kashii for patiently being a test subject for the manual tracking reflex tests.
‘Nucleon’ for giving valuable insight into gaming netcode, and for introducing me to the concept of the ideomotor phenomenon.
Gian ‘myT’ Schellenbaum for offering advice on how to use his UberDemoTools API for extracting useful information directly from a Quake Live demo file.