Am I too slow?

I’ve been playing a video game called Destiny on an Xbox One game console. Its in the category of FPS (first person shooters) which place a premium on fast reaction times. I’ve played it quite a bit over the past year or so (it keeps track of exactly how many hours, but I really don’t want to know that piece of data) and while I enjoy playing it, I’m not particularly good at it, at least compared to others that I play with. It is satisfying that I have improved immensely since I started - I no longer die in the first 10 seconds - but I started to wonder if maybe there is a simple explanation: my reaction time is simply worse than theirs. I vaguely remember that reaction times may increase by a factor of two between your twenties (where most of my player companions are) and the late sixties (where I am). Maybe I can blame it all on my age.

So I set out to measure my reaction time. There are a variety of possible reaction time tasks that could be used but to start with I chose the simplest, and probably most common one: the simple reaction time. In this task there is one stimulus - an LED light turning on in my case - and one response - push a button. Simple: the light comes on, you push the button as fast as you can and measure the time delay between the two events: that is your (simple) reaction time.

The device

My first thought was that I could find a program for my computer or phone that would measure it. And indeed there are quite a few programs like that, even web browser based ones. However, I was very skeptical of these programs because there are quite a few hidden timing unknowns. They are running on computers doing complex tasks - including tasks simultaneously competing with the measurement program. The screen displays may have relatively slow refresh rates and the keyboard or mouse response must also be processed. I’ve attempted to use computers like this to measure real time events in the past and been surprised when there was an unexpected delay of a second or two while the computer dealt with another task - like a floppy disk access (I said I was old). All together much too complicated for me to trust.

But its a simple task, so I made a quick Arduino based setup to do the measurement. I only had to add a pushbutton switch on a protoboard and the Arduino computer itself (I used the Arduino Micro version). A couple of wires, and a USB connection to my laptop for programming and data output and all was set. I didn’t even have much thinking to do for the construction or programming - there are many arduino reaction timers already documented on the web and I chose one from the Instructables web site. I made a few changes in the layout and program, of course, but relatively minor ones.

Based on: http://www.instructables.com/id/Arduino-Reaction-Time-Tester/

My reaction time

I took some data over 8 sessions and four days in a variety of circumstances, at the end of which I had a list of 400 numbers - my reaction times, in milliseconds, for each of the 400 trials. It was a simple matter to take their mean and standard deviation. My results gave a mean of 201 milliseconds with a standard deviation of 17 milliseconds.

So how does this compare to other people’s reaction times? A quick google and a little discrimination applied to the search results yielded an article that had the following plot of several hundred people’s reaction times as a function of age. The two different colored points are two different studies and the black line is a plot of the average reaction times as a function of age. Note that this is the average reaction time increase of this population of people - not the increase in reaction time of one individual over time. The two may or may not be similar. I drew in the red line at the value of my reaction time I measured with my setup.

From: http://journal.frontiersin.org/article/10.3389/fnhum.2015.00131/full — From: http://journal.frontiersin.org/article/10.3389/fnhum.2015.00131/full

So there is good news and bad news. The good news is that my reaction time is actually very good compared to this population - whatever the age. I can hold my own even compared to the youngsters, at least statistically. The population shown above however is not FPS Destiny players, so there is still the possibility that my friends have even lower reaction times. A study for the future, perhaps, if I can talk some gamers into measuring theirs. The device is easily sent in the mail so maybe I can set up a little data collection project.

The bad news is that I can’t blame my performance on my simple reaction time.

magic asks the right question

There is a web service, aptly named twitch, that allows people to watch gamers as they play their games. The page displays a window with the video of the game, usually inserted into a corner of which is a picture-in-picture of the player themselves. There is also a chat window where the viewers can type in their comments and see other viewers’ comments, in a style reminiscent of twitter. Its a fun social medium and can also be a good learning venue.

One of my favorite players on twitch is magicauer. She is one of those that I have played with who is clearly better than I am (though she is quite gracious about it). Not too long after I took these measurements, I typed them into chat on her stream. Even distracted as she was playing Destiny with bullets flying and death imminent, her immediate question was: “Is the distribution Normal?”.

This is exactly the right question, and should always be asked when doing statistics. The Normal distribution, also known as a Gaussian distribution, is the most commonly used one, sometimes inappropriately. Wikipedia has a good introductory article about it. Its also referred informally as the "bell curve" or even “the curve" (flashback to school grades anyone?). It is used so ubiquitously probably because it is so convenient mathematically, though a deeper argument is that it is the limiting distribution of a whole collection of random variates. Whatever the case let’s look at my data. Here is a histogram of the 400 measurements I made of my simple reaction times, overlaid with the best fit Normal distribution (the one specified by the mean and standard deviation calculated above):

Since its a finite sample, you expect fluctuations about the ideal curve. Its not a terrible fit - at least there is a single peak and the data does taper off on either side more or less smoothly. But its pretty clear also that its not a very good fit - maybe its a little more peaked than expected and there is certainly a long tail on the high end. Furthermore the data is clearly not symmetric about the peak as it should be for a Normal distribution. There are formal measures of how good a fit is, but let’s just take the empirical observations for now.

The skew-normal distribution

Is there a better distribution? There is a veritable zoo of studied distributions, hundreds probably, and you are always free to make up your own. In this case however there is an obvious candidate, named appropriately enough the Skew-Normal distribution (Wikipedia it if you want more information). It is a three parameter distribution (the Normal is a two parameter distribution: mean and standard deviation) so there is more freedom to choose the distribution to fit the data better. Here is a plot of the best fit of this distribution to my data:

This is clearly a better fit than the Normal distribution. The data still appears to be a little more peaked than the distribution, but the one-sided tail is captured reasonably well, so I’m pretty happy with it as a description of the data.

The three parameter values for this fit are location = 182 (roughly corresponding to the mean), scale = 25 (roughly corresponding to the standard deviation) and shape = 4.5 (a measure of the skewness).

So what do we win here? A better fit by itself isn’t necessarily useful. For example, one of the three parameters, the location, is a measure of something like the mean in a normal distribution. However, since almost all the reaction time data for various populations reported in the literature is given in terms of the mean, I can’t compare my number to others. The parameter still could be useful, of course, if I were to succeed in collecting data from other gamers, and analyze it with the same distribution.

The potential bigger win however is if the three parameters better reflect the underlying mechanisms of the reaction time. However I don't know of a good model of the mechanisms underlying simple reaction times. Certainly a large part is simple neural conduction times, but the brain gets into the act with some processing connecting input and output channels. I could imagine the long tail on the distribution may be a reflection of variations in some sort of attention process in the brain, but I don't have any good model for that.

Good days and bad days?

Maybe I can make use of my three parameters by exploring my good days and bad days. Its a universal observation of people going about any task that they seem to have good days and bad days. Sometimes everything seems to be working right and the task just flows along. Other times there is just frustration and discontent. And in my Destiny playing there certainly are good and bad days, so maybe this is just a reflection of varying reaction times. Since I can measure my reaction times I can explore this.

I spent a couple of weeks working on this, and in the end I didn't have much luck in showing any correlation that I found interesting. The main difficulty is that its very time consuming to get enough data to look at variability. Each trial takes maybe 15 or 20 seconds so that the time to take the 400 points used in the fit above would be over an hour and a half. If it was a good day at the start of the measurement, it would almost certainly be a bad day after 90 minutes of this boring task. My attention would certainly wander after only a few minutes and performance would suffer. I discovered that I could tolerate maybe 50 trials before I started to lose it. Unfortunately 50 trials is not really sufficient for a reliable fit to the data for the skew normal distribution.

Here are plots of two days, the left hand which I classified as a bad day and the right hand a good day:

They really are pretty similar. The means were both 190 secs and the standard deviation for the first was 16.2 and for the second it was 22.8. It looks like my attention might have been flagging a little bit there at the end of the second one. But in any case I couldn't distinguish the good day and the bad day from this data, and that held true for the other days for which I gathered data.

If not simple reaction time, then what?

If the simple reaction time doesn't answer the question of why I'm not as good as my companions, nor does it shed light on my own good days and bad days, then what does?

There are at least two more levels of reaction times that could be measured and that go beyond this simple reaction time statistic. Perhaps they could get closer to an explanation.

The next step would be to measure my go/no-go reaction time. This is an elaboration of the above task in that there are two stimuli - say the light is either red or green and you should push the button if it is green but not push the button if it is red. This requires a simple decision to be made in the brain, and that additional time should be reflected in a longer reaction time. But its also more complicated to analyze in that now there is an issue of a right and wrong response possible. For example there is a tradeoff between being very fast and often wrong, or slower and more often right, which needs to be added to the analysis. But its easy enough to implement the task and the analysis isn't that bad, so this is a potential little project.

The final step is really a whole realm of possibilities in which there are multiple stimuli and multiple responses - maybe two lights to indicate which of two buttons is to be pushed. Or more lights and more buttons and more decision processing. This is getting more realistic but also more difficult to implement and analyze.

A good solution would be to have a standardized test in the actual game environment with realistic (for the game) stimuli and responses and some numerical measure of success. I'm looking for this kind of standard test that I could repeat whenever I wanted to measure my effectiveness. If the task was simple, easy and quick, I could attempt it whenever I started to play Destiny and if I didn't do well, I could save my time and go out and do some gardening.

Tangled Spaces