Slot Machine Reward Psychology

admin  4/12/2022
16 Comments

Do you remember the slot machine example? Gambling and lottery games are examples of a reward based on a variable ratio schedule. In the classroom, an example would be rewarding students for some.

  1. Punishment And Reward Psychology
  2. Slot Machine Reward Psychology Program
Slot
  1. Slot machines reward gamblers with money according to which reinforcement schedule? Variable interval. Critical Thinking Questions: 1. What is a Skinner box and what is its purpose? What is the difference between negative reinforcement and punishment?
  2. The Cambridge researchers put their subjects in an fMRI machine to take images of their brains while they played a two-roll slot machine game. When the players hit a match and won money, the reward systems of the brain predictably got excited – the activation of areas classically associated to respond to food or sex I mentioned earlier.

Okay, so Lavata and I get in heated discussions about WoW’s loot system. He hates that anything random ever drops from an instance, and I think that random things HAVE to drop from instances. He’d be much happier if you always got what you want the first time, or if everything you wanted was on a token system.

Slot

The fact is that random loot uses well known behavioral psychology principles, and it works on both rats and people, to keep them engaged with an activity for longer. What I mean is – rats will keep pressing the levers, and people will keep mashin’ the keyboards trying to get their loot. At a certain point, random isn’t all that fun, but you’ll keep working towards that goal, whether or not you are happy. However, behavioral psychology has nothing to do with being happy, it has to do with modifying behavior (and behavior modification is a normal part of life). You say: “But I’m not like some Pavlovian dog that salivates at purple loot!” My response to that is, well the game is more like operant than classical conditioning, so in a way you are right.

Terms (from Schwartz, Wasserman, & Robbins, 2002, Psychology of Learning and Behavior):

  • Behavioral Psychology – Behavior theorists try to understand learning (and believe that all animal species learn in basically the same way). They believe that all learning involves associations between responses and stimuli, complex behavior can be understood by it’s parts, and environmental influnences play a large part in learning.
  • Operant Conditioning – Producing voluntary, goal-directed behavior, where reinforcement (either positive or negative) is used to shape behavior. Behavior is effected by it’s consequences, with the goal of doing things that will have positive consequences for you (ie. a rat will learn how to press a lever to get food).
  • Positive Reinforcement – when behavior produces a reward that is considered positive. This increases the likelihood that the behavior will occur again. Keep in mind that you can think giving candy to a child is reinforcing, however if they are allergic to eating candy (as in eating candy sends them to the hospital), then it’s not a positive reinforcement. It’s when you give someone something that they actually do want.
  • Conditioned Reinforcement – While things like food are likely to be rewarding if you are hungry, things like money (both inside and outside the game) are conditioned reinforcers.
  • Token reinforcement – Instead of directly giving money or food, you can train an animal to associate tokens with their reinforcement. So, you can teach a chimpanzee that they can get their food after they have X number of tokens.
  • Continuous reinforcement – Every time you do a behavior, you get a desired reward.
  • Intermittent reinforcement– Most behavior is reinforced intermittently. For example, when you work at a job, you aren’t constantly being given pennies or dollars for producing good behaviors throughout the day, you usually get paychecks on some sort of set schedule.

Types of Intermittent reinforcement schedules (and how they show up in WoW):

  • Fixed Ratio Schedule – You get reinforced after completing a certain number of responses, which is not based on the passage of time. PvP honor rewards are fixed ratio schedules. You get loot after accuring a certain amount of honor. If you kill a lot of people quickly, then you get your reward sooner. Reputation grinds, where you get rewarded with small amounts of rep for every kill, ends up being fixed ratio schedules, since there is a fixed ratio of mob kills you have to do to get to the next highest reputation rank. If you had to kill 20 mobs to get a drop, then that drop would be on a fixed ratio schedule – so a lot of quest rewards are fixed ratio schedules.
  • Variable Ratio Schedule – This requires a random number of responses to get your reward, and is how all the random drops in WoW (and cassinos) operate. This is the schedule that all random drop loot is on. Just the passage of time alone doesn’t make the drop more likely, and every time you kill that mob, it has the same X% chance of dropping the item, which doesn’t increase with the number of kills.
  • Fixed Interval Schedule – Requires one response after a fixed amount of time. While loot tends to not be given out just based on the passage of a fixed amount of time, the boss respawn timers are on pretty fixed intervals. If you clear Naxxramas on a Tuesday night, you get no reward for going to Naxx until after the bosses have respawned. Daily quests are also more of a fixed interval schedule. You know that if you did it today, you can’t get rewarded for trying to do it again until tomorrow. True fixed interval schedules would be something like respawning an hour after you killed it, so that the interval was based on your behavior.
  • Variable Interval Schedule – Requires a random amount of time to have passed and a single response to get your reward. A great example of this is ore & herb respawn timers. Any one node takes a random amount of time to respawn, so you are forced to run/fly around a zone trying to find ones that did pop up, rather than camping on a node and coming back every X minutes. In-game fishing is on a variable interval schedule, in terms of how long it takes once you cast before you can get your fish.

But WoW rewards me constantly every time I do something: At level cap, your most common rewards are basically token reinforcement (money & token drops from instances). Loot that you want usually drops in PvE on a variable interval schedule, off bosses that are on fixed interval schedules. Things like Arenas still have some random elements because the queue system for rankings makes it such that sometimes you lose, but there’s a reason why so many people queue up the day before arena points are handed out – because arena points are basically driven entirely by a fairly fixed interval schedule – arena points are given out the same time every week. They mostly put in boss respawn timers on the high-end raids so that we’ll get up out of our chairs and do other things with our lives sometimes, and to control the pacing of how often we get those big rewards once we hit end-game. The great thing that WoW can do is to maximize all of the types of schedules, and use them in ways that keep us playing longer.

Punishment And Reward Psychology

There are plenty of examples of WoW on variable ratio schedules. I could kill the first boss in Shadow Labs non-heroic as many times as I wanted to try and get my lifebloom idol off the boss, as long as I had people who were willing to go in and help me farm it. We ran that instance over and over and over again, until I just gave up mostly because my friends weren’t going to be my friends much longer if I kept making them run that instance with me (and I understood that going more didn’t increase the likelihood of it dropping).

What WoW (and Casinos) understand is that variable ratio schedules have the best rates of responding. You’ll keep playing the game longer so that you can have a chance of getting that awesome reward. While random isn’t “fair”, it keeps you playing the game longer.

Variable ratios are also a tool to keep some items as being very rare, which makes them more desirable. “Green” items are super common, so it’s not exciting when you get one. “Blue” items are worth a little bit more because they’re a little bit harder to get. Some “epic” items are likely to be rare, so we still value them highly, and in most cases, they are the best we can get. “Legendary” items are way super rare, and people will do just about anything to get one. If all it took to get the legendary was a fixed ratio of 20 kills of bosses in a raid instance and everyone got it after that number of kills, it wouldn’t have that high of a value – it would be totally common. The randomness and rarity of the items make them desirable.

If all the loot was based on a fixed ratio token drop schedule, raiding would be really, really boring. “Oh, I don’t get my loot until next Tuesday.” It didn’t work in manufacturing jobs – people created unions to organize against fixed ratio payment schedules, because while they worked hard to get products built, it meant that the entire workday was a constant grind to get things done. Having some grinding to get rewards on a fixed ratio schedule is fine to have in the game, since you’re actually just getting tokens and points on a constant schedule, which isn’t even a true fixed ratio schedule (ie. kill this mob 20 times and then get the drop).

Slot Machine Reward Psychology Program

In this game, it’s the draw of the random exciting rewards that makes us keep coming back week after week, year after year. When the loot items in the existing content are no longer desirable, they release new and better items that are “must have” items for us to be excited about earning again. If bosses didn’t drop random loot, we probably wouldn’t like raiding so much. The variable ratio drop schedules for the pieces of the desired loot pieces is what makes us keep coming back to an instance. You don’t want to miss a night of raiding, because if the desired piece of loot drops and someone else gets it instead, then you’ll be upset. Likewise, people don’t want to get up from their slot machine at the Casino, even if they are on a winning streak, because they are waiting for their payout to come. Casinos would make a lot less money on slot machines if they gave out rewards on any other reinforcement schedule than what they do.

Understanding behavioral reward systems also helps explain why people got bored with Naxxramas and stopped logging in after they had all the loot they wanted. They didn’t really think that there was likely to be a reward they want for going to that instance, so they stopped playing and found something else to do that was rewarding (either in or out of the game). This is also why gearing up your tank first, before the rest of the raid, usually ends up with the tank getting bored and not wanting to run the instance long enough for everyone else to get loot. However, sometimes you keep running it after you stop getting rewards, since “extinction” of behavior (ie. when you stop doing something) usually takes longer on a variable interval schedule than the other schedules. It helps explain why people will spend hours of their time trying to fish up a rare turtle mount, and why at least some loot out of instances has to be random drops (and why bosses usually drop a smaller number of loot items than there are people in the instance). If you knew that after 20 runs in Ulduar, you could have all the gear you needed, you would run Ulduar 20 times and then stop running the instance.

In conclusion, having rare drops on a variable ratio schedule keeps rats pressing the lever longer, keeps people spending money in the Casinos, and keeps WoW players staying in the game longer. It’s not about being fair, or really what gets you the rewards the fastest – variable ratio reinforcement schedules is what what keeps you playing the longest. I could ramble on more, but that’s where I’ll stop for today.