Computers are difficult enough to reason about when there’s just a single thread doing one task. There are dozens of cores in today’s modern processor world, and your program might try to take advantage of using more than just one. Things happening concurrently makes the number of states and interactions explode in to a mess we as humans are likely going to have trouble understanding. So, like [Hillel], you might turn to the computer to try and model those interactions.
The model in question is a task queue. Things are added to the pile, and “workers” grab one from the pile and process it. There are two metrics used to measure the effectiveness of a task queue: throughput and latency. Throughput is the number of things you can do per second (like this maximum throughput 3d printer), while latency is the amount of time it takes to finish one thing.
Instead of writing a simulation, [Hillel] turned to a probabilistic model checker called PRISM. There are a few constraints on the model, such as each task being dependent and taking a different time to complete. This is modeled by the fact that each step a worker has a 50% chance of completing their task. For each step, there is a 50% chance a new task comes into the queue, up to a limit of N total tasks. Next, he modeled throughput by creating a reward function that gives us the total number of steps it took us to complete all tasks. Latency is another reward function, but, it is the sum of the number of items in the queue for each timestep.
With just one worker, the growth in latency looks quadratic. Just ten tasks wait for 29-time steps, while 20 tasks wait for 97-time steps. When adding in a second worker, the throughput doesn’t double but instead is about 2/3rds of what it was for one worker. But on the flipside, latency has fallen to something closer to linear.
While a simple model, the idea of a model to simulate a complex domain is there. You could easily add priority, more workers, retrying, adding items back to the queue, adding multiple items in one time step, and other things. [Hillel] provides a little python gist to help you generate the PRISM for an arbitrary number of workers.
Formal methods/verification isn’t something we talk about often on Hackaday, and if you’re curious for more, we talked about how to verify your C compiler as being trustworthy.
6 thoughts on “Reflecting On A Queueing Prism Leads To Unexpected Results”
I found the statement: “There are a few constraints on the model, such as each task being dependent and taking a different time to complete. This is modeled by the fact that (at) each step a worker has a 50% chance of completing their task.” ambiguous.  What is meant by “dependent” here? It is like a shared resource dependency (or anything that can lead to a resource lock)?  Does “different time” refer to “clock cycles”? It can’t just be “time”, especially when some things can run in parallel within the cores of a multi-core processor (but other things require shared resources like memory or bus access).  What is meant by “50% chance of completing their task”? That seems too arbitrary.
This is where hydrid quantic-digital computers would shine I think. But don´t ask me how, it´s just a gut feeling.
Did anybody else’s mind go to the dark side of the Moon album when they saw the picture? I guess I’m just old. Lol. Interesting article though. 👍
This has been studied a lot, there are a bunch of papers arguing the merits of one or more algorithmic variations.
It has practical applications in the real world, such as when standing in line: at the supermarket, or at the DMV for example.
To minimize the *average* latency, always execute the shortest task first. In the real world, that would entail processing the person at the line in the DMV with the shortest task (“just want to register one car”) while letting the longer tasks wait (“I need to register the fleet at my company”).
Of course, this can lead to a situation where someone with a long task waits forever, and that’s not appropriate in the real world.
Still, the analysis leads to optimization: you can make a separate line for people with simple requests and process those people quickly while having a 2nd line for more complicated ones, and then grab people from the complicated queue singly when you have no one with a simple request. That algorithm works out very well in most situations.
It’s also a problem in the real world when there’s a cost associated with processing tasks. We see this at airport checkin, where it costs the airlines more money to pay for 2 people at two checkin windows, so instead they only have one person and tell you to arrive 2 hours early.
This is happened to me at the grocery store. There is the relatively new self-checkout section of 4-6 scanners, all of which usually staffed by one clerk, in addition to the regular lines with one clerk per line. Sometimes when the self serve area is clear that clerk will pull someone from the regular line. Not sure if they did analysis or it was just an intuitive solution but what you said definitely plays out in the real world. cool.
You mean that this is the reason why DMV has multiple alphabet related queues (L33, M22, N77,, …) running simultaneously when waiting there?
Please be kind and respectful to help make the comments section excellent. (Comment Policy)