Many problems in real life are very complex. One can construct a mathematical model to understand them. However, even solving the models exactly can be virtually impossible. In such cases, numerical experiments or simulation can be very useful. SciPy includes a large number of statistical distributions.
For discrete event simulation, simpy is a very interesting and useful application. It is not yet integrated with SciPy/Numpy but, hopefully, it may be soon.
In this article, you will simulate queueing delays in a bank, building on top of the excellent tutorial included with simpy documentation(see references below). The extension is that you will use statistical distributions from the scipy package with the simulation framework of simpy.
You can explore the Erlang distribution which seems to be a good option for modelling the arrival and servicing times. It is still widely used in tele-communications industry. Try the following code:
The plot in Figure 1 shows you what the probability distribution functions look like for the Erlang distribution with the parameters chosen. The function is like the exponential distribution when the shape parameter is one and seems to be good option for the customer arrival rate. It looks a lot like a normal distribution for larger values of the shape parameter. However, the distribution with the shape parameter 2 looks a lot like you may have experienced. No one leaves quickly and some seem to be on the counter for a long time. All three distributions have the same mean, but their variances differs. The output of this program will be
Erlang distribution probability functions with mean 2: Shape=1(Blue), Shape=2(Green), Shape=10(Red).
Your model consists of a source which generates customers randomly. Each customer waits for a counter(resource) to become available. The customer then spends a random amount of time for getting his work done and leaves. Include the necessary packages and write the code for generating customers.
The class Source contains one method – generate. It takes as parameters the number of customers to be generated, counter resources and probability distribution functions for the arrival of customers and the serving time at the counter. You create a customer and activate the customer passing a method name (visit in your case) as a parameter. The simulation package will call the visit method, which will need to be defined in the Customer class. Then, you find the random time at which the next customer is expected and hold the process till then.
You may now define the Customer class.
Customer class contains the method visit, which encompasses the logic for interaction of the customer with the bank. You track the time which the customer comes in. Customer waits for a counter to become available and you compute the wait time. Then, find the random time for servicing this customer and hold the process till it gets over. Release the counter. Finally, compute the time spent by the customer in the bank and process the times.
Your main program starts the simulation by creating the needed resources, activating the source process for generating customers using the generate method defined in the Source class. You use the Erlang distributions with shape factor 1 and 2 for arrivals and serving time.
For processing, just print the results and run the simulation with a customer arriving every 1.5 minutes and needing an average of 2 minutes to be served. You would expect the queue to build up pretty quickly as you can see from the results:
Here is another run and the counter is usually free:
And here is an output with 2 counters:
You will notice that the customers do not necessarily exit in the same order as they entered. Also, even with 2 counters, there are times when queuing delay is present.
Obviously, you need to experiment with large number of customers and printing the results is not particularly useful or meaningful.
You can collect the data in the process function and then use scipy's statistical functions. Simpy includes some simple statistical methods which will not be explored here.
You keep appending times to global python lists during the simulation. The lists are converted to numpy arrays and their mean, variance, median and maximum are printed. Of these, median is not a property of a numpy array and is computed using a standard numpy method. With two counters, a customer is handled every minute; hence, it also interesting to see how many customers did not have to wait. Numpy arrays have a number of useful properties, e.g. nonzero to get indices of the elements which are not zero.
The results for a run with two counters will look like:
With a sample of only a hundred, running the simulation again may give you substantially different results. You can experiment with larger sample sizes and explore the many more functions available in scipy.
With this article, we come to an end of our exploration of scipy, with the hope that you will find it useful for solving more complex research and educational problems.
SimPy Tutorial : http://simpy.sourceforge.net/SimPyDocs/SimPy_Tutorials.html
Erlang Distribution : http://en.wikipedia.org/wiki/Erlang_distribution
SciPy Statistics : http://docs.scipy.org/doc/scipy/reference/stats.html
Python for Research >