Is There Such a Thing as an Optimal Service Level?

Abhishek Soni

People queuing on stairs

Everybody hates waiting in a line up (that’s a queue for the Brits). So what if lines could be more effectively managed so that both customers and the service provider benefitted? Contributor Abhishek Soni explains how using concepts of queueing theory.

Queueing theory is a quantitative study of the behavior of waiting lines or queues. A queueing system is composed of a server providing services to clients and a set of clients waiting in the queue for the services.

In a typical client and server model of service operations, a queue is formed whenever the demand of service exceeds the capacity to provide it at that point of time.

Queues are characterized for a given server count (c) by the arrival rate (λ) and service rate (μ) as:

  • Traffic density (ρ) of queue is defined as ρ= (λ/(μ*c))
  • Queue length is infinite for ρ> = 1
  • Queue length is finite for ρ< 1

As per Little’s law, a queue is represented by the equation: L= λ W

  • L=Average length of queue
  • λ=Average arrival rate of customers
  • W=Average waiting time in queue

Queueing theory is an excellent tool to quantitatively analyze the performance of service operations. It helps us develop models to predict behavior of the systems that provide services for randomly arriving requests.

Based on average arrival rateand average service rate, queueing models are capable of predicting the following performance metrics of the service operations:

Service Metrics


Average arrival rate

Average number of requests / customers arriving in the system for fulfillment

Average service rate

Average rate at which a request is fulfilled or customer is served

Average server utilization

Average percentage amount of time a server is utilized or remains idle

Average queue length

Average number of customers waiting in a queue for fulfillment of a request

Average no of customers in the system

Average number of customers waiting in queue + average number of customers being served

Average waiting time in queue

Average time spent by a customer waiting in the queue before being served

Average time in the system

Average lead time to fulfill customer request

(Average time spent by a customer waiting in the queue + Average time spent by a customer getting served)

Probability of waiting in queue

Probability an arriving customer will have to wait in the queue

Probability that there will be n of customers waiting in the queue.

Queuing models are further influenced by the following characteristics of the queuing system:

  • Arrival pattern of customers
  • Service pattern of servers
  • Queue discipline
  • System capacity
  • Number of service channels
  • Number of service stages

Economics of Wait Time

In hhe service industry customers invariably spend a significant amount of time waiting during the fulfillment of their request or while obtaining a product. Indeed, customers are interested in more than just the cost of a service or product. Customers evaluate the service/product based on its economic cost which is the cost of product/service plus the opportunity cost of time spent waiting in a queue for receiving the service or product. Hence, it becomes imperative for service providers to deliver services at optimal levels thereby reducing the opportunity cost of time for customers.

A strategy of low service level for fulfillment may be inexpensive, in the short run, but may incur a high cost of customer dissatisfaction such as customer churn or loss of future cash flows. On the contrary a high service level will result in lower dissatisfaction costs but requires additional investment in servers.

The service provider encounters trade-off between the cost of delivering a particular service level and the cost of customer dissatisfaction.

Queueing Theory and Optimal Service Level

The goal of queuing theory is to minimize the total cost of the system by determining optimal service levels. Service level is a function of servers deployed in the system. In order to determine the optimal configuration of servers, two opposing costs are evaluated: Service delivery Cost, SC (cost of delivering services at particular service level) and Waiting Cost, WC (Cost of Customer dissatisfaction arising due to wait time in queue).

Total Cost (TC) of the system is combination of Service delivery Cost (SC) and Waiting Cost (WC).

i.e. Total Cost (TC) = Service Delivery Cost (SC) + Waiting Cost (WC)

While Service Delivery Cost, SC is a function of service level (or number of servers deployed), Waiting Cost, WC is a function of total time spent by customer waiting in queue at a particular service level.

An optimal service level is defined as a service level (delivered by certain number of servers) for which total cost of the system is minimum. Queuing theory enables us to determine optimal service level by simulating the Total Cost (TC) of the system for different combinations of servers and wait time and revealing the minimum Total Cost (TC) configuration.

Queuing models are quite a handy tool to understand the impact of waiting line on service providers and clients and optimally manage the queues in service environment