Queue concept is straight forward - append new item at the tail and dequeue from the head. It does not tell you how to process the requests in the queue.
You can enqueue new requests with multiple threads and processing the requests with 1 worker thread or multiple threads. Then, here comes with the processing designs:
Sequential (synchronous) processing the requests
With only 1 worker thread processing the requests in the queue, this is going to be slower as compared to the multiple worker threads but it has advantages. The advantage of sequential processing is when it comes to file I/O operation or network I/O operation. Just imagine that the exit has one door and only one person is able to pass through at one time.
Parallel (asynchronous) processing the requests
Thanks to multi-core and multi-CPU. With more than 1 worker thread processing the requests, it reduces the client wait time (on the response) and increases the server throughput.
Multi-thread has nothing to do with the "reducing the processing time for a request"! The "processing time" is depending on your codes and the how much the codes can be optimized (in the development time and also run time).
If there is a long run request and the request is handling by a worker thread (in a thread-pool), then, that worker thread is blocked. It won't be able to handle other request until it finished the current request. And if you have many of long running requests, then, all threads in the thread pool might not be able to reduce the response time. The server is now consider as "busy" even though the client is able to keep sending their requests into the queue.
Can we have unlimited threads in the pool..?!
The answer is no. If you have instantiated too many threads without using thread pool, then, the memory will be depleted... and the program will crash.
What happen to the requests in the queue if my program crash?
All requests will be disappear from the memory. Your program may be designed with auto-restart in case it crashed. But, you won't be able to recover the requests which was stored in the memory.
In case you need something better than storing the requests in memory..
1. Each thread in the thread pool must maintain individual processing statistics and these statistics will be updating to the database hourly. With this information, the system administrator will be able to tell how many worker threads were blocked, in which time zone and which request blocks it. Then, he will be able to justify whether to upgrade or replace the server.
The statistics to be keep track by each thread in the thread pool should include:
- Number of requests that has been processed
- Total processed time (ms)
- Highest processed time (ms)
- Request name for highest processed time
- Peak hour - 24 time zone in a day (just get the current hour value will do)
2. Using database as queue storage instead of memory - the main advantage is that the server will be able continue from where it's left before crashing. With this approach, there will be some overhead in storing the requests into database. But, those overhead is justifiable with the crash-proof design.
Another advantage of storing the requests into database is that the secondary server will be kick-in if all the worker threads in the thread pool (in the primary server) are blocked. The secondary server will be notify by the primary server (through WCF or socket). Then, the secondary server will query the database for the pending requests and process it accordingly.
Two ways to handle the respond to the client
- The result will be stored in the database and then notify the primary server. Then, the primary server will query the result and responds to the client.
- OR the result will be send to the primary server (through WCF or socket) and then responds to the client.