For example, for an asynchronous IO by using TCP/IP (using POSIX poll/select or more advanced epoll, kqueue, poll_set, IOCP), network driver starts by an interruption in different (hardware demultiplexer) CPU-cores, receives messages and dump them into a single (multiplexer) buffer at the kernel level. Then, our thread-acceptor by using epoll / kqueue / poll_set / IOCP receives from this single buffer a list of descriptors of sockets of messages which came and again scatters (demultiplexer) across threads (in thread-pool) running on different CPU-cores.
In short scheme looks like: hardware interruption (hardware demultiplexor) -> network driver in kernel space (multiplexor) -> user's acceptor in user space by using epoll / kqueue / poll_set / IOCP (demultiplexor)
Is not it easier and faster, to get rid of the last two links, and use only the "hardware demultiplexor"?
An example. If a network packet arrives, the network card will interrupt the CPU. On most systems today, these interrupts are distributed across cores. I.e. this work is a hardware demultiplexer. After receiving such an interruption, we can immediately process this network's message and wait for the next interrupt. All work for demultiplexing is done at the level of the hardware, by using a CPU interrupt.
In Cortex-A5 MPCore: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0434b/CCHDBEBE.html
Is it feasible an approach in all of Linux, in real-time *nix such as QNX, and are there public projects where this approach is used, may be ngnix?
UPDATE:
Simple answer to my question - yes I can use hardware demultiplexing by using /proc/irq/<N>/smp_affinity
: http://www.alexonlinux.com/smp-affinity-and-proper-interrupt-handling-in-linux
But second notice - it is not such a good thing, because different part of one packet can handled by different cores, and it can take time to cache synchronization (L1(CoreX)->L3->L1(CoreY)) for cache coherency: http://www.alexonlinux.com/why-interrupt-affinity-with-multiple-cores-is-not-such-a-good-thing
SOLUTIONS:
- hard-bind different ethernet adapters(its IRQs) to the different single CPU-cores
- use large packets and small messages, when packet often contain whole message completely
QUESTION: But may be there are some better solutions, an example using soft-IRQ (without hardware-IRQ) when we recieve a batch of some network packets from network adapter manualy, are there?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…