DABC tests, July 10
This is first performance tests, done with DABC framework.
For the moment we are using our 4-nodes InfiniBand cluster, where verbs and sockets can be tested.
Chaotic tests with verbs & sockets
This is a simplest posible test, where on each node runs two modules - Sender and Receiver. Each Sender
connected with all remote receivers (on 4-node system there are 3 connection). Sender sends packet sequentially
to each node - one after another. Therefore, if one node blocks, complete module will be blocked, waiting when connection will be active again.
All connections used in unidirectional mode when data transported from sender to receiver.
Buffer size |
Verbs with ackn |
Verbs without ackn |
Socket with ackn |
Socket without akcn |
Bytes/us |
CPU,% |
Bytes/us |
CPU,% |
Bytes/us |
CPU,% |
Bytes/us |
CPU,% |
2K |
142 |
48 |
140 |
46 |
35 |
17 |
38 |
81 |
4K |
260 |
45 |
200 |
45 |
52 |
16 |
49 |
72 |
8K |
479 |
48 |
330 |
46 |
58 |
14 |
67 |
56 |
16K |
758 |
48 |
780 |
48 |
61 |
17 |
66 |
34 |
32K |
852 |
47 |
928 |
47 |
62 |
24 |
73 |
34 |
64K |
900 |
37 |
944 |
37 |
66 |
26 |
74 |
30 |
128K |
927 |
21 |
952 |
20 |
56 |
17 |
69 |
20 |
CPU utilisation given for both CPU. For verbs, seems to be, the only CPU is really busy. In case of socket, both CPU are involved.
For socket there is slow down for large packet. It can be explained by chaotic nature of the transfer and a fact, that blocking mode used to
recieve/send data. Hopefully, with non-blocking mode, better results will be achieved.
BNet prototype
This is test with more components. There are 4 readout modules, all connected to subevent combiner module. Subevent combiner produces
subevents and deliver them to sender module. Sender connected to all receiver modules (including receiver on the same node). Receiver module
delivers all packets to event builder, where event is build, and finally event comes to event filter, where all events are rejected.
All modules distributed between two threads.
Buffer size |
Verbs with ackn |
Verbs without ackn |
Socket with ackn |
Socket without akcn |
Bytes/us |
CPU,% |
Bytes/us |
CPU,% |
Bytes/us |
CPU,% |
Bytes/us |
CPU,% |
2K |
- |
- |
69 |
48 |
- |
- |
28 |
26 |
4K |
- |
- |
134 |
48 |
- |
- |
40 |
22 |
8K |
- |
- |
269 |
48 |
- |
- |
52 |
20 |
16K |
- |
- |
577 |
48 |
- |
- |
65 |
15 |
32K |
- |
- |
926 |
49 |
- |
- |
70 |
15 |
64K |
- |
- |
920 |
48 |
- |
- |
73 |
18 |
128K |
- |
- |
937 |
27 |
- |
- |
71 |
28 |
Results little bit worse compare to pure chaotic test. Here already plays a role of back-pressure approach, while all modules block (not reading) inputs,
when they cannot deliver data to outputs. As a result, transfer rate over all connections very similar, while with chaotic tests deviation between nodes can be
about 10%.
--
LinevSergey - 10 Jul 2007