Saftlib Latency Measurements
During the startup of the accelerator in February 2021, issues have been observed with the so-called function generator (FG): Occasionally, the Master-FG within saftlib receives the CMD_FG_PREP Event too late (>10 ms). As a consequence, the buffers of the FG lm32 firmware run empty.
Introduction
The relevant frontends configure ECA rules for the following event numbers.
- CMD_SEQ_START
- CMD_BP_START
- CMD_FG_PREP
- CMD_FG_START
- CMD_GAP_START
- CMD_GAP_END
The critical messages is CMD_FG_PREP that is scheduled 8ns after CMD_BP_START (or CMD_SEQ_START). It was first suspected, that CMD_FG_PREP suffered a delay while processing the preceding CMD_SEQ_START/BP_START.
As first measure, the order of actions has been inverted by adding an offset of -16ns to the ECA rule for CMD_FG_PREP. Thus, CMD_FG_PREP is now executed 8ns ahead of CMD_SEQ_START/CMD_BP_START. Although slight improvements were observed, the problem was not cured. This triggered a few measurements that are presented in the following.
Delivery Time from the ECA Software Action Queue to a Callback Routine
A simple
single-threaded test program has been written that uses ECA rules identical to the ones used by the relevant FESA classes. The program has been compiled and run on two different SCUs.
- Enigma SCU
- scuxl0302 reconfigured
- gateware 5.0.4
- RTE timing-rte-tg-enigma-v5.0.4
- ramdisk scuxl.enigma (2019-08-12)
- Fallout SCU
- scuxl0338 with standard configuration
- gateware 5.0.4
- RTE timing-rte-tg-fallout-v6.0.1
- ramdisk scuxl.fallout (2020-11-09)
Measurement
The result of each measurement are three data sets. The first one is simple statistics. The second one provides close look on delayed messages (a 'delay' happens, if the host is not able to accept a due message at the exit of the software action queue). The third data sets shows the time required for delivery; this is the time difference between removing the message from the software action queue to the delivery in the callback function.
- simple statistics
- number of received messages
- number of late, early, conflict, delayed messages
- delay statistics (for delayed' messages only)
- average delay (standard deviation)
- minimum delay
- maximum delay
- delivery statistics from ECA to callback function
- average delivery time (standard deviation)
- minimum delivery time
- maximum delivery time
Data
The following data have been taken on two SCUs simultaneously.
Enigma
[root@scuxl0302 ~]# ./fec-analyzer -g -o 16 tr0 0xd2 1
offset for FG_PREP is -0.016 us
average TS read time via EB is 27.381 us
average system time offset is 36996876.430 us
corrected system time offset is 36996862.740 us
BP_START: n 11076, late 0, early 0, conflict 0, delayed 11076
SEQ_START: n 2215, late 0, early 0, conflict 0, delayed 2215
GAP_START: n 2215, late 0, early 0, conflict 0, delayed 0
GAP_END: n 2215, late 0, early 0, conflict 0, delayed 0
FG_PREP: n 13291, late 0, early 0, conflict 0, delayed 0
FG_START: n 2215, late 0, early 0, conflict 0, delayed 0
BP_START - delay [us]: ave(sdev) 347.211( 26.494) min 214.568 max 594.272
SEQ_START - delay [us]: ave(sdev) 337.278( 19.175) min 222.840 max 524.544
GAP_START - delay [us]: N/A
GAP_END - delay [us]: N/A
FG_PREP - delay [us]: N/A
FG_START - delay [us]: N/A
BP_START - delivery [us]: ave(sdev) 667.753( 113.655) min 375.868 max 1894.260
SEQ_START - delivery [us]: ave(sdev) 654.442( 104.149) min 406.604 max 1397.316
GAP_START - delivery [us]: ave(sdev) 622.295( 67.359) min 429.740 max 893.740
GAP_END - delivery [us]: ave(sdev) 605.052( 67.461) min 456.740 max 1731.740
FG_PREP - delivery [us]: ave(sdev) 633.091( 77.991) min 403.748 max 1957.748
FG_START - delivery [us]: ave(sdev) 594.278( 60.605) min 413.740 max 962.740
Fallout
[root@scuxl0338 ~]# ./fec-analyzer -g -o 16 tr0 0xd2 1
offset for FG_PREP is -0.016 us
average TS read time via EB is 27.749 us
average system time offset is 36999128.038 us
corrected system time offset is 36999114.164 us
BP_START: n 11080, late 0, early 0, conflict 0, delayed 11080
SEQ_START: n 2217, late 0, early 0, conflict 0, delayed 2217
GAP_START: n 2217, late 0, early 0, conflict 0, delayed 0
GAP_END: n 2217, late 0, early 0, conflict 0, delayed 0
FG_PREP: n 13297, late 0, early 0, conflict 0, delayed 0
FG_START: n 2217, late 0, early 0, conflict 0, delayed 0
BP_START - delay [us]: ave(sdev) 339.200( 20.552) min 213.768 max 452.296
SEQ_START - delay [us]: ave(sdev) 336.215( 17.153) min 225.528 max 395.152
GAP_START - delay [us]: N/A
GAP_END - delay [us]: N/A
FG_PREP - delay [us]: N/A
FG_START - delay [us]: N/A
BP_START - delivery [us]: ave(sdev) 579.145( 74.392) min 405.660 max 1215.484
SEQ_START - delivery [us]: ave(sdev) 574.808( 52.814) min 367.316 max 1071.484
GAP_START - delivery [us]: ave(sdev) 495.675( 44.856) min 361.164 max 771.164
GAP_END - delivery [us]: ave(sdev) 490.040( 41.621) min 330.164 max 665.164
FG_PREP - delivery [us]: ave(sdev) 509.402( 45.609) min 358.172 max 1359.172
FG_START - delivery [us]: ave(sdev) 482.863( 31.149) min 355.164 max 663.164
Discussion
The critical action is the one for CMD_FG_PREP. As can be seen, adding an offset of -16ns removes the issue of delay for this action. However, CMD_FG_PREP suffers from an increase of the delivery time by up to 1.4ms (2.0ms) for fallout (enigma). This is an interesting observation. However, it does not explain why the Master-FG executes by more than 10ms too late. In addition, this effect has been present already with the enigma version used in the 2020 beam time, where 'lateness' of the Master-FG was not an issue. Furthermore it should be noted, that fallout provides slightly better results than enigma.
It is interesting that the mean delivery time is in the order of 0.5ms, which is about twice the value reported in a
previous measurement (green curve). There is a simple explanation: The previous measurement is only the Time-of-Flight from the ECA to the callback routine but without the overhead of MSI handling and subsequent communication signaling a new message has been delivered by the ECA software action queue.
Delivery Time from the ECA Software Action Queue to a Callback Routine
Another measurement by the Frontend Team (DD) was done with a combination of saftlib and FESA. Two timing messages with a difference of 40us between deadlines where used with 2 FESA threads. Two long (8h) measurements where performed.
- A: FESA thread1 -> saftlib driver
- B: saftlib driver -> FESA thread1
Data
A avg 344 (200,12 | 250,12469 | 300,210867 | 350,289513 | 400,114166 | 450,22021 | 500,2656 | 550,48 | 600,21 | 650,23 | 700,2 | 750,1 | 900,1)
B avg 202 (150,168103 | 200,320046 | 250,128255 | 300,31154 | 350,3884 | 400,228 | 450,81 | 500,23 | 550,2 | 600,11 | 650,6 | 700,4 | 850,2 | 1000,1)
Data: Each tuple has two numbers: latency in microseconds and number of measurements
Figure: FESA-saftib latency of timing messages.
Discussion
The measured data includes a significant amount of data and show expected latency values. The maximum latency for all measurements is about 1ms in each direction and does not explain, why the Master-FG suffers from occasional 'lateness' exceeding 10ms.
--
DietrichBeck - 18 Feb 2021