diff options
| author | Aaron Rossetto <aaron.rossetto@ni.com> | 2022-02-17 12:15:21 -0600 | 
|---|---|---|
| committer | Aaron Rossetto <aaron.rossetto@ni.com> | 2022-03-03 14:09:27 -0600 | 
| commit | 8601d225d52b5d5d3876f633bf5d3a97ff475bda (patch) | |
| tree | d4c47b15075324c21690514fd3a7c02bcf6ddb72 /host/lib/rfnoc/replay_block_control.cpp | |
| parent | 7599789f83bcf9f9e9bdef3432c37e460bddc1e4 (diff) | |
| download | uhd-8601d225d52b5d5d3876f633bf5d3a97ff475bda.tar.gz uhd-8601d225d52b5d5d3876f633bf5d3a97ff475bda.tar.bz2 uhd-8601d225d52b5d5d3876f633bf5d3a97ff475bda.zip | |
rfnoc: Refactor ctrlport_endpoint; fix MT issues
This commit refactors ctrlport_endpoint and fixes several issues related
to multiple threads sending and receiving control transfers.
First, it refactors the change that Martin Braun implemented in
0caed5529 by adding a tracking mechanism for control requests where
clients have explicitly asked to receive an ACK when the corresponding
control response is received.
When a client wants to wait for an ACK associated with a control
request, a combination of that request's opcode, address, and sequence
number is added to a set when the request is sent. When a control
response is received, the set is consulted to see if the corresponding
request is there by matching the packet field data listed above. If so,
the control response is added to the response queue, thus notifying all
threads waiting in `wait_for_ack()` that there is a response that the
thread may be waiting on. If the request is not in the set, the request
is never added to the response queue. This prevents the initial problem
that 0caed5529 was addressing of the response queue growing infinitely
large with control responses that would never be popped from the queue.
Secondly, it addresses issues when multiple threads have sent a request
packet and are waiting in `wait_for_ack()` on the corresponding
response.
Originally, the function contained a loop which would sleep the calling
thread until the control response queue had at least one element in it.
When awakened, the thread would pop the frontmost control response off
the queue to see if it matches the corresponding control request (i.e.,
has the same sequence number, opcode, and address elements). If so, the
response would be handled appropriately, which may include signalling an
error if the response indicates an exceptional status, and the function
would return. If the response is not a matching one, the function would
return to the top of the loop. If the corresponding response is not
found within a specified period, the function would throw an op_timeout
exception.
However, there is a subtle issue with this algorithm when two different
calling threads submit control requests and end up calling
`wait_for_ack()` nearly simultaneously. Consider two threads issuing a
control request. Thread T1 issues a request with sequence number 1 and
thread T2 issues a request with sequence number 2. The two threads then
call `wait_for_ack()`. Let's assume that neither of the control reponses
have arrived yet. Both threads sleep, waiting to be notified of a
response. Now the response for sequence number 1 arrives and is pushed
to the front the response queue. This generates a signal that awakes one
of the waiting threads, but which one is awakened is completely at the
mercy of the scheduler. If T1 is awakened first, it pops the response
from the queue, finds that it matches the request, and handles it as
expected. Later, when the reponse for sequence number 2 is pushed onto
the queue, the still-sleeping T2 will be awakened. It pops the response,
finds it to be matching, and all is well.
But if the scheduler decides to wake T2 first, T2 ends up popping the
response with sequence number 1 off the front of the queue, but it
doesn't match the request that T2 sent with sequence number 2, so T2
goes back to the top of the loop. At this point, it doesn't matter if T2
or T1 is awakened next; because the control response for sequence number
1 was already popped off the queue, T1 never sees the control response
it expects, and will throw uhd::op_timeout back up the stack.
This commit modifies the `wait_for_ack()` algorithm to search the queue
for a matching response rather than indiscriminately popping the
frontmost element from the queue and throwing it away if it doesn't
match. That way, the order in which threads are awakened no longer
matters as they will be able to find the corresponding response
regardless. Furthermore, when a response is pushed onto the response
queue, all waiting threads are notified of the condition via
`notify_all()`, rather than just waking one thread at random
(`notify_one()`). This gives all waiting threads the opportunity to
check the queue for a response.
Finally, the `wait_for_ack()` loop has been modified such that the
thread waits to be signalled regardless of whether the queue has
elements in it or not. (Prior to this change, the thread would only wait
to be signalled if the queue was empty.) This effectively implements the
behavior that all threads are awakened when a new control response is
pushed into the queue, and combined with the changes above, ensures that
all threads get a chance to react and check the queue when the queue is
modified.
Diffstat (limited to 'host/lib/rfnoc/replay_block_control.cpp')
0 files changed, 0 insertions, 0 deletions
