Proton-j Reactor - Receiver

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Proton-j Reactor - Receiver

Garlapati Sreeram Kumar
Hello All!

I am using Proton-J reactor API (Version 0.12.0) for receiving AMQP Messages (from Microsoft Azure Event Hubs): https://github.com/Azure/azure-event-hubs/blob/master/java/azure-eventhubs/src/main/java/com/microsoft/azure/servicebus/amqp/ReceiveLinkHandler.java#L124 

Am using the onDelivery(Event) callback to receive messages. I really appreciate your help with this issue/behavior:

ISSUE: I noticed that the last few messages on the Queue are not being issued to onDelivery(Event) callback by the Reactor
- Then, I went ahead and enabled proton Frame tracing (PN_TRACE_FRM=1) and discovered that the Transfer frames corresponding to those messages were not even delivered to Client. Then, I looked at our Service Proton Frames and can clearly see that they are being delivered by the Service. And other AMQP clients (for ex: .net client can see the Transfer frames)
- Is this a known behavior?
Does Reactor code path disable Nagle on underlying socket – could this be related? or is there any other Configuration that we should be setting to see all Transfer frames received on the Socket?

Please advice.

Thanks a lot in Advance!
Sree

Sent from Mail for Windows 10

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Proton-j Reactor - Receiver

Robbie Gemmell
Administrator
On 31 March 2016 at 04:32, Garlapati Sreeram Kumar <[hidden email]> wrote:

> Hello All!
>
> I am using Proton-J reactor API (Version 0.12.0) for receiving AMQP Messages (from Microsoft Azure Event Hubs): https://github.com/Azure/azure-event-hubs/blob/master/java/azure-eventhubs/src/main/java/com/microsoft/azure/servicebus/amqp/ReceiveLinkHandler.java#L124
>
> Am using the onDelivery(Event) callback to receive messages. I really appreciate your help with this issue/behavior:
>
> ISSUE: I noticed that the last few messages on the Queue are not being issued to onDelivery(Event) callback by the Reactor
> - Then, I went ahead and enabled proton Frame tracing (PN_TRACE_FRM=1) and discovered that the Transfer frames corresponding to those messages were not even delivered to Client. Then, I looked at our Service Proton Frames and can clearly see that they are being delivered by the Service. And other AMQP clients (for ex: .net client can see the Transfer frames)
> - Is this a known behavior?
> Does Reactor code path disable Nagle on underlying socket – could this be related? or is there any other Configuration that we should be setting to see all Transfer frames received on the Socket?
>
> Please advice.
>
> Thanks a lot in Advance!
> Sree
>
> Sent from Mail for Windows 10
>

I'm not aware of anyone else reporting anything like that. I don't see
anything in the code suggesting the reactor sets TCP_NODELAY trueon
the socket, but I wouldn't think that should matter here.

The frame trace logging is done after the bytes are given to the
Transport and are processed into frames, so a lack of logging could
suggest various things such as they didnt actually get there, they
werent processed, something went wrong before they did/were, something
went wrong decoding them, etc. Its hard to say much more without more
info.

Robbie
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Proton-j Reactor - Receiver

Garlapati Sreeram Kumar
In reply to this post by Garlapati Sreeram Kumar
Hello Robbie,

We are using proton-j client with SSL and many of our customers are hitting this issue.
Here are my findings after debugging through this issue:

-          When incoming bytes arrive on the SocketChannel – proton-j client gets signaled by nio & as a result it unwinds the transport stack – as a result all the TransportInput implementations performs its task on the Read Bytes and hands off to the Next Layer in the stack (transport to ssl, ssl to frameparser etc).

-          While unwinding that stack, SimpleSSLTransportWrapper.unwrapInput reads(16k bytes) from _inputBuffer and the result - decoded bytes are written to _decodedInputBuffer – as an intermediate buffer.

-          It then flushes bytes from intermediate buffer to the next layer & invokes an _underlyingInput.Process() – to signal it that it has bytes in its input buffer.

-          If the underlyingInput (lets say FrameParser) buffer size is small – lets say 4k – then decodedInputBuffer will be left with 12k bytes & Over time this accrues.

The fix here is to flush decodedInputBuffer to the Next transport in the Network Stack & call _underlyingInput.Process() - until decodedInputBuffer is empty. Here’s the pull request - https://github.com/apache/qpid-proton/pull/73

Pl. let me know if we need to do more to fix this issue comprehensively.

Thx!
Sree

From: Robbie Gemmell<mailto:[hidden email]>
Sent: Thursday, March 31, 2016 9:19 AM
To: [hidden email]<mailto:[hidden email]>
Subject: Re: Proton-j Reactor - Receiver

On 31 March 2016 at 04:32, Garlapati Sreeram Kumar <[hidden email]> wrote:

> Hello All!
>
> I am using Proton-J reactor API (Version 0.12.0) for receiving AMQP Messages (from Microsoft Azure Event Hubs): https://github.com/Azure/azure-event-hubs/blob/master/java/azure-eventhubs/src/main/java/com/microsoft/azure/servicebus/amqp/ReceiveLinkHandler.java#L124
>
> Am using the onDelivery(Event) callback to receive messages. I really appreciate your help with this issue/behavior:
>
> ISSUE: I noticed that the last few messages on the Queue are not being issued to onDelivery(Event) callback by the Reactor
> - Then, I went ahead and enabled proton Frame tracing (PN_TRACE_FRM=1) and discovered that the Transfer frames corresponding to those messages were not even delivered to Client. Then, I looked at our Service Proton Frames and can clearly see that they are being delivered by the Service. And other AMQP clients (for ex: .net client can see the Transfer frames)
> - Is this a known behavior?
> Does Reactor code path disable Nagle on underlying socket – could this be related? or is there any other Configuration that we should be setting to see all Transfer frames received on the Socket?
>
> Please advice.
>
> Thanks a lot in Advance!
> Sree
>
> Sent from Mail for Windows 10
>

I'm not aware of anyone else reporting anything like that. I don't see
anything in the code suggesting the reactor sets TCP_NODELAY trueon
the socket, but I wouldn't think that should matter here.

The frame trace logging is done after the bytes are given to the
Transport and are processed into frames, so a lack of logging could
suggest various things such as they didnt actually get there, they
werent processed, something went wrong before they did/were, something
went wrong decoding them, etc. Its hard to say much more without more
info.

Robbie
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Proton-j Reactor - Receiver

Robbie Gemmell
Administrator
Hi Sree,

Thanks for the analysis and PR, I'll try to take a proper look soon.
It's not an area of the code I'm familiar with so I'll need to have a
bit of a dig myself to see if the change seems ok. I'd note that any
not-insignificant bug fix such as this should probably have a test
with it (and a JIRA, though I see you have since created one of those)
:)

Robbie

On 6 April 2016 at 01:23, Garlapati Sreeram Kumar <[hidden email]> wrote:

> Hello Robbie,
>
> We are using proton-j client with SSL and many of our customers are hitting this issue.
> Here are my findings after debugging through this issue:
>
> -          When incoming bytes arrive on the SocketChannel – proton-j client gets signaled by nio & as a result it unwinds the transport stack – as a result all the TransportInput implementations performs its task on the Read Bytes and hands off to the Next Layer in the stack (transport to ssl, ssl to frameparser etc).
>
> -          While unwinding that stack, SimpleSSLTransportWrapper.unwrapInput reads(16k bytes) from _inputBuffer and the result - decoded bytes are written to _decodedInputBuffer – as an intermediate buffer.
>
> -          It then flushes bytes from intermediate buffer to the next layer & invokes an _underlyingInput.Process() – to signal it that it has bytes in its input buffer.
>
> -          If the underlyingInput (lets say FrameParser) buffer size is small – lets say 4k – then decodedInputBuffer will be left with 12k bytes & Over time this accrues.
>
> The fix here is to flush decodedInputBuffer to the Next transport in the Network Stack & call _underlyingInput.Process() - until decodedInputBuffer is empty. Here’s the pull request - https://github.com/apache/qpid-proton/pull/73
>
> Pl. let me know if we need to do more to fix this issue comprehensively.
>
> Thx!
> Sree
>
> From: Robbie Gemmell<mailto:[hidden email]>
> Sent: Thursday, March 31, 2016 9:19 AM
> To: [hidden email]<mailto:[hidden email]>
> Subject: Re: Proton-j Reactor - Receiver
>
> On 31 March 2016 at 04:32, Garlapati Sreeram Kumar <[hidden email]> wrote:
>> Hello All!
>>
>> I am using Proton-J reactor API (Version 0.12.0) for receiving AMQP Messages (from Microsoft Azure Event Hubs): https://github.com/Azure/azure-event-hubs/blob/master/java/azure-eventhubs/src/main/java/com/microsoft/azure/servicebus/amqp/ReceiveLinkHandler.java#L124
>>
>> Am using the onDelivery(Event) callback to receive messages. I really appreciate your help with this issue/behavior:
>>
>> ISSUE: I noticed that the last few messages on the Queue are not being issued to onDelivery(Event) callback by the Reactor
>> - Then, I went ahead and enabled proton Frame tracing (PN_TRACE_FRM=1) and discovered that the Transfer frames corresponding to those messages were not even delivered to Client. Then, I looked at our Service Proton Frames and can clearly see that they are being delivered by the Service. And other AMQP clients (for ex: .net client can see the Transfer frames)
>> - Is this a known behavior?
>> Does Reactor code path disable Nagle on underlying socket – could this be related? or is there any other Configuration that we should be setting to see all Transfer frames received on the Socket?
>>
>> Please advice.
>>
>> Thanks a lot in Advance!
>> Sree
>>
>> Sent from Mail for Windows 10
>>
>
> I'm not aware of anyone else reporting anything like that. I don't see
> anything in the code suggesting the reactor sets TCP_NODELAY trueon
> the socket, but I wouldn't think that should matter here.
>
> The frame trace logging is done after the bytes are given to the
> Transport and are processed into frames, so a lack of logging could
> suggest various things such as they didnt actually get there, they
> werent processed, something went wrong before they did/were, something
> went wrong decoding them, etc. Its hard to say much more without more
> info.
>
> Robbie
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Proton-j Reactor - Receiver

Garlapati Sreeram Kumar
In reply to this post by Garlapati Sreeram Kumar
Thanks a lot for the Response Robbie!
Per your suggestion, added the CIT to the Pull Request (& yes, as you already said – this issue is being tracked via JIRA - PROTON-1171).

Thanks a lot for the Wonderful Collaboration!
Sree

From: Robbie Gemmell<mailto:[hidden email]>
Sent: Thursday, April 7, 2016 3:52 AM
To: [hidden email]<mailto:[hidden email]>
Cc: SeongJoon Kwak (SJ)<mailto:[hidden email]>; [hidden email]<mailto:[hidden email]>
Subject: Re: Proton-j Reactor - Receiver

Hi Sree,

Thanks for the analysis and PR, I'll try to take a proper look soon.
It's not an area of the code I'm familiar with so I'll need to have a
bit of a dig myself to see if the change seems ok. I'd note that any
not-insignificant bug fix such as this should probably have a test
with it (and a JIRA, though I see you have since created one of those)
:)

Robbie

On 6 April 2016 at 01:23, Garlapati Sreeram Kumar <[hidden email]> wrote:

> Hello Robbie,
>
> We are using proton-j client with SSL and many of our customers are hitting this issue.
> Here are my findings after debugging through this issue:
>
> -          When incoming bytes arrive on the SocketChannel – proton-j client gets signaled by nio & as a result it unwinds the transport stack – as a result all the TransportInput implementations performs its task on the Read Bytes and hands off to the Next Layer in the stack (transport to ssl, ssl to frameparser etc).
>
> -          While unwinding that stack, SimpleSSLTransportWrapper.unwrapInput reads(16k bytes) from _inputBuffer and the result - decoded bytes are written to _decodedInputBuffer – as an intermediate buffer.
>
> -          It then flushes bytes from intermediate buffer to the next layer & invokes an _underlyingInput.Process() – to signal it that it has bytes in its input buffer.
>
> -          If the underlyingInput (lets say FrameParser) buffer size is small – lets say 4k – then decodedInputBuffer will be left with 12k bytes & Over time this accrues.
>
> The fix here is to flush decodedInputBuffer to the Next transport in the Network Stack & call _underlyingInput.Process() - until decodedInputBuffer is empty. Here’s the pull request - https://github.com/apache/qpid-proton/pull/73
>
> Pl. let me know if we need to do more to fix this issue comprehensively.
>
> Thx!
> Sree
>
> From: Robbie Gemmell<mailto:[hidden email]>
> Sent: Thursday, March 31, 2016 9:19 AM
> To: [hidden email]<mailto:[hidden email]>
> Subject: Re: Proton-j Reactor - Receiver
>
> On 31 March 2016 at 04:32, Garlapati Sreeram Kumar <[hidden email]> wrote:
>> Hello All!
>>
>> I am using Proton-J reactor API (Version 0.12.0) for receiving AMQP Messages (from Microsoft Azure Event Hubs): https://github.com/Azure/azure-event-hubs/blob/master/java/azure-eventhubs/src/main/java/com/microsoft/azure/servicebus/amqp/ReceiveLinkHandler.java#L124
>>
>> Am using the onDelivery(Event) callback to receive messages. I really appreciate your help with this issue/behavior:
>>
>> ISSUE: I noticed that the last few messages on the Queue are not being issued to onDelivery(Event) callback by the Reactor
>> - Then, I went ahead and enabled proton Frame tracing (PN_TRACE_FRM=1) and discovered that the Transfer frames corresponding to those messages were not even delivered to Client. Then, I looked at our Service Proton Frames and can clearly see that they are being delivered by the Service. And other AMQP clients (for ex: .net client can see the Transfer frames)
>> - Is this a known behavior?
>> Does Reactor code path disable Nagle on underlying socket – could this be related? or is there any other Configuration that we should be setting to see all Transfer frames received on the Socket?
>>
>> Please advice.
>>
>> Thanks a lot in Advance!
>> Sree
>>
>> Sent from Mail for Windows 10
>>
>
> I'm not aware of anyone else reporting anything like that. I don't see
> anything in the code suggesting the reactor sets TCP_NODELAY trueon
> the socket, but I wouldn't think that should matter here.
>
> The frame trace logging is done after the bytes are given to the
> Transport and are processed into frames, so a lack of logging could
> suggest various things such as they didnt actually get there, they
> werent processed, something went wrong before they did/were, something
> went wrong decoding them, etc. Its hard to say much more without more
> info.
>
> Robbie
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Proton-j Reactor - Receiver

Robbie Gemmell
Administrator
Ah, excellent. I had actually started on testing this myself a little
earlier, so I'll take a look and see whats what before continuing
tomorrow. On taking an initial better look at things I think the
change itself may need augmented to account for some other conditions
too, need to investigate further to be sure.

Robbie

On 11 April 2016 at 17:37, Garlapati Sreeram Kumar <[hidden email]> wrote:

> Thanks a lot for the Response Robbie!
> Per your suggestion, added the CIT to the Pull Request (& yes, as you already said – this issue is being tracked via JIRA - PROTON-1171).
>
> Thanks a lot for the Wonderful Collaboration!
> Sree
>
> From: Robbie Gemmell<mailto:[hidden email]>
> Sent: Thursday, April 7, 2016 3:52 AM
> To: [hidden email]<mailto:[hidden email]>
> Cc: SeongJoon Kwak (SJ)<mailto:[hidden email]>; [hidden email]<mailto:[hidden email]>
> Subject: Re: Proton-j Reactor - Receiver
>
> Hi Sree,
>
> Thanks for the analysis and PR, I'll try to take a proper look soon.
> It's not an area of the code I'm familiar with so I'll need to have a
> bit of a dig myself to see if the change seems ok. I'd note that any
> not-insignificant bug fix such as this should probably have a test
> with it (and a JIRA, though I see you have since created one of those)
> :)
>
> Robbie
>
> On 6 April 2016 at 01:23, Garlapati Sreeram Kumar <[hidden email]> wrote:
>> Hello Robbie,
>>
>> We are using proton-j client with SSL and many of our customers are hitting this issue.
>> Here are my findings after debugging through this issue:
>>
>> -          When incoming bytes arrive on the SocketChannel – proton-j client gets signaled by nio & as a result it unwinds the transport stack – as a result all the TransportInput implementations performs its task on the Read Bytes and hands off to the Next Layer in the stack (transport to ssl, ssl to frameparser etc).
>>
>> -          While unwinding that stack, SimpleSSLTransportWrapper.unwrapInput reads(16k bytes) from _inputBuffer and the result - decoded bytes are written to _decodedInputBuffer – as an intermediate buffer.
>>
>> -          It then flushes bytes from intermediate buffer to the next layer & invokes an _underlyingInput.Process() – to signal it that it has bytes in its input buffer.
>>
>> -          If the underlyingInput (lets say FrameParser) buffer size is small – lets say 4k – then decodedInputBuffer will be left with 12k bytes & Over time this accrues.
>>
>> The fix here is to flush decodedInputBuffer to the Next transport in the Network Stack & call _underlyingInput.Process() - until decodedInputBuffer is empty. Here’s the pull request - https://github.com/apache/qpid-proton/pull/73
>>
>> Pl. let me know if we need to do more to fix this issue comprehensively.
>>
>> Thx!
>> Sree
>>
>> From: Robbie Gemmell<mailto:[hidden email]>
>> Sent: Thursday, March 31, 2016 9:19 AM
>> To: [hidden email]<mailto:[hidden email]>
>> Subject: Re: Proton-j Reactor - Receiver
>>
>> On 31 March 2016 at 04:32, Garlapati Sreeram Kumar <[hidden email]> wrote:
>>> Hello All!
>>>
>>> I am using Proton-J reactor API (Version 0.12.0) for receiving AMQP Messages (from Microsoft Azure Event Hubs): https://github.com/Azure/azure-event-hubs/blob/master/java/azure-eventhubs/src/main/java/com/microsoft/azure/servicebus/amqp/ReceiveLinkHandler.java#L124
>>>
>>> Am using the onDelivery(Event) callback to receive messages. I really appreciate your help with this issue/behavior:
>>>
>>> ISSUE: I noticed that the last few messages on the Queue are not being issued to onDelivery(Event) callback by the Reactor
>>> - Then, I went ahead and enabled proton Frame tracing (PN_TRACE_FRM=1) and discovered that the Transfer frames corresponding to those messages were not even delivered to Client. Then, I looked at our Service Proton Frames and can clearly see that they are being delivered by the Service. And other AMQP clients (for ex: .net client can see the Transfer frames)
>>> - Is this a known behavior?
>>> Does Reactor code path disable Nagle on underlying socket – could this be related? or is there any other Configuration that we should be setting to see all Transfer frames received on the Socket?
>>>
>>> Please advice.
>>>
>>> Thanks a lot in Advance!
>>> Sree
>>>
>>> Sent from Mail for Windows 10
>>>
>>
>> I'm not aware of anyone else reporting anything like that. I don't see
>> anything in the code suggesting the reactor sets TCP_NODELAY trueon
>> the socket, but I wouldn't think that should matter here.
>>
>> The frame trace logging is done after the bytes are given to the
>> Transport and are processed into frames, so a lack of logging could
>> suggest various things such as they didnt actually get there, they
>> werent processed, something went wrong before they did/were, something
>> went wrong decoding them, etc. Its hard to say much more without more
>> info.
>>
>> Robbie
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Proton-j Reactor - Receiver

Garlapati Sreeram Kumar
In reply to this post by Garlapati Sreeram Kumar
Awesome.

To make it easy - added you as collaborator to my fork of Proton & here’s the branch from which I submitted the PR: https://github.com/SreeramGarlapati/qpid-proton/tree/sg.recvstuck

Thx!
Sree

From: Robbie Gemmell<mailto:[hidden email]>
Sent: Monday, April 11, 2016 9:52 AM
To: [hidden email]<mailto:[hidden email]>
Cc: SeongJoon Kwak (SJ)<mailto:[hidden email]>; [hidden email]<mailto:[hidden email]>
Subject: Re: Proton-j Reactor - Receiver

Ah, excellent. I had actually started on testing this myself a little
earlier, so I'll take a look and see whats what before continuing
tomorrow. On taking an initial better look at things I think the
change itself may need augmented to account for some other conditions
too, need to investigate further to be sure.

Robbie

On 11 April 2016 at 17:37, Garlapati Sreeram Kumar <[hidden email]> wrote:

> Thanks a lot for the Response Robbie!
> Per your suggestion, added the CIT to the Pull Request (& yes, as you already said – this issue is being tracked via JIRA - PROTON-1171).
>
> Thanks a lot for the Wonderful Collaboration!
> Sree
>
> From: Robbie Gemmell<mailto:[hidden email]>
> Sent: Thursday, April 7, 2016 3:52 AM
> To: [hidden email]<mailto:[hidden email]>
> Cc: SeongJoon Kwak (SJ)<mailto:[hidden email]>; [hidden email]<mailto:[hidden email]>
> Subject: Re: Proton-j Reactor - Receiver
>
> Hi Sree,
>
> Thanks for the analysis and PR, I'll try to take a proper look soon.
> It's not an area of the code I'm familiar with so I'll need to have a
> bit of a dig myself to see if the change seems ok. I'd note that any
> not-insignificant bug fix such as this should probably have a test
> with it (and a JIRA, though I see you have since created one of those)
> :)
>
> Robbie
>
> On 6 April 2016 at 01:23, Garlapati Sreeram Kumar <[hidden email]> wrote:
>> Hello Robbie,
>>
>> We are using proton-j client with SSL and many of our customers are hitting this issue.
>> Here are my findings after debugging through this issue:
>>
>> -          When incoming bytes arrive on the SocketChannel – proton-j client gets signaled by nio & as a result it unwinds the transport stack – as a result all the TransportInput implementations performs its task on the Read Bytes and hands off to the Next Layer in the stack (transport to ssl, ssl to frameparser etc).
>>
>> -          While unwinding that stack, SimpleSSLTransportWrapper.unwrapInput reads(16k bytes) from _inputBuffer and the result - decoded bytes are written to _decodedInputBuffer – as an intermediate buffer.
>>
>> -          It then flushes bytes from intermediate buffer to the next layer & invokes an _underlyingInput.Process() – to signal it that it has bytes in its input buffer.
>>
>> -          If the underlyingInput (lets say FrameParser) buffer size is small – lets say 4k – then decodedInputBuffer will be left with 12k bytes & Over time this accrues.
>>
>> The fix here is to flush decodedInputBuffer to the Next transport in the Network Stack & call _underlyingInput.Process() - until decodedInputBuffer is empty. Here’s the pull request - https://github.com/apache/qpid-proton/pull/73
>>
>> Pl. let me know if we need to do more to fix this issue comprehensively.
>>
>> Thx!
>> Sree
>>
>> From: Robbie Gemmell<mailto:[hidden email]>
>> Sent: Thursday, March 31, 2016 9:19 AM
>> To: [hidden email]<mailto:[hidden email]>
>> Subject: Re: Proton-j Reactor - Receiver
>>
>> On 31 March 2016 at 04:32, Garlapati Sreeram Kumar <[hidden email]> wrote:
>>> Hello All!
>>>
>>> I am using Proton-J reactor API (Version 0.12.0) for receiving AMQP Messages (from Microsoft Azure Event Hubs): https://github.com/Azure/azure-event-hubs/blob/master/java/azure-eventhubs/src/main/java/com/microsoft/azure/servicebus/amqp/ReceiveLinkHandler.java#L124
>>>
>>> Am using the onDelivery(Event) callback to receive messages. I really appreciate your help with this issue/behavior:
>>>
>>> ISSUE: I noticed that the last few messages on the Queue are not being issued to onDelivery(Event) callback by the Reactor
>>> - Then, I went ahead and enabled proton Frame tracing (PN_TRACE_FRM=1) and discovered that the Transfer frames corresponding to those messages were not even delivered to Client. Then, I looked at our Service Proton Frames and can clearly see that they are being delivered by the Service. And other AMQP clients (for ex: .net client can see the Transfer frames)
>>> - Is this a known behavior?
>>> Does Reactor code path disable Nagle on underlying socket – could this be related? or is there any other Configuration that we should be setting to see all Transfer frames received on the Socket?
>>>
>>> Please advice.
>>>
>>> Thanks a lot in Advance!
>>> Sree
>>>
>>> Sent from Mail for Windows 10
>>>
>>
>> I'm not aware of anyone else reporting anything like that. I don't see
>> anything in the code suggesting the reactor sets TCP_NODELAY trueon
>> the socket, but I wouldn't think that should matter here.
>>
>> The frame trace logging is done after the bytes are given to the
>> Transport and are processed into frames, so a lack of logging could
>> suggest various things such as they didnt actually get there, they
>> werent processed, something went wrong before they did/were, something
>> went wrong decoding them, etc. Its hard to say much more without more
>> info.
>>
>> Robbie
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Proton-j Reactor - Receiver

Garlapati Sreeram Kumar
In reply to this post by Garlapati Sreeram Kumar

Hello Robbie,

 

After couple hours of Stress Run with the proton-j change – we still could repro the Receive Stuck problem. Although – the below bug fix places us in a much better state now.

Attached a screenshot of Objects sizes on Heap – which corresponds to the codepath that you fixed (SimpleSslTransportWrapper).

Please see if this rings any bells – I am more than happy to share more details (proton traces & dump – pls suggest if you will need more details).

 

Thx!

Sree

 

From: [hidden email]
Sent: Monday, April 11, 2016 11:50 AM
To: [hidden email]; [hidden email]
Cc: [hidden email]; [hidden email]
Subject: RE: Proton-j Reactor - Receiver

 

Awesome.

To make it easy - added you as collaborator to my fork of Proton & here’s the branch from which I submitted the PR: https://github.com/SreeramGarlapati/qpid-proton/tree/sg.recvstuck

Thx!
Sree

From: Robbie Gemmell<[hidden email]>
Sent: Monday, April 11, 2016 9:52 AM
To: [hidden email]<mailto:[hidden email]>
Cc: SeongJoon Kwak (SJ)<mailto:[hidden email]>; [hidden email]<[hidden email]>
Subject: Re: Proton-j Reactor - Receiver

Ah, excellent. I had actually started on testing this myself a little
earlier, so I'll take a look and see whats what before continuing
tomorrow. On taking an initial better look at things I think the
change itself may need augmented to account for some other conditions
too, need to investigate further to be sure.

Robbie

On 11 April 2016 at 17:37, Garlapati Sreeram Kumar <[hidden email]> wrote:
> Thanks a lot for the Response Robbie!
> Per your suggestion, added the CIT to the Pull Request (& yes, as you already said – this issue is being tracked via JIRA - PROTON-1171).
>
> Thanks a lot for the Wonderful Collaboration!
> Sree
>
> From: Robbie Gemmell<[hidden email]>
> Sent: Thursday, April 7, 2016 3:52 AM
> To: [hidden email]<mailto:[hidden email]>
> Cc: SeongJoon Kwak (SJ)<mailto:[hidden email]>; [hidden email]<[hidden email]>
> Subject: Re: Proton-j Reactor - Receiver
>
> Hi Sree,
>
> Thanks for the analysis and PR, I'll try to take a proper look soon.
> It's not an area of the code I'm familiar with so I'll need to have a
> bit of a dig myself to see if the change seems ok. I'd note that any
> not-insignificant bug fix such as this should probably have a test
> with it (and a JIRA, though I see you have since created one of those)
> :)
>
> Robbie
>
> On 6 April 2016 at 01:23, Garlapati Sreeram Kumar <[hidden email]> wrote:
>> Hello Robbie,
>>
>> We are using proton-j client with SSL and many of our customers are hitting this issue.
>> Here are my findings after debugging through this issue:
>>
>> -          When incoming bytes arrive on the SocketChannel – proton-j client gets signaled by nio & as a result it unwinds the transport stack – as a result all the TransportInput implementations performs its task on the Read Bytes and hands off to the Next Layer in the stack (transport to ssl, ssl to frameparser etc).
>>
>> -          While unwinding that stack, SimpleSSLTransportWrapper.unwrapInput reads(16k bytes) from _inputBuffer and the result - decoded bytes are written to _decodedInputBuffer – as an intermediate buffer.
>>
>> -          It then flushes bytes from intermediate buffer to the next layer & invokes an _underlyingInput.Process() – to signal it that it has bytes in its input buffer.
>>
>> -          If the underlyingInput (lets say FrameParser) buffer size is small – lets say 4k – then decodedInputBuffer will be left with 12k bytes & Over time this accrues.
>>
>> The fix here is to flush decodedInputBuffer to the Next transport in the Network Stack & call _underlyingInput.Process() - until decodedInputBuffer is empty. Here’s the pull request - https://github.com/apache/qpid-proton/pull/73
>>
>> Pl. let me know if we need to do more to fix this issue comprehensively.
>>
>> Thx!
>> Sree
>>
>> From: Robbie Gemmell<[hidden email]>
>> Sent: Thursday, March 31, 2016 9:19 AM
>> To: [hidden email]<mailto:[hidden email]>
>> Subject: Re: Proton-j Reactor - Receiver
>>
>> On 31 March 2016 at 04:32, Garlapati Sreeram Kumar <[hidden email]> wrote:
>>> Hello All!
>>>
>>> I am using Proton-J reactor API (Version 0.12.0) for receiving AMQP Messages (from Microsoft Azure Event Hubs): https://github.com/Azure/azure-event-hubs/blob/master/java/azure-eventhubs/src/main/java/com/microsoft/azure/servicebus/amqp/ReceiveLinkHandler.java#L124
>>>
>>> Am using the onDelivery(Event) callback to receive messages. I really appreciate your help with this issue/behavior:
>>>
>>> ISSUE: I noticed that the last few messages on the Queue are not being issued to onDelivery(Event) callback by the Reactor
>>> - Then, I went ahead and enabled proton Frame tracing (PN_TRACE_FRM=1) and discovered that the Transfer frames corresponding to those messages were not even delivered to Client. Then, I looked at our Service Proton Frames and can clearly see that they are being delivered by the Service. And other AMQP clients (for ex: .net client can see the Transfer frames)
>>> - Is this a known behavior?
>>> Does Reactor code path disable Nagle on underlying socket – could this be related? or is there any other Configuration that we should be setting to see all Transfer frames received on the Socket?
>>>
>>> Please advice.
>>>
>>> Thanks a lot in Advance!
>>> Sree
>>>
>>> Sent from Mail for Windows 10
>>>
>>
>> I'm not aware of anyone else reporting anything like that. I don't see
>> anything in the code suggesting the reactor sets TCP_NODELAY trueon
>> the socket, but I wouldn't think that should matter here.
>>
>> The frame trace logging is done after the bytes are given to the
>> Transport and are processed into frames, so a lack of logging could
>> suggest various things such as they didnt actually get there, they
>> werent processed, something went wrong before they did/were, something
>> went wrong decoding them, etc. Its hard to say much more without more
>> info.
>>
>> Robbie
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Proton-j Reactor - Receiver

Robbie Gemmell
Administrator
Hi Sree,

Your attachment didnt make it, the mailing list strips almost all
attachments, particularly of any non trivial size. Best to attach
stuff to JIRAs or provide links to them elsewhere.

I had another look at it and still don't see a way for it to do what
it did before, I'd guess it may be something slightly different. I
think I'll need more analysis and/or a reproducer from you to make any
further progress.

0.12.2 currently has the votes to pass. If it stays that way I
probably lean towards proceeding with its release unless more concrete
details becomes obvious, since it does seem to give a good improvement
over the previous behaviour.

Robbie

On 15 April 2016 at 01:35, Garlapati Sreeram Kumar <[hidden email]> wrote:

> Hello Robbie,
>
>
>
> After couple hours of Stress Run with the proton-j change – we still could
> repro the Receive Stuck problem. Although – the below bug fix places us in a
> much better state now.
>
> Attached a screenshot of Objects sizes on Heap – which corresponds to the
> codepath that you fixed (SimpleSslTransportWrapper).
>
> Please see if this rings any bells – I am more than happy to share more
> details (proton traces & dump – pls suggest if you will need more details).
>
>
>
> Thx!
>
> Sree
>
>
>
> From: Garlapati Sreeram Kumar
> Sent: Monday, April 11, 2016 11:50 AM
> To: Robbie Gemmell; [hidden email]
> Cc: SeongJoon Kwak (SJ); [hidden email]
> Subject: RE: Proton-j Reactor - Receiver
>
>
>
> Awesome.
>
> To make it easy - added you as collaborator to my fork of Proton & here’s
> the branch from which I submitted the PR:
> https://github.com/SreeramGarlapati/qpid-proton/tree/sg.recvstuck
>
> Thx!
> Sree
>
> From: Robbie Gemmell<mailto:[hidden email]>
> Sent: Monday, April 11, 2016 9:52 AM
> To: [hidden email]<mailto:[hidden email]>
> Cc: SeongJoon Kwak (SJ)<mailto:[hidden email]>;
> [hidden email]<mailto:[hidden email]>
> Subject: Re: Proton-j Reactor - Receiver
>
> Ah, excellent. I had actually started on testing this myself a little
> earlier, so I'll take a look and see whats what before continuing
> tomorrow. On taking an initial better look at things I think the
> change itself may need augmented to account for some other conditions
> too, need to investigate further to be sure.
>
> Robbie
>
> On 11 April 2016 at 17:37, Garlapati Sreeram Kumar <[hidden email]>
> wrote:
>> Thanks a lot for the Response Robbie!
>> Per your suggestion, added the CIT to the Pull Request (& yes, as you
>> already said – this issue is being tracked via JIRA - PROTON-1171).
>>
>> Thanks a lot for the Wonderful Collaboration!
>> Sree
>>
>> From: Robbie Gemmell<mailto:[hidden email]>
>> Sent: Thursday, April 7, 2016 3:52 AM
>> To: [hidden email]<mailto:[hidden email]>
>> Cc: SeongJoon Kwak (SJ)<mailto:[hidden email]>;
>> [hidden email]<mailto:[hidden email]>
>> Subject: Re: Proton-j Reactor - Receiver
>>
>> Hi Sree,
>>
>> Thanks for the analysis and PR, I'll try to take a proper look soon.
>> It's not an area of the code I'm familiar with so I'll need to have a
>> bit of a dig myself to see if the change seems ok. I'd note that any
>> not-insignificant bug fix such as this should probably have a test
>> with it (and a JIRA, though I see you have since created one of those)
>> :)
>>
>> Robbie
>>
>> On 6 April 2016 at 01:23, Garlapati Sreeram Kumar <[hidden email]>
>> wrote:
>>> Hello Robbie,
>>>
>>> We are using proton-j client with SSL and many of our customers are
>>> hitting this issue.
>>> Here are my findings after debugging through this issue:
>>>
>>> -          When incoming bytes arrive on the SocketChannel – proton-j
>>> client gets signaled by nio & as a result it unwinds the transport stack –
>>> as a result all the TransportInput implementations performs its task on the
>>> Read Bytes and hands off to the Next Layer in the stack (transport to ssl,
>>> ssl to frameparser etc).
>>>
>>> -          While unwinding that stack,
>>> SimpleSSLTransportWrapper.unwrapInput reads(16k bytes) from _inputBuffer and
>>> the result - decoded bytes are written to _decodedInputBuffer – as an
>>> intermediate buffer.
>>>
>>> -          It then flushes bytes from intermediate buffer to the next
>>> layer & invokes an _underlyingInput.Process() – to signal it that it has
>>> bytes in its input buffer.
>>>
>>> -          If the underlyingInput (lets say FrameParser) buffer size is
>>> small – lets say 4k – then decodedInputBuffer will be left with 12k bytes &
>>> Over time this accrues.
>>>
>>> The fix here is to flush decodedInputBuffer to the Next transport in the
>>> Network Stack & call _underlyingInput.Process() - until decodedInputBuffer
>>> is empty. Here’s the pull request -
>>> https://github.com/apache/qpid-proton/pull/73
>>>
>>> Pl. let me know if we need to do more to fix this issue comprehensively.
>>>
>>> Thx!
>>> Sree
>>>
>>> From: Robbie Gemmell<mailto:[hidden email]>
>>> Sent: Thursday, March 31, 2016 9:19 AM
>>> To: [hidden email]<mailto:[hidden email]>
>>> Subject: Re: Proton-j Reactor - Receiver
>>>
>>> On 31 March 2016 at 04:32, Garlapati Sreeram Kumar <[hidden email]>
>>> wrote:
>>>> Hello All!
>>>>
>>>> I am using Proton-J reactor API (Version 0.12.0) for receiving AMQP
>>>> Messages (from Microsoft Azure Event Hubs):
>>>> https://github.com/Azure/azure-event-hubs/blob/master/java/azure-eventhubs/src/main/java/com/microsoft/azure/servicebus/amqp/ReceiveLinkHandler.java#L124
>>>>
>>>> Am using the onDelivery(Event) callback to receive messages. I really
>>>> appreciate your help with this issue/behavior:
>>>>
>>>> ISSUE: I noticed that the last few messages on the Queue are not being
>>>> issued to onDelivery(Event) callback by the Reactor
>>>> - Then, I went ahead and enabled proton Frame tracing (PN_TRACE_FRM=1)
>>>> and discovered that the Transfer frames corresponding to those messages were
>>>> not even delivered to Client. Then, I looked at our Service Proton Frames
>>>> and can clearly see that they are being delivered by the Service. And other
>>>> AMQP clients (for ex: .net client can see the Transfer frames)
>>>> - Is this a known behavior?
>>>> Does Reactor code path disable Nagle on underlying socket – could this
>>>> be related? or is there any other Configuration that we should be setting to
>>>> see all Transfer frames received on the Socket?
>>>>
>>>> Please advice.
>>>>
>>>> Thanks a lot in Advance!
>>>> Sree
>>>>
>>>> Sent from Mail for Windows 10
>>>>
>>>
>>> I'm not aware of anyone else reporting anything like that. I don't see
>>> anything in the code suggesting the reactor sets TCP_NODELAY trueon
>>> the socket, but I wouldn't think that should matter here.
>>>
>>> The frame trace logging is done after the bytes are given to the
>>> Transport and are processed into frames, so a lack of logging could
>>> suggest various things such as they didnt actually get there, they
>>> werent processed, something went wrong before they did/were, something
>>> went wrong decoding them, etc. Its hard to say much more without more
>>> info.
>>>
>>> Robbie
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Proton-j Reactor - Receiver

Garlapati Sreeram Kumar
In reply to this post by Garlapati Sreeram Kumar
Hello Robbie – That’s correct, please go ahead with the release 0.12.2.
As per your suggestion, I will follow up on this issue with a Simple reproducer.

Thx!
Sree

From: Robbie Gemmell<mailto:[hidden email]>
Sent: Friday, April 15, 2016 7:09 AM
To: [hidden email]<mailto:[hidden email]>
Cc: SeongJoon Kwak (SJ)<mailto:[hidden email]>; [hidden email]<mailto:[hidden email]>
Subject: Re: Proton-j Reactor - Receiver

Hi Sree,

Your attachment didnt make it, the mailing list strips almost all
attachments, particularly of any non trivial size. Best to attach
stuff to JIRAs or provide links to them elsewhere.

I had another look at it and still don't see a way for it to do what
it did before, I'd guess it may be something slightly different. I
think I'll need more analysis and/or a reproducer from you to make any
further progress.

0.12.2 currently has the votes to pass. If it stays that way I
probably lean towards proceeding with its release unless more concrete
details becomes obvious, since it does seem to give a good improvement
over the previous behaviour.

Robbie

On 15 April 2016 at 01:35, Garlapati Sreeram Kumar <[hidden email]> wrote:

> Hello Robbie,
>
>
>
> After couple hours of Stress Run with the proton-j change – we still could
> repro the Receive Stuck problem. Although – the below bug fix places us in a
> much better state now.
>
> Attached a screenshot of Objects sizes on Heap – which corresponds to the
> codepath that you fixed (SimpleSslTransportWrapper).
>
> Please see if this rings any bells – I am more than happy to share more
> details (proton traces & dump – pls suggest if you will need more details).
>
>
>
> Thx!
>
> Sree
>
>
>
> From: Garlapati Sreeram Kumar
> Sent: Monday, April 11, 2016 11:50 AM
> To: Robbie Gemmell; [hidden email]
> Cc: SeongJoon Kwak (SJ); [hidden email]
> Subject: RE: Proton-j Reactor - Receiver
>
>
>
> Awesome.
>
> To make it easy - added you as collaborator to my fork of Proton & here’s
> the branch from which I submitted the PR:
> https://github.com/SreeramGarlapati/qpid-proton/tree/sg.recvstuck
>
> Thx!
> Sree
>
> From: Robbie Gemmell<mailto:[hidden email]>
> Sent: Monday, April 11, 2016 9:52 AM
> To: [hidden email]<mailto:[hidden email]>
> Cc: SeongJoon Kwak (SJ)<mailto:[hidden email]>;
> [hidden email]<mailto:[hidden email]>
> Subject: Re: Proton-j Reactor - Receiver
>
> Ah, excellent. I had actually started on testing this myself a little
> earlier, so I'll take a look and see whats what before continuing
> tomorrow. On taking an initial better look at things I think the
> change itself may need augmented to account for some other conditions
> too, need to investigate further to be sure.
>
> Robbie
>
> On 11 April 2016 at 17:37, Garlapati Sreeram Kumar <[hidden email]>
> wrote:
>> Thanks a lot for the Response Robbie!
>> Per your suggestion, added the CIT to the Pull Request (& yes, as you
>> already said – this issue is being tracked via JIRA - PROTON-1171).
>>
>> Thanks a lot for the Wonderful Collaboration!
>> Sree
>>
>> From: Robbie Gemmell<mailto:[hidden email]>
>> Sent: Thursday, April 7, 2016 3:52 AM
>> To: [hidden email]<mailto:[hidden email]>
>> Cc: SeongJoon Kwak (SJ)<mailto:[hidden email]>;
>> [hidden email]<mailto:[hidden email]>
>> Subject: Re: Proton-j Reactor - Receiver
>>
>> Hi Sree,
>>
>> Thanks for the analysis and PR, I'll try to take a proper look soon.
>> It's not an area of the code I'm familiar with so I'll need to have a
>> bit of a dig myself to see if the change seems ok. I'd note that any
>> not-insignificant bug fix such as this should probably have a test
>> with it (and a JIRA, though I see you have since created one of those)
>> :)
>>
>> Robbie
>>
>> On 6 April 2016 at 01:23, Garlapati Sreeram Kumar <[hidden email]>
>> wrote:
>>> Hello Robbie,
>>>
>>> We are using proton-j client with SSL and many of our customers are
>>> hitting this issue.
>>> Here are my findings after debugging through this issue:
>>>
>>> -          When incoming bytes arrive on the SocketChannel – proton-j
>>> client gets signaled by nio & as a result it unwinds the transport stack –
>>> as a result all the TransportInput implementations performs its task on the
>>> Read Bytes and hands off to the Next Layer in the stack (transport to ssl,
>>> ssl to frameparser etc).
>>>
>>> -          While unwinding that stack,
>>> SimpleSSLTransportWrapper.unwrapInput reads(16k bytes) from _inputBuffer and
>>> the result - decoded bytes are written to _decodedInputBuffer – as an
>>> intermediate buffer.
>>>
>>> -          It then flushes bytes from intermediate buffer to the next
>>> layer & invokes an _underlyingInput.Process() – to signal it that it has
>>> bytes in its input buffer.
>>>
>>> -          If the underlyingInput (lets say FrameParser) buffer size is
>>> small – lets say 4k – then decodedInputBuffer will be left with 12k bytes &
>>> Over time this accrues.
>>>
>>> The fix here is to flush decodedInputBuffer to the Next transport in the
>>> Network Stack & call _underlyingInput.Process() - until decodedInputBuffer
>>> is empty. Here’s the pull request -
>>> https://github.com/apache/qpid-proton/pull/73
>>>
>>> Pl. let me know if we need to do more to fix this issue comprehensively.
>>>
>>> Thx!
>>> Sree
>>>
>>> From: Robbie Gemmell<mailto:[hidden email]>
>>> Sent: Thursday, March 31, 2016 9:19 AM
>>> To: [hidden email]<mailto:[hidden email]>
>>> Subject: Re: Proton-j Reactor - Receiver
>>>
>>> On 31 March 2016 at 04:32, Garlapati Sreeram Kumar <[hidden email]>
>>> wrote:
>>>> Hello All!
>>>>
>>>> I am using Proton-J reactor API (Version 0.12.0) for receiving AMQP
>>>> Messages (from Microsoft Azure Event Hubs):
>>>> https://github.com/Azure/azure-event-hubs/blob/master/java/azure-eventhubs/src/main/java/com/microsoft/azure/servicebus/amqp/ReceiveLinkHandler.java#L124
>>>>
>>>> Am using the onDelivery(Event) callback to receive messages. I really
>>>> appreciate your help with this issue/behavior:
>>>>
>>>> ISSUE: I noticed that the last few messages on the Queue are not being
>>>> issued to onDelivery(Event) callback by the Reactor
>>>> - Then, I went ahead and enabled proton Frame tracing (PN_TRACE_FRM=1)
>>>> and discovered that the Transfer frames corresponding to those messages were
>>>> not even delivered to Client. Then, I looked at our Service Proton Frames
>>>> and can clearly see that they are being delivered by the Service. And other
>>>> AMQP clients (for ex: .net client can see the Transfer frames)
>>>> - Is this a known behavior?
>>>> Does Reactor code path disable Nagle on underlying socket – could this
>>>> be related? or is there any other Configuration that we should be setting to
>>>> see all Transfer frames received on the Socket?
>>>>
>>>> Please advice.
>>>>
>>>> Thanks a lot in Advance!
>>>> Sree
>>>>
>>>> Sent from Mail for Windows 10
>>>>
>>>
>>> I'm not aware of anyone else reporting anything like that. I don't see
>>> anything in the code suggesting the reactor sets TCP_NODELAY trueon
>>> the socket, but I wouldn't think that should matter here.
>>>
>>> The frame trace logging is done after the bytes are given to the
>>> Transport and are processed into frames, so a lack of logging could
>>> suggest various things such as they didnt actually get there, they
>>> werent processed, something went wrong before they did/were, something
>>> went wrong decoding them, etc. Its hard to say much more without more
>>> info.
>>>
>>> Robbie
Loading...