Quantcast

QPID performance on virtual machines

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

QPID performance on virtual machines

Clive Lilley
Hi all,

I have been undertaking some performance profiling of QPID version 0.14
over the last few weeks and I have found a significant performance drop
off when running QPID in a virtual machine.

As an example if I run qpidd on an 8 core DELL R710 with 36G RAM
(RHEL5u5) and then run qpid-perf-test (on the same machine to discount
any network problems) without any command line parameters I am seeing
about 85,000 publish transfers/sec and 80000 consume transfers/sec. If I
run the same scenario on a VM (tried both KVM and VMWare ESXi 4.3
running RHEL5u5) with 2 cores and 8G RAM, I am seeing only 45000 publish
transfers/sec and 40000 consume transfers/sec. A significant drop off in
performance. Looking at the cpu and memory usage these would not seem to
be the limiting factors as the memory consumption of qpidd stays under
200 MBytes and its CPU is up at about 150%; hence the two core machine.

I have even run the same test on my Mac Book at home using VMWare Fusion
4 ( 2 Core 4G RAM) and see the same 45000/40000 transfers/sec results.

I would expect a small drop off in performance when running in a VM, but
not to the extent that I am seeing.

Has anyone else seen this and if so were they able to get to the bottom
of the issue.

Any help would be appreciated.

Clive Lilley

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: QPID performance on virtual machines

James Kirkland
What sort of messging scenario is it?  Are the messages persisted?  How
big are they?  If they are persisted are you using virtual disks or
physical devices?

CLIVE wrote:

> Hi all,
>
> I have been undertaking some performance profiling of QPID version
> 0.14 over the last few weeks and I have found a significant
> performance drop off when running QPID in a virtual machine.
>
> As an example if I run qpidd on an 8 core DELL R710 with 36G RAM
> (RHEL5u5) and then run qpid-perf-test (on the same machine to discount
> any network problems) without any command line parameters I am seeing
> about 85,000 publish transfers/sec and 80000 consume transfers/sec. If
> I run the same scenario on a VM (tried both KVM and VMWare ESXi 4.3
> running RHEL5u5) with 2 cores and 8G RAM, I am seeing only 45000
> publish transfers/sec and 40000 consume transfers/sec. A significant
> drop off in performance. Looking at the cpu and memory usage these
> would not seem to be the limiting factors as the memory consumption of
> qpidd stays under 200 MBytes and its CPU is up at about 150%; hence
> the two core machine.
>
> I have even run the same test on my Mac Book at home using VMWare
> Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000 transfers/sec
> results.
>
> I would expect a small drop off in performance when running in a VM,
> but not to the extent that I am seeing.
>
> Has anyone else seen this and if so were they able to get to the
> bottom of the issue.
>
> Any help would be appreciated.
>
> Clive Lilley
>

--
James Kirkland
Principal Enterprise Solutions Architect
3340 Peachtree Road, NE,
Suite 1200
Atlanta, GA 30326 USA.
Phone (404) 254-6457 <https://www.google.com/voice#phones>
RHCE Certificate: 805009616436562
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: QPID performance on virtual machines

Clive Lilley
James,

qpid-perf-test (as supplied with the qpid-0.14 source tar ball) runs a
direct queue test when executed without any parameters; there is a
command line option that enables this to be be changed if required.  The
message size is 1024K (again default size when not explicitly set). And
500000 messages are published by the test (again the default when not
explicitly set). All messages are transient so I wouldn't expect any
file I/O overhead to interfere with the test and this is confirmed by
the vmstat results I am seeing. The only jump in the vmstat output is
the number of context switches that are occurring which jumps up into
the thousands.

Clive

On 02/05/2012 18:10, James Kirkland wrote:

> What sort of messging scenario is it?  Are the messages persisted?  
> How big are they?  If they are persisted are you using virtual disks
> or physical devices?
>
> CLIVE wrote:
>> Hi all,
>>
>> I have been undertaking some performance profiling of QPID version
>> 0.14 over the last few weeks and I have found a significant
>> performance drop off when running QPID in a virtual machine.
>>
>> As an example if I run qpidd on an 8 core DELL R710 with 36G RAM
>> (RHEL5u5) and then run qpid-perf-test (on the same machine to
>> discount any network problems) without any command line parameters I
>> am seeing about 85,000 publish transfers/sec and 80000 consume
>> transfers/sec. If I run the same scenario on a VM (tried both KVM and
>> VMWare ESXi 4.3 running RHEL5u5) with 2 cores and 8G RAM, I am seeing
>> only 45000 publish transfers/sec and 40000 consume transfers/sec. A
>> significant drop off in performance. Looking at the cpu and memory
>> usage these would not seem to be the limiting factors as the memory
>> consumption of qpidd stays under 200 MBytes and its CPU is up at
>> about 150%; hence the two core machine.
>>
>> I have even run the same test on my Mac Book at home using VMWare
>> Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000 transfers/sec
>> results.
>>
>> I would expect a small drop off in performance when running in a VM,
>> but not to the extent that I am seeing.
>>
>> Has anyone else seen this and if so were they able to get to the
>> bottom of the issue.
>>
>> Any help would be appreciated.
>>
>> Clive Lilley
>>
>
> --
> James Kirkland
> Principal Enterprise Solutions Architect
> 3340 Peachtree Road, NE,
> Suite 1200
> Atlanta, GA 30326 USA.
> Phone (404) 254-6457 <https://www.google.com/voice#phones>
> RHCE Certificate: 805009616436562

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

RE: QPID performance on virtual machines

Steve Huston
The qpid broker learns how many CPUs are available and will run more I/O
threads when more CPUs are available (#CPUs + 1 threads). It would be
interesting to see the results if your VM gets more CPUs.

-Steve

> -----Original Message-----
> From: CLIVE [mailto:[hidden email]]
> Sent: Wednesday, May 02, 2012 1:30 PM
> To: James Kirkland
> Cc: [hidden email]
> Subject: Re: QPID performance on virtual machines
>
> James,
>
> qpid-perf-test (as supplied with the qpid-0.14 source tar ball) runs a
direct
> queue test when executed without any parameters; there is a command line
> option that enables this to be be changed if required.  The message size
is
> 1024K (again default size when not explicitly set). And
> 500000 messages are published by the test (again the default when not
> explicitly set). All messages are transient so I wouldn't expect any
file I/O

> overhead to interfere with the test and this is confirmed by the vmstat
> results I am seeing. The only jump in the vmstat output is the number of
> context switches that are occurring which jumps up into the thousands.
>
> Clive
>
> On 02/05/2012 18:10, James Kirkland wrote:
> > What sort of messging scenario is it?  Are the messages persisted?
> > How big are they?  If they are persisted are you using virtual disks
> > or physical devices?
> >
> > CLIVE wrote:
> >> Hi all,
> >>
> >> I have been undertaking some performance profiling of QPID version
> >> 0.14 over the last few weeks and I have found a significant
> >> performance drop off when running QPID in a virtual machine.
> >>
> >> As an example if I run qpidd on an 8 core DELL R710 with 36G RAM
> >> (RHEL5u5) and then run qpid-perf-test (on the same machine to
> >> discount any network problems) without any command line parameters I
> >> am seeing about 85,000 publish transfers/sec and 80000 consume
> >> transfers/sec. If I run the same scenario on a VM (tried both KVM and
> >> VMWare ESXi 4.3 running RHEL5u5) with 2 cores and 8G RAM, I am seeing
> >> only 45000 publish transfers/sec and 40000 consume transfers/sec. A
> >> significant drop off in performance. Looking at the cpu and memory
> >> usage these would not seem to be the limiting factors as the memory
> >> consumption of qpidd stays under 200 MBytes and its CPU is up at
> >> about 150%; hence the two core machine.
> >>
> >> I have even run the same test on my Mac Book at home using VMWare
> >> Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000 transfers/sec
> >> results.
> >>
> >> I would expect a small drop off in performance when running in a VM,
> >> but not to the extent that I am seeing.
> >>
> >> Has anyone else seen this and if so were they able to get to the
> >> bottom of the issue.
> >>
> >> Any help would be appreciated.
> >>
> >> Clive Lilley
> >>
> >
> > --
> > James Kirkland
> > Principal Enterprise Solutions Architect
> > 3340 Peachtree Road, NE,
> > Suite 1200
> > Atlanta, GA 30326 USA.
> > Phone (404) 254-6457 <https://www.google.com/voice#phones>
> > RHCE Certificate: 805009616436562


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: QPID performance on virtual machines

Clive Lilley
Steve,

I thought about this as well. So re-started the broker on the physical
Dell R710 with the threads option set to just 4 and saw the same
throughput values (85000 publish and 80000 subscribe). As reducing the
threads count didn't seem to have much effect on the physical machine I
thought that this probably wasn't the issue.

As the qpid-perftest application was only creating 1 producer and 1
consumer I reasoned that perhaps the broker was only using two threads
too service the read and writes from these clients. This was why
reducing the thread count on the broker had no effect. Would you expect
the broker to use more than two threads to service the clients for this
scenario?

I will rerun the test tomorrow based on an increased number of CPU's in
the VM(s) just to double check whether it is a number of cores issue.

I did run 'strace -c' on qpidd while the test was running to count the
number of system calls and I noted the big hitters were futex and write.
Interestingly the reads read in 64K chunks, but the writes were only
2048 bytes at a time. As a result the number writes occurring were an
order of magnitude bigger than the reads; I left the detailed results at
work so apologies for not quoting the actual figures.

Clive

On 02/05/2012 20:23, Steve Huston wrote:

> The qpid broker learns how many CPUs are available and will run more I/O
> threads when more CPUs are available (#CPUs + 1 threads). It would be
> interesting to see the results if your VM gets more CPUs.
>
> -Steve
>
>> -----Original Message-----
>> From: CLIVE [mailto:[hidden email]]
>> Sent: Wednesday, May 02, 2012 1:30 PM
>> To: James Kirkland
>> Cc: [hidden email]
>> Subject: Re: QPID performance on virtual machines
>>
>> James,
>>
>> qpid-perf-test (as supplied with the qpid-0.14 source tar ball) runs a
> direct
>> queue test when executed without any parameters; there is a command line
>> option that enables this to be be changed if required.  The message size
> is
>> 1024K (again default size when not explicitly set). And
>> 500000 messages are published by the test (again the default when not
>> explicitly set). All messages are transient so I wouldn't expect any
> file I/O
>> overhead to interfere with the test and this is confirmed by the vmstat
>> results I am seeing. The only jump in the vmstat output is the number of
>> context switches that are occurring which jumps up into the thousands.
>>
>> Clive
>>
>> On 02/05/2012 18:10, James Kirkland wrote:
>>> What sort of messging scenario is it?  Are the messages persisted?
>>> How big are they?  If they are persisted are you using virtual disks
>>> or physical devices?
>>>
>>> CLIVE wrote:
>>>> Hi all,
>>>>
>>>> I have been undertaking some performance profiling of QPID version
>>>> 0.14 over the last few weeks and I have found a significant
>>>> performance drop off when running QPID in a virtual machine.
>>>>
>>>> As an example if I run qpidd on an 8 core DELL R710 with 36G RAM
>>>> (RHEL5u5) and then run qpid-perf-test (on the same machine to
>>>> discount any network problems) without any command line parameters I
>>>> am seeing about 85,000 publish transfers/sec and 80000 consume
>>>> transfers/sec. If I run the same scenario on a VM (tried both KVM and
>>>> VMWare ESXi 4.3 running RHEL5u5) with 2 cores and 8G RAM, I am seeing
>>>> only 45000 publish transfers/sec and 40000 consume transfers/sec. A
>>>> significant drop off in performance. Looking at the cpu and memory
>>>> usage these would not seem to be the limiting factors as the memory
>>>> consumption of qpidd stays under 200 MBytes and its CPU is up at
>>>> about 150%; hence the two core machine.
>>>>
>>>> I have even run the same test on my Mac Book at home using VMWare
>>>> Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000 transfers/sec
>>>> results.
>>>>
>>>> I would expect a small drop off in performance when running in a VM,
>>>> but not to the extent that I am seeing.
>>>>
>>>> Has anyone else seen this and if so were they able to get to the
>>>> bottom of the issue.
>>>>
>>>> Any help would be appreciated.
>>>>
>>>> Clive Lilley
>>>>
>>> --
>>> James Kirkland
>>> Principal Enterprise Solutions Architect
>>> 3340 Peachtree Road, NE,
>>> Suite 1200
>>> Atlanta, GA 30326 USA.
>>> Phone (404) 254-6457<https://www.google.com/voice#phones>
>>> RHCE Certificate: 805009616436562
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
> .
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

RE: QPID performance on virtual machines

Steve Huston
Hi Clive,

The broker will use threads based on load - if the broker takes longer to
process a message than qpid-perftest takes to send the next message, the
broker would need more threads.

A more pointed test for broker performance would be to run the client on
another host - then you know the non-VM vs. VM differences are just the
broker's actions. It may be a little confusing weeding out the actual vs.
virtual NIC issues, but there would be no confusion about how much the
client is taking away from resources available to the broker.

-Steve

> -----Original Message-----
> From: CLIVE [mailto:[hidden email]]
> Sent: Wednesday, May 02, 2012 5:28 PM
> To: [hidden email]
> Cc: Steve Huston; 'James Kirkland'
> Subject: Re: QPID performance on virtual machines
>
> Steve,
>
> I thought about this as well. So re-started the broker on the physical
Dell
> R710 with the threads option set to just 4 and saw the same throughput
> values (85000 publish and 80000 subscribe). As reducing the threads
count
> didn't seem to have much effect on the physical machine I thought that
this
> probably wasn't the issue.
>
> As the qpid-perftest application was only creating 1 producer and 1
consumer
> I reasoned that perhaps the broker was only using two threads too
service
> the read and writes from these clients. This was why reducing the thread
> count on the broker had no effect. Would you expect the broker to use
more
> than two threads to service the clients for this scenario?
>
> I will rerun the test tomorrow based on an increased number of CPU's in
the
> VM(s) just to double check whether it is a number of cores issue.
>
> I did run 'strace -c' on qpidd while the test was running to count the
number
> of system calls and I noted the big hitters were futex and write.
> Interestingly the reads read in 64K chunks, but the writes were only
> 2048 bytes at a time. As a result the number writes occurring were an
order
> of magnitude bigger than the reads; I left the detailed results at work
so

> apologies for not quoting the actual figures.
>
> Clive
>
> On 02/05/2012 20:23, Steve Huston wrote:
> > The qpid broker learns how many CPUs are available and will run more
> > I/O threads when more CPUs are available (#CPUs + 1 threads). It would
> > be interesting to see the results if your VM gets more CPUs.
> >
> > -Steve
> >
> >> -----Original Message-----
> >> From: CLIVE [mailto:[hidden email]]
> >> Sent: Wednesday, May 02, 2012 1:30 PM
> >> To: James Kirkland
> >> Cc: [hidden email]
> >> Subject: Re: QPID performance on virtual machines
> >>
> >> James,
> >>
> >> qpid-perf-test (as supplied with the qpid-0.14 source tar ball) runs
> >> a
> > direct
> >> queue test when executed without any parameters; there is a command
> >> line option that enables this to be be changed if required.  The
> >> message size
> > is
> >> 1024K (again default size when not explicitly set). And
> >> 500000 messages are published by the test (again the default when not
> >> explicitly set). All messages are transient so I wouldn't expect any
> > file I/O
> >> overhead to interfere with the test and this is confirmed by the
> >> vmstat results I am seeing. The only jump in the vmstat output is the
> >> number of context switches that are occurring which jumps up into the
> thousands.
> >>
> >> Clive
> >>
> >> On 02/05/2012 18:10, James Kirkland wrote:
> >>> What sort of messging scenario is it?  Are the messages persisted?
> >>> How big are they?  If they are persisted are you using virtual disks
> >>> or physical devices?
> >>>
> >>> CLIVE wrote:
> >>>> Hi all,
> >>>>
> >>>> I have been undertaking some performance profiling of QPID version
> >>>> 0.14 over the last few weeks and I have found a significant
> >>>> performance drop off when running QPID in a virtual machine.
> >>>>
> >>>> As an example if I run qpidd on an 8 core DELL R710 with 36G RAM
> >>>> (RHEL5u5) and then run qpid-perf-test (on the same machine to
> >>>> discount any network problems) without any command line
> parameters
> >>>> I am seeing about 85,000 publish transfers/sec and 80000 consume
> >>>> transfers/sec. If I run the same scenario on a VM (tried both KVM
> >>>> and VMWare ESXi 4.3 running RHEL5u5) with 2 cores and 8G RAM, I am
> >>>> seeing only 45000 publish transfers/sec and 40000 consume
> >>>> transfers/sec. A significant drop off in performance. Looking at
> >>>> the cpu and memory usage these would not seem to be the limiting
> >>>> factors as the memory consumption of qpidd stays under 200 MBytes
> >>>> and its CPU is up at about 150%; hence the two core machine.
> >>>>
> >>>> I have even run the same test on my Mac Book at home using VMWare
> >>>> Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000
> >>>> transfers/sec results.
> >>>>
> >>>> I would expect a small drop off in performance when running in a
> >>>> VM, but not to the extent that I am seeing.
> >>>>
> >>>> Has anyone else seen this and if so were they able to get to the
> >>>> bottom of the issue.
> >>>>
> >>>> Any help would be appreciated.
> >>>>
> >>>> Clive Lilley
> >>>>
> >>> --
> >>> James Kirkland
> >>> Principal Enterprise Solutions Architect
> >>> 3340 Peachtree Road, NE,
> >>> Suite 1200
> >>> Atlanta, GA 30326 USA.
> >>> Phone (404) 254-6457<https://www.google.com/voice#phones>
> >>> RHCE Certificate: 805009616436562
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email] For
> > additional commands, e-mail: [hidden email]
> >
> > .
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: QPID performance on virtual machines

Clive Lilley
Steve,

Managed to run some more performance tests today using a RHEL5u4 VM on a
Dell R710 . Ran qpid-perftest with default values on the same VM as
qpidd, each test ran several times with the calculated average shown in
the table below.

CPUs    RAM      Publish    Consume
   2         4G        48K           46K
   4         4G        65K           60K
   6         4G        73K           66K
   2         8G        46K           44K
   4         8G        65K           61K
   6         8G        74K           67K

Basically it confirms your assertion about the broker using more threads
under heavy load. Changing the VM memory had no discernible effect on
performance, but increasing the number of CPU's available to the VM had
a big effect on throughput.

So when defining a VM for QPID transient usage focus on CPU allocation!!!

Thanks for the advise and help.

Clive



On 03/05/2012 15:27, Steve Huston wrote:

> Hi Clive,
>
> The broker will use threads based on load - if the broker takes longer to
> process a message than qpid-perftest takes to send the next message, the
> broker would need more threads.
>
> A more pointed test for broker performance would be to run the client on
> another host - then you know the non-VM vs. VM differences are just the
> broker's actions. It may be a little confusing weeding out the actual vs.
> virtual NIC issues, but there would be no confusion about how much the
> client is taking away from resources available to the broker.
>
> -Steve
>
>> -----Original Message-----
>> From: CLIVE [mailto:[hidden email]]
>> Sent: Wednesday, May 02, 2012 5:28 PM
>> To: [hidden email]
>> Cc: Steve Huston; 'James Kirkland'
>> Subject: Re: QPID performance on virtual machines
>>
>> Steve,
>>
>> I thought about this as well. So re-started the broker on the physical
> Dell
>> R710 with the threads option set to just 4 and saw the same throughput
>> values (85000 publish and 80000 subscribe). As reducing the threads
> count
>> didn't seem to have much effect on the physical machine I thought that
> this
>> probably wasn't the issue.
>>
>> As the qpid-perftest application was only creating 1 producer and 1
> consumer
>> I reasoned that perhaps the broker was only using two threads too
> service
>> the read and writes from these clients. This was why reducing the thread
>> count on the broker had no effect. Would you expect the broker to use
> more
>> than two threads to service the clients for this scenario?
>>
>> I will rerun the test tomorrow based on an increased number of CPU's in
> the
>> VM(s) just to double check whether it is a number of cores issue.
>>
>> I did run 'strace -c' on qpidd while the test was running to count the
> number
>> of system calls and I noted the big hitters were futex and write.
>> Interestingly the reads read in 64K chunks, but the writes were only
>> 2048 bytes at a time. As a result the number writes occurring were an
> order
>> of magnitude bigger than the reads; I left the detailed results at work
> so
>> apologies for not quoting the actual figures.
>>
>> Clive
>>
>> On 02/05/2012 20:23, Steve Huston wrote:
>>> The qpid broker learns how many CPUs are available and will run more
>>> I/O threads when more CPUs are available (#CPUs + 1 threads). It would
>>> be interesting to see the results if your VM gets more CPUs.
>>>
>>> -Steve
>>>
>>>> -----Original Message-----
>>>> From: CLIVE [mailto:[hidden email]]
>>>> Sent: Wednesday, May 02, 2012 1:30 PM
>>>> To: James Kirkland
>>>> Cc: [hidden email]
>>>> Subject: Re: QPID performance on virtual machines
>>>>
>>>> James,
>>>>
>>>> qpid-perf-test (as supplied with the qpid-0.14 source tar ball) runs
>>>> a
>>> direct
>>>> queue test when executed without any parameters; there is a command
>>>> line option that enables this to be be changed if required.  The
>>>> message size
>>> is
>>>> 1024K (again default size when not explicitly set). And
>>>> 500000 messages are published by the test (again the default when not
>>>> explicitly set). All messages are transient so I wouldn't expect any
>>> file I/O
>>>> overhead to interfere with the test and this is confirmed by the
>>>> vmstat results I am seeing. The only jump in the vmstat output is the
>>>> number of context switches that are occurring which jumps up into the
>> thousands.
>>>> Clive
>>>>
>>>> On 02/05/2012 18:10, James Kirkland wrote:
>>>>> What sort of messging scenario is it?  Are the messages persisted?
>>>>> How big are they?  If they are persisted are you using virtual disks
>>>>> or physical devices?
>>>>>
>>>>> CLIVE wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> I have been undertaking some performance profiling of QPID version
>>>>>> 0.14 over the last few weeks and I have found a significant
>>>>>> performance drop off when running QPID in a virtual machine.
>>>>>>
>>>>>> As an example if I run qpidd on an 8 core DELL R710 with 36G RAM
>>>>>> (RHEL5u5) and then run qpid-perf-test (on the same machine to
>>>>>> discount any network problems) without any command line
>> parameters
>>>>>> I am seeing about 85,000 publish transfers/sec and 80000 consume
>>>>>> transfers/sec. If I run the same scenario on a VM (tried both KVM
>>>>>> and VMWare ESXi 4.3 running RHEL5u5) with 2 cores and 8G RAM, I am
>>>>>> seeing only 45000 publish transfers/sec and 40000 consume
>>>>>> transfers/sec. A significant drop off in performance. Looking at
>>>>>> the cpu and memory usage these would not seem to be the limiting
>>>>>> factors as the memory consumption of qpidd stays under 200 MBytes
>>>>>> and its CPU is up at about 150%; hence the two core machine.
>>>>>>
>>>>>> I have even run the same test on my Mac Book at home using VMWare
>>>>>> Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000
>>>>>> transfers/sec results.
>>>>>>
>>>>>> I would expect a small drop off in performance when running in a
>>>>>> VM, but not to the extent that I am seeing.
>>>>>>
>>>>>> Has anyone else seen this and if so were they able to get to the
>>>>>> bottom of the issue.
>>>>>>
>>>>>> Any help would be appreciated.
>>>>>>
>>>>>> Clive Lilley
>>>>>>
>>>>> --
>>>>> James Kirkland
>>>>> Principal Enterprise Solutions Architect
>>>>> 3340 Peachtree Road, NE,
>>>>> Suite 1200
>>>>> Atlanta, GA 30326 USA.
>>>>> Phone (404) 254-6457<https://www.google.com/voice#phones>
>>>>> RHCE Certificate: 805009616436562
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email] For
>>> additional commands, e-mail: [hidden email]
>>>
>>> .
>>>
> .
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

RE: QPID performance on virtual machines

Steve Huston
Ok, Clive - thanks very much for the follow-up! Glad you have this
situation in hand now.

-Steve

> -----Original Message-----
> From: CLIVE [mailto:[hidden email]]
> Sent: Thursday, May 03, 2012 4:53 PM
> To: Steve Huston
> Cc: [hidden email]; 'James Kirkland'
> Subject: Re: QPID performance on virtual machines
>
> Steve,
>
> Managed to run some more performance tests today using a RHEL5u4 VM on
> a Dell R710 . Ran qpid-perftest with default values on the same VM as
qpidd,
> each test ran several times with the calculated average shown in the
table

> below.
>
> CPUs    RAM      Publish    Consume
>    2         4G        48K           46K
>    4         4G        65K           60K
>    6         4G        73K           66K
>    2         8G        46K           44K
>    4         8G        65K           61K
>    6         8G        74K           67K
>
> Basically it confirms your assertion about the broker using more threads
> under heavy load. Changing the VM memory had no discernible effect on
> performance, but increasing the number of CPU's available to the VM had
a
> big effect on throughput.
>
> So when defining a VM for QPID transient usage focus on CPU
allocation!!!

>
> Thanks for the advise and help.
>
> Clive
>
>
>
> On 03/05/2012 15:27, Steve Huston wrote:
> > Hi Clive,
> >
> > The broker will use threads based on load - if the broker takes longer
> > to process a message than qpid-perftest takes to send the next
> > message, the broker would need more threads.
> >
> > A more pointed test for broker performance would be to run the client
> > on another host - then you know the non-VM vs. VM differences are just
> > the broker's actions. It may be a little confusing weeding out the
actual vs.

> > virtual NIC issues, but there would be no confusion about how much the
> > client is taking away from resources available to the broker.
> >
> > -Steve
> >
> >> -----Original Message-----
> >> From: CLIVE [mailto:[hidden email]]
> >> Sent: Wednesday, May 02, 2012 5:28 PM
> >> To: [hidden email]
> >> Cc: Steve Huston; 'James Kirkland'
> >> Subject: Re: QPID performance on virtual machines
> >>
> >> Steve,
> >>
> >> I thought about this as well. So re-started the broker on the
> >> physical
> > Dell
> >> R710 with the threads option set to just 4 and saw the same
> >> throughput values (85000 publish and 80000 subscribe). As reducing
> >> the threads
> > count
> >> didn't seem to have much effect on the physical machine I thought
> >> that
> > this
> >> probably wasn't the issue.
> >>
> >> As the qpid-perftest application was only creating 1 producer and 1
> > consumer
> >> I reasoned that perhaps the broker was only using two threads too
> > service
> >> the read and writes from these clients. This was why reducing the
> >> thread count on the broker had no effect. Would you expect the broker
> >> to use
> > more
> >> than two threads to service the clients for this scenario?
> >>
> >> I will rerun the test tomorrow based on an increased number of CPU's
> >> in
> > the
> >> VM(s) just to double check whether it is a number of cores issue.
> >>
> >> I did run 'strace -c' on qpidd while the test was running to count
> >> the
> > number
> >> of system calls and I noted the big hitters were futex and write.
> >> Interestingly the reads read in 64K chunks, but the writes were only
> >> 2048 bytes at a time. As a result the number writes occurring were an
> > order
> >> of magnitude bigger than the reads; I left the detailed results at
> >> work
> > so
> >> apologies for not quoting the actual figures.
> >>
> >> Clive
> >>
> >> On 02/05/2012 20:23, Steve Huston wrote:
> >>> The qpid broker learns how many CPUs are available and will run more
> >>> I/O threads when more CPUs are available (#CPUs + 1 threads). It
> >>> would be interesting to see the results if your VM gets more CPUs.
> >>>
> >>> -Steve
> >>>
> >>>> -----Original Message-----
> >>>> From: CLIVE [mailto:[hidden email]]
> >>>> Sent: Wednesday, May 02, 2012 1:30 PM
> >>>> To: James Kirkland
> >>>> Cc: [hidden email]
> >>>> Subject: Re: QPID performance on virtual machines
> >>>>
> >>>> James,
> >>>>
> >>>> qpid-perf-test (as supplied with the qpid-0.14 source tar ball)
> >>>> runs a
> >>> direct
> >>>> queue test when executed without any parameters; there is a
> command
> >>>> line option that enables this to be be changed if required.  The
> >>>> message size
> >>> is
> >>>> 1024K (again default size when not explicitly set). And
> >>>> 500000 messages are published by the test (again the default when
> >>>> not explicitly set). All messages are transient so I wouldn't
> >>>> expect any
> >>> file I/O
> >>>> overhead to interfere with the test and this is confirmed by the
> >>>> vmstat results I am seeing. The only jump in the vmstat output is
> >>>> the number of context switches that are occurring which jumps up
> >>>> into the
> >> thousands.
> >>>> Clive
> >>>>
> >>>> On 02/05/2012 18:10, James Kirkland wrote:
> >>>>> What sort of messging scenario is it?  Are the messages persisted?
> >>>>> How big are they?  If they are persisted are you using virtual
> >>>>> disks or physical devices?
> >>>>>
> >>>>> CLIVE wrote:
> >>>>>> Hi all,
> >>>>>>
> >>>>>> I have been undertaking some performance profiling of QPID
> >>>>>> version
> >>>>>> 0.14 over the last few weeks and I have found a significant
> >>>>>> performance drop off when running QPID in a virtual machine.
> >>>>>>
> >>>>>> As an example if I run qpidd on an 8 core DELL R710 with 36G RAM
> >>>>>> (RHEL5u5) and then run qpid-perf-test (on the same machine to
> >>>>>> discount any network problems) without any command line
> >> parameters
> >>>>>> I am seeing about 85,000 publish transfers/sec and 80000 consume
> >>>>>> transfers/sec. If I run the same scenario on a VM (tried both KVM
> >>>>>> and VMWare ESXi 4.3 running RHEL5u5) with 2 cores and 8G RAM, I
> >>>>>> am seeing only 45000 publish transfers/sec and 40000 consume
> >>>>>> transfers/sec. A significant drop off in performance. Looking at
> >>>>>> the cpu and memory usage these would not seem to be the limiting
> >>>>>> factors as the memory consumption of qpidd stays under 200
> MBytes
> >>>>>> and its CPU is up at about 150%; hence the two core machine.
> >>>>>>
> >>>>>> I have even run the same test on my Mac Book at home using
> VMWare
> >>>>>> Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000
> >>>>>> transfers/sec results.
> >>>>>>
> >>>>>> I would expect a small drop off in performance when running in a
> >>>>>> VM, but not to the extent that I am seeing.
> >>>>>>
> >>>>>> Has anyone else seen this and if so were they able to get to the
> >>>>>> bottom of the issue.
> >>>>>>
> >>>>>> Any help would be appreciated.
> >>>>>>
> >>>>>> Clive Lilley
> >>>>>>
> >>>>> --
> >>>>> James Kirkland
> >>>>> Principal Enterprise Solutions Architect
> >>>>> 3340 Peachtree Road, NE,
> >>>>> Suite 1200
> >>>>> Atlanta, GA 30326 USA.
> >>>>> Phone (404) 254-6457<https://www.google.com/voice#phones>
> >>>>> RHCE Certificate: 805009616436562
> >>> --------------------------------------------------------------------
> >>> - To unsubscribe, e-mail: [hidden email] For
> >>> additional commands, e-mail: [hidden email]
> >>>
> >>> .
> >>>
> > .
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: QPID performance on virtual machines

Clive Lilley
Steve,

Just one other thought. On other multi-threaded applications I have
usually found a significant speed up by moving to a more thread
efficient memory allocator like that provided by Intel's Thread Building
Blocks (TBB) or Google's tcmalloc (part of google-perftools)

Is this something that you think might be worth a look, or is QPID doing
something clever already?

Clive

On 03/05/2012 22:04, Steve Huston wrote:

> Ok, Clive - thanks very much for the follow-up! Glad you have this
> situation in hand now.
>
> -Steve
>
>> -----Original Message-----
>> From: CLIVE [mailto:[hidden email]]
>> Sent: Thursday, May 03, 2012 4:53 PM
>> To: Steve Huston
>> Cc: [hidden email]; 'James Kirkland'
>> Subject: Re: QPID performance on virtual machines
>>
>> Steve,
>>
>> Managed to run some more performance tests today using a RHEL5u4 VM on
>> a Dell R710 . Ran qpid-perftest with default values on the same VM as
> qpidd,
>> each test ran several times with the calculated average shown in the
> table
>> below.
>>
>> CPUs    RAM      Publish    Consume
>>     2         4G        48K           46K
>>     4         4G        65K           60K
>>     6         4G        73K           66K
>>     2         8G        46K           44K
>>     4         8G        65K           61K
>>     6         8G        74K           67K
>>
>> Basically it confirms your assertion about the broker using more threads
>> under heavy load. Changing the VM memory had no discernible effect on
>> performance, but increasing the number of CPU's available to the VM had
> a
>> big effect on throughput.
>>
>> So when defining a VM for QPID transient usage focus on CPU
> allocation!!!
>> Thanks for the advise and help.
>>
>> Clive
>>
>>
>>
>> On 03/05/2012 15:27, Steve Huston wrote:
>>> Hi Clive,
>>>
>>> The broker will use threads based on load - if the broker takes longer
>>> to process a message than qpid-perftest takes to send the next
>>> message, the broker would need more threads.
>>>
>>> A more pointed test for broker performance would be to run the client
>>> on another host - then you know the non-VM vs. VM differences are just
>>> the broker's actions. It may be a little confusing weeding out the
> actual vs.
>>> virtual NIC issues, but there would be no confusion about how much the
>>> client is taking away from resources available to the broker.
>>>
>>> -Steve
>>>
>>>> -----Original Message-----
>>>> From: CLIVE [mailto:[hidden email]]
>>>> Sent: Wednesday, May 02, 2012 5:28 PM
>>>> To: [hidden email]
>>>> Cc: Steve Huston; 'James Kirkland'
>>>> Subject: Re: QPID performance on virtual machines
>>>>
>>>> Steve,
>>>>
>>>> I thought about this as well. So re-started the broker on the
>>>> physical
>>> Dell
>>>> R710 with the threads option set to just 4 and saw the same
>>>> throughput values (85000 publish and 80000 subscribe). As reducing
>>>> the threads
>>> count
>>>> didn't seem to have much effect on the physical machine I thought
>>>> that
>>> this
>>>> probably wasn't the issue.
>>>>
>>>> As the qpid-perftest application was only creating 1 producer and 1
>>> consumer
>>>> I reasoned that perhaps the broker was only using two threads too
>>> service
>>>> the read and writes from these clients. This was why reducing the
>>>> thread count on the broker had no effect. Would you expect the broker
>>>> to use
>>> more
>>>> than two threads to service the clients for this scenario?
>>>>
>>>> I will rerun the test tomorrow based on an increased number of CPU's
>>>> in
>>> the
>>>> VM(s) just to double check whether it is a number of cores issue.
>>>>
>>>> I did run 'strace -c' on qpidd while the test was running to count
>>>> the
>>> number
>>>> of system calls and I noted the big hitters were futex and write.
>>>> Interestingly the reads read in 64K chunks, but the writes were only
>>>> 2048 bytes at a time. As a result the number writes occurring were an
>>> order
>>>> of magnitude bigger than the reads; I left the detailed results at
>>>> work
>>> so
>>>> apologies for not quoting the actual figures.
>>>>
>>>> Clive
>>>>
>>>> On 02/05/2012 20:23, Steve Huston wrote:
>>>>> The qpid broker learns how many CPUs are available and will run more
>>>>> I/O threads when more CPUs are available (#CPUs + 1 threads). It
>>>>> would be interesting to see the results if your VM gets more CPUs.
>>>>>
>>>>> -Steve
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: CLIVE [mailto:[hidden email]]
>>>>>> Sent: Wednesday, May 02, 2012 1:30 PM
>>>>>> To: James Kirkland
>>>>>> Cc: [hidden email]
>>>>>> Subject: Re: QPID performance on virtual machines
>>>>>>
>>>>>> James,
>>>>>>
>>>>>> qpid-perf-test (as supplied with the qpid-0.14 source tar ball)
>>>>>> runs a
>>>>> direct
>>>>>> queue test when executed without any parameters; there is a
>> command
>>>>>> line option that enables this to be be changed if required.  The
>>>>>> message size
>>>>> is
>>>>>> 1024K (again default size when not explicitly set). And
>>>>>> 500000 messages are published by the test (again the default when
>>>>>> not explicitly set). All messages are transient so I wouldn't
>>>>>> expect any
>>>>> file I/O
>>>>>> overhead to interfere with the test and this is confirmed by the
>>>>>> vmstat results I am seeing. The only jump in the vmstat output is
>>>>>> the number of context switches that are occurring which jumps up
>>>>>> into the
>>>> thousands.
>>>>>> Clive
>>>>>>
>>>>>> On 02/05/2012 18:10, James Kirkland wrote:
>>>>>>> What sort of messging scenario is it?  Are the messages persisted?
>>>>>>> How big are they?  If they are persisted are you using virtual
>>>>>>> disks or physical devices?
>>>>>>>
>>>>>>> CLIVE wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I have been undertaking some performance profiling of QPID
>>>>>>>> version
>>>>>>>> 0.14 over the last few weeks and I have found a significant
>>>>>>>> performance drop off when running QPID in a virtual machine.
>>>>>>>>
>>>>>>>> As an example if I run qpidd on an 8 core DELL R710 with 36G RAM
>>>>>>>> (RHEL5u5) and then run qpid-perf-test (on the same machine to
>>>>>>>> discount any network problems) without any command line
>>>> parameters
>>>>>>>> I am seeing about 85,000 publish transfers/sec and 80000 consume
>>>>>>>> transfers/sec. If I run the same scenario on a VM (tried both KVM
>>>>>>>> and VMWare ESXi 4.3 running RHEL5u5) with 2 cores and 8G RAM, I
>>>>>>>> am seeing only 45000 publish transfers/sec and 40000 consume
>>>>>>>> transfers/sec. A significant drop off in performance. Looking at
>>>>>>>> the cpu and memory usage these would not seem to be the limiting
>>>>>>>> factors as the memory consumption of qpidd stays under 200
>> MBytes
>>>>>>>> and its CPU is up at about 150%; hence the two core machine.
>>>>>>>>
>>>>>>>> I have even run the same test on my Mac Book at home using
>> VMWare
>>>>>>>> Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000
>>>>>>>> transfers/sec results.
>>>>>>>>
>>>>>>>> I would expect a small drop off in performance when running in a
>>>>>>>> VM, but not to the extent that I am seeing.
>>>>>>>>
>>>>>>>> Has anyone else seen this and if so were they able to get to the
>>>>>>>> bottom of the issue.
>>>>>>>>
>>>>>>>> Any help would be appreciated.
>>>>>>>>
>>>>>>>> Clive Lilley
>>>>>>>>
>>>>>>> --
>>>>>>> James Kirkland
>>>>>>> Principal Enterprise Solutions Architect
>>>>>>> 3340 Peachtree Road, NE,
>>>>>>> Suite 1200
>>>>>>> Atlanta, GA 30326 USA.
>>>>>>> Phone (404) 254-6457<https://www.google.com/voice#phones>
>>>>>>> RHCE Certificate: 805009616436562
>>>>> --------------------------------------------------------------------
>>>>> - To unsubscribe, e-mail: [hidden email] For
>>>>> additional commands, e-mail: [hidden email]
>>>>>
>>>>> .
>>>>>
>>> .
>>>
> .
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: QPID performance on virtual machines

Carl Trieloff


I was chatting to Kim about this, this week and I believe we should do
something along these lines (custom memory allocator) for quite a few
reasons.

Carl.


On 05/03/2012 05:42 PM, CLIVE wrote:

> Steve,
>
> Just one other thought. On other multi-threaded applications I have
> usually found a significant speed up by moving to a more thread
> efficient memory allocator like that provided by Intel's Thread
> Building Blocks (TBB) or Google's tcmalloc (part of google-perftools)
>
> Is this something that you think might be worth a look, or is QPID
> doing something clever already?
>
> Clive
>
> On 03/05/2012 22:04, Steve Huston wrote:
>> Ok, Clive - thanks very much for the follow-up! Glad you have this
>> situation in hand now.
>>
>> -Steve
>>
>>> -----Original Message-----
>>> From: CLIVE [mailto:[hidden email]]
>>> Sent: Thursday, May 03, 2012 4:53 PM
>>> To: Steve Huston
>>> Cc: [hidden email]; 'James Kirkland'
>>> Subject: Re: QPID performance on virtual machines
>>>
>>> Steve,
>>>
>>> Managed to run some more performance tests today using a RHEL5u4 VM on
>>> a Dell R710 . Ran qpid-perftest with default values on the same VM as
>> qpidd,
>>> each test ran several times with the calculated average shown in the
>> table
>>> below.
>>>
>>> CPUs    RAM      Publish    Consume
>>>     2         4G        48K           46K
>>>     4         4G        65K           60K
>>>     6         4G        73K           66K
>>>     2         8G        46K           44K
>>>     4         8G        65K           61K
>>>     6         8G        74K           67K
>>>
>>> Basically it confirms your assertion about the broker using more
>>> threads
>>> under heavy load. Changing the VM memory had no discernible effect on
>>> performance, but increasing the number of CPU's available to the VM had
>> a
>>> big effect on throughput.
>>>
>>> So when defining a VM for QPID transient usage focus on CPU
>> allocation!!!
>>> Thanks for the advise and help.
>>>
>>> Clive
>>>
>>>
>>>
>>> On 03/05/2012 15:27, Steve Huston wrote:
>>>> Hi Clive,
>>>>
>>>> The broker will use threads based on load - if the broker takes longer
>>>> to process a message than qpid-perftest takes to send the next
>>>> message, the broker would need more threads.
>>>>
>>>> A more pointed test for broker performance would be to run the client
>>>> on another host - then you know the non-VM vs. VM differences are just
>>>> the broker's actions. It may be a little confusing weeding out the
>> actual vs.
>>>> virtual NIC issues, but there would be no confusion about how much the
>>>> client is taking away from resources available to the broker.
>>>>
>>>> -Steve
>>>>
>>>>> -----Original Message-----
>>>>> From: CLIVE [mailto:[hidden email]]
>>>>> Sent: Wednesday, May 02, 2012 5:28 PM
>>>>> To: [hidden email]
>>>>> Cc: Steve Huston; 'James Kirkland'
>>>>> Subject: Re: QPID performance on virtual machines
>>>>>
>>>>> Steve,
>>>>>
>>>>> I thought about this as well. So re-started the broker on the
>>>>> physical
>>>> Dell
>>>>> R710 with the threads option set to just 4 and saw the same
>>>>> throughput values (85000 publish and 80000 subscribe). As reducing
>>>>> the threads
>>>> count
>>>>> didn't seem to have much effect on the physical machine I thought
>>>>> that
>>>> this
>>>>> probably wasn't the issue.
>>>>>
>>>>> As the qpid-perftest application was only creating 1 producer and 1
>>>> consumer
>>>>> I reasoned that perhaps the broker was only using two threads too
>>>> service
>>>>> the read and writes from these clients. This was why reducing the
>>>>> thread count on the broker had no effect. Would you expect the broker
>>>>> to use
>>>> more
>>>>> than two threads to service the clients for this scenario?
>>>>>
>>>>> I will rerun the test tomorrow based on an increased number of CPU's
>>>>> in
>>>> the
>>>>> VM(s) just to double check whether it is a number of cores issue.
>>>>>
>>>>> I did run 'strace -c' on qpidd while the test was running to count
>>>>> the
>>>> number
>>>>> of system calls and I noted the big hitters were futex and write.
>>>>> Interestingly the reads read in 64K chunks, but the writes were only
>>>>> 2048 bytes at a time. As a result the number writes occurring were an
>>>> order
>>>>> of magnitude bigger than the reads; I left the detailed results at
>>>>> work
>>>> so
>>>>> apologies for not quoting the actual figures.
>>>>>
>>>>> Clive
>>>>>
>>>>> On 02/05/2012 20:23, Steve Huston wrote:
>>>>>> The qpid broker learns how many CPUs are available and will run more
>>>>>> I/O threads when more CPUs are available (#CPUs + 1 threads). It
>>>>>> would be interesting to see the results if your VM gets more CPUs.
>>>>>>
>>>>>> -Steve
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: CLIVE [mailto:[hidden email]]
>>>>>>> Sent: Wednesday, May 02, 2012 1:30 PM
>>>>>>> To: James Kirkland
>>>>>>> Cc: [hidden email]
>>>>>>> Subject: Re: QPID performance on virtual machines
>>>>>>>
>>>>>>> James,
>>>>>>>
>>>>>>> qpid-perf-test (as supplied with the qpid-0.14 source tar ball)
>>>>>>> runs a
>>>>>> direct
>>>>>>> queue test when executed without any parameters; there is a
>>> command
>>>>>>> line option that enables this to be be changed if required.  The
>>>>>>> message size
>>>>>> is
>>>>>>> 1024K (again default size when not explicitly set). And
>>>>>>> 500000 messages are published by the test (again the default when
>>>>>>> not explicitly set). All messages are transient so I wouldn't
>>>>>>> expect any
>>>>>> file I/O
>>>>>>> overhead to interfere with the test and this is confirmed by the
>>>>>>> vmstat results I am seeing. The only jump in the vmstat output is
>>>>>>> the number of context switches that are occurring which jumps up
>>>>>>> into the
>>>>> thousands.
>>>>>>> Clive
>>>>>>>
>>>>>>> On 02/05/2012 18:10, James Kirkland wrote:
>>>>>>>> What sort of messging scenario is it?  Are the messages persisted?
>>>>>>>> How big are they?  If they are persisted are you using virtual
>>>>>>>> disks or physical devices?
>>>>>>>>
>>>>>>>> CLIVE wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> I have been undertaking some performance profiling of QPID
>>>>>>>>> version
>>>>>>>>> 0.14 over the last few weeks and I have found a significant
>>>>>>>>> performance drop off when running QPID in a virtual machine.
>>>>>>>>>
>>>>>>>>> As an example if I run qpidd on an 8 core DELL R710 with 36G RAM
>>>>>>>>> (RHEL5u5) and then run qpid-perf-test (on the same machine to
>>>>>>>>> discount any network problems) without any command line
>>>>> parameters
>>>>>>>>> I am seeing about 85,000 publish transfers/sec and 80000 consume
>>>>>>>>> transfers/sec. If I run the same scenario on a VM (tried both KVM
>>>>>>>>> and VMWare ESXi 4.3 running RHEL5u5) with 2 cores and 8G RAM, I
>>>>>>>>> am seeing only 45000 publish transfers/sec and 40000 consume
>>>>>>>>> transfers/sec. A significant drop off in performance. Looking at
>>>>>>>>> the cpu and memory usage these would not seem to be the limiting
>>>>>>>>> factors as the memory consumption of qpidd stays under 200
>>> MBytes
>>>>>>>>> and its CPU is up at about 150%; hence the two core machine.
>>>>>>>>>
>>>>>>>>> I have even run the same test on my Mac Book at home using
>>> VMWare
>>>>>>>>> Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000
>>>>>>>>> transfers/sec results.
>>>>>>>>>
>>>>>>>>> I would expect a small drop off in performance when running in a
>>>>>>>>> VM, but not to the extent that I am seeing.
>>>>>>>>>
>>>>>>>>> Has anyone else seen this and if so were they able to get to the
>>>>>>>>> bottom of the issue.
>>>>>>>>>
>>>>>>>>> Any help would be appreciated.
>>>>>>>>>
>>>>>>>>> Clive Lilley
>>>>>>>>>
>>>>>>>> --
>>>>>>>> James Kirkland
>>>>>>>> Principal Enterprise Solutions Architect
>>>>>>>> 3340 Peachtree Road, NE,
>>>>>>>> Suite 1200
>>>>>>>> Atlanta, GA 30326 USA.
>>>>>>>> Phone (404) 254-6457<https://www.google.com/voice#phones>
>>>>>>>> RHCE Certificate: 805009616436562
>>>>>> --------------------------------------------------------------------
>>>>>> - To unsubscribe, e-mail: [hidden email] For
>>>>>> additional commands, e-mail: [hidden email]
>>>>>>
>>>>>> .
>>>>>>
>>>> .
>>>>
>> .
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: QPID performance on virtual machines

Clive Lilley
Carl,

I ran a test today on the Dell R710 physical machine using qpidd linked
against Google's tcmalloc (exported
LD_PRELOAD=/home/clive/libs/libtcmalloc_minimal.so before running the
qpidd process).

When qpid-perftest was executed using it's default values, I saw the
publish and consume rates raise from 85000/80000 to 108000/105000
transfers/sec. A significant increase.

Producing QPID's own thread optimized malloc or incorporating an
existing third party version into the build might have some merit.

Anyway thought you might like to know.

As an aside I hope to try Intel's TBB next, so will keep you informed on
how this performs

Clive

On 03/05/2012 22:56, Carl Trieloff wrote:

>
> I was chatting to Kim about this, this week and I believe we should do
> something along these lines (custom memory allocator) for quite a few
> reasons.
>
> Carl.
>
>
> On 05/03/2012 05:42 PM, CLIVE wrote:
>> Steve,
>>
>> Just one other thought. On other multi-threaded applications I have
>> usually found a significant speed up by moving to a more thread
>> efficient memory allocator like that provided by Intel's Thread
>> Building Blocks (TBB) or Google's tcmalloc (part of google-perftools)
>>
>> Is this something that you think might be worth a look, or is QPID
>> doing something clever already?
>>
>> Clive
>>
>> On 03/05/2012 22:04, Steve Huston wrote:
>>> Ok, Clive - thanks very much for the follow-up! Glad you have this
>>> situation in hand now.
>>>
>>> -Steve
>>>
>>>> -----Original Message-----
>>>> From: CLIVE [mailto:[hidden email]]
>>>> Sent: Thursday, May 03, 2012 4:53 PM
>>>> To: Steve Huston
>>>> Cc: [hidden email]; 'James Kirkland'
>>>> Subject: Re: QPID performance on virtual machines
>>>>
>>>> Steve,
>>>>
>>>> Managed to run some more performance tests today using a RHEL5u4 VM on
>>>> a Dell R710 . Ran qpid-perftest with default values on the same VM as
>>> qpidd,
>>>> each test ran several times with the calculated average shown in the
>>> table
>>>> below.
>>>>
>>>> CPUs    RAM      Publish    Consume
>>>>      2         4G        48K           46K
>>>>      4         4G        65K           60K
>>>>      6         4G        73K           66K
>>>>      2         8G        46K           44K
>>>>      4         8G        65K           61K
>>>>      6         8G        74K           67K
>>>>
>>>> Basically it confirms your assertion about the broker using more
>>>> threads
>>>> under heavy load. Changing the VM memory had no discernible effect on
>>>> performance, but increasing the number of CPU's available to the VM had
>>> a
>>>> big effect on throughput.
>>>>
>>>> So when defining a VM for QPID transient usage focus on CPU
>>> allocation!!!
>>>> Thanks for the advise and help.
>>>>
>>>> Clive
>>>>
>>>>
>>>>
>>>> On 03/05/2012 15:27, Steve Huston wrote:
>>>>> Hi Clive,
>>>>>
>>>>> The broker will use threads based on load - if the broker takes longer
>>>>> to process a message than qpid-perftest takes to send the next
>>>>> message, the broker would need more threads.
>>>>>
>>>>> A more pointed test for broker performance would be to run the client
>>>>> on another host - then you know the non-VM vs. VM differences are just
>>>>> the broker's actions. It may be a little confusing weeding out the
>>> actual vs.
>>>>> virtual NIC issues, but there would be no confusion about how much the
>>>>> client is taking away from resources available to the broker.
>>>>>
>>>>> -Steve
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: CLIVE [mailto:[hidden email]]
>>>>>> Sent: Wednesday, May 02, 2012 5:28 PM
>>>>>> To: [hidden email]
>>>>>> Cc: Steve Huston; 'James Kirkland'
>>>>>> Subject: Re: QPID performance on virtual machines
>>>>>>
>>>>>> Steve,
>>>>>>
>>>>>> I thought about this as well. So re-started the broker on the
>>>>>> physical
>>>>> Dell
>>>>>> R710 with the threads option set to just 4 and saw the same
>>>>>> throughput values (85000 publish and 80000 subscribe). As reducing
>>>>>> the threads
>>>>> count
>>>>>> didn't seem to have much effect on the physical machine I thought
>>>>>> that
>>>>> this
>>>>>> probably wasn't the issue.
>>>>>>
>>>>>> As the qpid-perftest application was only creating 1 producer and 1
>>>>> consumer
>>>>>> I reasoned that perhaps the broker was only using two threads too
>>>>> service
>>>>>> the read and writes from these clients. This was why reducing the
>>>>>> thread count on the broker had no effect. Would you expect the broker
>>>>>> to use
>>>>> more
>>>>>> than two threads to service the clients for this scenario?
>>>>>>
>>>>>> I will rerun the test tomorrow based on an increased number of CPU's
>>>>>> in
>>>>> the
>>>>>> VM(s) just to double check whether it is a number of cores issue.
>>>>>>
>>>>>> I did run 'strace -c' on qpidd while the test was running to count
>>>>>> the
>>>>> number
>>>>>> of system calls and I noted the big hitters were futex and write.
>>>>>> Interestingly the reads read in 64K chunks, but the writes were only
>>>>>> 2048 bytes at a time. As a result the number writes occurring were an
>>>>> order
>>>>>> of magnitude bigger than the reads; I left the detailed results at
>>>>>> work
>>>>> so
>>>>>> apologies for not quoting the actual figures.
>>>>>>
>>>>>> Clive
>>>>>>
>>>>>> On 02/05/2012 20:23, Steve Huston wrote:
>>>>>>> The qpid broker learns how many CPUs are available and will run more
>>>>>>> I/O threads when more CPUs are available (#CPUs + 1 threads). It
>>>>>>> would be interesting to see the results if your VM gets more CPUs.
>>>>>>>
>>>>>>> -Steve
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: CLIVE [mailto:[hidden email]]
>>>>>>>> Sent: Wednesday, May 02, 2012 1:30 PM
>>>>>>>> To: James Kirkland
>>>>>>>> Cc: [hidden email]
>>>>>>>> Subject: Re: QPID performance on virtual machines
>>>>>>>>
>>>>>>>> James,
>>>>>>>>
>>>>>>>> qpid-perf-test (as supplied with the qpid-0.14 source tar ball)
>>>>>>>> runs a
>>>>>>> direct
>>>>>>>> queue test when executed without any parameters; there is a
>>>> command
>>>>>>>> line option that enables this to be be changed if required.  The
>>>>>>>> message size
>>>>>>> is
>>>>>>>> 1024K (again default size when not explicitly set). And
>>>>>>>> 500000 messages are published by the test (again the default when
>>>>>>>> not explicitly set). All messages are transient so I wouldn't
>>>>>>>> expect any
>>>>>>> file I/O
>>>>>>>> overhead to interfere with the test and this is confirmed by the
>>>>>>>> vmstat results I am seeing. The only jump in the vmstat output is
>>>>>>>> the number of context switches that are occurring which jumps up
>>>>>>>> into the
>>>>>> thousands.
>>>>>>>> Clive
>>>>>>>>
>>>>>>>> On 02/05/2012 18:10, James Kirkland wrote:
>>>>>>>>> What sort of messging scenario is it?  Are the messages persisted?
>>>>>>>>> How big are they?  If they are persisted are you using virtual
>>>>>>>>> disks or physical devices?
>>>>>>>>>
>>>>>>>>> CLIVE wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> I have been undertaking some performance profiling of QPID
>>>>>>>>>> version
>>>>>>>>>> 0.14 over the last few weeks and I have found a significant
>>>>>>>>>> performance drop off when running QPID in a virtual machine.
>>>>>>>>>>
>>>>>>>>>> As an example if I run qpidd on an 8 core DELL R710 with 36G RAM
>>>>>>>>>> (RHEL5u5) and then run qpid-perf-test (on the same machine to
>>>>>>>>>> discount any network problems) without any command line
>>>>>> parameters
>>>>>>>>>> I am seeing about 85,000 publish transfers/sec and 80000 consume
>>>>>>>>>> transfers/sec. If I run the same scenario on a VM (tried both KVM
>>>>>>>>>> and VMWare ESXi 4.3 running RHEL5u5) with 2 cores and 8G RAM, I
>>>>>>>>>> am seeing only 45000 publish transfers/sec and 40000 consume
>>>>>>>>>> transfers/sec. A significant drop off in performance. Looking at
>>>>>>>>>> the cpu and memory usage these would not seem to be the limiting
>>>>>>>>>> factors as the memory consumption of qpidd stays under 200
>>>> MBytes
>>>>>>>>>> and its CPU is up at about 150%; hence the two core machine.
>>>>>>>>>>
>>>>>>>>>> I have even run the same test on my Mac Book at home using
>>>> VMWare
>>>>>>>>>> Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000
>>>>>>>>>> transfers/sec results.
>>>>>>>>>>
>>>>>>>>>> I would expect a small drop off in performance when running in a
>>>>>>>>>> VM, but not to the extent that I am seeing.
>>>>>>>>>>
>>>>>>>>>> Has anyone else seen this and if so were they able to get to the
>>>>>>>>>> bottom of the issue.
>>>>>>>>>>
>>>>>>>>>> Any help would be appreciated.
>>>>>>>>>>
>>>>>>>>>> Clive Lilley
>>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> James Kirkland
>>>>>>>>> Principal Enterprise Solutions Architect
>>>>>>>>> 3340 Peachtree Road, NE,
>>>>>>>>> Suite 1200
>>>>>>>>> Atlanta, GA 30326 USA.
>>>>>>>>> Phone (404) 254-6457<https://www.google.com/voice#phones>
>>>>>>>>> RHCE Certificate: 805009616436562
>>>>>>> --------------------------------------------------------------------
>>>>>>> - To unsubscribe, e-mail: [hidden email] For
>>>>>>> additional commands, e-mail: [hidden email]
>>>>>>>
>>>>>>> .
>>>>>>>
>>>>> .
>>>>>
>>> .
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
> .
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Loading...