|
Hi all,
I have been undertaking some performance profiling of QPID version 0.14 over the last few weeks and I have found a significant performance drop off when running QPID in a virtual machine. As an example if I run qpidd on an 8 core DELL R710 with 36G RAM (RHEL5u5) and then run qpid-perf-test (on the same machine to discount any network problems) without any command line parameters I am seeing about 85,000 publish transfers/sec and 80000 consume transfers/sec. If I run the same scenario on a VM (tried both KVM and VMWare ESXi 4.3 running RHEL5u5) with 2 cores and 8G RAM, I am seeing only 45000 publish transfers/sec and 40000 consume transfers/sec. A significant drop off in performance. Looking at the cpu and memory usage these would not seem to be the limiting factors as the memory consumption of qpidd stays under 200 MBytes and its CPU is up at about 150%; hence the two core machine. I have even run the same test on my Mac Book at home using VMWare Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000 transfers/sec results. I would expect a small drop off in performance when running in a VM, but not to the extent that I am seeing. Has anyone else seen this and if so were they able to get to the bottom of the issue. Any help would be appreciated. Clive Lilley --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
|
What sort of messging scenario is it? Are the messages persisted? How
big are they? If they are persisted are you using virtual disks or physical devices? CLIVE wrote: > Hi all, > > I have been undertaking some performance profiling of QPID version > 0.14 over the last few weeks and I have found a significant > performance drop off when running QPID in a virtual machine. > > As an example if I run qpidd on an 8 core DELL R710 with 36G RAM > (RHEL5u5) and then run qpid-perf-test (on the same machine to discount > any network problems) without any command line parameters I am seeing > about 85,000 publish transfers/sec and 80000 consume transfers/sec. If > I run the same scenario on a VM (tried both KVM and VMWare ESXi 4.3 > running RHEL5u5) with 2 cores and 8G RAM, I am seeing only 45000 > publish transfers/sec and 40000 consume transfers/sec. A significant > drop off in performance. Looking at the cpu and memory usage these > would not seem to be the limiting factors as the memory consumption of > qpidd stays under 200 MBytes and its CPU is up at about 150%; hence > the two core machine. > > I have even run the same test on my Mac Book at home using VMWare > Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000 transfers/sec > results. > > I would expect a small drop off in performance when running in a VM, > but not to the extent that I am seeing. > > Has anyone else seen this and if so were they able to get to the > bottom of the issue. > > Any help would be appreciated. > > Clive Lilley > -- James Kirkland Principal Enterprise Solutions Architect 3340 Peachtree Road, NE, Suite 1200 Atlanta, GA 30326 USA. Phone (404) 254-6457 <https://www.google.com/voice#phones> RHCE Certificate: 805009616436562 |
|
James,
qpid-perf-test (as supplied with the qpid-0.14 source tar ball) runs a direct queue test when executed without any parameters; there is a command line option that enables this to be be changed if required. The message size is 1024K (again default size when not explicitly set). And 500000 messages are published by the test (again the default when not explicitly set). All messages are transient so I wouldn't expect any file I/O overhead to interfere with the test and this is confirmed by the vmstat results I am seeing. The only jump in the vmstat output is the number of context switches that are occurring which jumps up into the thousands. Clive On 02/05/2012 18:10, James Kirkland wrote: > What sort of messging scenario is it? Are the messages persisted? > How big are they? If they are persisted are you using virtual disks > or physical devices? > > CLIVE wrote: >> Hi all, >> >> I have been undertaking some performance profiling of QPID version >> 0.14 over the last few weeks and I have found a significant >> performance drop off when running QPID in a virtual machine. >> >> As an example if I run qpidd on an 8 core DELL R710 with 36G RAM >> (RHEL5u5) and then run qpid-perf-test (on the same machine to >> discount any network problems) without any command line parameters I >> am seeing about 85,000 publish transfers/sec and 80000 consume >> transfers/sec. If I run the same scenario on a VM (tried both KVM and >> VMWare ESXi 4.3 running RHEL5u5) with 2 cores and 8G RAM, I am seeing >> only 45000 publish transfers/sec and 40000 consume transfers/sec. A >> significant drop off in performance. Looking at the cpu and memory >> usage these would not seem to be the limiting factors as the memory >> consumption of qpidd stays under 200 MBytes and its CPU is up at >> about 150%; hence the two core machine. >> >> I have even run the same test on my Mac Book at home using VMWare >> Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000 transfers/sec >> results. >> >> I would expect a small drop off in performance when running in a VM, >> but not to the extent that I am seeing. >> >> Has anyone else seen this and if so were they able to get to the >> bottom of the issue. >> >> Any help would be appreciated. >> >> Clive Lilley >> > > -- > James Kirkland > Principal Enterprise Solutions Architect > 3340 Peachtree Road, NE, > Suite 1200 > Atlanta, GA 30326 USA. > Phone (404) 254-6457 <https://www.google.com/voice#phones> > RHCE Certificate: 805009616436562 |
|
The qpid broker learns how many CPUs are available and will run more I/O
threads when more CPUs are available (#CPUs + 1 threads). It would be interesting to see the results if your VM gets more CPUs. -Steve > -----Original Message----- > From: CLIVE [mailto:[hidden email]] > Sent: Wednesday, May 02, 2012 1:30 PM > To: James Kirkland > Cc: [hidden email] > Subject: Re: QPID performance on virtual machines > > James, > > qpid-perf-test (as supplied with the qpid-0.14 source tar ball) runs a > queue test when executed without any parameters; there is a command line > option that enables this to be be changed if required. The message size is > 1024K (again default size when not explicitly set). And > 500000 messages are published by the test (again the default when not > explicitly set). All messages are transient so I wouldn't expect any file I/O > overhead to interfere with the test and this is confirmed by the vmstat > results I am seeing. The only jump in the vmstat output is the number of > context switches that are occurring which jumps up into the thousands. > > Clive > > On 02/05/2012 18:10, James Kirkland wrote: > > What sort of messging scenario is it? Are the messages persisted? > > How big are they? If they are persisted are you using virtual disks > > or physical devices? > > > > CLIVE wrote: > >> Hi all, > >> > >> I have been undertaking some performance profiling of QPID version > >> 0.14 over the last few weeks and I have found a significant > >> performance drop off when running QPID in a virtual machine. > >> > >> As an example if I run qpidd on an 8 core DELL R710 with 36G RAM > >> (RHEL5u5) and then run qpid-perf-test (on the same machine to > >> discount any network problems) without any command line parameters I > >> am seeing about 85,000 publish transfers/sec and 80000 consume > >> transfers/sec. If I run the same scenario on a VM (tried both KVM and > >> VMWare ESXi 4.3 running RHEL5u5) with 2 cores and 8G RAM, I am seeing > >> only 45000 publish transfers/sec and 40000 consume transfers/sec. A > >> significant drop off in performance. Looking at the cpu and memory > >> usage these would not seem to be the limiting factors as the memory > >> consumption of qpidd stays under 200 MBytes and its CPU is up at > >> about 150%; hence the two core machine. > >> > >> I have even run the same test on my Mac Book at home using VMWare > >> Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000 transfers/sec > >> results. > >> > >> I would expect a small drop off in performance when running in a VM, > >> but not to the extent that I am seeing. > >> > >> Has anyone else seen this and if so were they able to get to the > >> bottom of the issue. > >> > >> Any help would be appreciated. > >> > >> Clive Lilley > >> > > > > -- > > James Kirkland > > Principal Enterprise Solutions Architect > > 3340 Peachtree Road, NE, > > Suite 1200 > > Atlanta, GA 30326 USA. > > Phone (404) 254-6457 <https://www.google.com/voice#phones> > > RHCE Certificate: 805009616436562 --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
|
Steve,
I thought about this as well. So re-started the broker on the physical Dell R710 with the threads option set to just 4 and saw the same throughput values (85000 publish and 80000 subscribe). As reducing the threads count didn't seem to have much effect on the physical machine I thought that this probably wasn't the issue. As the qpid-perftest application was only creating 1 producer and 1 consumer I reasoned that perhaps the broker was only using two threads too service the read and writes from these clients. This was why reducing the thread count on the broker had no effect. Would you expect the broker to use more than two threads to service the clients for this scenario? I will rerun the test tomorrow based on an increased number of CPU's in the VM(s) just to double check whether it is a number of cores issue. I did run 'strace -c' on qpidd while the test was running to count the number of system calls and I noted the big hitters were futex and write. Interestingly the reads read in 64K chunks, but the writes were only 2048 bytes at a time. As a result the number writes occurring were an order of magnitude bigger than the reads; I left the detailed results at work so apologies for not quoting the actual figures. Clive On 02/05/2012 20:23, Steve Huston wrote: > The qpid broker learns how many CPUs are available and will run more I/O > threads when more CPUs are available (#CPUs + 1 threads). It would be > interesting to see the results if your VM gets more CPUs. > > -Steve > >> -----Original Message----- >> From: CLIVE [mailto:[hidden email]] >> Sent: Wednesday, May 02, 2012 1:30 PM >> To: James Kirkland >> Cc: [hidden email] >> Subject: Re: QPID performance on virtual machines >> >> James, >> >> qpid-perf-test (as supplied with the qpid-0.14 source tar ball) runs a > direct >> queue test when executed without any parameters; there is a command line >> option that enables this to be be changed if required. The message size > is >> 1024K (again default size when not explicitly set). And >> 500000 messages are published by the test (again the default when not >> explicitly set). All messages are transient so I wouldn't expect any > file I/O >> overhead to interfere with the test and this is confirmed by the vmstat >> results I am seeing. The only jump in the vmstat output is the number of >> context switches that are occurring which jumps up into the thousands. >> >> Clive >> >> On 02/05/2012 18:10, James Kirkland wrote: >>> What sort of messging scenario is it? Are the messages persisted? >>> How big are they? If they are persisted are you using virtual disks >>> or physical devices? >>> >>> CLIVE wrote: >>>> Hi all, >>>> >>>> I have been undertaking some performance profiling of QPID version >>>> 0.14 over the last few weeks and I have found a significant >>>> performance drop off when running QPID in a virtual machine. >>>> >>>> As an example if I run qpidd on an 8 core DELL R710 with 36G RAM >>>> (RHEL5u5) and then run qpid-perf-test (on the same machine to >>>> discount any network problems) without any command line parameters I >>>> am seeing about 85,000 publish transfers/sec and 80000 consume >>>> transfers/sec. If I run the same scenario on a VM (tried both KVM and >>>> VMWare ESXi 4.3 running RHEL5u5) with 2 cores and 8G RAM, I am seeing >>>> only 45000 publish transfers/sec and 40000 consume transfers/sec. A >>>> significant drop off in performance. Looking at the cpu and memory >>>> usage these would not seem to be the limiting factors as the memory >>>> consumption of qpidd stays under 200 MBytes and its CPU is up at >>>> about 150%; hence the two core machine. >>>> >>>> I have even run the same test on my Mac Book at home using VMWare >>>> Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000 transfers/sec >>>> results. >>>> >>>> I would expect a small drop off in performance when running in a VM, >>>> but not to the extent that I am seeing. >>>> >>>> Has anyone else seen this and if so were they able to get to the >>>> bottom of the issue. >>>> >>>> Any help would be appreciated. >>>> >>>> Clive Lilley >>>> >>> -- >>> James Kirkland >>> Principal Enterprise Solutions Architect >>> 3340 Peachtree Road, NE, >>> Suite 1200 >>> Atlanta, GA 30326 USA. >>> Phone (404) 254-6457<https://www.google.com/voice#phones> >>> RHCE Certificate: 805009616436562 > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [hidden email] > For additional commands, e-mail: [hidden email] > > . > --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
|
Hi Clive,
The broker will use threads based on load - if the broker takes longer to process a message than qpid-perftest takes to send the next message, the broker would need more threads. A more pointed test for broker performance would be to run the client on another host - then you know the non-VM vs. VM differences are just the broker's actions. It may be a little confusing weeding out the actual vs. virtual NIC issues, but there would be no confusion about how much the client is taking away from resources available to the broker. -Steve > -----Original Message----- > From: CLIVE [mailto:[hidden email]] > Sent: Wednesday, May 02, 2012 5:28 PM > To: [hidden email] > Cc: Steve Huston; 'James Kirkland' > Subject: Re: QPID performance on virtual machines > > Steve, > > I thought about this as well. So re-started the broker on the physical > R710 with the threads option set to just 4 and saw the same throughput > values (85000 publish and 80000 subscribe). As reducing the threads count > didn't seem to have much effect on the physical machine I thought that this > probably wasn't the issue. > > As the qpid-perftest application was only creating 1 producer and 1 consumer > I reasoned that perhaps the broker was only using two threads too service > the read and writes from these clients. This was why reducing the thread > count on the broker had no effect. Would you expect the broker to use more > than two threads to service the clients for this scenario? > > I will rerun the test tomorrow based on an increased number of CPU's in the > VM(s) just to double check whether it is a number of cores issue. > > I did run 'strace -c' on qpidd while the test was running to count the number > of system calls and I noted the big hitters were futex and write. > Interestingly the reads read in 64K chunks, but the writes were only > 2048 bytes at a time. As a result the number writes occurring were an order > of magnitude bigger than the reads; I left the detailed results at work so > apologies for not quoting the actual figures. > > Clive > > On 02/05/2012 20:23, Steve Huston wrote: > > The qpid broker learns how many CPUs are available and will run more > > I/O threads when more CPUs are available (#CPUs + 1 threads). It would > > be interesting to see the results if your VM gets more CPUs. > > > > -Steve > > > >> -----Original Message----- > >> From: CLIVE [mailto:[hidden email]] > >> Sent: Wednesday, May 02, 2012 1:30 PM > >> To: James Kirkland > >> Cc: [hidden email] > >> Subject: Re: QPID performance on virtual machines > >> > >> James, > >> > >> qpid-perf-test (as supplied with the qpid-0.14 source tar ball) runs > >> a > > direct > >> queue test when executed without any parameters; there is a command > >> line option that enables this to be be changed if required. The > >> message size > > is > >> 1024K (again default size when not explicitly set). And > >> 500000 messages are published by the test (again the default when not > >> explicitly set). All messages are transient so I wouldn't expect any > > file I/O > >> overhead to interfere with the test and this is confirmed by the > >> vmstat results I am seeing. The only jump in the vmstat output is the > >> number of context switches that are occurring which jumps up into the > thousands. > >> > >> Clive > >> > >> On 02/05/2012 18:10, James Kirkland wrote: > >>> What sort of messging scenario is it? Are the messages persisted? > >>> How big are they? If they are persisted are you using virtual disks > >>> or physical devices? > >>> > >>> CLIVE wrote: > >>>> Hi all, > >>>> > >>>> I have been undertaking some performance profiling of QPID version > >>>> 0.14 over the last few weeks and I have found a significant > >>>> performance drop off when running QPID in a virtual machine. > >>>> > >>>> As an example if I run qpidd on an 8 core DELL R710 with 36G RAM > >>>> (RHEL5u5) and then run qpid-perf-test (on the same machine to > >>>> discount any network problems) without any command line > parameters > >>>> I am seeing about 85,000 publish transfers/sec and 80000 consume > >>>> transfers/sec. If I run the same scenario on a VM (tried both KVM > >>>> and VMWare ESXi 4.3 running RHEL5u5) with 2 cores and 8G RAM, I am > >>>> seeing only 45000 publish transfers/sec and 40000 consume > >>>> transfers/sec. A significant drop off in performance. Looking at > >>>> the cpu and memory usage these would not seem to be the limiting > >>>> factors as the memory consumption of qpidd stays under 200 MBytes > >>>> and its CPU is up at about 150%; hence the two core machine. > >>>> > >>>> I have even run the same test on my Mac Book at home using VMWare > >>>> Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000 > >>>> transfers/sec results. > >>>> > >>>> I would expect a small drop off in performance when running in a > >>>> VM, but not to the extent that I am seeing. > >>>> > >>>> Has anyone else seen this and if so were they able to get to the > >>>> bottom of the issue. > >>>> > >>>> Any help would be appreciated. > >>>> > >>>> Clive Lilley > >>>> > >>> -- > >>> James Kirkland > >>> Principal Enterprise Solutions Architect > >>> 3340 Peachtree Road, NE, > >>> Suite 1200 > >>> Atlanta, GA 30326 USA. > >>> Phone (404) 254-6457<https://www.google.com/voice#phones> > >>> RHCE Certificate: 805009616436562 > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [hidden email] For > > additional commands, e-mail: [hidden email] > > > > . > > --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
|
Steve,
Managed to run some more performance tests today using a RHEL5u4 VM on a Dell R710 . Ran qpid-perftest with default values on the same VM as qpidd, each test ran several times with the calculated average shown in the table below. CPUs RAM Publish Consume 2 4G 48K 46K 4 4G 65K 60K 6 4G 73K 66K 2 8G 46K 44K 4 8G 65K 61K 6 8G 74K 67K Basically it confirms your assertion about the broker using more threads under heavy load. Changing the VM memory had no discernible effect on performance, but increasing the number of CPU's available to the VM had a big effect on throughput. So when defining a VM for QPID transient usage focus on CPU allocation!!! Thanks for the advise and help. Clive On 03/05/2012 15:27, Steve Huston wrote: > Hi Clive, > > The broker will use threads based on load - if the broker takes longer to > process a message than qpid-perftest takes to send the next message, the > broker would need more threads. > > A more pointed test for broker performance would be to run the client on > another host - then you know the non-VM vs. VM differences are just the > broker's actions. It may be a little confusing weeding out the actual vs. > virtual NIC issues, but there would be no confusion about how much the > client is taking away from resources available to the broker. > > -Steve > >> -----Original Message----- >> From: CLIVE [mailto:[hidden email]] >> Sent: Wednesday, May 02, 2012 5:28 PM >> To: [hidden email] >> Cc: Steve Huston; 'James Kirkland' >> Subject: Re: QPID performance on virtual machines >> >> Steve, >> >> I thought about this as well. So re-started the broker on the physical > Dell >> R710 with the threads option set to just 4 and saw the same throughput >> values (85000 publish and 80000 subscribe). As reducing the threads > count >> didn't seem to have much effect on the physical machine I thought that > this >> probably wasn't the issue. >> >> As the qpid-perftest application was only creating 1 producer and 1 > consumer >> I reasoned that perhaps the broker was only using two threads too > service >> the read and writes from these clients. This was why reducing the thread >> count on the broker had no effect. Would you expect the broker to use > more >> than two threads to service the clients for this scenario? >> >> I will rerun the test tomorrow based on an increased number of CPU's in > the >> VM(s) just to double check whether it is a number of cores issue. >> >> I did run 'strace -c' on qpidd while the test was running to count the > number >> of system calls and I noted the big hitters were futex and write. >> Interestingly the reads read in 64K chunks, but the writes were only >> 2048 bytes at a time. As a result the number writes occurring were an > order >> of magnitude bigger than the reads; I left the detailed results at work > so >> apologies for not quoting the actual figures. >> >> Clive >> >> On 02/05/2012 20:23, Steve Huston wrote: >>> The qpid broker learns how many CPUs are available and will run more >>> I/O threads when more CPUs are available (#CPUs + 1 threads). It would >>> be interesting to see the results if your VM gets more CPUs. >>> >>> -Steve >>> >>>> -----Original Message----- >>>> From: CLIVE [mailto:[hidden email]] >>>> Sent: Wednesday, May 02, 2012 1:30 PM >>>> To: James Kirkland >>>> Cc: [hidden email] >>>> Subject: Re: QPID performance on virtual machines >>>> >>>> James, >>>> >>>> qpid-perf-test (as supplied with the qpid-0.14 source tar ball) runs >>>> a >>> direct >>>> queue test when executed without any parameters; there is a command >>>> line option that enables this to be be changed if required. The >>>> message size >>> is >>>> 1024K (again default size when not explicitly set). And >>>> 500000 messages are published by the test (again the default when not >>>> explicitly set). All messages are transient so I wouldn't expect any >>> file I/O >>>> overhead to interfere with the test and this is confirmed by the >>>> vmstat results I am seeing. The only jump in the vmstat output is the >>>> number of context switches that are occurring which jumps up into the >> thousands. >>>> Clive >>>> >>>> On 02/05/2012 18:10, James Kirkland wrote: >>>>> What sort of messging scenario is it? Are the messages persisted? >>>>> How big are they? If they are persisted are you using virtual disks >>>>> or physical devices? >>>>> >>>>> CLIVE wrote: >>>>>> Hi all, >>>>>> >>>>>> I have been undertaking some performance profiling of QPID version >>>>>> 0.14 over the last few weeks and I have found a significant >>>>>> performance drop off when running QPID in a virtual machine. >>>>>> >>>>>> As an example if I run qpidd on an 8 core DELL R710 with 36G RAM >>>>>> (RHEL5u5) and then run qpid-perf-test (on the same machine to >>>>>> discount any network problems) without any command line >> parameters >>>>>> I am seeing about 85,000 publish transfers/sec and 80000 consume >>>>>> transfers/sec. If I run the same scenario on a VM (tried both KVM >>>>>> and VMWare ESXi 4.3 running RHEL5u5) with 2 cores and 8G RAM, I am >>>>>> seeing only 45000 publish transfers/sec and 40000 consume >>>>>> transfers/sec. A significant drop off in performance. Looking at >>>>>> the cpu and memory usage these would not seem to be the limiting >>>>>> factors as the memory consumption of qpidd stays under 200 MBytes >>>>>> and its CPU is up at about 150%; hence the two core machine. >>>>>> >>>>>> I have even run the same test on my Mac Book at home using VMWare >>>>>> Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000 >>>>>> transfers/sec results. >>>>>> >>>>>> I would expect a small drop off in performance when running in a >>>>>> VM, but not to the extent that I am seeing. >>>>>> >>>>>> Has anyone else seen this and if so were they able to get to the >>>>>> bottom of the issue. >>>>>> >>>>>> Any help would be appreciated. >>>>>> >>>>>> Clive Lilley >>>>>> >>>>> -- >>>>> James Kirkland >>>>> Principal Enterprise Solutions Architect >>>>> 3340 Peachtree Road, NE, >>>>> Suite 1200 >>>>> Atlanta, GA 30326 USA. >>>>> Phone (404) 254-6457<https://www.google.com/voice#phones> >>>>> RHCE Certificate: 805009616436562 >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [hidden email] For >>> additional commands, e-mail: [hidden email] >>> >>> . >>> > . > --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
|
Ok, Clive - thanks very much for the follow-up! Glad you have this
situation in hand now. -Steve > -----Original Message----- > From: CLIVE [mailto:[hidden email]] > Sent: Thursday, May 03, 2012 4:53 PM > To: Steve Huston > Cc: [hidden email]; 'James Kirkland' > Subject: Re: QPID performance on virtual machines > > Steve, > > Managed to run some more performance tests today using a RHEL5u4 VM on > a Dell R710 . Ran qpid-perftest with default values on the same VM as > each test ran several times with the calculated average shown in the table > below. > > CPUs RAM Publish Consume > 2 4G 48K 46K > 4 4G 65K 60K > 6 4G 73K 66K > 2 8G 46K 44K > 4 8G 65K 61K > 6 8G 74K 67K > > Basically it confirms your assertion about the broker using more threads > under heavy load. Changing the VM memory had no discernible effect on > performance, but increasing the number of CPU's available to the VM had > big effect on throughput. > > So when defining a VM for QPID transient usage focus on CPU allocation!!! > > Thanks for the advise and help. > > Clive > > > > On 03/05/2012 15:27, Steve Huston wrote: > > Hi Clive, > > > > The broker will use threads based on load - if the broker takes longer > > to process a message than qpid-perftest takes to send the next > > message, the broker would need more threads. > > > > A more pointed test for broker performance would be to run the client > > on another host - then you know the non-VM vs. VM differences are just > > the broker's actions. It may be a little confusing weeding out the > > virtual NIC issues, but there would be no confusion about how much the > > client is taking away from resources available to the broker. > > > > -Steve > > > >> -----Original Message----- > >> From: CLIVE [mailto:[hidden email]] > >> Sent: Wednesday, May 02, 2012 5:28 PM > >> To: [hidden email] > >> Cc: Steve Huston; 'James Kirkland' > >> Subject: Re: QPID performance on virtual machines > >> > >> Steve, > >> > >> I thought about this as well. So re-started the broker on the > >> physical > > Dell > >> R710 with the threads option set to just 4 and saw the same > >> throughput values (85000 publish and 80000 subscribe). As reducing > >> the threads > > count > >> didn't seem to have much effect on the physical machine I thought > >> that > > this > >> probably wasn't the issue. > >> > >> As the qpid-perftest application was only creating 1 producer and 1 > > consumer > >> I reasoned that perhaps the broker was only using two threads too > > service > >> the read and writes from these clients. This was why reducing the > >> thread count on the broker had no effect. Would you expect the broker > >> to use > > more > >> than two threads to service the clients for this scenario? > >> > >> I will rerun the test tomorrow based on an increased number of CPU's > >> in > > the > >> VM(s) just to double check whether it is a number of cores issue. > >> > >> I did run 'strace -c' on qpidd while the test was running to count > >> the > > number > >> of system calls and I noted the big hitters were futex and write. > >> Interestingly the reads read in 64K chunks, but the writes were only > >> 2048 bytes at a time. As a result the number writes occurring were an > > order > >> of magnitude bigger than the reads; I left the detailed results at > >> work > > so > >> apologies for not quoting the actual figures. > >> > >> Clive > >> > >> On 02/05/2012 20:23, Steve Huston wrote: > >>> The qpid broker learns how many CPUs are available and will run more > >>> I/O threads when more CPUs are available (#CPUs + 1 threads). It > >>> would be interesting to see the results if your VM gets more CPUs. > >>> > >>> -Steve > >>> > >>>> -----Original Message----- > >>>> From: CLIVE [mailto:[hidden email]] > >>>> Sent: Wednesday, May 02, 2012 1:30 PM > >>>> To: James Kirkland > >>>> Cc: [hidden email] > >>>> Subject: Re: QPID performance on virtual machines > >>>> > >>>> James, > >>>> > >>>> qpid-perf-test (as supplied with the qpid-0.14 source tar ball) > >>>> runs a > >>> direct > >>>> queue test when executed without any parameters; there is a > command > >>>> line option that enables this to be be changed if required. The > >>>> message size > >>> is > >>>> 1024K (again default size when not explicitly set). And > >>>> 500000 messages are published by the test (again the default when > >>>> not explicitly set). All messages are transient so I wouldn't > >>>> expect any > >>> file I/O > >>>> overhead to interfere with the test and this is confirmed by the > >>>> vmstat results I am seeing. The only jump in the vmstat output is > >>>> the number of context switches that are occurring which jumps up > >>>> into the > >> thousands. > >>>> Clive > >>>> > >>>> On 02/05/2012 18:10, James Kirkland wrote: > >>>>> What sort of messging scenario is it? Are the messages persisted? > >>>>> How big are they? If they are persisted are you using virtual > >>>>> disks or physical devices? > >>>>> > >>>>> CLIVE wrote: > >>>>>> Hi all, > >>>>>> > >>>>>> I have been undertaking some performance profiling of QPID > >>>>>> version > >>>>>> 0.14 over the last few weeks and I have found a significant > >>>>>> performance drop off when running QPID in a virtual machine. > >>>>>> > >>>>>> As an example if I run qpidd on an 8 core DELL R710 with 36G RAM > >>>>>> (RHEL5u5) and then run qpid-perf-test (on the same machine to > >>>>>> discount any network problems) without any command line > >> parameters > >>>>>> I am seeing about 85,000 publish transfers/sec and 80000 consume > >>>>>> transfers/sec. If I run the same scenario on a VM (tried both KVM > >>>>>> and VMWare ESXi 4.3 running RHEL5u5) with 2 cores and 8G RAM, I > >>>>>> am seeing only 45000 publish transfers/sec and 40000 consume > >>>>>> transfers/sec. A significant drop off in performance. Looking at > >>>>>> the cpu and memory usage these would not seem to be the limiting > >>>>>> factors as the memory consumption of qpidd stays under 200 > MBytes > >>>>>> and its CPU is up at about 150%; hence the two core machine. > >>>>>> > >>>>>> I have even run the same test on my Mac Book at home using > VMWare > >>>>>> Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000 > >>>>>> transfers/sec results. > >>>>>> > >>>>>> I would expect a small drop off in performance when running in a > >>>>>> VM, but not to the extent that I am seeing. > >>>>>> > >>>>>> Has anyone else seen this and if so were they able to get to the > >>>>>> bottom of the issue. > >>>>>> > >>>>>> Any help would be appreciated. > >>>>>> > >>>>>> Clive Lilley > >>>>>> > >>>>> -- > >>>>> James Kirkland > >>>>> Principal Enterprise Solutions Architect > >>>>> 3340 Peachtree Road, NE, > >>>>> Suite 1200 > >>>>> Atlanta, GA 30326 USA. > >>>>> Phone (404) 254-6457<https://www.google.com/voice#phones> > >>>>> RHCE Certificate: 805009616436562 > >>> -------------------------------------------------------------------- > >>> - To unsubscribe, e-mail: [hidden email] For > >>> additional commands, e-mail: [hidden email] > >>> > >>> . > >>> > > . > > --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
|
Steve,
Just one other thought. On other multi-threaded applications I have usually found a significant speed up by moving to a more thread efficient memory allocator like that provided by Intel's Thread Building Blocks (TBB) or Google's tcmalloc (part of google-perftools) Is this something that you think might be worth a look, or is QPID doing something clever already? Clive On 03/05/2012 22:04, Steve Huston wrote: > Ok, Clive - thanks very much for the follow-up! Glad you have this > situation in hand now. > > -Steve > >> -----Original Message----- >> From: CLIVE [mailto:[hidden email]] >> Sent: Thursday, May 03, 2012 4:53 PM >> To: Steve Huston >> Cc: [hidden email]; 'James Kirkland' >> Subject: Re: QPID performance on virtual machines >> >> Steve, >> >> Managed to run some more performance tests today using a RHEL5u4 VM on >> a Dell R710 . Ran qpid-perftest with default values on the same VM as > qpidd, >> each test ran several times with the calculated average shown in the > table >> below. >> >> CPUs RAM Publish Consume >> 2 4G 48K 46K >> 4 4G 65K 60K >> 6 4G 73K 66K >> 2 8G 46K 44K >> 4 8G 65K 61K >> 6 8G 74K 67K >> >> Basically it confirms your assertion about the broker using more threads >> under heavy load. Changing the VM memory had no discernible effect on >> performance, but increasing the number of CPU's available to the VM had > a >> big effect on throughput. >> >> So when defining a VM for QPID transient usage focus on CPU > allocation!!! >> Thanks for the advise and help. >> >> Clive >> >> >> >> On 03/05/2012 15:27, Steve Huston wrote: >>> Hi Clive, >>> >>> The broker will use threads based on load - if the broker takes longer >>> to process a message than qpid-perftest takes to send the next >>> message, the broker would need more threads. >>> >>> A more pointed test for broker performance would be to run the client >>> on another host - then you know the non-VM vs. VM differences are just >>> the broker's actions. It may be a little confusing weeding out the > actual vs. >>> virtual NIC issues, but there would be no confusion about how much the >>> client is taking away from resources available to the broker. >>> >>> -Steve >>> >>>> -----Original Message----- >>>> From: CLIVE [mailto:[hidden email]] >>>> Sent: Wednesday, May 02, 2012 5:28 PM >>>> To: [hidden email] >>>> Cc: Steve Huston; 'James Kirkland' >>>> Subject: Re: QPID performance on virtual machines >>>> >>>> Steve, >>>> >>>> I thought about this as well. So re-started the broker on the >>>> physical >>> Dell >>>> R710 with the threads option set to just 4 and saw the same >>>> throughput values (85000 publish and 80000 subscribe). As reducing >>>> the threads >>> count >>>> didn't seem to have much effect on the physical machine I thought >>>> that >>> this >>>> probably wasn't the issue. >>>> >>>> As the qpid-perftest application was only creating 1 producer and 1 >>> consumer >>>> I reasoned that perhaps the broker was only using two threads too >>> service >>>> the read and writes from these clients. This was why reducing the >>>> thread count on the broker had no effect. Would you expect the broker >>>> to use >>> more >>>> than two threads to service the clients for this scenario? >>>> >>>> I will rerun the test tomorrow based on an increased number of CPU's >>>> in >>> the >>>> VM(s) just to double check whether it is a number of cores issue. >>>> >>>> I did run 'strace -c' on qpidd while the test was running to count >>>> the >>> number >>>> of system calls and I noted the big hitters were futex and write. >>>> Interestingly the reads read in 64K chunks, but the writes were only >>>> 2048 bytes at a time. As a result the number writes occurring were an >>> order >>>> of magnitude bigger than the reads; I left the detailed results at >>>> work >>> so >>>> apologies for not quoting the actual figures. >>>> >>>> Clive >>>> >>>> On 02/05/2012 20:23, Steve Huston wrote: >>>>> The qpid broker learns how many CPUs are available and will run more >>>>> I/O threads when more CPUs are available (#CPUs + 1 threads). It >>>>> would be interesting to see the results if your VM gets more CPUs. >>>>> >>>>> -Steve >>>>> >>>>>> -----Original Message----- >>>>>> From: CLIVE [mailto:[hidden email]] >>>>>> Sent: Wednesday, May 02, 2012 1:30 PM >>>>>> To: James Kirkland >>>>>> Cc: [hidden email] >>>>>> Subject: Re: QPID performance on virtual machines >>>>>> >>>>>> James, >>>>>> >>>>>> qpid-perf-test (as supplied with the qpid-0.14 source tar ball) >>>>>> runs a >>>>> direct >>>>>> queue test when executed without any parameters; there is a >> command >>>>>> line option that enables this to be be changed if required. The >>>>>> message size >>>>> is >>>>>> 1024K (again default size when not explicitly set). And >>>>>> 500000 messages are published by the test (again the default when >>>>>> not explicitly set). All messages are transient so I wouldn't >>>>>> expect any >>>>> file I/O >>>>>> overhead to interfere with the test and this is confirmed by the >>>>>> vmstat results I am seeing. The only jump in the vmstat output is >>>>>> the number of context switches that are occurring which jumps up >>>>>> into the >>>> thousands. >>>>>> Clive >>>>>> >>>>>> On 02/05/2012 18:10, James Kirkland wrote: >>>>>>> What sort of messging scenario is it? Are the messages persisted? >>>>>>> How big are they? If they are persisted are you using virtual >>>>>>> disks or physical devices? >>>>>>> >>>>>>> CLIVE wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I have been undertaking some performance profiling of QPID >>>>>>>> version >>>>>>>> 0.14 over the last few weeks and I have found a significant >>>>>>>> performance drop off when running QPID in a virtual machine. >>>>>>>> >>>>>>>> As an example if I run qpidd on an 8 core DELL R710 with 36G RAM >>>>>>>> (RHEL5u5) and then run qpid-perf-test (on the same machine to >>>>>>>> discount any network problems) without any command line >>>> parameters >>>>>>>> I am seeing about 85,000 publish transfers/sec and 80000 consume >>>>>>>> transfers/sec. If I run the same scenario on a VM (tried both KVM >>>>>>>> and VMWare ESXi 4.3 running RHEL5u5) with 2 cores and 8G RAM, I >>>>>>>> am seeing only 45000 publish transfers/sec and 40000 consume >>>>>>>> transfers/sec. A significant drop off in performance. Looking at >>>>>>>> the cpu and memory usage these would not seem to be the limiting >>>>>>>> factors as the memory consumption of qpidd stays under 200 >> MBytes >>>>>>>> and its CPU is up at about 150%; hence the two core machine. >>>>>>>> >>>>>>>> I have even run the same test on my Mac Book at home using >> VMWare >>>>>>>> Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000 >>>>>>>> transfers/sec results. >>>>>>>> >>>>>>>> I would expect a small drop off in performance when running in a >>>>>>>> VM, but not to the extent that I am seeing. >>>>>>>> >>>>>>>> Has anyone else seen this and if so were they able to get to the >>>>>>>> bottom of the issue. >>>>>>>> >>>>>>>> Any help would be appreciated. >>>>>>>> >>>>>>>> Clive Lilley >>>>>>>> >>>>>>> -- >>>>>>> James Kirkland >>>>>>> Principal Enterprise Solutions Architect >>>>>>> 3340 Peachtree Road, NE, >>>>>>> Suite 1200 >>>>>>> Atlanta, GA 30326 USA. >>>>>>> Phone (404) 254-6457<https://www.google.com/voice#phones> >>>>>>> RHCE Certificate: 805009616436562 >>>>> -------------------------------------------------------------------- >>>>> - To unsubscribe, e-mail: [hidden email] For >>>>> additional commands, e-mail: [hidden email] >>>>> >>>>> . >>>>> >>> . >>> > . > --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
|
I was chatting to Kim about this, this week and I believe we should do something along these lines (custom memory allocator) for quite a few reasons. Carl. On 05/03/2012 05:42 PM, CLIVE wrote: > Steve, > > Just one other thought. On other multi-threaded applications I have > usually found a significant speed up by moving to a more thread > efficient memory allocator like that provided by Intel's Thread > Building Blocks (TBB) or Google's tcmalloc (part of google-perftools) > > Is this something that you think might be worth a look, or is QPID > doing something clever already? > > Clive > > On 03/05/2012 22:04, Steve Huston wrote: >> Ok, Clive - thanks very much for the follow-up! Glad you have this >> situation in hand now. >> >> -Steve >> >>> -----Original Message----- >>> From: CLIVE [mailto:[hidden email]] >>> Sent: Thursday, May 03, 2012 4:53 PM >>> To: Steve Huston >>> Cc: [hidden email]; 'James Kirkland' >>> Subject: Re: QPID performance on virtual machines >>> >>> Steve, >>> >>> Managed to run some more performance tests today using a RHEL5u4 VM on >>> a Dell R710 . Ran qpid-perftest with default values on the same VM as >> qpidd, >>> each test ran several times with the calculated average shown in the >> table >>> below. >>> >>> CPUs RAM Publish Consume >>> 2 4G 48K 46K >>> 4 4G 65K 60K >>> 6 4G 73K 66K >>> 2 8G 46K 44K >>> 4 8G 65K 61K >>> 6 8G 74K 67K >>> >>> Basically it confirms your assertion about the broker using more >>> threads >>> under heavy load. Changing the VM memory had no discernible effect on >>> performance, but increasing the number of CPU's available to the VM had >> a >>> big effect on throughput. >>> >>> So when defining a VM for QPID transient usage focus on CPU >> allocation!!! >>> Thanks for the advise and help. >>> >>> Clive >>> >>> >>> >>> On 03/05/2012 15:27, Steve Huston wrote: >>>> Hi Clive, >>>> >>>> The broker will use threads based on load - if the broker takes longer >>>> to process a message than qpid-perftest takes to send the next >>>> message, the broker would need more threads. >>>> >>>> A more pointed test for broker performance would be to run the client >>>> on another host - then you know the non-VM vs. VM differences are just >>>> the broker's actions. It may be a little confusing weeding out the >> actual vs. >>>> virtual NIC issues, but there would be no confusion about how much the >>>> client is taking away from resources available to the broker. >>>> >>>> -Steve >>>> >>>>> -----Original Message----- >>>>> From: CLIVE [mailto:[hidden email]] >>>>> Sent: Wednesday, May 02, 2012 5:28 PM >>>>> To: [hidden email] >>>>> Cc: Steve Huston; 'James Kirkland' >>>>> Subject: Re: QPID performance on virtual machines >>>>> >>>>> Steve, >>>>> >>>>> I thought about this as well. So re-started the broker on the >>>>> physical >>>> Dell >>>>> R710 with the threads option set to just 4 and saw the same >>>>> throughput values (85000 publish and 80000 subscribe). As reducing >>>>> the threads >>>> count >>>>> didn't seem to have much effect on the physical machine I thought >>>>> that >>>> this >>>>> probably wasn't the issue. >>>>> >>>>> As the qpid-perftest application was only creating 1 producer and 1 >>>> consumer >>>>> I reasoned that perhaps the broker was only using two threads too >>>> service >>>>> the read and writes from these clients. This was why reducing the >>>>> thread count on the broker had no effect. Would you expect the broker >>>>> to use >>>> more >>>>> than two threads to service the clients for this scenario? >>>>> >>>>> I will rerun the test tomorrow based on an increased number of CPU's >>>>> in >>>> the >>>>> VM(s) just to double check whether it is a number of cores issue. >>>>> >>>>> I did run 'strace -c' on qpidd while the test was running to count >>>>> the >>>> number >>>>> of system calls and I noted the big hitters were futex and write. >>>>> Interestingly the reads read in 64K chunks, but the writes were only >>>>> 2048 bytes at a time. As a result the number writes occurring were an >>>> order >>>>> of magnitude bigger than the reads; I left the detailed results at >>>>> work >>>> so >>>>> apologies for not quoting the actual figures. >>>>> >>>>> Clive >>>>> >>>>> On 02/05/2012 20:23, Steve Huston wrote: >>>>>> The qpid broker learns how many CPUs are available and will run more >>>>>> I/O threads when more CPUs are available (#CPUs + 1 threads). It >>>>>> would be interesting to see the results if your VM gets more CPUs. >>>>>> >>>>>> -Steve >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: CLIVE [mailto:[hidden email]] >>>>>>> Sent: Wednesday, May 02, 2012 1:30 PM >>>>>>> To: James Kirkland >>>>>>> Cc: [hidden email] >>>>>>> Subject: Re: QPID performance on virtual machines >>>>>>> >>>>>>> James, >>>>>>> >>>>>>> qpid-perf-test (as supplied with the qpid-0.14 source tar ball) >>>>>>> runs a >>>>>> direct >>>>>>> queue test when executed without any parameters; there is a >>> command >>>>>>> line option that enables this to be be changed if required. The >>>>>>> message size >>>>>> is >>>>>>> 1024K (again default size when not explicitly set). And >>>>>>> 500000 messages are published by the test (again the default when >>>>>>> not explicitly set). All messages are transient so I wouldn't >>>>>>> expect any >>>>>> file I/O >>>>>>> overhead to interfere with the test and this is confirmed by the >>>>>>> vmstat results I am seeing. The only jump in the vmstat output is >>>>>>> the number of context switches that are occurring which jumps up >>>>>>> into the >>>>> thousands. >>>>>>> Clive >>>>>>> >>>>>>> On 02/05/2012 18:10, James Kirkland wrote: >>>>>>>> What sort of messging scenario is it? Are the messages persisted? >>>>>>>> How big are they? If they are persisted are you using virtual >>>>>>>> disks or physical devices? >>>>>>>> >>>>>>>> CLIVE wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I have been undertaking some performance profiling of QPID >>>>>>>>> version >>>>>>>>> 0.14 over the last few weeks and I have found a significant >>>>>>>>> performance drop off when running QPID in a virtual machine. >>>>>>>>> >>>>>>>>> As an example if I run qpidd on an 8 core DELL R710 with 36G RAM >>>>>>>>> (RHEL5u5) and then run qpid-perf-test (on the same machine to >>>>>>>>> discount any network problems) without any command line >>>>> parameters >>>>>>>>> I am seeing about 85,000 publish transfers/sec and 80000 consume >>>>>>>>> transfers/sec. If I run the same scenario on a VM (tried both KVM >>>>>>>>> and VMWare ESXi 4.3 running RHEL5u5) with 2 cores and 8G RAM, I >>>>>>>>> am seeing only 45000 publish transfers/sec and 40000 consume >>>>>>>>> transfers/sec. A significant drop off in performance. Looking at >>>>>>>>> the cpu and memory usage these would not seem to be the limiting >>>>>>>>> factors as the memory consumption of qpidd stays under 200 >>> MBytes >>>>>>>>> and its CPU is up at about 150%; hence the two core machine. >>>>>>>>> >>>>>>>>> I have even run the same test on my Mac Book at home using >>> VMWare >>>>>>>>> Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000 >>>>>>>>> transfers/sec results. >>>>>>>>> >>>>>>>>> I would expect a small drop off in performance when running in a >>>>>>>>> VM, but not to the extent that I am seeing. >>>>>>>>> >>>>>>>>> Has anyone else seen this and if so were they able to get to the >>>>>>>>> bottom of the issue. >>>>>>>>> >>>>>>>>> Any help would be appreciated. >>>>>>>>> >>>>>>>>> Clive Lilley >>>>>>>>> >>>>>>>> -- >>>>>>>> James Kirkland >>>>>>>> Principal Enterprise Solutions Architect >>>>>>>> 3340 Peachtree Road, NE, >>>>>>>> Suite 1200 >>>>>>>> Atlanta, GA 30326 USA. >>>>>>>> Phone (404) 254-6457<https://www.google.com/voice#phones> >>>>>>>> RHCE Certificate: 805009616436562 >>>>>> -------------------------------------------------------------------- >>>>>> - To unsubscribe, e-mail: [hidden email] For >>>>>> additional commands, e-mail: [hidden email] >>>>>> >>>>>> . >>>>>> >>>> . >>>> >> . >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [hidden email] > For additional commands, e-mail: [hidden email] > --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
|
Carl,
I ran a test today on the Dell R710 physical machine using qpidd linked against Google's tcmalloc (exported LD_PRELOAD=/home/clive/libs/libtcmalloc_minimal.so before running the qpidd process). When qpid-perftest was executed using it's default values, I saw the publish and consume rates raise from 85000/80000 to 108000/105000 transfers/sec. A significant increase. Producing QPID's own thread optimized malloc or incorporating an existing third party version into the build might have some merit. Anyway thought you might like to know. As an aside I hope to try Intel's TBB next, so will keep you informed on how this performs Clive On 03/05/2012 22:56, Carl Trieloff wrote: > > I was chatting to Kim about this, this week and I believe we should do > something along these lines (custom memory allocator) for quite a few > reasons. > > Carl. > > > On 05/03/2012 05:42 PM, CLIVE wrote: >> Steve, >> >> Just one other thought. On other multi-threaded applications I have >> usually found a significant speed up by moving to a more thread >> efficient memory allocator like that provided by Intel's Thread >> Building Blocks (TBB) or Google's tcmalloc (part of google-perftools) >> >> Is this something that you think might be worth a look, or is QPID >> doing something clever already? >> >> Clive >> >> On 03/05/2012 22:04, Steve Huston wrote: >>> Ok, Clive - thanks very much for the follow-up! Glad you have this >>> situation in hand now. >>> >>> -Steve >>> >>>> -----Original Message----- >>>> From: CLIVE [mailto:[hidden email]] >>>> Sent: Thursday, May 03, 2012 4:53 PM >>>> To: Steve Huston >>>> Cc: [hidden email]; 'James Kirkland' >>>> Subject: Re: QPID performance on virtual machines >>>> >>>> Steve, >>>> >>>> Managed to run some more performance tests today using a RHEL5u4 VM on >>>> a Dell R710 . Ran qpid-perftest with default values on the same VM as >>> qpidd, >>>> each test ran several times with the calculated average shown in the >>> table >>>> below. >>>> >>>> CPUs RAM Publish Consume >>>> 2 4G 48K 46K >>>> 4 4G 65K 60K >>>> 6 4G 73K 66K >>>> 2 8G 46K 44K >>>> 4 8G 65K 61K >>>> 6 8G 74K 67K >>>> >>>> Basically it confirms your assertion about the broker using more >>>> threads >>>> under heavy load. Changing the VM memory had no discernible effect on >>>> performance, but increasing the number of CPU's available to the VM had >>> a >>>> big effect on throughput. >>>> >>>> So when defining a VM for QPID transient usage focus on CPU >>> allocation!!! >>>> Thanks for the advise and help. >>>> >>>> Clive >>>> >>>> >>>> >>>> On 03/05/2012 15:27, Steve Huston wrote: >>>>> Hi Clive, >>>>> >>>>> The broker will use threads based on load - if the broker takes longer >>>>> to process a message than qpid-perftest takes to send the next >>>>> message, the broker would need more threads. >>>>> >>>>> A more pointed test for broker performance would be to run the client >>>>> on another host - then you know the non-VM vs. VM differences are just >>>>> the broker's actions. It may be a little confusing weeding out the >>> actual vs. >>>>> virtual NIC issues, but there would be no confusion about how much the >>>>> client is taking away from resources available to the broker. >>>>> >>>>> -Steve >>>>> >>>>>> -----Original Message----- >>>>>> From: CLIVE [mailto:[hidden email]] >>>>>> Sent: Wednesday, May 02, 2012 5:28 PM >>>>>> To: [hidden email] >>>>>> Cc: Steve Huston; 'James Kirkland' >>>>>> Subject: Re: QPID performance on virtual machines >>>>>> >>>>>> Steve, >>>>>> >>>>>> I thought about this as well. So re-started the broker on the >>>>>> physical >>>>> Dell >>>>>> R710 with the threads option set to just 4 and saw the same >>>>>> throughput values (85000 publish and 80000 subscribe). As reducing >>>>>> the threads >>>>> count >>>>>> didn't seem to have much effect on the physical machine I thought >>>>>> that >>>>> this >>>>>> probably wasn't the issue. >>>>>> >>>>>> As the qpid-perftest application was only creating 1 producer and 1 >>>>> consumer >>>>>> I reasoned that perhaps the broker was only using two threads too >>>>> service >>>>>> the read and writes from these clients. This was why reducing the >>>>>> thread count on the broker had no effect. Would you expect the broker >>>>>> to use >>>>> more >>>>>> than two threads to service the clients for this scenario? >>>>>> >>>>>> I will rerun the test tomorrow based on an increased number of CPU's >>>>>> in >>>>> the >>>>>> VM(s) just to double check whether it is a number of cores issue. >>>>>> >>>>>> I did run 'strace -c' on qpidd while the test was running to count >>>>>> the >>>>> number >>>>>> of system calls and I noted the big hitters were futex and write. >>>>>> Interestingly the reads read in 64K chunks, but the writes were only >>>>>> 2048 bytes at a time. As a result the number writes occurring were an >>>>> order >>>>>> of magnitude bigger than the reads; I left the detailed results at >>>>>> work >>>>> so >>>>>> apologies for not quoting the actual figures. >>>>>> >>>>>> Clive >>>>>> >>>>>> On 02/05/2012 20:23, Steve Huston wrote: >>>>>>> The qpid broker learns how many CPUs are available and will run more >>>>>>> I/O threads when more CPUs are available (#CPUs + 1 threads). It >>>>>>> would be interesting to see the results if your VM gets more CPUs. >>>>>>> >>>>>>> -Steve >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: CLIVE [mailto:[hidden email]] >>>>>>>> Sent: Wednesday, May 02, 2012 1:30 PM >>>>>>>> To: James Kirkland >>>>>>>> Cc: [hidden email] >>>>>>>> Subject: Re: QPID performance on virtual machines >>>>>>>> >>>>>>>> James, >>>>>>>> >>>>>>>> qpid-perf-test (as supplied with the qpid-0.14 source tar ball) >>>>>>>> runs a >>>>>>> direct >>>>>>>> queue test when executed without any parameters; there is a >>>> command >>>>>>>> line option that enables this to be be changed if required. The >>>>>>>> message size >>>>>>> is >>>>>>>> 1024K (again default size when not explicitly set). And >>>>>>>> 500000 messages are published by the test (again the default when >>>>>>>> not explicitly set). All messages are transient so I wouldn't >>>>>>>> expect any >>>>>>> file I/O >>>>>>>> overhead to interfere with the test and this is confirmed by the >>>>>>>> vmstat results I am seeing. The only jump in the vmstat output is >>>>>>>> the number of context switches that are occurring which jumps up >>>>>>>> into the >>>>>> thousands. >>>>>>>> Clive >>>>>>>> >>>>>>>> On 02/05/2012 18:10, James Kirkland wrote: >>>>>>>>> What sort of messging scenario is it? Are the messages persisted? >>>>>>>>> How big are they? If they are persisted are you using virtual >>>>>>>>> disks or physical devices? >>>>>>>>> >>>>>>>>> CLIVE wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> I have been undertaking some performance profiling of QPID >>>>>>>>>> version >>>>>>>>>> 0.14 over the last few weeks and I have found a significant >>>>>>>>>> performance drop off when running QPID in a virtual machine. >>>>>>>>>> >>>>>>>>>> As an example if I run qpidd on an 8 core DELL R710 with 36G RAM >>>>>>>>>> (RHEL5u5) and then run qpid-perf-test (on the same machine to >>>>>>>>>> discount any network problems) without any command line >>>>>> parameters >>>>>>>>>> I am seeing about 85,000 publish transfers/sec and 80000 consume >>>>>>>>>> transfers/sec. If I run the same scenario on a VM (tried both KVM >>>>>>>>>> and VMWare ESXi 4.3 running RHEL5u5) with 2 cores and 8G RAM, I >>>>>>>>>> am seeing only 45000 publish transfers/sec and 40000 consume >>>>>>>>>> transfers/sec. A significant drop off in performance. Looking at >>>>>>>>>> the cpu and memory usage these would not seem to be the limiting >>>>>>>>>> factors as the memory consumption of qpidd stays under 200 >>>> MBytes >>>>>>>>>> and its CPU is up at about 150%; hence the two core machine. >>>>>>>>>> >>>>>>>>>> I have even run the same test on my Mac Book at home using >>>> VMWare >>>>>>>>>> Fusion 4 ( 2 Core 4G RAM) and see the same 45000/40000 >>>>>>>>>> transfers/sec results. >>>>>>>>>> >>>>>>>>>> I would expect a small drop off in performance when running in a >>>>>>>>>> VM, but not to the extent that I am seeing. >>>>>>>>>> >>>>>>>>>> Has anyone else seen this and if so were they able to get to the >>>>>>>>>> bottom of the issue. >>>>>>>>>> >>>>>>>>>> Any help would be appreciated. >>>>>>>>>> >>>>>>>>>> Clive Lilley >>>>>>>>>> >>>>>>>>> -- >>>>>>>>> James Kirkland >>>>>>>>> Principal Enterprise Solutions Architect >>>>>>>>> 3340 Peachtree Road, NE, >>>>>>>>> Suite 1200 >>>>>>>>> Atlanta, GA 30326 USA. >>>>>>>>> Phone (404) 254-6457<https://www.google.com/voice#phones> >>>>>>>>> RHCE Certificate: 805009616436562 >>>>>>> -------------------------------------------------------------------- >>>>>>> - To unsubscribe, e-mail: [hidden email] For >>>>>>> additional commands, e-mail: [hidden email] >>>>>>> >>>>>>> . >>>>>>> >>>>> . >>>>> >>> . >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [hidden email] >> For additional commands, e-mail: [hidden email] >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [hidden email] > For additional commands, e-mail: [hidden email] > > . > --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
| Powered by Nabble | Edit this page |
