War of allocators: The field of Workers

Since, two of our tested allocators - TCmalloc, JEmalloc - benefit in multi-threaded environment, I did some benchmark which uses threading effectively. I run 2 instances of each popular benchmarks simultaneously with the help of JavaScript workers. I benchmarked on Linux-Qt port of WebKit and used official r54475 revision. All measurements were running in QtLauncher and did minimal painting only. The performance results represent the slower workers average runtime, the memory results represent the maximum resident set size of each simultaneously running benchmark suites.

Performance

 

 

As you can see on the chart, in the case of SunSpider, the results don't follow the tendencies that you can saw in my older posts. JEmalloc overtakes System malloc, but still 9.3% slower than TCmalloc. An interesting thing is that System malloc overtakes both TCmalloc - by 2.9% - and JEmalloc in the case of two simultaneously running V8 benchmarks, but on the other side we'll see it's enormous memory consumption. Our two WindScorpion workers show the tendency what was general in my older measurements: TCmalloc is the fastest and JEmalloc is the slowest.

Memory consumption

 

 

In the case of SunSpider, there is nothing surprising, TCmalloc's memory consumption is the highest and System allocator's the lowest. V8 benchmark shows the most interesting values... Since, a single running simple V8 benchmark consumes ~140MB memory, thus far two V8 workers consume ~350MB with the TCmalloc and JEmalloc, with the system allocator it consumes ~450MB. This means 23% extra memory usage compared to JEmalloc (in this case JEmalloc shows the best values)! The results of WindScorpion show that System allocator and JEmalloc produces almost the same values. TCmalloc still consumes the most memory, but by only 4%.

Summary

 

Sys. malloc

JEmalloc

TCmalloc

SunSpider

767 ms

616 ms

559 ms

V8

5 519 ms

5 741 ms

5 678 ms

WindScorpion

17 055 ms

18 891 ms

15 972 ms

SunSpider

47 099 kbytes

51 626 kbytes

56 348 kbytes

V8

446 417 kbytes

344 354 kbytes

356 808 kbytes

WindScorpion

153 452 kbytes

152 592 kbytes

160 087 kbytes

 

I think I've set up and measured a non general situation, because workers is a relatively new technology in JavaScript and only few JavaScript developers use it for real life problems, but it was a good stress-test for our allocators considering their multi-threading benefits. JEmalloc still doesn't convince me. Moreover, it'd be a good idea to investigate why does the System allocator consume that much memory in the case of V8 workers test?

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • No HTML tags allowed
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Fill in the blank