TestComplete PPC Ads TC10 - CS6-02

In Smartphones and Tablets, Multicore is Not Necessarily the Way to Go

multicore-mobile-tablet

The PC solution for the CPU performance wall was to throw more cores at it. That’s not necessarily the best way to improve mobile app performance.

Several years ago, desktop PCs hit a performance wall. Intel thought that it could keep raising the clock speeds and keep up with the cooling issues, but something happened on the way to 4GHz. That “something” was a heat wall that required a heat sink and fan almost as big as a power supply to keep the CPU cool. Other crazy experiments included liquid cooling, including antifreeze coolant, and even liquid nitrogen.

The trick, then, was to go multicore, and to speed up PCs by dividing the tasks among cores. Windows XP handled multicore technology better than it had any right to, given its age. Windows XP saw a two-core CPU as two CPUs and processes were split accordingly. Microsoft would improve on multicore support in Windows 7.

Of Windows Vista, we shall not speak its name.

As smartphone hardware improved and applications demanded more performance, the situation was almost the same. Manufacturers couldn’t make CPUs any faster because of heat dissipation issues in that tiny space and power drainage. So rather than put a 2GHz CPU in a phone, makers began using 1.2GHz dual core phones. Then came quad-core CPUs, like Nvidia’s Tegra 3 and a few in Qualcomm’s Snapdragon line.

However, the scalability that worked well in the PC (mostly) and better in the server market doesn’t work so great in a smartphone, because the nature of the development environment is completely different animal.

This was originally pointed out in a white paper (PDF) by ST-Ericsson, which notes that on the PC multicore never really caught on for a variety of reasons, and they apply even less on the smartphone.

The reason why software developers don’t parallelize their code, wrote the paper’s authors, is that “For most of the PC applications it is simply not necessary or not economical to do so, with the exception of some niche markets for which performance is a compelling differentiating feature, such as for some multimedia applications, CAD and some specific professional areas.”

When it comes to smartphones, the white paper notes that single core performance isn’t even saturated with current technology, so second cores are hardly necessary. Smartphones are driven by Web browsing and HTML5-based applications, and here, multicore is fairly worthless because HTML5 itself is not multicore.

The conclusion, then, is that smartphone makers are putting far too much processing power in phones, much more than necessary. HTML5 will not use three of the four cores of a Tegra 3 but the cores will draw power.

Smartphone developers working with HTML5 agree on this. “Adding more cores is not necessarily the best way to do it but having enough core processing power is,” said Jason Baptiste, CEO of Onswipe, a developer of custom content apps for handheld devices. Baptiste said he would rather have a tighter integration with the GPU. Cascading Style Sheets 3 (CSS) uses hardware acceleration in the GPU, and that’s much more important to him than four CPU cores.

“Two cores make sense, four is experimental, eight is loony,” said Cameron Laird, vice president of boutique Web consultancy Phase It and frequent author on HTML5 development (including several articles here at SmartBear). He expects that smartphones will eventually be able to utilize two cores. “Current Web browsers tend to be among the hardest common loads for mobile devices, especially to the extent that they do interesting HTML5-that-goes-beyond-HTML4 things. I expect mobile browsers to improve significantly over the next year. Dual-cores will look like clear winners over single-core for Web pages.”

One thing in ARM’s favor, Laird noted, is that it has done a very good job at multicore management, so it can reduce power usage selectively. “That’s a significant win, and it’s at hand,” he said.

“Based on what I’ve read, I would agree that multicore is most likely wasted for HTML5 apps,” said Chris Minnick, president of Minnick Web Services, a mobile website designer. “I know that the biggest problem with HTML5 app performance is with people writing inefficient code. It is possible to write HTML5 code that takes advantage of multicores, by using Web Workers.”

“Very few HTML5 apps are written to take advantage of multicore, because very few HTML5 apps are written with any consideration of the hardware they’ll run on at all. The extra layer of abstraction between a Web app and a device makes it easy to forget or neglect to do this type of performance tuning,” Minnick added.

Baptiste is more concerned with the GPU, both its implementation and integration, than with CPU cores. “Having a great GPU that can leverage HTML5 is where it can really matter. The problem is no one is taking advantage of that on a scale that really matters,” he said. “WebGL takes advantage of the graphics processor, but it’s not used on iOS except in iAds.”

The hardware acceleration for CSS3 is done by the GPU, not CPU, as is graphics processing and some JavaScript. So the GPU of the device matters more to him. Onswipe’s latest publication launch is a mobile version of Kiplinger’s personal finance magazine, and it uses the GPU for hardware acceleration, Java libraries to make it feel like a native app and CSS3 for 3D translations.

“I’d rather have a much faster GPU and more memory than more cores,” Baptiste  said.

Minnick felt the same way. “Yes, for sure, [faster cores over more cores]. A faster GPU would be much more preferable but, again, it’s a trade-off between performance and battery life. I’d rather have better battery life and browsers and Web apps that were better optimized for the hardware,” he said.

See also:


  • Anonymous

    So the next step is clear – HTML must die!

    Truly, the current crop of web technologies are an accretion of kludges.

    • Anon

      This!
      As a .NET engineer with a weakness for XAML based technologies, I literally am horrified by mess that is javascript. HTML/jscript quite literally is technology that got waaaaay out of hand. None of both were ever meant to be used like it is used today. They weren’t designed for that. And it shows during development.
      I’ve literally never been on a HTML/jscript project that didn’t end up being a total spaghetti of cryptic code EVERYWHERE, with hacks every other line – just to make it work properly in all browsers. Or just to make something happen in a browser that isn’t supposed to happen in a browser in the first place.
      Perhaps I’m spoiled by the awesomeness that is .NET combined with XAML.
      Although I still have yet to meet the first team that doesn’t curse the internet during development of a web-application.

      • Anon2

        While I’d hesitate to claim that XAML’s any better, HTML/JS really is a horrible mess.

        As for the things that aren’t ‘supposed to happen in a browser’, I suspect ActiveX is at least partially to blame for setting expectations there, because you can make a browser do anything at all if you can root the machine running it.

    • http://gatesvp.blogspot.com Gaëtan Voyer-Perrault

      If HTML dies, what replaces it?

  • http://romanovskis.blogspot.com romanovskis

    So the next step is clear – hardware acceleration for HTML/JS!

    • digi_owl

      Asm.js will deliver just that.

  • digi_owl

    Nope, idle cores do not draw power. Because they can be shut down in modern chip designs. Also, with dynamic clocking a multi-core setup will get more done with lower clock because it can spread the load. And modern smartphones are multitaskers, running multiple social networking apps, a media player and a game or web browser at the same time.

    • david

      absolutely. I think the oposite of what the article says is true. The way to go for tablets and phones is indeed smart multicore processors that are only swiched on when needed.

      • digi_owl

        And that seems to have been the goal of ARM’s BigLittle design, tho i fear Samsung has tarnished it with a bad implementation. This by having the chip switch between the A15 and A7 as a set rather than individually.

  • jkerby

    You seem to forget one simple thing. Most people do PLAY games on their smartphones/tablets. All this CPU nad GPU power is is for game sake only, pretty much the same as in case of ordinary PC where main consumer of top-notch CPUs/GPUs are gamers, not users of productivity apps

  • https://twitter.com/xarinatan Alexander ypema

    While I agree to some extend, I think it depends on the kind of use; Generally phones are single tasking oriented, as opposed to PCs, and thus having one or two cores would be plenty to make it do that one single job it’s supposed to be doing at a time. Some people hardly do anything but gaming with their phone, and sure that’s where to money is because those people buy fast smartphones and do a lot of app purchases and such.

    However modern smartphones ARE multitasking capable, and more importantly they can run apps in the background, generally ones that check mail, twitter or facebook or some other API, and for those apps it’s nice to have at least two cores so that the foreground applications don’t suffer from the periodic polling and processing of the background apps.

    It’d be even better if they took that ‘big little’ concept of faster and slower (more power efficient) cores combined in a single package a little further, where by default you’d be only accessing the lighter cores, and you’d need to have explicit permission to use the faster cores to do heavy lifting. That way it’d also be much easier for OS developers to schedule applications accordingly, since right now with the big little concept all apps are just running distributed across both the big core and the little core at all times, rather than just at the moments they really need it.

  • Julian G.

    There’s a HUGE pitfall you r falling into: HTML5 and derived apps are running in a layer soooooo high in the sky, and hardware is in a deep down on the earth layer. So, my point is that the in-the-middle layers, play much more important roles in the performance of HTML5 applications. IMHO I guess the pitfall is to benchmark (not that there’s any in this article) any system in the physical layer: Multicore Processors, FPGA, Soc Chips, A single threaded 4 bit processor, or an abacus if you will, using Application Layer (and to me HTML5 apps are in the 9nth layer or so) doesn’t make any sense.

    On the other hand I can’t foresee a practical non-hellraiser html5 implementation using GPU (in terms of GPGPU?), because GPU architectures are not branchable, and I guess several ‘if’s are used in html5. Nevertheless the raster graphics portions are suitable to be implemented in GPU’s (like CSS).

    However the article makes sense in saying that Apps won’t take advantage of a multicore architecture, it’s true, the apps doesn’t, OS’s do. Is the OS architecture the one in charge to leverage the power of multicore architectures to applications.

    At the end of the day the message is: HTML5 is not the peek to real multicore power, it is not a proper bench marker. And of course parallelism is the way to go, in terms of power, size and performance…. and to prey quantum computing is available as soon as possible.

  • Pingback: Boot up: GS4′s 20m, canvas demos, Apple iCloud phishing, and more

  • joshuaesmith

    This article is wrong in so many ways it is hard to enumerate. It starts by saying multi-core is only useful for niche apps like multimedia consumption. And then it ignores the fact that this is almost entirely what these devices are used for. Multi-threading in game engines is universal. Physics, in particular, is trivial to parallelize.

    Even the “HTML5 won’t take advantage” argument is specious. Many parts of the HTML5 pipeline can be parallelized, so everything could just be faster/smoother without the JS/HTML programmers knowing anything about how many cores there are. (JPEG decoding, for example; or pushing data between the layout engine and the rendering engine. And don’t forget dealing with the network stack and SSL decryption, which can easily run on a different core than everything else.)

    Where these devices are currently hurting, though, isn’t currently CPU or GPU. It’s memory bandwidth. More cores will be great, but not until they speed up the pipes in and out, so the CPU and GPU aren’t all just sitting there waiting for each other to get out of the way. (The same is true of mobile desktop chipsets. Once you start cranking the GPU, the CPU gets locked out from accessing memory.)

  • Pingback: Links 6/7/2013: Schools on GNU/Linux, Edward Snowden Granted Asylum | Techrights

  • passerby

    Lets get a few things straight:

    Tegra 3 (so does Tegra 4) by default it uses the companion core only i.e. it is a single core.

    GPU is just a multicore CPU designated for graphics (lower clock, tweaked instruction set, a LOT more cores). So saying things like “we don’t care about multicore CPU – we will do things on the GPU” is essentially saying that you need a lot more cores. For instance a modern smartphone can have 72 cores for the GPU. So now can you see the meaningless of the statement of this article?

    Ok, lets talk about browsing on a smartphone, not that I believe that you are even supposed to do that, but still: you either have sites that don’t have a lot on them (then you don’t care about performance) or are full of multimedia content (youtube for instance). In order to process a FullHD video even on your 480p phone you still need a lot of performance and it doesn’t require that performance to be single core.

    Also as it was noted in the beginning of the article: you cannot infinitely increase frequency due to power and heat problems. So if we can’t increase the frequency at which the cpu does things we need to increase the amount of work that the cpu does in 1 cycle. Thus a way to increase performance is increase the instruction set, that would defeat the purpose of a RISC architecture and you will end up with a power-hungry x86 in your phone (as complexity increases so does power usage). The other was to make the cpu do more during the cycle is parallelism so we end up with either a multicore design (that we already know, that is relatively simple to implement) or a MIPS processor that is hard to program for.

    So in the end if we want more performance then there is little choice and we have to go multicore. As for the developers that don’t want to use threads – it beats developing for a MIPS, and you will get more energy-efficient performance if you start threading.

  • Pingback: Boot up: GS4's 20m, canvas demos, Apple iCloud phishing, and more | Technology | guardian.co.uk