Business Technology Optimization

The Benchmark King: How Futuremark Keeps Performance Score

When it comes to benchmarking, Oliver Baltuch concedes, rather mildly, there can be “a lot of religious arguments”. That’s for sure: benchmark scores have long been powerful marketing tools and have thus been prone to accusations of skulduggery and shenanigans by vendors scorned by low marks. But, as president of the highly respected benchmarking group Futuremark, Baltuch is a proponent of fair play and when I met the Canadian in London recently, he talked frankly about the issues surrounding these tests and the importance of transparency.

Struggling to be heard amid the noise of train departure announcements at our appointed meeting place of Waterloo Station, I start by asking, benchmarks: what are they good for? Despite an engineering background at companies like nVidia and National Semiconductor, Baltuch is disarmingly down to earth.

“When you look at what goes in to a computer and how the processor is built, the computer user doesn’t care,” he says. “They’re interested in what they can do with it. It’s a tool. Whether it’s AMD, nVidia or ARM shouldn’t matter that much, but what matters is [the ability to fulfil tasks]. What a benchmark is for is to take a machine and place it into human terms — because we’re human, not machines. A benchmark will allow you to compare two machines that are almost identical, to see the differences.”

In the IBM-compatible PC sector that gave rise to an army of so-called clones, the differences may be minor indeed, but for organisations buying in volume, the variation in bang for buck can be highly significant.

“If an organisation is buying 10,000 or 100,000 machines they want to know they get the best value for the money they’re spending,” he says. “A dollar saving on one machine could be worth $100,000 in total.”

Founded in 1997, Futuremark is often touted as the leader in the obscure world (to most, anyway) of benchmarking and much of its strength lies in its openness. Baltuch says he joined the company eight years ago because it held out the prospect of delivering a “fairer, neutral, more comprehensive” system. Changes to benchmarks, created in Finland where the company is headquartered, are commented on by a consortium and there is a transparent spec change request system. The result is a trusted meter that, Baltuch says, delivers scores within a two per cent variance and sees a Futuremark benchmark being run somewhere in the world every four seconds.

No matter how fair, scrutiny is only to be expected, hence the focus on showing there’s a level playing field.

“At the end of a benchmark release, almost nobody likes it… so we know we built it fair. If everyone is screaming at us, that’s a good sign. We’re one of the most neutral companies in the world. All our books are open. We’re in Finland [where financial openness rules are stringent]. We have an open membership and they all pay the same.”

Attempts to game the benchmark suites have been exposed but it’s a challenging business to create these tests as computer designs change quickly, as do what we use them for. Futuremark’s response has been to introduce new benchmarks so, for example, 3DMark to score gaming performance is 75% a GPU benchmark while the broader PCMark is 40% CPU with the rest of the suite testing GPU, storage, memory and bandwidth. Benchmarks are refreshed with each significant operating system release.

The result is a score that gives a fair reflection of how a computer will perform, assuming you’re one of the large majority of users. The benchmark wouldn't work so well for a specialist computer rig used by video special effects creator or CAD specialist say, but then it’s not designed to do so.

Baltuch says Futuremark is always considering adaptations. Versions of 3DMark for iOS and Android were released recently, for example.

The benchmarking job is arguably harder than ever before. Once upon a time, CPU clock speed, measured in megahertz, was seen as a good guide. Now, processors are very different in architecture, a GPU might act as a parallel processor for the CPU and new elements —such as how a solid state hard drive is affected by a storage controller — grow in relevance.

“[In previous times], the processor frequency was a good guide but with GPU, multicore and SSD it’s very difficult for these poor guys [creating the benchmarks]."

And then of course there are the challenges and upset caused by a vendor achieving a low score. That still happens, right?

Baltuch laughs.

“Would you like to see the stripes on my back? There are fracases but we work purely on data. Unless someone shows us a darned good reason, we’re not going to change the benchmark.”

The benchmark, used the world over, is developed by a small company, stuffed to the case with engineers and with a sales team operating from Taiwan. Current headcount is about 35 people, looking reassuringly geeky here.

“All the [rival] benchmarks have something in common: they’re written by a single person or consultants and the process of creating them tends to be pretty opaque,” Baltuch says.

Rival benchmarking group BAPCo is called out as being Intel-backed, while the not-for-profit is supported by many PC vendors it appears to have lost the faith of some in the semiconductor space.

We need experts who can measure the right things and be open and even-handed. Otherwise, we stand, as individuals and corporations, to waste vast sums: this applies to the private sector but also government and the public sector where tendering documents can often skew choices and lock out some vendors. That’s a situation that is seeing more national and regional groups incorporate Futuremark in tendering documents. Even if they can be controversial and they never stand still for long, one thing is clear: benchmarks are important. And in any high-stakes game, you need somebody trustworthy to keep score.


Martin Veitch is Editorial Director at IDG Connect


« Politicians Talk Tech: Peter Fleming, Sevenoaks District Council Leader


CMO Files: Stacey Epstein, CMO, ServiceMax »
Martin Veitch

Martin Veitch is Contributing Editor for IDG Connect

  • twt
  • twt
  • Mail


Do you think your smartphone is making you a workaholic?