Data Center

How The Machine from HPE will find its way into your data centre

HPE prides itself on a track record of research and development that has delivered many cutting edge technologies, but now the firm is trying to overhaul the underlying architecture behind today’s computer systems in order to prepare for the demands that organisations will face with future applications.

First demonstrated last year, The Machine is a research project that may never make it into production in its current form, but was intended to showcase a bundle of new technologies, some of which are likely to find their way into future HPE systems and thus into corporate data centres.


Memory-driven computing

Chief among these is a new approach that HPE calls memory-driven computing, intended to address the growing volumes of data that enterprise systems are increasingly required to process. However, this also calls for new high-speed interconnects to make the architecture work, and HPE also envisioned new non-volatile or persistent memory technology to replace DRAM, something which has yet to materialise.

“The whole idea with what we do in Hewlett-Packard Labs and advanced development is, we want to look out several years in advance, and try and disrupt our own operations and roadmaps before someone else does,” says Andrew Wheeler, HPE vice president and deputy director for Hewlett Packard Labs.

“Several years ago, we saw some of the trends that showed the rate at which data is growing, especially for enterprise mission-critical high performance computing. Not just from human generated stuff but also machine generated, and if we looked at our current systems and architectures, we saw a bottleneck that was coming in terms of our ability to ingest all of that data, process it, store it, and maybe even more importantly, secure it,” he adds.

This fed into many of the research programmes that Hewlett Packard Labs was working on at the time, until the decision was made to try and bring it all together into a real, physical system that could validate the approach instead of just existing as a laboratory simulation. The result was the first prototype, demonstrated at the HPE Discover conference in London last year.

Since then, HPE has worked on scaling up the prototype, and in May announced it had progressed to a configuration 20 times the capacity of the original, from a system comprising just a couple of nodes to one with 40 nodes, containing 160TB of shared memory. It claimed this as the world’s largest single-memory computer.


A single large memory space

Having a single large memory space is one of the key aspects of The Machine. The notion is that the memory should be large enough so that the entire data set associated with a specific workload can be processed directly from memory.

This contrasts with the way most data sets are processed today. A database, for example, may contain a large data set, but this typically resides on disk and only a small portion of the entire data is in memory at any given time. Likewise, high performance compute (HPC) systems designed for applications such as analytics processing of large data sets typically comprise a cluster of individual nodes, each holding and processing a separate piece of the data in its memory.

While The Machine is also made up of a cluster of nodes, these are connected in such a way that their combined memory capacity functions as if it were a single large memory space.

The key to this is a high-speed memory fabric that interconnects the nodes using silicon photonics. Each individual node of The Machine contains a processor directly controlling its own small pool of local working memory, plus a larger chunk of global memory. Both the processor module and the global memory are independently connected to the memory fabric, which links them with the other nodes.


A memory fabric

The architecture of The Machine can thus be visualised as a bunch of processors and a bunch of memory blocks, all meshed together by the memory fabric so that any of the processors can access any part of the global memory pool. This should mean that HPE’s architecture can scale to meet requirements simply by connecting more processors and memory to the memory fabric.

HPE is not alone in following this development path. Intel is aiming towards a similar end goal with its Rack Scale Design (RSD) initiative, as is the EU’s Horizon 2020 research programme with its dRedBox project. All have the goal of uncoupling memory from the processors in a system, in order to enable greater flexibility and scalability.

Such an architecture also calls for a rethink of the way software operates, and for The Machine HPE has developed a custom version of Linux, optimised to support operations using fabric-attached memory.

In HPE’s current prototype, the processors are Cavium’s ThunderX2, an ARM-based system-on-a-chip with 32 cores, but could just as easily be Intel or AMD chips. They could also be something more exotic such as GPUs or specialised accelerators implemented using FPGA (field programmable gate array) chips, or even a mixture of these, according to Wheeler.

“The great benefit is that all of that processing you can attach directly to this persistent memory pool, and they can essentially work on the same data set, but using the functions they are tailored for from both performance and energy efficiency standpoints. So we really see this as a heterogeneous environment, and this memory fabric that hosts all of the memory, that itself is now implemented on top of an open industry standard,” he says.

The open standard that Wheeler refers to is Gen-Z, which is being developed by a consortium of companies including HPE, ARM, Dell EMC, IBM, Lenovo and others. This is a high bandwidth, low latency interface that supports memory-style read and write operations, and is intended to enable large distributed memory pools, such as in The Machine.


Silicon photonics

In HPE’s current prototype, the interfaces are not strictly compliant with the Gen-Z specifications, simply because the standard is still evolving. Connections within each node use copper tracks on the system board as usual, while connections between nodes use HPE’s X1 silicon photonics module to transmit over optical fibre, offering an impressive 1.2 terabits per second of bandwidth, split into 600Gbps in each direction.

“We’ve taken a lot of our learnings and essentially contributed those to the consortium, so over time, if we were to build out something we would productise, it would be based on the Gen-Z specifications,” Wheeler says.

Meanwhile, the global memory pool in The Machine is currently implemented as standard DRAM, but HPE’s original plan revolved around persistent memory, which retains its content even in the absence of power.

This capability was to have been filled by HPE’s own memristor technology, but development of this has been delayed. However, HPE said that its architecture can work with various memory technologies, such as the NVDIMMs already used in its ProLiant servers that combine DRAM with flash, and so-called storage-class memories like Intel’s 3D XPoint or the ReRAM (Resistive RAM) HPE is partnering with Western Digital on.

Even using DRAM, a system based on The Machine architecture could prove useful in HPC applications, according to Wheeler.

“It turns out that quite a lot of the problems in the HPC and analytics space can actually live with a model like that, if we also have a mechanism to quickly checkpoint [save a snapshot of] the data. So, we’ve actually been working on that and had some very good results, and that’s something we could bring to market, even as these other persistent memory technologies come online,” he says.

And this is not the only place that technology behind The Machine could find a use, thanks to its scalability.

“We really envision it as something that scales all the way from IoT edge devices, something we call the intelligent edge, all the way up through something that looks more like a traditional server or one of our blade systems - something very composable - up to something you could really build out at large scale,” Wheeler says.


The impact on IoT

In particular, the processing power that memory-driven computing can bring to bear should prove useful for the kind of analytics that some IoT applications call for.

“When you think of an IoT device, you think it’s just some dumb sensor, but if you see that a lot of the machine generated data will be processed at that level, and if you think about the architecture and overall solution you need, I think that’s an example of what’s going to happen,” he adds.

The same goes for autonomous vehicles, one of the hot topics of the moment. These will need a huge amount of processing power to in order to analyse sensor data, and will need to do so in real-time in order to avoid collisions.

All of this means that while we may not actually see The Machine itself find its way into commercial use, it is helping to develop new technologies that will find their way into enterprise computing, including Gen-Z, the memory-driven architecture itself, and the software tools to exploit that architecture to best effect.


« Under pressure: Is it now make or break for net neutrality?


Is quantum computing nearly here yet? »
Dan Robinson

Dan Robinson has over 20 years of experience as an IT journalist, covering everything from smartphones to IBM mainframes and supercomputers as well as the Windows PC industry. Based in the UK, Dan has a background in electronics and a BSc Hons in Information Technology.

  • Mail


Do you think your smartphone is making you a workaholic?