Cloudera's CEO builds for new world of data-driven insights

Cloudera boss Tom Reilly is betting on a platform play to get the most out of Hadoop

The Edgware Road, London hotel where Cloudera is taking part in a conference is, even by the standards of technology events, pretty geeky. Here’s a man in a T-shirt animatedly describing the nature of “dirty reads”; there’s a bearded fellow in braces carrying the largest Lego box I have ever seen. Maybe he’s making a particle collider from the pieces. The place is buzzing but then, as one of the fastest growing enterprise software companies ever, Cloudera has plenty of buzz about it too.

This is the company that is at the heart of the Hadoop and Big Data movements, and one with the legs to move very quickly towards leading the space where architectures generated by the internet giants enter open source and enable data analysis to take place on a higher plane than has ever been possible. A mixture of VC funding and a huge Intel stake has given Cloudera over $1bn to play with. The stakes are very high indeed and some suggest that Cloudera is already worth the thick end of $5bn. But when I meet CEO Tom Reilly he’s polite and softly spoken, the antithesis of the brash American tech leader cliché.

His company is reliant on open source and he’s a proponent of community to the extent that customers are the keynote speakers on Cloudera’s global tour and the London event is co-branded by O’Reilly, the near-namesake techie books publisher.

“That’s the way the customers like it,” Reilly says. “It’s not just the vendor pitch.”

He notes that the $740m investment by Intel to take almost a fifth of Cloudera is now over a year old. Some might see that agreement as lucrative but potentially compromising to Cloudera’s independence but Reilly says that the alliance is not exclusive and he bats off such notions. 

“We think it’s all upside,” he says, noting the ability to run instructions on the world’s biggest chip maker’s silicon and winning a periscope view onto what’s happening in infrastructure. “When we entered into this partnership I was always excited but one year in I’m literally blown away.”

Growing up

He also believes that the Hadoop market is maturing.

“If you go back two-plus years ago there were a lot of proof-of-concepts and discussion of what it’s good for and not good for. In 2014 we saw data hubs and data lakes and more understanding of how this platform should be used.”

Today, we might have arrived at Gartner’s fabled “plateau of productivity”, judging by Reilly’s statement that capturing best practices, demonstrating business value, developing boardroom-level use cases, and discovering new usage scenarios are what’s happening today.

Financial services is Cloudera’s biggest sector so far: identifying money laundering, fraud and so on. But its appeal seems broad and extending to anywhere data needs to be parsed for insight (UK retailer Marks & Spencer has just been added to the customer roster). Seventy per cent of customers fall into the enterprise bucket, Reilly says, and the company is trending to the classical subscription model of “land and expand” where spending increases within accounts over time as users become comfortable.

Building a platform

But a big part of what Cloudera will be working on today is building an ecosystem. The company’s partner programme today covers about 1,500 companies, 300 of them ISVs. Certification programmes are being put in place for business partners (Amdocs and ClickFox are namechecked), admins and system analysts.

I press Reilly for a company that he admires or might use as a template and after giving the subject some thought, he cites Microsoft as impressive.

“One of the things Microsoft has done very well is always to have an ecosystem of ISVs; this is a conversation I had with Satya Nadella recently. We want a very vibrant community building a platform on top of our programs.”

Data and science

Which brings us to the contentious area of hiring and what the new generation of data scientists able to make practical use of all this new data will look like.

Cloudera is making its presence felt in universities with its academic partner programme and Reilly says that the new data scientist will be “someone who’s an expert in [a market like] financial regulatory requirements and someone who has [the skills] to work with data… that’s why they call them the unicorns.”

He also thinks companies should bet on creating data scientists rather than buying off the shelf.

“You’re better off building your own rather than trying to hire,” he says.

One hundred up

As for competition, HortonWorks, the company that is most often compared to Cloudera, has not been shy of criticising its rival but Reilly is initially temperate, suggesting Cloudera has superior enterprise-grade capabilities.

“We have a very good competitive landscape,” Reilly says. “I don’t think this [market] is ‘winner takes all’. Our goal is to be the market leader and this is a market we feel we have created. If you’re a large enterprise that’s concerned about security of data and lowest total cost of ownership, come to us. We don’t want to be arrogant but we do have to be confident.”

If Reilly appears reluctant to rejoice about that massive financing his company has won, he’s much keener to discuss the announcement Cloudera made in February when it said it had over $100m in annual revenues and he believes that Cloudera got to that mark faster than shooting stars of a previous generation like Oracle or, more recently, Tableau and Splunk. What’s perhaps just as surprising – especially seeing as a lot of software development is done –is that Cloudera is arguably the first open source company since Red Hat to reach that mark.

Performance indicators

Reilly views the triple crown of new customers, renewal rates and expansion within accounts as his main KPIs. Seventy per cent of customers expand their investments in Cloudera in a year and 15% did so twice. He won’t disclose stickiness levels but says they are “in that range” when I suggest that successful subscription software services tend to keep north of 90% of customers per year.

Of course we’ll find out more about Cloudera’s numbers if and when the company goes public, although that treasure chest of funding means there needn’t be any great rush to float. Reilly says Cloudera is already “meticulous” about proof points and providing predictability and clarity. And with a flick of the stiletto he contrasts that approach with an unnamed, publicly-traded rival (but let’s call them Hortonworks) where “nobody understands their business model”.

Changing the world

Of course, having strong revenues and backing is important but what’s more important about Cloudera, and other firms of its ilk, is their ability to change the world and answer some of the most important questions we’re grappling with today.

Take healthcare. Reilly is himself a type 1 diabetic and believes that e-records, monitoring and alerts – such as informing medics of a toxic reaction to treatment – could change patient care in remarkable ways. For years now the Big Data analytics case study cited at conferences was about how Target was able to identify pregnant women in their second trimester. But Reilly has a better one, telling me about a children’s hospital in Atlanta where two years of data were mined to show that distress in babies was often related to the noise of staff entering rooms. Lesson learned, the hospital minimised the scope for disruptions.

Reilly also points to the actor Michael J. Fox’s Parkinson’s disease research foundation and how suitably instrumented patients can transmit 300 data points per second. Another item: data can be deployed to reduce risks of contracting sepsis infections in hospitals by predicting the probability of bloodstream infections.

Reilly recognises that there will always be some privacy concerns over data and how it is used but says that having his Tesla car suggest a smarter route home after a day in the office so he can say goodnight to his children is an example of how predictive data can make an everyday difference.

Sunny horizons

Cloudera appears to have a very bright present and future but its fast progress has perhaps indirectly led to it being involved in a couple of spats. Reilly is dismissive about the recently announced Open Data Platform Hadoop/Big Data initiative that Cloudera rejected, saying it “is not addressing any customer need”, comparing it to groupings like the OpenSocial Foundation and claiming that it runs “contrary to the principles of Apache Foundation”. When he attempted to find out more about ODP before its launch “the response was ‘we’re not sure you’ll be interested’” and he says he was only offered details a week before launch and even then on condition of an accompanying non-disclosure agreement.

Another target has been Cloudera’s business model of using proprietary software alongside open source, attracting predictable brickbats from HortonWorks (again). But Reilly says, “there would not be a company called Cloudera without the Apache Foundation”, points out that Hadoop creator Doug Cutting works for Cloudera, and notes that the Apache Spark cluster computing platform was created by two Cloudera interns. Touché.

And with that our hour is up and Reilly returns to the melee of the conference for further meetings - and more data collection for subsequent interrogation.

 

See also:

HortonWorks plots IPO, slates Cloudera

Cloudera soars on a mountain of cash and opportunity

Hadoop latecomer HortonWorks no longer hears a ‘who?’