What is open metadata and why should we care?

Attempts to set standards through organic adoption are traditionally an uphill struggle

It was a blatant plug. Mandy Chessell, IBM’s chief data officer had just delivered a keynote talk on open metadata and she was petitioning the audience to join her, or at least the ODPi, saying that “adoption is key to standards.” She has a point. For open metadata to become the de facto standard for the big data industry, it will need volume support and where better to get it than at a big data conference?

Dataworks 2018, held at the Estrel in Berlin, a large hotel and congress center on the edge of the Neukolln district in the south of the city should have been fertile ground for Chessell. The room was full of data geeks. How Erich Mielke, the head of the East German Stasi would have loved to mine their collective knowledge on handling large amounts of data. The irony is not lost, given the Berlin wall ran just a stone’s throw from the Estrel and the last person to be killed at the wall, Chris Geuffroy was shot scaling a fence between Treptow Park and Neukolln.

That’s history of course but managing big data remains a constant problem for businesses and organizations, especially as it moves between devices, datacenters and the public cloud. As organizations evolve, their adoption of cloud strategies varies widely. Interestingly, a quick interactive poll of the Dataworks audience revealed that 35 percent had no interest in moving data to the cloud. It wasn’t expected but then there is a lot of uncertainty about security and how to manage it if it is in the cloud.

It’s something the event’s main sponsor Hortonworks has clearly tried to solve with its DataPlane Service, a management system that can give visibility of all data regardless of its location. Chessell’s push for open metadata dovetails neatly with this idea, offering a sort of filing cabinet, like in a library, which helps users find what they are looking for quickly. The metadata is a catalogue of all available data but not all filing cabinets are created equal. That’s Chessell’s point. If it’s not open, it just slows everything down.

“Organizations will either be locked into single vendor deployments or spend time creating their own integrations to shift and transform metadata between repositories from different vendors - this is what is happening today and what we are trying to change,” she says when asked what happens if the open standard is not widely taken-up. “The effect is increased cost in managing data and less effective use of data, since it is harder to find and control it.”

Unsurprisingly perhaps Hortonworks’ VP marketing John Kreisa agrees. Hortonworks was after all a founding member of the ODPi.

“Honestly, I think there will be pressure on businesses to adopt a standard,” he says. “The more the tools are connected to a common metadata layer, the more compliance, which will make GDPR that much easier. If companies are not subscribed to the open idea, it’s still possible but it will be a lot of work and will take longer to achieve.”

To continue reading this article register now