Soda pours effort into building a data observatory

Belgian company seeks to automate processes that fix data disconnects.


Anyone advocating the importance of data in the modern world is pushing at an open door: we all know that companies from Amazon to Tesla and Spotify to Deliveroo couldn’t exist in their present forms without an obsessive desire to harness data, secure it, query it and act on it. But there’s data and there’s data… and that’s why Belgium-rooted Soda is being viewed as a key player in enabling data observability and collaboration.

The three-year-old company recently pushed out Soda Cloud, an end-to-end platform based on open source tools created by the company to fix issues in data products and promote “good data” across organisations. Soda helps data teams to anticipate and fix the panoply of “silent issues” that hurt organisations every day, Maarten Masschelein, CEO & Co-Founder, Soda, tells me over a video call.

Think of Soda Cloud as a data observatory that’s constantly scanning for issues caused by human error, firmware upgrades, schema changes, cross-platform integration snafus, bought data or transformation bugs. The Soda Cloud fizzes with the power of testing and monitoring tools built by data and analytics engineers operating as a community. The aim: to automate, verify and validate as far as possible the flow of data from a gamut of sources and encourage enlightened discussion among data experts and downstream business decisionmakers.

Masschelein says that the world is benefiting from data products that act as the digital engines behind price-comparison and recommendation sites, for example. But these all need to be managed as live, dynamically changing entities. Left alone, they can be plagued by errors, disconnects, broken logic and data drift, leading to reduced accuracy and degraded model performance.

“There are more and more data products out there they’re doing a lot more but feed them bad data and there are huge potential damages,” he argues. “This is not just a technology problem, it’s also about collaboration and bringing people together because the engines are becoming commoditised but data tooling isn’t.”

That’s an issue for techies but also executives who become mistrustful of data, slowing down insights and subsequent actions. Soda Cloud expedites data integration by making any connection and source able to be tested and any issues discovered, prioritised and resolved via team efforts. Integrations with tools such as Slack simplify the process.

Meeting of minds

Masschelein was an early employee at another data company Collibra but says Soda is going in another direction, having been convinced that “the larger the company was, the more important discoverability was”. Even within Collibra, he had observed the importance of data observability as he struggled with challenges such as where new reps needed to be hired. Fail to address underlying issues like this and data teams are “flying blind”, says Tom Baeyens, Soda CTO and Masschelein’s co-founder, with sleeping issues continuing in the background.  

Collibra had used software coded by Baeyens for data collaboration and when the two shared a Brussels-to-London train journey, Soda started to brew. By combining data monitoring and troubleshooting with collaborative workflows, the knotty process of unlocking data value could be unravelled. “It dawned on me that the challenge is huge,” Masschelein says. “You can only model good data if you know what it looks like.”

Bayaens adds: “With data roles, you need to have workflows to support that and we feel that open source was a driving factor to enable engineers to connect to data and to help the analysts.”

Similar to DevOps back in the day, we’re at a new stage where companies need to bring data and collaboration together to optimise the work of data engineers, the pair argue.

Soda recently raised $17m to build out its ambitions to lead what it feels is a discrete category where rivals tend to be niche tools and home-grown efforts. What’s the total value of the opportunity? That’s hard to gauge but Masschelein notes that hot cloud monitoring firm Datadog has a large valuation (almost $25 billion at time of writing) “and hasn’t even touched the most complicated part of the stack”.

But fancy valuations for Soda are for another day and for now there are more prosaic concerns such as hiring (“brutal”) and all the classic build-out challenges startups face.

“Why we wake up in the morning is from the joy in teams adopting software that we’ve created and built through blood, sweat and tears,” Masschelein says. “There’s an imminent need and we just want to grab the market as fast as possible.”

Also read:

Collibra rides the waves of a changing data ocean