Audi puts open source big data foundations in place for car usage data

Audi has been adopting a range of open source technologies to build a big data foundation for collecting increasing data volumes from its latest luxury car models, as well as the machinery in its production facilities.

Speaking to a packed room at the Dataworks summit in Berlin last week, Matthias Graunitz and Carsten Herbe, two big data architects at Audi spoke about how they built the data backend to store these new data sources, and some lessons learned along the way.

Open source stack

Audi is a big Hadoop user, storing data in Hadoop Distributed File System (HDFS) since 2015. Laying out the roadmap, Graunitz said: "So we started with a small cluster by the end of 2015, started with clients to investigate how to build and run this system and if they fulfilled the business requirements. So we started with a small Hortonworks data platform (HDP) cluster and have four nodes, 96 cores and 160TB of raw capacity."

Today that is a productive Hadoop cluster with 1PB of storage capacity, 288 cores across 12 nodes and 6TB of RAM, as well as a productive Kafka cluster with 4 nodes, 128GB of RAM and16TB of raw capacity.

That doesn't mean the integration was seamless though, as Herbe identified: "Introducing Hadoop into an existing enterprise environment has challenges, it has to integrate with business systems, there are security requirements.” This tweet shows a more detailed map of the Hadoop path within Audi.

To continue reading...


« Snapchat's Spectacles 2 are here and Apple's AR glasses can't come soon enough


Time for 5G pragmatism »


Do you think your smartphone is making you a workaholic?