Shell: Cloud and Big Data delivery expertise accelerates delivery of new analytics platform
Colibri worked with Shell, one of the world's largest oil and gas companies, to deliver an Azure advanced analytics stack for its global lubricants business. Shell was struggling to design a solution capable of processing vast quantities of dirty data from various sources
Shell, one of the world's largest oil and gas companies have spent the last six months trying to deliver an Azure analytics stack for its global retail lubricants business. Struggling with the volume, veracity and variety of data as well as suffering from a lack of experience delivering Azure data projects, the project was heading towards failure. Major issues included the time taken to manually deploy and execute the main data pipeline, an unstable data science environment, and a lack of automation and control around code changes.
Colibri were approached to resign and re-implement the analytics platform from scratch, using a DevOps first, Cloud first methodology to enable rapid evolution and prototyping of new use cases and ingestion of new data.
We rapidly embedded into the existing Data Engineering, Data Science and DevOps teams instilling a culture of automation and Cloud First thinking. This allowed us to rapidly distill and simplify the architecture of the target data platform.
By working with existing business and engineering teams, we were able to develop a refined, prioritized backlog of development tasks supported by a best practice cloud and data architecture focussed on solving only the core business requirements, something that the project had failed to do.
We also sponsored the adoption of industry best practice technologies including Azure Storage, AKS, ACR, Spark, Kubernetes, and CircleCI enabling engineering teams to significantly improve productivity while dramatically simplifying the architecture of the platform.
In just 10 weeks we were able to reduce the time taken to deploy a new version of the data engineering or data science pipeline from weeks to minutes. Pipeline execution performance increased by more than 15x on a significantly larger data set.
We also delivered a state of the art set of analytics infrastructure supported by fully automated deployments, providing environments for data scientists to rapidly explore and new use cases, and data engineers to test out the creation of new features scalable, repeatable fashion.