scroll

Customer Name

Partner Name

We empowered Tata Consumer with unified Data-Driven Insights, enabling 1500+ Users with ThoughtSpot Dashboards & Automating Data Currency through Robust AWS Lakehouse & Snowflake Architecture

Challenge

INDUSTRY OVERVIEW

Tata Consumer Products Limited (TCPL) is a prominent player in the consumer goods sector and a subsidiary of the Tata Group.With a diverse portfolio that spans beverages, foods, and essential consumables, the company has consistently demonstrated a steadfast dedication to consumer satisfaction and societal well-being.

Tata Consumer is looking to establish a holistic data platform that will integrate data from various source systems (SAP, Botree, Salesforce, DMS etc) into a single persistent storage layer which will transform data as and when required to build transformed views that streamline various business related KPI definitions.

CHALLENGE

Tata Consumer is looking to transition their data from SAP BW, Botree, Salesforce, Blue Yonder application, DMS systems to Snowflake by creating a single source of truth and standardized.

Gaps in the current system:

  • Fragmentation of data across different storage solutions resulted in inefficiencies when attempting to access or analyze information
  • The lack of a data lake restricts scalability, especially as data volumes grow over time
  • Lack of quick access to a centralized pool of data leads to delayed decision-making
  • Security measures and compliance protocols is harder to enforce across fragmented data storage systems

Why were we brought in?

Ganit has made a significant dent in various industries using data science and analysis. Ganit partners with clients to translate their data into a tangible, insightful plan of action that delivers on a measurable impact to the clients’ topline & bottom-line growth.

Our approach

Solution

For this project, Ganit created an AWS based Datalake and Warehouse setup on Snowflake to create the foundational Lakehouse architecture.

Data was ingested from various sources such as MYSQL (SSFA, SMDMS, Botree), SFDC, SAP to a raw data lake layer on AWS on Amazon S3. The ingestion tool used was SnapLogic to perform different types of loads

  • Full Load: Entire data was pulled from the source system or from full load start date mentioned in the SnapLogic
  • Delta Load: New/changed data was captured and pulled from the source system on daily/hourly basis

There were Raw, Curated, and Refined zones in Amazon S3, with AWS Glue utilized for data cleansing and transformation.

After SnapLogic pipeline execution gets completed, files are stored inside raw zone. Data present in raw zone is moved to curated zone as a replica with the help of AWS Glue jobs defined for various data sources. Data cleaning is done & data is stored in parquet format in curated zone. Data present in curated zone is then moved inside to refined bucket where transformation and business logic is developed to fulfil the business requirement.

AWS Glue is orchestrated by an AWS Managed Apache Airflow Workflow engine. A Metadata data store on RDS MySQL manages metadata. A Snowflake-based data model supports Sales Key Performance Indicators (KPIs) executed and visualized on ThoughtSpot, ensuring an efficient end-to-end data processing and analytics solution.

Features of the tool

  • Flexible handling of diverse data sources for analytics use
  • Faster availability of refreshed dashboards
  • Automated Data Transformation via PySpark in AWS Glue, orchestrated by Airflow, streamlines and accelerates transformations
  • Leveraging AWS services for a streamlined, cloud-native architecture

A valuable difference

Impact

  • 1500+ users are now consuming insights from ThoughtSpot dashboards that is built on top of data platform on AWS
  • Business applications no longer need to access different source systems for fetching data and KPIs
  • Due to the automation of the process, the data on the dashboards remains current without requiring manual intervention
Success stories

See the impact that we make on our
cross-industry client base.

Top