Customer Name
Partner Name
Tata Consumer Products Limited (TCPL) is a prominent player in the consumer goods sector and a subsidiary of the Tata Group.With a diverse portfolio that spans beverages, foods, and essential consumables, the company has consistently demonstrated a steadfast dedication to consumer satisfaction and societal well-being.
Tata Consumer is looking to establish a holistic data platform that will integrate data from various source systems (SAP, Botree, Salesforce, DMS etc) into a single persistent storage layer which will transform data as and when required to build transformed views that streamline various business related KPI definitions.
Tata Consumer is looking to transition their data from SAP BW, Botree, Salesforce, Blue Yonder application, DMS systems to Snowflake by creating a single source of truth and standardized.
Gaps in the current system:
Ganit has made a significant dent in various industries using data science and analysis. Ganit partners with clients to translate their data into a tangible, insightful plan of action that delivers on a measurable impact to the clients’ topline & bottom-line growth.
For this project, Ganit created an AWS based Datalake and Warehouse setup on Snowflake to create the foundational Lakehouse architecture.
Data was ingested from various sources such as MYSQL (SSFA, SMDMS, Botree), SFDC, SAP to a raw data lake layer on AWS on Amazon S3. The ingestion tool used was SnapLogic to perform different types of loads
There were Raw, Curated, and Refined zones in Amazon S3, with AWS Glue utilized for data cleansing and transformation.
After SnapLogic pipeline execution gets completed, files are stored inside raw zone. Data present in raw zone is moved to curated zone as a replica with the help of AWS Glue jobs defined for various data sources. Data cleaning is done & data is stored in parquet format in curated zone. Data present in curated zone is then moved inside to refined bucket where transformation and business logic is developed to fulfil the business requirement.
AWS Glue is orchestrated by an AWS Managed Apache Airflow Workflow engine. A Metadata data store on RDS MySQL manages metadata. A Snowflake-based data model supports Sales Key Performance Indicators (KPIs) executed and visualized on ThoughtSpot, ensuring an efficient end-to-end data processing and analytics solution.
See the impact that we make on our
cross-industry client base.