Career Profile
Experienced Big Data consultant worked on heavy data load Bigdata projects for multiple clients of Banking, Healthcare, Supply chain, E-Commerce, and Retail Domain.
Experiences
Working as Lead for Ingestion framework team of 5 developers and maintaining two Ingestion frameworks running 1000s of daily ingestions in six environments along with its related services.
- Successfully released a new generation of metadata driven ingestion framework with user interface during this time of pandemic. Developed jobs and utilities to migrate old clusters to new clusters.
- Working with architect and project manager to discuss and deliver new features and enhancements
- Responsible for delivery of stories in sprint and backlog grooming.
Worked with one of the largest banks in APAC to build an Enterprise Big Data Platform using open source tools and a design principle focused on flexibility and transparency.
- Lead a team of 5 developers from offshore and helped project teams with Ingestion issues and queries.
- Worked on Spark based Ingestion framework. Developed and maintained jenkins pipelines.
- Develop a test framework to automated testing of use cases.
- Worked on Talend based Ingestion framework for data ingestion into data lake. I worked on features like Rest API ingestion and Json data parser.
- Worked on IBM MQ and Kafka based streaming Ingestion POC on top of Pivotal Cloud Foundry and hadoop cluster. As a part of this, I also worked on xml to text parser to convert xml messages from MQ
Worked on a forecasting based inventory management platform for a leading automotive aftermarket parts provider in the USA for Inventory replenishment.
- Developed aws emr based spark modules for replenishing stores in a store family due to Assortment change, Route change and Store close.
- Developed PySpark framework to run SQL files in spark cluster.
- Developed automated script to launch aws emr cluster along with required input hive tables.
- Worked on a migration project of SAS to Hadoop based data pipeline.
- Worked on converting SAS to Hive & Spark based data pipelines.
- Develop SQOOP based scripts to automate data loading from databases like SAP HANA, DB2, SQLServer etc.
- Worked with the SAS Data Science team to prepare data pipelines using pyspark,Spark SQL in AWS Cluster.
- Consult/Help/guide peers on big data related problems and solutions.
- Help on debugging issues in hadoop, Hive, Spark jobs.
- Involved in Big Data engineer recruitment pane
Zaloni has a data management and governance tool named Bedrock, which helped in managing data ingestion pipelines and its metadata. I worked on the first production project of Bedrock and successfully deployed to a Fortune 500 healthcare company.
- Developed Map Reduce based framework for data ingestion in Bedrock having features of record counting, tokenization, watermarking, Schema validation and file duplications check..
- Developed a shell script based framework for data extraction from database sources using SQOOP , Hive, Hcatalog etc. supporting full load and Incremental loads on top of Bedrock tool
- Worked as tech lead for many data offloading projects and POCs using Bedrock.
- Worked on a project to migrate and merge all existing data ingestion pipelines to merge hadoop map reduce workflow
As a Startup company, I worked on a lot of POCs and product concepts.
- Involved in requirement gathering and technical discussion to formulate development plans.
- Developed web applications and backend modules.
- Helped frontend developers with necessary apis.
- Contributed to discussion while designing database models.