Data Engineer at Credit Suisse
Send a job offer directly to this candidate
As a Data Engineer at Credit Suisse, I establish and maintain scalable data solutions that support the company's financial operations and reporting. I use machine learning, Spark, and AWS to process, analyze, and validate large and complex data sets, ensuring high quality and accuracy.
In my previous role at Draxo Infotech, I developed and executed Big Data analytics and machine learning applications using Apache Spark with Python, leveraging Spark ML and MLlib for various use cases. I also automated and enhanced data management processes using Hive, HDFS, and AWS services, such as EMR, Redshift, and S3.
I hold a Master's degree in Data Science from The University of Texas at Arlington, where I learned and applied advanced skills in data engineering, machine learning, and cloud computing. I also have a Bachelor's degree in Computer Science and Engineering from Motilal Nehru National Institute Of Technology, and I am AWS Certified Solutions Architect – Associate.
I am passionate about finding and solving data problems, and I am always eager to learn new technologies and tools. I enjoy working with cross-functional teams and collaborating with other data professionals to deliver innovative and impactful solutions.
Engineered and Automated ETL workflows using Apache Spark and Python, resulting in a 30% reduction in data processing time and improved data accuracy.
*Designed and implemented ETL processes using SQL to extract data from various sources, including transaction databases, online interactions, social media, and customer support systems.
*Led the development of a Comprehensive Customer Analytics Platform, utilizing SQL for data collection, integration, transformation, and analysis.
*Developed SQL queries and scripts for in-depth data analysis, calculating KPIs, and deriving actionable insights for stakeholders.
*Collaborated with teams to integrate SQL-based visualization tools like Tableau for creating interactive dashboards displaying customer behaviors, sales trends, and marketing campaign effectiveness.
*Implemented CI/CD for data pipelines using AWS Data Pipeline and Airflow, achieved a 20% reducing deployment time.
*Integrated Python with AWS Lambda for serverless data processing, decreasing infrastructure costs by 15%.
*Leveraged AWS Glue to automate ETL processes for daily sales data feeds, reducing processing time by approximately 50%.
*Built a Databricks to build a scalable data lake, ingesting and processing 2 terabytes of financial data, including historical market data, customer transactions, and risk assessments.
*Designed and Automated efficient data pipelines that parsed and stored raw data into partitioned Hive tables, improving data retrieval for reporting and analysis by 15%
Masters in Data Science from The University of Texas at Arlington,May 2023