I have more than 5 years of experience as a Data Engineer.
Experience
Hands-on experience in Python programming.
Hands-on experience in PySpark for data transformation and manipulation.
Hands-on experience in cloud platforms such as AWS, AZURE and GCP which includes services like AWS (EMR, EC2, S3, EC2, Glue, Lambda, Redshift) AZURE (Databricks, Data factory, Data Lake, and Synapse) and aware of functionality and use cases of other AWS components as well.
Proficiency in analyzing Data using Spark-SQL queries through Databricks.
Proficiency in analyzing Data using PySpark Data frame through Databricks and EMR.
Proficiency in creating Data pipelines using Python and integrating with Airflow for creating and scheduling DAGs.
Hands on experience in creating, designing and implementing the Data pipelines, ETL (Extract, Transform and Load)/ ELT (Extract, Load and Transform) process and Data warehousing solutions.
5+ years of experience in creating the Data pipeline.
5+ years of experience in executing the solutions for Complex business Problems involving large scale data warehousing.
Proficiency of working on Data Ingestion process.
Knowledge of Shell scripting.
Strong in Problem Solving skills.
Have Experience in workingon GitHub, PyCharm,Jupyter Notebook.
Ability to work effectively in cross functional teams and communicate complex technical concepts to Non-Technical Stakeholders.