Data Engineer(Spark,Scala,Pyspark)
Technology
恩士迅信息科技(上海)有限公司¥15,000 - ¥25,000 /年3周前截至 2026/6/9
全职
职位描述
该职位来源于猎聘 Responsibilities
- Deliver solutions for online data extraction, data integration and data management.
- Develop and maintain high-performance, scalable utilities to support technology research and data transformation.
- Optimize data processing protocols and systems for better efficiency and maintainability.
- Contribute to the establishment and maintenance of distributed computing platform and big data services
- Writing of documents that clearly explain how algorithms should be implemented, verified and validated. Requirements
- Experience in a field encompassing Distributed computing, Data Analytics, Data Transformation
- Data Engineer with hands-on experience in Hadoop, SQL, Hive/Impala, Spark (Scala/ Python/ Java), Parquet/Avro/HBase and HDFS.
- Analyse and address issues relating to capacity constraints and performance related items on SQL and Spark ETL processing.
- Strong in Linux/Unix Commands and Scripting (Shell Scripting and Python/Scala).
- Experiences in automation of batches with Shell scripts REST APIs (triggering and result parsing).
- Experience in using AWS products like S3 Storage.
- Strong problem-solving skills with the can-do attitude to be able to come up with solutions for various problems on a wide technical area.
- Knowledge of professional software engineering practices & best practices for the full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations
Keywords
UnixCodingScalaApache HadoopApache SparkApache HBaseLinuxPythonSqlApache ParquetHadoopHiveHbaseApache AvroJavaCoding conventionsMaintainabilityBig dataData managementShell script
¿Te interesa este puesto?