To ogłoszenie nie jest już dostępne.

Senior Data Engineer (Spark) (Praca zdalna)

Addepto

Białystok +7 więcej
31920 zł/mth.
Zdalna
🐍 Python
Spark
🤖 Airflow
🐳 Docker
🌐 Zdalna
☁️ AWS
SQL
📊 Data Integration
Hadoop
Java
Iceberg
Scala
Hive

Requirements

Expected technologies

Python

Spark

Airflow

Docker

Optional technologies

Java

Scala

Kubeflow

MLFlow

Databricks

dbt

Kafka

Kubernetes

Iceberg

Terraform

Operating system

Windows

macOS

Our requirements

  • At least 5 years of commercial experience implementing, developing, or maintaining Big Data systems, data governance and data management processes.
  • Strong programming skills in Python (or Java/Scala): writing a clean code, OOP design.
  • Hands-on with Big Data technologies like Spark, Cloudera, Data Platform, Airflow, NiFi, Docker, Kubernetes, Iceberg, Hive, Trino or Hudi.
  • Excellent understanding of dimensional data and data modeling techniques.
  • Experience implementing and deploying solutions in cloud environments.
  • Consulting experience with excellent communication and client management skills, including prior experience directly interacting with clients as a consultant.
  • Ability to work independently and take ownership of project deliverables.
  • Fluent in English (at least C1 level).
  • Bachelor’s degree in technical or mathematical studies.

Optional

  • Experience with an MLOps framework such as Kubeflow or MLFlow.
  • Familiarity with Databricks, dbt or Kafka.

Your responsibilities

  • Develop and maintain a high-performance data processing platform for automotive data, ensuring scalability and reliability.
  • Design and implement data pipelines that process large volumes of data in both streaming and batch modes.
  • Optimize data workflows to ensure efficient data ingestion, processing, and storage using technologies such as Spark, Cloudera, and Airflow.
  • Work with data lake technologies (e.g., Iceberg) to manage structured and unstructured data efficiently.
  • Collaborate with cross-functional teams to understand data requirements and ensure seamless integration of data sources.
  • Monitor and troubleshoot the platform, ensuring high availability, performance, and accuracy of data processing.
  • Leverage cloud services (AWS) for infrastructure management and scaling of processing workloads.
  • Write and maintain high-quality Python (or Java/Scala) code for data processing tasks and automation.
Wyświetlenia: 1
Opublikowanaokoło miesiąc temu
Wygasaza 12 dni
Tryb pracyZdalna
Źródło
Logo
Logo

Podobne oferty, które mogą Cię zainteresować

Na podstawie "Senior Data Engineer (Spark)"