Site Reliability Engineer / DevOps Engineer – Data Platform & Analytics Infrastructure - MindSource

MindSource is looking for a skilled professional to support a large-scale metadata discovery and analytics infrastructure platform for client based in Austin, TX. This platform is responsible for cataloging and managing metadata across millions of datasets. It centralizes metadata from a wide variety of data systems—including relational databases, distributed storage systems, and analytics platforms—so that engineering, data science, legal, and compliance teams can find and understand available datasets from a centralized registry.

You'll work in a dynamic environment that combines the excitement of the tech industry with the vibrant cultural scene of the city. This role will focus on operating and scaling the infrastructure and streaming pipelines that power this data discovery platform.

This is a hybrid, long-term contract role with possibility of extension. Key Responsibilities

Platform Reliability &

• Operations Operate and maintain large-scale streaming pipelines responsible for ingesting metadata from multiple data systems. Ensure reliability, availability, and performance across distributed services supporting metadata ingestion and processing. Troubleshoot and debug issues across distributed systems and streaming pipelines. Support production incidents and ensure stable operation of critical data infrastructure.

Infrastructure &

• Deployment Deploy and manage services and streaming applications in containerized environments. Manage deployments and infrastructure scaling using modern deployment tooling. Operate web services responsible for receiving metadata events from multiple connectors across the environment. Support infrastructure configuration and system scalability for high-volume ingestion workloads.

Streaming &

• Data Pipeline Support Support streaming pipelines processing metadata events through distributed messaging systems. Manage topics, event streams, and processing jobs responsible for metadata synchronization. Ensure reliable processing and delivery of metadata streams into centralized data lake storage systems.

Observability &

• Monitoring Implement monitoring, metrics collection, and alerting across pipeline infrastructure. Build dashboards and operational visibility for distributed services and streaming applications. Ensure early detection and resolution of reliability or performance issues.

Scalability &

• Performance Maintain infrastructure capable of handling large-scale ingestion workloads and millions of metadata events. Optimize system performance across services, streaming infrastructure, and storage layers. Support scaling of services and streaming applications as metadata volume grows.

Required Qualifications

Strong experience in Site Reliability Engineering or DevOps engineering.

Experience supporting distributed systems and data platforms.

Strong hands-on experience with Kubernetes and container orchestration.

Experience managing deployment pipelines and automated infrastructure.

Experience building and maintaining observability platforms and monitoring systems.

Strong debugging and troubleshooting skills across distributed systems.

Preferred Qualifications

Experience with streaming data platforms such as Kafka.

Familiarity with stream processing frameworks such as Flink.

Experience with modern data lake storage formats such as Iceberg.

Familiarity with analytics platforms or data infrastructure environments.

Experience supporting large-scale data pipelines or data platforms.

Site Reliability Engineer / DevOps Engineer – Data Platform & Analytics Infrastructure

Job description

Platform Reliability &

Related

Related