Senior Backend & Data Engineer | Building Scalable, Reliable Systems for AI, SaaS, and High-Throughput Platforms
Send a job offer directly to this candidate
Backend-heavy Senior Software Engineer with experience of building scalable backend systems, event-driven services, data pipelines, internal platforms, and full stack applications across civic technology, connected-device platforms, and global e-commerce.
Strong hands-on experience with Python, Go, TypeScript, JavaScript, PostgreSQL, Apache Kafka, Apache Airflow, dbt, Apache Spark / PySpark, Snowflake, BigQuery, Redis, Docker, AWS, REST APIs, gRPC, Prometheus, and Grafana.
Experienced in designing production systems that support transactional workflows, streaming event pipelines, ETL/ELT processing, historical backfills, analytics engineering, data migration, warehouse modeling, operational dashboards, service-to-service communication, and production observability.
•Designed and implemented backend and data-processing services for municipal permitting, inspections, licensing, fee collection, document management, and citizen-service workflows using Python, Go, PostgreSQL, Apache Kafka, REST APIs, gRPC, Redis, Docker, and AWS, supporting workflow automation for multiple municipal departments including building, zoning, planning, fire, finance, and code enforcement.
10/2024 – 06/2026
Bethesda, Maryland, United States
•Built Kafka-based event pipelines that captured permit, inspection, licensing, payment, document, and audit events, processing 100K+ municipal workflow events per month into operational reporting models.
Kafka producers and Python/Go consumers to validate event payloads, enrich records with applicant and parcel metadata, and persist normalized records into PostgreSQL reporting tables.
•Developed scheduled ETL pipelines using Apache Airflow, Python, SQL, PostgreSQL, and AWS services to automate nightly customer data loads, municipal exports, import validation reports, reconciliation checks, and dashboard refreshes, reducing manual data-processing work by approximately 8–12 hours per week for implementation and support teams.
•Built dbt-based transformation models for municipal reporting and analytics using dbt, SQL, PostgreSQL, Snowflake-style warehouse modeling, staging models, intermediate models, marts, and dbt tests. Created reporting layers for permit activity, inspection throughput, fee collection, license expiration, aging applications, and code-enforcement workload analysis.
•Implemented customer data migration pipelines for onboarding municipalities from legacy systems using Python, Pandas-style transformations, SQL, Apache Airflow, PostgreSQL, cloud storage, and PySpark for larger historical loads. Migrated hundreds of thousands to 1M+ historical records from CSV files, SQL exports, and customer-specific spreadsheets while validating duplicate permit IDs, invalid parcel numbers, malformed addresses, missing applicants, and inconsistent fee codes.
•Used Apache Spark / PySpark for large historical migration and batch-processing workloads involving permit records, parcel data, inspection history, contractor licenses, and document metadata.
Spark jobs to clean, deduplicate, join, standardize, and aggregate large datasets before loading them into PostgreSQL or warehouse staging tables, improving large import processing time by approximately 30–50% compared with single-node scripts.
•Designed reconciliation and data-quality workflows using Airflow DAGs, dbt tests, SQL validation queries, and Python reporting scripts. Built automated checks that compared source-file counts against loaded records, detected orphaned inspections, flagged permits without required parcels or applicants, and identified workflow states that violated municipal business rules.
•Improved PostgreSQL query and reporting performance across permit search, parcel lookup, inspection dashboards, and municipal exports using EXPLAIN plans, indexing, query restructuring, pagination, selective joins, materialized views, Redis caching, and Airflow-scheduled refreshes, reducing selected dashboard and search queries from 10+ seconds to under 2–3 seconds.
•Built backend APIs and internal service communication layers using REST APIs and gRPC. Used REST for frontend and customer-facing workflows, and gRPC-style service contracts for internal communication between workflow services, reporting services, import services, and integration workers where typed request/response models improved maintainability.
•Implemented asynchronous processing for bulk imports, document generation, report generation, scheduled notifications, integration retries, and nightly reconciliation using Apache Kafka, Airflow, Redis queues, RabbitMQ-style workers, PostgreSQL job state tables, and Dockerized worker services. Added retry policies, idempotency keys, dead-letter handling, and detailed job status tracking to make long-running workloads safe to rerun.
•Added observability across backend services, Kafka consumers, Airflow DAGs, Spark jobs, dbt transformations, and background workers using Prometheus, Grafana, structured logging, AWS monitoring, and application health checks. Built dashboards for API latency, Kafka consumer lag, Airflow DAG failures, ETL job duration, failed import rows, Spark job runtime, database query performance, and worker retry rates, reducing production investigation time by approximately 30–40%.
•Built internal admin and data-operations tools using React, TypeScript, REST APIs, PostgreSQL, and reporting services to help support and implementation teams inspect customer configuration, imported records, failed ETL jobs, Kafka processing history, Airflow run status, validation errors, and reconciliation results without direct database access.
•Built backend and data-ingestion services for connected-device registration, customer onboarding, product ownership, warranty tracking, device status, device activity, and customer support workflows using Python, Go, TypeScript, PostgreSQL, Redis, Apache Kafka, REST APIs, Docker, and AWS, supporting customer, product, order, and device lifecycle data across multiple internal systems.
12/2021 – 10/2024
Rockville, Maryland, United States
•Designed Kafka-based device-event ingestion pipelines using Apache Kafka, Python consumers, PostgreSQL, Redis, and background workers, processing 1M+ device and customer lifecycle events per month. Normalized raw device payloads, validated identifiers, enriched records with customer and product metadata, and stored structured events for analytics, lifecycle messaging, customer support, and operational reporting.
•Built streaming-style event workflows for customer and device lifecycle events including account created, product registered, device activated, device disconnected, warranty activated, order synchronized, shipment updated, support case opened, and notification sent. Published events into Kafka topics and consumed them through downstream services for reporting, notification orchestration, operational dashboards, and warehouse ingestion.
•Developed ETL and ELT workflows using Apache Airflow, Python, SQL, PostgreSQL, cloud storage, Snowflake, and BigQuery. Created scheduled pipelines for customer analytics, device activity summaries, order behavior, product usage, support operations, notification performance, and order-to-registration conversion tracking, reducing recurring manual reporting work by approximately 10+ hours per week.
•Built dbt transformation models to organize analytics data into reliable reporting layers using dbt, SQL, Snowflake / BigQuery, staging models, intermediate models, marts, incremental models, and dbt tests. Transformed raw device events, customer records, order events, notification events, and support activity into analytics-ready datasets used by product, operations, and support teams.
•Used Apache Spark / PySpark for high-volume device-event processing and historical backfills.
Spark jobs to process large batches of raw device telemetry and operational events from cloud storage, deduplicate repeated messages, normalize timestamps, join events with product and customer dimensions, and produce aggregated datasets for reporting and support dashboards.
•Implemented data-quality checks across ingestion and warehouse pipelines using Airflow sensors, dbt tests, SQL validation queries, and Python data checks. Validated event volume, duplicate rates, missing device IDs, invalid customer mappings, delayed Kafka messages, null-heavy fields, unexpected schema changes, and order-to-registration mismatches.
•Integrated e-commerce, fulfillment, CRM, notification, and analytics platforms using REST APIs, webhooks, Kafka producers/consumers, gRPC-style internal services, PostgreSQL synchronization tables, and retry-safe background jobs. Added idempotency handling for webhook retries, duplicate shipment updates, third-party API failures, and partial order synchronization.
•Refactored expensive synchronous workflows into asynchronous data-processing pipelines. Moved device-event enrichment, third-party synchronization, notification dispatch, analytics aggregation, and warehouse loads into Kafka consumers, Airflow DAGs, Redis queues, RabbitMQ-style workers, and Python/Go background services, reducing API timeout and retry-related issues by approximately 35–45% across selected workflows.
•Designed reporting data models for support and operations teams using PostgreSQL, Snowflake, BigQuery, dbt, and SQL. Built datasets for active devices, failed device events, daily registrations, product usage, customer engagement, notification delivery, warranty activity, order-to-device matching, and support-case context.
•Built internal support and operations dashboards using React, TypeScript, REST APIs, PostgreSQL, and reporting services. These tools allowed non-engineering teams to search customers, inspect device registration history, review event-processing status, view order synchronization state, verify notification history, and troubleshoot failed data pipeline jobs, reducing support investigation time by approximately 30%.
•Added observability across APIs, Kafka consumers, Airflow DAGs, Spark jobs, dbt transformations, data pipelines, and worker services using Prometheus, Grafana, structured logging, AWS monitoring, application health checks, and alerting rules. Built dashboards for API latency, Kafka consumer lag, device-event throughput, Airflow task failures, Spark job duration, dbt test failures, worker retry rates, failed webhook counts, and database query performance.
•Implemented automated tests for backend services, data transformations, event consumers, and frontend workflows using Pytest, Jest, Cypress, dbt tests, API contract tests, integration tests, and CI/CD pipeline checks. Added regression coverage around device registration, customer onboarding, order synchronization, Kafka event handling, Airflow pipeline logic, dbt models, and internal admin tools.
•Developed backend and data-oriented services for high-traffic e-commerce workflows including customer accounts, product configuration, saved designs, pricing-adjacent services, domain-related products, small business tools, internal tools, and operational platforms using Node.js, JavaScript, TypeScript, Python, Go, PostgreSQL, MySQL, MongoDB, Redis, REST APIs, Docker, and cloud infrastructure.
07/2016 – 12/2021
Rockville, Maryland, United States
•Built data pipelines for customer behavior, product usage, design activity, domain search behavior, marketing attribution, experimentation analysis, and operational reporting using Python, SQL, scheduled ETL jobs, Apache Airflow-style orchestration, cloud storage, Snowflake / BigQuery-style warehouse environments, and internal data platforms, supporting reporting over multi-million-row e-commerce datasets.
•Supported event-driven analytics workflows using Apache Kafka / Kafka-style event streams, RabbitMQ/message queues, backend event APIs, and batch-processing jobs. Published and consumed customer activity, product events, design project events, domain search events, checkout-adjacent activity, and campaign interaction events so downstream analytics and reporting systems had consistent behavioral data.
•Built SQL transformation logic for analytics datasets using dbt-style modeling, warehouse staging tables, incremental transformations, and data marts. Organized raw application events into cleaned staging models and reporting-ready datasets for customer engagement, product adoption, domain funnel behavior, marketing attribution, experimentation analysis, and operational reporting.
•Worked with warehouse-style analytics environments such as Snowflake and BigQuery to support large-scale e-commerce reporting. Built SQL transformations and aggregation tables for customer journeys, product usage, conversion funnels, domain search activity, campaign performance, and operational dashboards. Used partitioning, clustering, incremental loads, and pre-aggregated marts where appropriate to improve reporting performance.
•Used Apache Spark / PySpark for large-scale batch processing and historical backfills involving customer events, product activity, marketing data, and domain-related behavior. Built or supported Spark jobs that read raw event data from cloud storage, cleaned and deduplicated records, joined event streams with customer and product dimensions, and produced aggregated outputs for warehouse loading and analytics use cases.
•Implemented Airflow-style orchestration for recurring data workloads including extraction from application databases, transformation of event data, warehouse loading, validation checks, and stakeholder-facing reporting refreshes. Defined task dependencies, retries, alerting behavior, and failure handling so daily and hourly data jobs could run reliably.
•Added data-quality checks for analytics and reporting pipelines using SQL validation queries, dbt-style tests, Python checks, and Airflow task validations. Checked for missing event fields, duplicate customer events, abnormal event-volume drops, delayed source data, invalid product IDs, broken campaign mappings, inconsistent attribution records, and incomplete warehouse loads.
•Integrated backend systems using REST APIs, GraphQL, gRPC-style internal service contracts, message queues, and service adapters across commerce, catalog, pricing, identity, design storage, domain-service, order-management, analytics, and content platforms.
•Improved customer-facing and data-heavy e-commerce flows through SQL optimization, Redis caching, API response shaping, batched service calls, background processing, warehouse pre-aggregation, and event-driven data updates, reducing selected API and reporting latencies by approximately 30–50%.
•Built internal tools for support, marketing, operations, analytics, and engineering teams using React, TypeScript, Node.js, Python, REST APIs, PostgreSQL/MySQL, and reporting data services. These tools helped internal users inspect customer state, validate product configuration, review domain-service data, troubleshoot order-adjacent issues, monitor data-processing jobs, and analyze customer behavior.
•Added observability for backend services, event-processing workflows, Airflow-style jobs, Spark jobs, and batch pipelines using Prometheus, Grafana, structured logs, service metrics, CI/CD checks, and cloud monitoring tools. Tracked API latency, error rates, event-processing failures, job duration, queue depth, data freshness, warehouse load status, and database query performance.
•Supported service migration and API modernization efforts by extracting backend logic from tightly coupled systems into smaller service-oriented components. Used Node.js, Python, Go, REST APIs, Docker, CI/CD, Redis, SQL databases, Kafka/message queues, and cloud deployment patterns to separate application logic, integration logic, and data-processing workflows across multiple services.
Bachelor's Degree / Computer Science
08/2010 – 08/2014
Charlottesville, Virginia