LLM Systems & Agentic Infrastructure

Senior AI Engineer

(LLM Systems & Agentic Infrastructure)

Company Description

Ankbyte is advancing the future of intelligent businesses with its innovative Agentic AI Factory Model. With operations in the United States and Mexico, the company specializes in digital coworkers, enterprise RAG systems, computer vision pipelines, and custom LLM-powered applications.

Overview

Design and implement large-scale AI systems involving LLM inference, retrieval architectures, and agent-based orchestration. The role focuses on building production-grade AI services, knowledge retrieval pipelines, and distributed inference systems.

The engineer will work on core infrastructure including embedding pipelines, vector retrieval systems, agent execution frameworks, and real-time AI APIs.

Responsibilities

LLM Application Systems

Implement LLM-powered services using transformer-based models.
Develop prompt pipelines and structured reasoning workflows.
Implement context management and multi-turn dialogue state handling.

Retrieval Systems

Build retrieval pipelines for unstructured and semi-structured data.
Implement embedding generation pipelines and vector indexing.
Optimize similarity search and hybrid retrieval systems.

Agent Execution Frameworks

Implement agent orchestration pipelines using frameworks such as:
LangGraph
LangChain
Autogen
custom agent runtimes
Develop modular tools and execution chains for agent workflows.

Data Processing Pipelines

Implement data ingestion and preprocessing pipelines.
Develop batch and streaming pipelines for knowledge ingestion.
Build feature extraction and embedding generation pipelines.

API & Service Development

Build scalable AI APIs using Python-based microservices.
Implement REST / gRPC interfaces for inference services.
Optimize service latency and concurrency performance.

Observability & Evaluation

Implement logging and tracing for AI interactions.
Develop evaluation pipelines for retrieval accuracy and response quality.
Monitor performance metrics including latency, throughput, and model accuracy.

Technical Skills

Required

Strong programming skills in Python
Experience with LLM frameworks
Experience with RAG architectures
Experience with vector databases
Experience with distributed systems
Experience with REST/gRPC APIs
Experience with containerized deployment

Technologies:

Python
PyTorch / TensorFlow
LangChain / LangGraph
FastAPI
Docker / Kubernetes
PostgreSQL / Redis
Vector DB systems

Preferred Experience

Large-scale ML systems
real-time inference systems
multi-agent AI frameworks
high-throughput distributed systems
ML infrastructure platforms

Descripción del puesto