LLM Systems & Agentic Infrastructure
ANKBYTEDescripción del puesto
Senior AI Engineer
(LLM Systems & Agentic Infrastructure)
Company Description
Ankbyte is advancing the future of intelligent businesses with its innovative Agentic AI Factory Model. With operations in the United States and Mexico, the company specializes in digital coworkers, enterprise RAG systems, computer vision pipelines, and custom LLM-powered applications.
Overview
Design and implement large-scale AI systems involving LLM inference, retrieval architectures, and agent-based orchestration. The role focuses on building production-grade AI services, knowledge retrieval pipelines, and distributed inference systems.
The engineer will work on core infrastructure including embedding pipelines, vector retrieval systems, agent execution frameworks, and real-time AI APIs.
Responsibilities
LLM Application Systems
- Implement LLM-powered services using transformer-based models.
- Develop prompt pipelines and structured reasoning workflows.
- Implement context management and multi-turn dialogue state handling.
- Build retrieval pipelines for unstructured and semi-structured data.
- Implement embedding generation pipelines and vector indexing.
- Optimize similarity search and hybrid retrieval systems.
Agent Execution Frameworks
- Implement agent orchestration pipelines using frameworks such as:
- LangGraph
- LangChain
- Autogen
- custom agent runtimes
- Develop modular tools and execution chains for agent workflows.
- Implement data ingestion and preprocessing pipelines.
- Develop batch and streaming pipelines for knowledge ingestion.
- Build feature extraction and embedding generation pipelines.
- Build scalable AI APIs using Python-based microservices.
- Implement REST / gRPC interfaces for inference services.
- Optimize service latency and concurrency performance.
- Implement logging and tracing for AI interactions.
- Develop evaluation pipelines for retrieval accuracy and response quality.
- Monitor performance metrics including latency, throughput, and model accuracy.
Technical Skills
Required
- Strong programming skills in Python
- Experience with LLM frameworks
- Experience with RAG architectures
- Experience with vector databases
- Experience with distributed systems
- Experience with REST/gRPC APIs
- Experience with containerized deployment
- Python
- PyTorch / TensorFlow
- LangChain / LangGraph
- FastAPI
- Docker / Kubernetes
- PostgreSQL / Redis
- Vector DB systems
Preferred Experience
- Large-scale ML systems
- real-time inference systems
- multi-agent AI frameworks
- high-throughput distributed systems
- ML infrastructure platforms
¿Te interesa este puesto?