DiscreteStack is a deep-tech company building a private AI operating system for enterprise environments.

We give businesses their own AI — running on their infrastructure, under their control, at predictable cost. Our platform handles everything from high-performance LLM inference with smart scheduling and caching, to identity management, observability, enterprise system connectors, and a catalog of ready-to-use AI applications.

We work across the full stack — from GPU scheduling and KV-cache optimization to control plane, identity management, and application delivery.

Please read carefully below because we truly mean what we say.

We're looking for an AI Software Engineer to join DiscreteStack in Sofia. You'll work across the full stack of our private AI operating system — from GPU-level inference optimization and model serving to control plane services, API layers, and enterprise integrations. If you're the kind of engineer who's equally comfortable debugging a CUDA kernel and wiring up an API gateway, this is your role.

The right person is technically sharp, learns fast, and doesn't need hand-holding to get things done.

Key Responsibilities

As an AI Software Engineer, you will:

Build and maintain core platform components — inference runtime, dynamic queue scheduling, identity and access control, observability.
Work with LLM serving infrastructure: vLLM / KTTransformers / SGLang, model quantization, KV-cache optimization, request scheduling.
Develop and integrate enterprise connectors (MCP, A2A or shell based) to systems of record like SAP, Salesforce, and internal client platforms.
Leverage AI coding tools to write production-grade code in Python, Go, or other technologies — with a bias toward simplicity and reliability.
Work with Linux systems, networking, and containerized deployments — our platform runs on bare metal and sovereign cloud.
Contribute to platform packaging, installation tooling, and system update workflows for on-premise environments.
Debug across layers — from GPU utilization and model behavior to HTTP routing and auth flows.

Who We’re Looking For

We don’t expect you to have everything figured out yet – we’re more interested in your attitude and potential! Here’s what we value:

Bachelor's or Master's degree in Computer Science and 3 years, or 5 years without a degree — in both cases with real production deployments, not just prototypes.
Strong programming skills with production experience in Python, Java, Go, or C — we care more about systems-level depth than framework familiarity. If you've built infrastructure, tooling, or backend services that run in production, that matters more than which language you used.
Hands-on experience with Linux systems administration — you're comfortable in a terminal, not afraid of systemd, networking, and bare metal.
Familiarity with LLM inference stacks (vLLM, SGLang, KTransormers) or strong willingness to learn fast.
Experience with containerization and deployment tooling — Docker, systemd, CI/CD pipelines.
You use AI coding tools daily and know how to get reliable output from them — not just accept whatever comes out.
Solid understanding of APIs, networking fundamentals, and distributed systems basics.
English proficiency (B2 , spoken and written).

Bonus points if you have:

Experience working directly with founders or C-level as the delivery counterpart.
Familiarity with engineering delivery workflows (GitHub, CI/CD, sprint boards).
Understanding of AI/LLM core concepts — even at a high level.

Interview process

Meeting the decision makers (up to two interviews)
Feedback within two calendar weeks

What You Can Expect From Us

We Work in Office. We're a small team building something hard. That requires being in the same room, not on the same Slack channel. You'll commute to the office every day — and so will everyone else.
Direct Impact. This is a startup, not a department. What you ship next week matters next week. Your work directly shapes how the product gets to market.
Founder Access. You'll work directly with the CEO — no layers, no filters. You'll know the full picture and have a say in how things move.
Autonomy & Ownership. You own delivery — the cadence, the communication, the outcomes. But key decisions stay close to the founder. Expect high trust and high standards in equal measure.
Hard Technical Problems. GPU inference optimization, multi-actor scheduling, on-premise deployment in air-gapped environments — this isn't CRUD work.

If you’re excited to advance your career in delivery management and think DiscreteStack could be the right place for you, we’d love to hear from you!

AI Software Engineer

Описание на позицията

Key Responsibilities

Who We’re Looking For

Interview process

Свързани

Свързани