AI Kernel Writer
BeBeeDesign and implement high-performance compute kernels for AI primitives such as GEMM, attention, normalization, and convolution. Optimize for throughput, latency, and memory hierarchy across heterogeneous compute units (SIMD, matrix engines, DMA). Collaborate with compiler and runtime teams to integ