TurgaySavaci’s Profile | Apple Developer Forums

TurgaySavaci

Roles

Developer

Post

Replies

Boosts

Views

Activity

Building a 4-agent autonomous coding pipeline on Apple Silicon — MLX backend questions

Hi, I'm building ANF (Autonomous Native Forge) — a cloud-free, 4-agent autonomous software production pipeline running on local hardware with local LLM inference. No middleware, pure Node.js native. Currently running on NVIDIA Blackwell GB10 with vLLM + DeepSeek-R1-32B. Now porting to Apple Silicon. Three technical questions: How production-ready is mlx-lm's OpenAI-compatible API server for long context generation (32K tokens)? What's the recommended approach for KV Cache management with Unified Memory architecture — any specific flags or configurations for M4 Ultra? MLX vs GGUF (llama.cpp) for a multi-agent pipeline where 4 agents call the inference endpoint concurrently — which handles parallel requests better on Apple Silicon? GitHub: github.com/trgysvc/AutonomousNativeForge Any guidance appreciated.

Machine Learning & AI Core ML Interface Builder Core ML Apple Silicon

360

Mar ’26

Building a 4-agent autonomous coding pipeline on Apple Silicon — MLX backend questions

Machine Learning & AI Core ML Interface Builder Core ML Apple Silicon

Replies: 0
Boosts: 0
Views: 360
Activity: Mar ’26