xAI Brings Grok to Kilo Code — AI Dev Pulse · May 28, 2026

At a glance

## At a glance – xAI rolled out native Grok support in Kilo Code on May 27, letting SuperGrok and X Premium+ users run agentic coding workflows directly in the open-source VS Code platform. – Grok Build CLI launched May 25 with inline diffs, plan viewer, and parallel subagents for structured coding sessions. – Qwen3-Coder-Next (80B MoE) continues gaining traction for local agentic coding, with fresh Ollama and Hugging Face distribution updates surfacing this week. – Gemini 3.5 Flash reached general availability earlier in May, delivering 4× speed over comparable models at competitive pricing for high-volume dev workloads.

Introduction

The last 48 hours delivered targeted developer tooling updates rather than another frontier model drop. xAI’s moves into open-source coding agents and CLI workflows stand out, giving engineers direct access to Grok inside production-grade environments like Kilo Code. These releases emphasize practical integration over raw benchmark chasing—OAuth-based access, visible reasoning traces, and subagent orchestration that map cleanly to real agentic pipelines. Meanwhile, established coding-specialized models like Qwen3-Coder-Next keep maturing for local-first setups, and Google’s lighter Gemini variant solidifies its place for cost-sensitive inference. For builders, the signal is clear: the focus has shifted to reliable tool-calling loops, state management, and subscription-friendly access models that reduce friction in daily workflows.

Top Stories

xAI brings Grok models to Kilo Code Practical dev impact: Developers can now authenticate with SuperGrok or X Premium+ subscriptions via OAuth inside the open-source Kilo Code extension and CLI, running Grok directly for agentic coding without separate API keys or per-token billing.

Grok Build CLI enters early access Practical dev impact: The new CLI surfaces inline diffs, a plan viewer, and parallel subagent execution, enabling structured multi-step coding sessions with visible reasoning that integrates cleanly into existing Git workflows.

Qwen3-Coder-Next sees wider local distribution Practical dev impact: The 80B MoE coding model (3B active params, 256K context) optimized for long-horizon agentic tasks is now more readily available via updated Ollama and Hugging Face channels, supporting self-hosted tool-calling agents with strong recovery from execution failures.

Gemini 3.5 Flash reaches GA Practical dev impact: Google’s lightweight model is now generally available with frontier-level intelligence at 4× the speed of peers, 1M context, and pricing of $1.50/$9 per million tokens, making it immediately usable for high-throughput coding and retrieval tasks.

Practical Impact Analysis

These updates lower the barrier to running capable models inside familiar dev environments. Kilo Code’s OAuth integration means teams already paying for X Premium+ or SuperGrok can experiment with Grok agents without new procurement or billing overhead. Grok Build’s emphasis on visible plans and subagents directly addresses pain points in multi-step agent reliability—developers can inspect and steer execution rather than treating the model as a black box. On the open-source side, Qwen3-Coder-Next’s continued distribution momentum shows demand for efficient local MoE models that rival closed alternatives on agentic benchmarks while running on consumer or on-prem hardware. Gemini 3.5 Flash GA reinforces the trend toward tiered inference: heavy reasoning stays on frontier models while lighter, faster variants handle volume. Collectively, the week’s news rewards engineers who prioritize observable tool use, subscription leverage, and hybrid local/cloud setups over chasing the next leaderboard jump.

Recommended Tutorial Idea

Build a simple agentic coding loop with Grok in Kilo Code

1. Install the latest Kilo Code extension in VS Code. 2. Open the command palette and select “Connect AI Provider → xAI”. 3. Sign in with your SuperGrok or X Premium+ account (OAuth flow). 4. Create a new workspace and enable “Agent Mode”. 5. Prompt the agent: “Refactor the current file to add error handling around the API call, then run tests and propose a follow-up commit.”

python Recommended Tutorial Implementation
# Example: Minimal agent loop using xAI via OpenAI-compatible client
from openai import OpenAI

client = OpenAI(
    base_url="https://api.x.ai/v1",
    api_key="your-grok-key"  # or OAuth session token
)

response = client.chat.completions.create(
    model="grok-4",  # or grok-code-fast variant
    messages=[{"role": "user", "content": "Write a Python function to fetch user data with retries and logging."}],
    tools=[...],  # define your code execution / file tools here
    tool_choice="auto"
)


... click "Show full code" below to expand
▸ Show full code (16 lines)
# Example: Minimal agent loop using xAI via OpenAI-compatible client
from openai import OpenAI

client = OpenAI(
    base_url="https://api.x.ai/v1",
    api_key="your-grok-key"  # or OAuth session token
)

response = client.chat.completions.create(
    model="grok-4",  # or grok-code-fast variant
    messages=[{"role": "user", "content": "Write a Python function to fetch user data with retries and logging."}],
    tools=[...],  # define your code execution / file tools here
    tool_choice="auto"
)

print(response.choices[0].message.content)

Grok Deep Dive

How are teams actually wiring Grok into production agent loops today—what patterns emerge when you combine visible subagent plans from Grok Build with local Qwen3-Coder-Next fallbacks, and which observability hooks (LangSmith, Langfuse, or custom) are proving most effective for debugging long-horizon coding agents?

Grok Deep Dive

Explore each Top Story in Grok — links open in a new tab. On phones, the same link may open the Grok app if you have it installed (via your device's normal link handling).

Article: xAI Brings Grok to Kilo Code — AI Dev Pulse · May 28, 2026

Privacy: links open grok.com in your session only. AIDevPulse does not run your prompts through our API.

Leave a Comment