LlaDash API Documentation

The LlaDash API is a next-generation LLM routing API available without authentication keys. It dynamically combines models via Groq's inference engine to generate high-quality answers.

[Access Restriction] Requests with User-Agents under 10 chars, or from common scripts like curl, python, or node are blocked (403 Forbidden). Please set a custom User-Agent.

Architecture Overview

Up to 4 models work together per request:
1. Safety Guard (Llama-Prompt-Guard-2): Verifies prompt safety
2. Router (Llama-3.1-8b): Judges complexity & generates titles
3. Thinker (Qwen3-32b): Generates internal reasoning
4. Responder (Llama-4-Scout-17b): Generates final answer

Text Generation Endpoint

Send a prompt to receive AI inferences. Supports both GET and POST requests.

GET POST /ai/{prompt}

Query Parameters

NameTypeDescription
promptStringMax length is 500 chars. Exceeding this returns 400 Bad Request.
jsonBooleanReturns results in JSON, including thinking process and title if generated.
xmlBooleanReturns results in XML format.
streamBooleanStreams the response in real-time using Server-Sent Events (SSE).
titleBooleanGenerates a short, localized title alongside the answer.

Example Request (JSON + Title)

const url = 'https://api.ndnx.workers.dev/ai/What is AI?&json=true&title=true';
const response = await fetch(url, {
  headers: { 'User-Agent': 'MyAwesomeApp/1.0' }
});
const data = await response.json();
console.log(data);

Example Response

{
  "status": "success",
  "model": "meta-llama/llama-4-scout-17b-16e-instruct",
  "thinking": "The user is asking for a definition of AI...",
  "title": "AIの定義",
  "answer": "人工知能(AI)とは..."
}

Rate Limit Information

Retrieve global queue status and underlying LLM rate limit states.

GET /ai/rate

Rate Limits

  • Global Queue Limit: Max 20 requests per minute
  • Llama-Prompt-Guard: 30 RPM / 15,000 TPM
  • Router (Llama-3.1-8b): 30 RPM / 131,072 TPM
  • Thinker (Qwen3-32b): 30 RPM / 6,000 TPM
  • Responder (Llama-4-Scout): 30 RPM / 131,072 TPM

Error Responses

  • 400 Bad Request: Prompt too long or invalid request.
  • 403 Forbidden: Bot detected, missing User-Agent, or security block.
  • 429 Too Many Requests: Global queue limit (20/min) exceeded.
  • 500 Internal Server Error: Backend API error or timeout.