How to Use Assisters API in 2026: Quick Start Guide for Devs

Table of Contents

Updated December 26, 2025

Authentication

Assisters uses API keys for authentication. Include your key in every request via the Authorization header using the Bearer scheme.

bash

curl https://api.assisters.com/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"

Key Management

Create Keys: POST /v1/keys

json

  { "name": "dev-key-01" }

Rotate Keys: DELETE /v1/keys/{key_id} then create a new one.
Rate Limits: 1000 requests per minute per key. Exceeding this returns HTTP 429.

Best Practices

Store keys in environment variables (never in code).
Use separate keys for development, staging, and production.
Rotate keys every 90 days or after personnel changes.

Core Endpoints

Models

List available AI models and their capabilities.

Request

http

GET /v1/models

Response

json

{
  "models": [
    {
      "id": "gpt-4.1-mini",
      "name": "GPT-4.1 Mini",
      "max_tokens": 128000,
      "supports": ["chat", "embeddings", "reasoning"]
    }
  ]
}

Use Case: Select a model based on token limits or supported features.

Chat Completions

Generate AI responses for chat interactions.

Request

http

POST /v1/chat/completions

Body

json

{
  "model": "gpt-4.1-mini",
  "messages": [
    { "role": "user", "content": "Explain quantum computing." }
  ],
  "temperature": 0.7,
  "max_tokens": 1000
}

Response

json

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "Quantum computing..."
      }
    }
  ]
}

Parameters

model: Required. Specify the model ID.
messages: Required. Array of { role, content } pairs (e.g., user, assistant).
temperature: Float (0–1). Lower = more deterministic.
max_tokens: Integer. Maximum response length.

Streaming Responses Set stream: true to receive chunks as they’re generated.

javascript

fetch("https://api.assisters.com/v1/chat/completions", {
  method: "POST",
  headers: { "Authorization": "Bearer YOUR_KEY", "Content-Type": "application/json" },
  body: JSON.stringify({ model: "gpt-4.1-mini", messages: [{ role: "user", content: "Hello" }], stream: true })
});

Embeddings

Convert text into vector embeddings for semantic search or clustering.

Request

http

POST /v1/embeddings

Body

json

{
  "model": "text-embedding-3-small",
  "input": "The quick brown fox jumps over the lazy dog."
}

Response

json

{
  "embedding": [0.0012, -0.0045, ..., 0.0078],
  "model": "text-embedding-3-small",
  "usage": { "tokens": 12 }
}

Use Cases

Semantic search in document databases.
Clustering user queries for analytics.
Input for machine learning models.

Advanced Features

Tools

Extend chat completions with function calling for real-world integrations.

Request

json

{
  "model": "gpt-4.1-mini",
  "messages": [{ "role": "user", "content": "What’s the weather in Paris?" }],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get current weather for a location.",
        "parameters": {
          "type": "object",
          "properties": {
            "location": { "type": "string" }
          }
        }
      }
    }
  ]
}

Response

json

{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": null,
      "tool_calls": [{
        "id": "call_123",
        "type": "function",
        "function": { "name": "get_weather", "arguments": "{\"location\": \"Paris\"}" }
      }]
    }
  }]
}

Handling Tool Calls

Parse the tool_calls array.
Execute the named function with provided arguments.
Return results via a new message:

json

   {
     "role": "tool",
     "content": "{\"temperature\": 15, \"unit\": \"C\"}",
     "tool_call_id": "call_123"
   }

Supported Tools

web_search: Real-time web search.
code_interpreter: Execute Python code.
Custom tools via the tools parameter.

Reasoning

Enable step-by-step problem-solving for complex queries.

Request

json

{
  "model": "gpt-4.1-mini",
  "messages": [{ "role": "user", "content": "Solve 2x + 3 = 7." }],
  "reasoning": true,
  "max_tokens": 2000
}

Response

json

{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "Step 1: Subtract 3 from both sides → 2x = 4.
Step 2: Divide by 2 → x = 2.",
      "reasoning": "Derived from algebraic manipulation."
    }
  }]
}

Use Cases

Debugging code.
Mathematical proofs.
Multi-step decision making.

Error Handling

Assisters uses standard HTTP status codes. Key errors:

Code	Error Type	Example
400	Bad Request	Missing `model` parameter.
401	Unauthorized	Invalid API key.
404	Not Found	Unknown model ID.
429	Too Many Requests	Rate limit exceeded.
500	Internal Server Error	Model inference failed.

Error Response Format

json

{
  "error": {
    "type": "invalid_request_error",
    "message": "Model not found.",
    "param": "model",
    "code": "model_not_found"
  }
}

Retry Logic

For 429, implement exponential backoff (e.g., 1s, 2s, 4s).
For 500, retry up to 3 times with jitter (e.g., +0.5s).

SDKs and Libraries

Official SDKs

Python: pip install assistents

python

  from assistents import Assisters

  client = Assisters(api_key="YOUR_KEY")
  response = client.chat.completions.create(model="gpt-4.1-mini", messages=[{"role": "user", "content": "Hello"}])
  print(response.choices[0].message.content)

Node.js: npm install @assisters/sdk

javascript

  import Assisters from '@assisters/sdk';

  const client = new Assisters({ apiKey: "YOUR_KEY" });
  const response = await client.chat.completions.create({ model: "gpt-4.1-mini", messages: [{ role: "user", content: "Hello" }] });
  console.log(response.choices[0].message.content);

Community Libraries

Go: github.com/assisters/go-sdk
Ruby: gem assistents-ruby

Webhooks

Subscribe to real-time events (e.g., chat completions, errors).

Setup

Create Hook: POST /v1/webhooks

json

   {
     "url": "https://your-server.com/events",
     "events": ["chat.completion", "model.failed"]
   }

Verify Endpoint: Respond to GET /webhooks/verify with a challenge token.
Receive Events: Assisters sends HTTP POST requests with payloads like:

json

   {
     "event": "chat.completion",
     "data": { "id": "chat_123", "status": "completed" }
   }

Security

Validate webhook signatures using a shared secret.
Use HTTPS for the endpoint URL.

Performance Optimization

Caching

Cache embeddings for repeated queries:

python

  from cachetools import cached, TTLCache

  cache = TTLCache(maxsize=1000, ttl=3600)

  @cached(cache)
  def get_embedding(text):
      response = client.embeddings.create(model="text-embedding-3-small", input=text)
      return response.embedding

Use Redis or Memcached for distributed caching.

Batch Processing

Embed multiple texts in one request:

json

  {
    "model": "text-embedding-3-small",
    "input": ["text 1", "text 2", "text 3"]
  }

Model Selection

Use smaller models for low-latency tasks (e.g., gpt-4.1-mini instead of gpt-4.1-ultra).

Compliance and Security

Data Handling

GDPR/CCPA: Delete data via DELETE /v1/data/{id}.
Encryption: All data in transit uses TLS 1.3. Data at rest is encrypted.
PII Redaction: Use mask: true in requests to redact personally identifiable information.

Audit Logs

Access logs via GET /v1/audit?start=2024-01-01&end=2024-01-31.

Migration Guide

From v1 Legacy API

Update endpoints:

/v1/completions → /v1/chat/completions

Replace prompt with messages array:

diff

   - { "prompt": "Hello" }
   + { "messages": [{ "role": "user", "content": "Hello" }] }

Use new models (e.g., gpt-4.1-mini instead of gpt-3.5-turbo).

Breaking Changes in v2

temperature now defaults to 1.0 (was 0.5).
max_tokens includes response tokens (previously excluded).

Best Practices for Developers

Idempotency: Use the idempotency-key header for retries:

http

  POST /v1/chat/completions
  Idempotency-Key: abc123

Monitoring: Track latency and error rates with /v1/metrics.
Fallbacks: Implement a secondary model for high-priority tasks.
Testing: Use /v1/models/{model}/test for canary deployments.

Assisters’ API empowers you to integrate AI seamlessly into your applications, whether you’re building chatbots, search engines, or automation tools. By leveraging the endpoints, tools, and optimizations outlined here, you can reduce development time from weeks to minutes while ensuring scalability and reliability. Start with the quickstart guide and experiment with the interactive playground to see what’s possible. The future of AI-assisted development is here—build it today.