Skip to main content

AI Models (core.models)

The core.models runtime module provides a unified GraphQL interface for AI model operations — text embeddings and LLM completions — across multiple providers.

Quick Start

1. Register a Model

mutation {
core {
insert_data_sources(data: {
name: "my_llm"
type: "llm-openai"
prefix: "my_llm"
path: "http://localhost:1234/v1/chat/completions?model=gemma-4&timeout=120s"
}) { name }
}
}

2. Generate Text

{
function {
core {
models {
completion(model: "my_llm", prompt: "Explain GraphQL in one sentence") {
content
latency_ms
}
}
}
}
}

Functions

embedding

Generate a single embedding vector.

embedding(model: String!, input: String!): embedding_result
FieldTypeDescription
vectorVectorEmbedding vector
token_countIntTokens consumed

embeddings

Generate multiple embedding vectors in batch.

embeddings(model: String!, input: String!): embeddings_result

Input is a JSON array of strings. Returns vectors and total token count.

completion

Simple text completion.

completion(
model: String!
prompt: String!
max_tokens: Int
temperature: Float
): llm_result

chat_completion

Multi-turn chat with optional tool calling.

chat_completion(
model: String!
messages: [String!]!
tools: [String!]
tool_choice: String
max_tokens: Int
temperature: Float
): llm_result

Messages: Each element is a JSON string with role and content fields:

messages: [
"{\"role\":\"system\",\"content\":\"You are a helpful assistant.\"}",
"{\"role\":\"user\",\"content\":\"What is the weather?\"}"
]

Tools: Each element is a JSON string with name, description, and parameters (JSON Schema):

tools: [
"{\"name\":\"get_weather\",\"description\":\"Get weather\",\"parameters\":{\"type\":\"object\",\"properties\":{\"city\":{\"type\":\"string\"}}}}"
]

Tool choice: "auto" (model decides), "none" (no tools), or a specific tool name.

model_sources

List all registered AI model data sources.

model_sources: [model_source_info]

Returns name, type (llm/embedding), provider, and model for each registered source.

Response: llm_result

Normalized across all providers:

FieldTypeDescription
contentStringGenerated text
modelStringModel that responded
finish_reasonStringstop, tool_use, or length
prompt_tokensIntInput tokens
completion_tokensIntOutput tokens
total_tokensIntTotal tokens
providerStringopenai, anthropic, or gemini
latency_msIntRequest latency in milliseconds
tool_callsStringJSON: [{"id":"...","name":"...","arguments":{...}}]

Supported Providers

TypeProviderCovers
llm-openaiOpenAI-compatibleOpenAI, Ollama, LM Studio, vLLM, Mistral, Qwen, Azure OpenAI
llm-anthropicAnthropicClaude models
llm-geminiGoogle GeminiGemini models
embeddingOpenAI-compatibleAny OpenAI-compatible embedding endpoint

All LLM providers support the same completion, chat_completion, and tool calling interface. Provider-specific API differences (auth headers, request format, response format) are handled transparently.

Tool Calling

Tool calls are normalized across all providers into a unified format:

[{"id": "call_123", "name": "get_weather", "arguments": {"city": "London"}}]
  • OpenAI: tool_calls[].function.arguments (string) → parsed to object
  • Anthropic: content[].input (object) → used as-is
  • Gemini: parts[].functionCall.args (object) → used as-is

See Also