Gemini 2.5 — text & embeddings
Chat completions via gemini-2.5-pro / flash / flash-lite and embeddings via text-embedding-004. A single API key covers all four models.
text_generation parameters
Endpoint: POST /v1/generations with method="text_generation". Sync mode supported — the task completes in a single HTTP request.
| Name | Type | Required | Description |
|---|---|---|---|
| prompt | string | ✓ | User input. Non-empty. |
| system_instruction | string | — | System instruction — sets the model's role and style. |
| response_mime_type | enum | — | text/plain | application/json. For strict JSON output, set application/json + response_schema. |
| response_schema | object | — | JSON schema (OpenAPI subset). Forces the model to emit valid JSON. |
| tools | array | — | Function calling: array of function_declarations. See example below. |
| temperature | number | — | 0.0–2.0 (default 1.0) |
| max_output_tokens | integer | — | Cap on response tokens. Up to 8192. |
| top_p | number | — | 0.0–1.0 |
Basic chat request
Minimal example with a system instruction. The result lives in task.result.text.
curl -X POST https://api.aigenway.com/v1/generations \
-H "Authorization: Bearer sk_live_..." \
-H "Idempotency-Key: $(uuidgen)" \
-H "Content-Type: application/json" \
-d '{
"provider": "google_banana",
"model": "gemini-2.5-flash",
"method": "text_generation",
"params": {
"system_instruction": "Ты — лаконичный помощник. Отвечай в одно предложение.",
"prompt": "Почему небо голубое?",
"temperature": 0.4,
"max_output_tokens": 256
}
}'Response shape:
{
"task_id": "task_01HXYZ...",
"status": "SUCCEEDED",
"result": {
"text": "Голубой цвет неба обусловлен рассеянием Рэлея...",
"finish_reason": "STOP",
"model": "gemini-2.5-flash",
"usage": { "input_tokens": 22, "output_tokens": 31, "total_tokens": 53 },
"files": []
}
}Structured JSON output
response_mime_type=application/json + response_schema guarantees task.result.text is a string you can JSON.parse directly.
curl -X POST https://api.aigenway.com/v1/generations \
-H "Authorization: Bearer sk_live_..." \
-H "Idempotency-Key: $(uuidgen)" \
-H "Content-Type: application/json" \
-d '{
"provider": "google_banana",
"model": "gemini-2.5-pro",
"method": "text_generation",
"params": {
"prompt": "Извлеки сущности из текста: «Apple купила за $2 млрд в 2024 году компанию Pixelmator»",
"response_mime_type": "application/json",
"response_schema": {
"type": "object",
"properties": {
"buyer": { "type": "string" },
"target": { "type": "string" },
"amount_usd": { "type": "number" },
"year": { "type": "integer" }
},
"required": ["buyer", "target"]
}
}
}'Without response_schema the model may return JSON, but field shape is not guaranteed. With response_schema, fields and types are strict.
Function calling (tools)
Pass function_declarations — the model decides whether to call a function and returns arguments in tool_calls. Your code performs the call, then sends a follow-up with the result.
curl -X POST https://api.aigenway.com/v1/generations \
-H "Authorization: Bearer sk_live_..." \
-H "Idempotency-Key: $(uuidgen)" \
-H "Content-Type: application/json" \
-d '{
"provider": "google_banana",
"model": "gemini-2.5-pro",
"method": "text_generation",
"params": {
"prompt": "Какая сейчас погода в Москве?",
"tools": [{
"function_declarations": [{
"name": "get_weather",
"description": "Получить текущую погоду в городе",
"parameters": {
"type": "object",
"properties": {
"city": { "type": "string" }
},
"required": ["city"]
}
}]
}]
}
}'When the model decides to call a function — text is empty and tool_calls carries the name and args:
{
"result": {
"text": "",
"finish_reason": "TOOL_CALL",
"tool_calls": [
{ "name": "get_weather", "args": { "city": "Москва" } }
],
"usage": { "input_tokens": 47, "output_tokens": 12, "total_tokens": 59 },
"files": []
}
}Embeddings (text-embedding-004)
Endpoint: POST /v1/generations with method="embedding". Up to 100 inputs per call, native dimension 768. PER_REQUEST pricing (one inputs[] = one call).
| Name | Type | Required | Description |
|---|---|---|---|
| inputs | string[] | ✓ | Array of strings (1–100). Each up to 2048 tokens. |
| task_type | enum | — | Hints the model about the downstream task. Improves retrieval quality. |
| output_dimensionality | integer | — | 8–768. Truncate embeddings to a lower dimension (Matryoshka). |
| title | string | — | Document title — improves quality for RETRIEVAL_DOCUMENT. |
task_type ∈ { RETRIEVAL_DOCUMENT, RETRIEVAL_QUERY, SEMANTIC_SIMILARITY, CLASSIFICATION, CLUSTERING, QUESTION_ANSWERING, FACT_VERIFICATION, CODE_RETRIEVAL_QUERY }
curl -X POST https://api.aigenway.com/v1/generations \
-H "Authorization: Bearer sk_live_..." \
-H "Idempotency-Key: $(uuidgen)" \
-H "Content-Type: application/json" \
-d '{
"provider": "google_banana",
"model": "text-embedding-004",
"method": "embedding",
"params": {
"inputs": [
"Беспроводные наушники с активным шумоподавлением",
"Шапка зимняя шерстяная мужская",
"Кофемашина автоматическая с капучинатором"
],
"task_type": "RETRIEVAL_DOCUMENT"
}
}'Response shape:
{
"task_id": "task_01HXYZ...",
"status": "SUCCEEDED",
"result": {
"model": "text-embedding-004",
"count": 3,
"dimension": 768,
"embeddings": [
[0.0123, -0.0456, 0.0789, ...],
[-0.0034, 0.0212, -0.0561, ...],
[0.0418, -0.0102, 0.0337, ...]
],
"files": []
}
}Truncated embeddings (Matryoshka)
output_dimensionality returns a smaller vector (8–768) — saves memory in your vector DB without losing all signal.
curl -X POST https://api.aigenway.com/v1/generations \
-H "Authorization: Bearer sk_live_..." \
-H "Idempotency-Key: $(uuidgen)" \
-H "Content-Type: application/json" \
-d '{
"provider": "google_banana",
"model": "text-embedding-004",
"method": "embedding",
"params": {
"inputs": ["search query: купить ноутбук asus"],
"task_type": "RETRIEVAL_QUERY",
"output_dimensionality": 256
}
}'