Models
Below are the list of models from Deca:
Latest models (2.5)
Deca 2.5 Pro
Intelligence:
Cost:
-
Normal Price:
$1.25 / MTok in, $4 / MTok out
-
Turbo Price:
$2 / MTok in, $5.50 / MTok out
Reasoning:
- Sometimes
Deca dynamically choses when to think based on the prompt.
Context Length:
- 128000 Context + 32000 Attention Context
- 160000 tokens total. Deca's architecture is designed to allocate more attention heads to the 32000 attention context tokens than the 128000 normal context tokens. Deca intelligently chooses which tokens to attend to based on the prompt. This usually results in a better performance. Disable this by using `attention_density_adjustment=0` in the request body
Deca 2.5 pro delivers a great balance between performance and cost.
Snapshots:-
deca-2.5-pro:flex
? / MTok in, ? / MTok out
Flex tier allows you to spend less by only paying for the GPU resources you use. It is generally cheaper than the normal tier. Requests might be queued for processing. Prompts will be used to improve Deca's AI models -
deca-2.5-pro:free
$0 / MTok in, $0 / MTok out
deca-2.5-pro
deca-2.5-pro-turbo
deca-2.5-pro-beta-06032025
Deca 2.5 Ultra Preview
Intelligence:
Cost:
-
Normal Price:
$25 / MTok in, $50 / MTok out
Not availble yet. -
Preview Price:
$75 / MTok in, $150 / MTok out
Reasoning:
- Always
Context Length:
- 1000000 context
Deca 2.5 Ultra is our largest, and most intelligent model in the Deca 2.5 series. BETA
Deca 2.5 is currently in beta and is subject to change.Snapshots:
deca-2.5-ultra
deca-2.5-ultra-07012025
Deca 2.5 mini
Intelligence:
Cost:
-
Normal Price:
$0.35 / MTok in, $0.99 / MTok out
-
Turbo Price:
$0.99 / MTok in, $0.99 / MTok out
Reasoning:
- No
Context Length:
- 128000 context
Deca 2.5 mini is our smallest model in the Deca 2.5 series.
Snapshots:-
deca-2.5-mini:flex
? / MTok in, ? / MTok out
Flex tier allows you to spend less by only paying for the GPU resources you use. It is generally cheaper than the normal tier. Requests might be queued for processing. Prompts will be used to improve Deca's AI models deca-2.5-mini
deca-2.5-mini-turbo
deca-2.5-mini-beta-06032025
Using Deca Models
So you’ve picked your model (great choice, by the way) – now what? Below is the quick-start guide to using Deca models
1. Get Your API Key
Sign in to the and head to the API Keys page (Key icon in your bottom bar). Smash that green button, copy the key and paste it into the code below. Keep it safe.
Don't have an account? for free.
2. Fire Up a Request
curl https://api.deca.ai/v1/chat/completions \
-H "Authorization: Bearer <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{
"model": "deca-2.5-ultra-max",
"messages": [
{ "role": "user", "content": "Write acrostic about Deca, a powerful AI model" }
]
}'
# Expected response:
# Dynamic intelligence, ever-learning with grace,
# Empowering innovation, leading the race.
# Connecting minds, shaping the future bright,
# Always adapting, a beacon of light.
Substitute deca-2.5-ultra-max
with any model or snapshot from the table above.
3. Parameters Cheat-Sheet
Param | What It Does | Default |
---|---|---|
temperature | Spices up randomness. 0 = robotic, 2 = poet on espresso. | 0.7 |
max_tokens | How long the model can talk before we shush it. | Max (see models) |
top_p | Top_p controls text diversity by selecting from the smallest set of words whose probabilities add up to a threshold. | 1.0 |
stream | Set to true for token-by-token response. | false |
4. Counting (and Saving) Tokens
Prices are shown as $ / MTok
– that’s per million tokens. For a ten-token “What are the best hotels in Italy?”, you’ll pay roughly the cost of finding a penny on the sidewalk. Use the cheaper :free
snapshots for testing, then move up when you feel you're ready for production.
5. Turbo & Dwarf Modes
Add -turbo
to any model name for lightning-fast responses (at a slightly higher price). Ultra models: use higher reasoning efforts to activate more dwarves. Use deca-2.5-ultra-max
to activate all dwarves.
6. Error Codes That Matter
401
– Your key is missing or expired.429
– Slow down, you’ve hit the rate limit.502 & 500
– We tripped over a cable in the server room.