Calculate and compare the cost of using OpenAI, Azure, Anthropic Claude, Llama 2, Google Gemini, Mistral, and Cohere LLM APIs for your AI project with our simple and powerful free calculator. Latest numbers as of October 2024
Understanding the pricing formula for LLM API calls
Basic Formula:
All costs are calculated in USD. Adjust the parameters below to see how changes affect your pricing.
Thank you for the amazing response on Reddit! 37K views • 98% upvotes • 96 shares
View discussionTip: Share your calculations with others by copying the URL - it includes all your current settings and filters.
See model-specific costs in the table below
Calculate token usage for images sent to vision-enabled language models
For image generation models, output tokens are fixed based on dimensions and quality settings
Or manually enter dimensions:
Tile representation:
Each cell represents a 512×512px tile
All prices in USD. Costs calculated based on your inputs above.
Provider | Model | Knowledge Cutoff | Context | Input/1M tokens | Output/1M tokens | Per Call | Chat/Completion Models |
---|---|---|---|---|---|---|---|
OpenAI / Azure | gpt-4-32k | Sep 2021 | 32k | $60.00 | $120.00 | $0.0420 | $0.04 |
OpenAI / Azure | gpt-4 | Sep 2021 | 8K | $30.00 | $60.00 | $0.0210 | $0.02 |
Anthropic | claude opus | Aug 2023 | 200K | $15.00 | $75.00 | $0.0150 | $0.02 |
OpenAI / Azure | o3 | Mar 2024 | 128K | $10.00 | $40.00 | $0.0090 | $0.01 |
OpenAI / Azure | gpt-4 turbo | Apr 2023 | 128K | $10.00 | $30.00 | $0.0080 | $0.01 |
OpenAI / Azure | gpt-4 turbo with vision | Apr 2023 | 128K | $10.00 | $30.00 | $0.0080 | $0.01 |
OpenAI / Azure | gpt-4o preview | Apr 2023 | 128K | $10.00 | $30.00 | $0.0080 | $0.01 |
OpenAI / Azure | gpt-4-0125-preview | Dec 2023 | 128K | $10.00 | $30.00 | $0.0080 | $0.01 |
OpenAI / Azure | gpt-4-1106-preview | Apr 2023 | 128K | $10.00 | $30.00 | $0.0080 | $0.01 |
OpenAI / Azure | gpt-4-1106-vision-preview | Apr 2023 | 128K | $10.00 | $30.00 | $0.0080 | $0.01 |
OpenAI / Azure | gpt-4-turbo-preview | Dec 2023 | 128K | $10.00 | $30.00 | $0.0080 | $0.01 |
OpenAI / Azure | gpt-4-vision-preview | Apr 2023 | 128K | $10.00 | $30.00 | $0.0080 | $0.01 |
Anthropic | claude 3.7 sonnet | Jun 2024 | 200K | $8.00 | $24.00 | $0.0064 | $0.01 |
Mistral AI | mistral-large-latest | Sep 2021 | 32K | $8.00 | $24.00 | $0.0064 | $0.01 |
OpenAI / Azure | o3 (cached input) | Mar 2024 | 128K | $2.50 | $40.00 | $0.0053 | $0.01 |
OpenAI / Azure | gpt-4o | October 2023 | 128K | $5.00 | $15.00 | $0.0040 | $0.00 |
OpenAI / Azure | gpt-4o-2024-05-13 | October 2023 | 128K | $5.00 | $15.00 | $0.0040 | $0.00 |
Anthropic | claude sonnet | Aug 2023 | 200K | $3.00 | $15.00 | $0.0030 | $0.00 |
Anthropic | claude 3.5 sonnet | Apr 2024 | 200K | $3.00 | $15.00 | $0.0030 | $0.00 |
gemini 1.5 pro | Early 2023 | 1 Million | $3.50 | $10.50 | $0.0028 | $0.00 | |
Mistral AI | mistral-medium-latest | Sep 2021 | 32K | $2.70 | $8.10 | $0.0022 | $0.00 |
OpenAI / Azure | gpt-4o (cached input) | October 2023 | 128K | $1.25 | $15.00 | $0.0021 | $0.00 |
OpenAI / Azure | gpt-4.1 | June 2024 | 1M | $2.00 | $8.00 | $0.0018 | $0.00 |
gemini 2.5 pro | April 2024 | 1 Million | $1.25 | $10.00 | $0.0016 | $0.00 | |
Mistral AI | mistral-small-latest | Sep 2021 | 32K | $2.00 | $6.00 | $0.0016 | $0.00 |
OpenAI / Azure | text-davinci-003 | Jun 2021 | 4K | $2.00 | $2.00 | $0.0012 | $0.00 |
OpenAI / Azure | davinci-002 | Oct 2021 | 16K | $2.00 | $2.00 | $0.0012 | $0.00 |
OpenAI / Azure | gpt-4.1 (cached input) | June 2024 | 1M | $0.50 | $8.00 | $0.0011 | $0.00 |
OpenAI / Azure | o4-mini | Mar 2024 | 128K | $1.10 | $4.40 | $0.0010 | $0.00 |
OpenAI / Azure | gpt-3.5-turbo-instruct | Sep 2021 | 4K | $1.50 | $2.00 | $0.0010 | $0.00 |
OpenAI / Azure | gpt-3.5-turbo-instruct-0914 | Sep 2021 | 4K | $1.50 | $2.00 | $0.0010 | $0.00 |
Meta (via Anyscale) | llama 2 70b | Sep 2022 | 4K | $1.00 | $1.00 | $0.0006 | $0.00 |
OpenAI / Azure | o4-mini (cached input) | Mar 2024 | 128K | $0.28 | $4.40 | $0.0006 | $0.00 |
Amazon | titan text express | N/A | 8K | $0.80 | $1.60 | $0.0006 | $0.00 |
Mistral AI | open-mixtral-8x7b | Sep 2021 | 32K | $0.70 | $0.70 | $0.0004 | $0.00 |
OpenAI / Azure | gpt-3.5-turbo-0125 | Sep 2021 | 16K | $0.50 | $1.50 | $0.0004 | $0.00 |
gemini 1.0 pro | Early 2023 | 32K | $0.50 | $1.50 | $0.0004 | $0.00 | |
OpenAI / Azure | gpt-3.5-turbo-1106 | Sep 2021 | 16K | $0.50 | $1.50 | $0.0004 | $0.00 |
OpenAI / Azure | gpt-4.1-mini | June 2024 | 1M | $0.40 | $1.60 | $0.0004 | $0.00 |
Anthropic | claude haiku | Aug 2023 | 200K | $0.25 | $1.25 | $0.0003 | $0.00 |
gemini 1.5 flash | Early 2023 | 1 Million | $0.35 | $0.70 | $0.0002 | $0.00 | |
OpenAI / Azure | gpt-4.1-mini (cached input) | June 2024 | 1M | $0.10 | $1.60 | $0.0002 | $0.00 |
Amazon | titan text lite | N/A | 4K | $0.30 | $0.40 | $0.0002 | $0.00 |
Mistral AI | open-mistral-7b | 32K | $0.25 | $0.25 | $0.0002 | $0.00 | |
OpenAI / Azure | gpt-4o-mini | October 2023 | 128K | $0.15 | $0.60 | $0.0001 | $0.00 |
gemini 2.5 flash | April 2024 | 1 Million | $0.15 | $0.60 | $0.0001 | $0.00 | |
OpenAI / Azure | babbage-002 | Oct 2021 | 16K | $0.20 | $0.20 | $0.0001 | $0.00 |
OpenAI / Azure | gpt-4.1-nano | June 2024 | 1M | $0.10 | $0.40 | $0.0001 | $0.00 |
Mistral (via Anyscale) | mistral-7b-instruct-v0.1 | Sep 2021 | 32K | $0.15 | $0.15 | $0.0001 | $0.00 |
gemini 2.0 flash | June 2024 | 1 Million | $0.10 | $0.40 | $0.0001 | $0.00 | |
OpenAI / Azure | gpt-4o-mini (cached input) | October 2023 | 128K | $0.04 | $0.60 | $0.0001 | $0.00 |
OpenAI / Azure | gpt-4.1-nano (cached input) | June 2024 | 1M | $0.03 | $0.40 | $0.0001 | $0.00 |
gemini 2.0 flash with context caching | June 2024 | 1 Million | $0.03 | $0.40 | $0.0001 | $0.00 | Fine-tuning Models |
OpenAI | gpt-3.5 turbo | Sep 2021 | 4K | $3.00 | $6.00 | $0.0000 | $0.00 |
palm 2 | Mid 2021 | 8K | $2.00 | $2.00 | $0.0000 | $0.00 | Embedding Models |
OpenAI / Azure | ada v2 | $0.10 | $0.0000 | $0.00 | |||
OpenAI / Azure | text-embedding-3-small | $0.02 | $0.0000 | $0.00 | |||
OpenAI / Azure | text-embedding-3-large | $0.13 | $0.0000 | $0.00 | |||
OpenAI / Azure | text-embedding-ada-002 | $0.10 | $0.0000 | $0.00 | |||
Anthropic | claude 3 haiku embeddings | $0.10 | $0.0000 | $0.00 | |||
Anthropic | claude 3 sonnet embeddings | $0.10 | $0.0000 | $0.00 | |||
Anthropic | claude 3 opus embeddings | $0.10 | $0.0000 | $0.00 | |||
palm 2 | $0.40 | $0.0000 | $0.00 | ||||
Cohere | embed | $0.10 | $0.0000 | $0.00 | |||
Mistral | embed | $0.10 | $0.0000 | $0.00 | |||
Amazon | tital embeddings | $0.10 | $0.0000 | $0.00 | Image Generation Models | ||
OpenAI | gpt-image-1 (portrait, high) | Apr 2025 | 1024x1536 (6240 Output Tokens) | $5.00 | $40.00 | $0.2521 | $0.25 |
OpenAI | gpt-image-1 (landscape, high) | Apr 2025 | 1536x1024 (6208 Output Tokens) | $5.00 | $40.00 | $0.2508 | $0.25 |
OpenAI | gpt-image-1 (square, high) | Apr 2025 | 1024x1024 (4160 Output Tokens) | $5.00 | $40.00 | $0.1689 | $0.17 |
OpenAI | gpt-image-1 (portrait, medium) | Apr 2025 | 1024x1536 (1584 Output Tokens) | $5.00 | $40.00 | $0.0659 | $0.07 |
OpenAI | gpt-image-1 (landscape, medium) | Apr 2025 | 1536x1024 (1568 Output Tokens) | $5.00 | $40.00 | $0.0652 | $0.07 |
OpenAI | gpt-image-1 (square, medium) | Apr 2025 | 1024x1024 (1056 Output Tokens) | $5.00 | $40.00 | $0.0447 | $0.04 |
OpenAI | gpt-image-1 (portrait, low) | Apr 2025 | 1024x1536 (408 Output Tokens) | $5.00 | $40.00 | $0.0188 | $0.02 |
OpenAI | gpt-image-1 (landscape, low) | Apr 2025 | 1536x1024 (400 Output Tokens) | $5.00 | $40.00 | $0.0185 | $0.02 |
OpenAI | gpt-image-1 (square, low) | Apr 2025 | 1024x1024 (272 Output Tokens) | $5.00 | $40.00 | $0.0134 | $0.01 |
OpenAI | computer-use-preview-2025-03-11 | Mar 2025 | Computer Vision | $3.00 | $12.00 | $0.0000 | $0.00 |
Remember: The cheapest AI call is the one you don't make. Optimize your application architecture to minimize unnecessary API calls.
You can find the official OpenAI pricing here.
For gpt-image-1
pricing, you can find it
here.
For Mistral Platform pricing and rate limits, you can find them
here.
Claude AI pricing and rate limits can be found
here.
Google's AI pricing, or gemini pricing, or palm pricing can be found
here.
Amazon's AI pricing, or bedrock pricing, can be found
here.
A token is a part of a text that can be word, subword, punctuation mark or symbol. It is the unit of account of OpenAI APIs.
An OpenAI execution is the combination of the invite to OpenAI and the response from OpenAI. All text in the prompt and response count for OpenAI billing.
The OpenAI prompt is the instructions or question you send to OpenAI in order to get a
response. It is text written in normal, natural language, for example:
Prompt: "Write a tagline for an ice cream shop".
This is the response OpenAI gives you to the prompt you sent, for example:
Prompt: "Write a tagline for an ice cream shop"
Response: "We serve up smiles with every scoop!"
You can configure a billing limit in the OpenAI billing settings, in order to control your costs.
You can review your usage in the OpenAI dashboard usage section. You can easily set a spending limit in the OpenAI dashboard billing section.
You can use OpenAI's tokenizer to estimate your token usage for a given prompt. You can find the calculator here.
OpenAI vision models process images by dividing them into tiles. For GPT-4.1, images are first resized if they exceed 2048px in any dimension. If the shortest side is larger than 768px, it's scaled down. The image is then divided into 512×512px tiles, with each tile costing 170 tokens. A base cost of 85 tokens is added per request. The total token cost is calculated as: (number of tiles × 170 tokens per tile) + 85 base tokens. You can use our Image Tokens calculator tab to estimate these costs precisely.
Cached input pricing is a way to reduce costs when making multiple API calls with the same input tokens. When you use cached input (by including a session_id in your API calls), you're only charged the full input token rate for the first call, and a significantly reduced rate (typically 75% lower) for subsequent calls with the same content. This is particularly useful for applications where you need to reuse the same context or instructions across multiple requests, such as chatbots with fixed system prompts or tools that process similar inputs frequently. Models with "(cached input)" in their name in our calculator represent the pricing for these cached token scenarios.