Updated February 2026

AI Model Comparison

Compare the leading AI models side-by-side—capabilities, pricing, benchmarks, and best use cases.

Showing 17 of 17 models

Model	Provider	Context	Input Price	Output Price	Capabilities	MMLU	HumanEval	Best For	Links
GPT-4o May 2024	OpenAI	128K	$2.50/1M	$10.00/1M	🎤	88.7%	90.2%	General useMultimodal tasks	Visit
GPT-4 Turbo Apr 2024	OpenAI	128K	$10.00/1M	$30.00/1M		86.4%	87.1%	Complex reasoningLong documents	Visit
Claude 3.5 Sonnet Jun 2024	Anthropic	200K	$3.00/1M	$15.00/1M		88.3%	92%	CodingAnalysis	Visit
Claude 3.5 Opus Oct 2024	Anthropic	200K	$15.00/1M	$75.00/1M		91.2%	94.5%	ResearchComplex reasoning	Visit
Gemini 1.5 Pro Feb 2024	Google	2M	$1.25/1M	$5.00/1M	🎤🎬	85.9%	84.1%	Ultra-long contextVideo analysis	Visit
Gemini Ultra Dec 2023	Google	128K	Enterprise	Enterprise	🎤🎬	90%	74.4%	EnterpriseComplex tasks	Visit
Grok-2 Aug 2024	xAI	128K	$2.00/1M	$10.00/1M		87.5%	88.4%	Real-time infoX integration	Visit
GPT-4o mini Jul 2024	OpenAI	128K	$0.15/1M	$0.60/1M		82%	87%	Cost-effectiveHigh volume	Visit
Claude 3 Haiku Mar 2024	Anthropic	200K	$0.25/1M	$1.25/1M		75.2%	75.9%	SpeedSimple tasks	Visit
Gemini 1.5 Flash May 2024	Google	1M	$0.075/1M	$0.30/1M	🎤🎬	78.9%	74.3%	Cost-effectiveLong context	Visit
Mistral Large 2 Jul 2024	Mistral	128K	$2.00/1M	$6.00/1M		84%	92.1%	CodingEuropean compliance	Visit
Command R+ Apr 2024	Cohere	128K	$2.50/1M	$10.00/1M		75.7%	72%	RAGEnterprise	Visit
Llama 3.1 405B Jul 2024	Meta	128K	Free / $5.00/1M*	Free / $15.00/1M*		88.6%	89%	Self-hostingFine-tuning	Visit
Llama 3.1 70B Jul 2024	Meta	128K	Free / $0.90/1M*	Free / $0.90/1M*		83.6%	80.5%	BalancedSelf-hosting	Visit
Mixtral 8x22B Apr 2024	Mistral	64K	Free / $1.20/1M*	Free / $1.20/1M*		77.8%	75%	MoE efficiencyMultilingual	Visit
Qwen2 72B Jun 2024	Alibaba	128K	Free	Free		84.2%	86%	Chinese/EnglishMath	Visit
DeepSeek V2.5 Sep 2024	DeepSeek	128K	$0.14/1M	$0.28/1M		80.4%	89.4%	CodingUltra-cheap	Visit

* Prices for open-source models reflect API hosting costs (e.g., Together AI, Fireworks). Self-hosting is free.

Quick Recommendations

Best Value

GPT-4o mini

Incredible performance at $0.15/1M input tokens. Perfect for high-volume applications.

Best for Coding

Claude 3.5 Sonnet

92% HumanEval score with 200K context. The developer\'s choice for complex coding tasks.

Best for Long Context

Gemini 1.5 Pro

2 million token context window. Analyze entire codebases or books in one prompt.

Weekly newsletter starting March 1st. No spam. Unsubscribe anytime.