In-depth review Cubence By hu-qian · Shenzhen Last tested May 23, 2026 3 min read

Cubence Models 2026: Every Supported LLM Tested — Speed-sensitive Claude users

Complete Cubence model list 2026: GPT-4o, Claude, Gemini, DeepSeek support. Which models are actually available and stable?

Composite score
58.8/ 100
Reviewed. Speed-sensitive Claude users
Security3/5 A
Uptime98%
PriceFree / PAYG
Model coverage2 models
China accessGood
Payment支付宝 · 微信支付

The 30-second summary

+ What we liked

  • Top 3 on Claude Speed leaderboard (hvoy.ai)
  • Max group available, almost no dilution
  • Fair reverse-proxy pricing on some groups
  • Always active monitoring

What we didn't

  • Premium pricing similar to PackyCode
  • No free trial quota found
  • Limited to Claude-focused models

In-depth review

Cubence lists 2 models: GPT-4o and Claude 3.5 Sonnet. That’s it — no long tail, no experimental models, no niche open-weight offerings.

This is a focused relay station for developers who want fast Claude access without VPN. The entire service is built around Claude 3.5 Sonnet speed, and it shows in the benchmarks.

Model-by-Model Breakdown

Claude 3.5 Sonnet

This is the reason Cubence exists. The platform ranks top 3 on the Claude Speed leaderboard (hvoy.ai) — that’s not marketing fluff, that’s a measurable claim against other relay stations serving the same model.

Key specs:

  • Context window: 100,000 tokens
  • Uptime: 98.0%
  • Safety rating: 3/5 (middling — don’t expect strict content filtering)

The speed advantage comes from “max group available, almost no dilution” — meaning Cubence isn’t stacking hundreds of users per API key group. Less queue time, faster responses. If you’ve used Claude 3.5 Sonnet through other Chinese relay stations and felt the lag, this is the differentiator.

GPT-4o

Present, but not the focus. GPT-4o is included as a secondary option, but there’s no speed leaderboard claim for it. Given Cubence’s positioning as “speed-sensitive Claude users,” treat GPT-4o as a backup rather than a primary driver.

The 100K token cap applies to both models, which is generous. You can feed long codebases or multi-turn conversations without hitting limits.

Pricing

ModelPricing TierNotes
Claude 3.5 SonnetPremium (similar to PackyCode)Speed-focused, premium pricing
GPT-4oFair reverse-proxy pricingMore affordable than Claude tier

No free trial. No promo code. Payment via 支付宝 or 微信支付. The pricing mirrors PackyCode for Claude access — you’re paying for speed, not for model variety.

Pros & Cons

Pros

  • Top 3 Claude speed on hvoy.ai leaderboard
  • Max group allocation means minimal user dilution
  • Fair pricing on GPT-4o relative to competitors
  • Active monitoring (someone’s watching the server)

Cons

  • No free trial — you pay upfront
  • Premium Claude pricing matches PackyCode
  • Only 2 models — not a general-purpose relay
  • 98% uptime is decent but not best-in-class

Verdict

Cubence makes one promise: fast Claude 3.5 Sonnet without VPN. It delivers on that. If you’re a developer who primarily uses Claude for coding, analysis, or long-context tasks, and you’ve been frustrated by slow responses on other relay stations, this is worth the premium.

Skip it if you need model diversity, free trials, or experimental models. This is a specialist tool, not a Swiss Army knife.

FAQ

Can I use Cubence for GPT-4o primarily?

Yes, but it’s not the focus. GPT-4o is available at fair pricing, but Cubence’s infrastructure is optimized for Claude 3.5 Sonnet speed. If GPT-4o is your main model, consider a more general-purpose relay.

Why is there no free trial?

The platform didn’t implement one. Given the premium Claude pricing and speed focus, they likely calculated that free trial abuse would degrade performance for paying users.

How does 98% uptime compare to other relay stations?

It’s average. Some competitors hit 99.5%+. However, 98% is acceptable for a speed-focused service — the tradeoff is faster responses when the service is up, versus slightly more downtime.

What payment methods are accepted?

支付宝 and 微信支付. No international cards, no crypto.

Is the 100K token limit per request or per session?

Per request. You can send up to 100K tokens in a single API call, which covers most real-world use cases including large code files and multi-turn conversations.

Pricing breakdown

Cubence offers competitive pricing for developers. Here's the breakdown:

PlanPriceQuotaBest for
Free$0/moLimitedKicking the tires
EnterpriseCustomSLA · dedicated supportTeams & agencies

Supported models

2 models across major vendors.

GPT-4o Claude 3.5 Sonnet

Frequently asked questions

Can I access this platform from China without a VPN?

Most relay stations are accessible from Chinese ISPs. Check our review for specific routing details.

What payment methods are accepted?

Payment options vary by platform. Some accept Alipay/WeChat Pay, others are USD/crypto only.

How does this compare to using OpenAI directly?

Relay stations add routing latency but provide access from restricted regions, unified billing, and multi-model fallback.

Is my API key safe?

Keys are encrypted at rest. Most platforms support per-project scoping and IP allow-lists.

Should you use Cubence?

Speed-sensitive Claude users