Abcas blog / latest AI rules

Latest AI Rules

2026年3月6日時点の公式情報を基準に、GPT-5.4、GPT-5.3-Codex xhigh、Claude Sonnet 4.6、Claude Opus 4.6、Gemini 3.1 Pro を、開発、業務設計、Excel、PPT、法律、医療、金融まで含めて使い分けるための実務ルールです。

A practical operating guide, updated for March 6, 2026, for using GPT-5.4, GPT-5.3-Codex xhigh, Claude Sonnet 4.6, Claude Opus 4.6, and Gemini 3.1 Pro across development, business operations, Excel, presentations, legal work, medical literature, and finance.

更新: 2026-03-06 Updated: 2026-03-06 参照つき With references ライト / ダーク切替 Light / Dark switch blog.abcas.jp

先に結論

Fast recommendation

1本で広く回すなら GPT-5.4。実装実務は Codex。高リスク領域は単独モデルに決めさせない。

Use GPT-5.4 as the broad default, Codex for execution-heavy development, and never let a single model make final calls in high-stakes domains.

命名注意

Naming note

Codex-5.3-xhigh は厳密には GPT-5.3-Codex に xhigh reasoning をかけた呼び方です。

Codex-5.3-xhigh is shorthand for GPT-5.3-Codex with xhigh reasoning.

まず守るルール

Operating baseline

Default

迷ったら GPT-5.4 を主担当にする。相談、要件、調査、文書、設計を1本で回しやすい。

Default to GPT-5.4 when you want one lead model across consultation, requirements, research, documents, and design.

Execution

実装と検証のループ は GPT-5.3-Codex xhigh を優先する。CLI、repo、テスト、修復反復に向く。

Prefer GPT-5.3-Codex xhigh for implementation loops that live in the terminal, repository, tests, and repeated fixes.

High stakes

法律、医療、金融 は必ず一次資料と人間レビューを通す。モデルは下書き、比較、論点整理まで。

For legal, medical, and financial work, require primary sources and human review. Models can draft, compare, and structure, but not make the final call alone.

モデル別の役割

Model roles

GPT-5.4

広く回す主担当。調査、文書、設計、Office、業務判断をまとめやすい。

The broad mainline model for research, documents, design, office work, and mixed business tasks.

GPT-5.3-Codex xhigh

実装ワーカー。コード変更、CI、デバッグ、端末操作を押し切る。

The implementation worker for code changes, CI, debugging, and computer-based execution.

Claude Sonnet 4.6

コスパの良い万能補助。量の多い本番運用、フロント、業務ワークフローに強い。

A cost-efficient all-rounder for scaled production, frontend work, and workflow-heavy operations.

Claude Opus 4.6

最難タスクの審査役。厳密レビュー、法務系、金融系、多段分析で使う。

The hardest-task reviewer for strict review, legal-heavy analysis, finance-heavy analysis, and long multi-step work.

Gemini 3.1 Pro

巨大文脈とマルチモーダル担当。大量資料、PDF、画像、動画、コードベース俯瞰に向く。

The huge-context multimodal specialist for large source bundles, PDFs, charts, video, and codebase-wide understanding.

用途別の推奨

Recommendations by use case

観点	第1候補	第2候補	実務メモ
Use case	First choice	Second choice	Operational note
相談 / 壁打ち / 要件定義	GPT-5.4	Sonnet 4.6	抽象度を上下しやすい。曖昧な依頼の整理に向く。
Consultation / sparring / requirements	GPT-5.4	Sonnet 4.6	Best when the work moves between ambiguity, structure, and decisions.
リサーチ / 情報収集 / 資料統合	GPT-5.4	Gemini 3.1 Pro	Web中心はGPT、大量資料やマルチモーダル束ねはGemini。
Research / source gathering / synthesis	GPT-5.4	Gemini 3.1 Pro	Use GPT for web-heavy work and Gemini for very large, multimodal source bundles.
アーキテクチャ / バックエンド / 連携設計	GPT-5.4	Opus 4.6	境界設計、整合、制約整理はGPT。最難ケースの詰めはOpus。
Architecture / backend / integration design	GPT-5.4	Opus 4.6	Use GPT for coherent system design and Opus for unusually hard edge cases.
フロントデザイン / プロトタイプ	Sonnet 4.6	GPT-5.4	見た目の意図と指示追従のバランスが良い。1本化するならGPTでもよい。
Frontend design / prototypes	Sonnet 4.6	GPT-5.4	Sonnet is strong when visual taste and instruction following both matter.
開発実行: コーディング / CI-CD / デバッグ / コード整合	GPT-5.3-Codex xhigh	Sonnet 4.6	開発系の重複観点はこの1行で扱う。repoと端末が主戦場ならCodex、並列レビューや量産ならSonnet。
Development execution: coding / CI-CD / debugging / code consistency	GPT-5.3-Codex xhigh	Sonnet 4.6	This row intentionally compresses overlapping development tasks. Use Codex for repo-terminal execution and Sonnet for scalable parallel assistance.
コード論理性 / 危険変更のレビュー	Opus 4.6	GPT-5.4	厳密な論点詰め、長い差分の審査、難バグの見落とし確認に向く。
Code logic / risky-change review	Opus 4.6	GPT-5.4	Use Opus for strict review, long diffs, and difficult bug-checking passes.
業務ワークフロー / CRM / 契約ルーティング	Sonnet 4.6	GPT-5.4	条件分岐の多い業務処理や長く走るエージェントに向く。
Business workflows / CRM / contract routing	Sonnet 4.6	GPT-5.4	Strong for branched workflows, contract routing, and longer-running agents.
ビジネスロジック / コスト構造 / 運用設計	GPT-5.4	Gemini 3.1 Pro	構成要素の整理、比較、文章化はGPT。資料量が多い時はGemini。
Business logic / cost structure / operating design	GPT-5.4	Gemini 3.1 Pro	Use GPT for structured tradeoffs and Gemini when the source set becomes very large.
販売営業プラン / 提案文 / 企画書	GPT-5.4	Sonnet 4.6	訴求軸、提案構成、比較表、文章品質をまとめやすい。
Sales planning / proposals / go-to-market writing	GPT-5.4	Sonnet 4.6	Strong for positioning, proposal structure, comparison tables, and final polish.
Excel / スプレッドシート / モデリング	GPT-5.4	Opus 4.6	GPTはExcel連携と財務モデリングが明確。Opusは複雑な表と分析の精度補強。
Excel / spreadsheets / modeling	GPT-5.4	Opus 4.6	GPT has explicit Excel and financial-modeling positioning. Opus is strong when the tables and reasoning get harder.
PPT / プレゼン / 提案デッキ	GPT-5.4	Opus 4.6	GPTはプレゼン品質改善が明示。金融系デッキの細部詰めはOpusも強い。
PPT / presentations / decks	GPT-5.4	Opus 4.6	GPT explicitly improved presentations. Opus is also strong for detail-heavy financial decks.
長文書 / PDF / チャート / 図表読解	Gemini 3.1 Pro	Sonnet 4.6	Geminiは巨大マルチモーダル文脈向き。SonnetはOfficeQA系の読み取りに強い。
Long documents / PDFs / charts / tables	Gemini 3.1 Pro	Sonnet 4.6	Gemini is best for massive multimodal context. Sonnet is strong on enterprise-document understanding.
法律 / 契約 / 規約 / ポリシー分析	Opus 4.6	GPT-5.4	高ステークス。最終判断は弁護士レビュー必須。一次資料と日付を必ず残す。
Legal / contracts / policy analysis	Opus 4.6	GPT-5.4	High-stakes. Require lawyer review and keep exact primary sources with dates.
医療 / 臨床文献 / 研究要約	Gemini 3.1 Pro	GPT-5.4	高ステークス。文献探索と比較要約まで。診断、治療判断、処方決定は人間が行う。
Medical / clinical literature / research summaries	Gemini 3.1 Pro	GPT-5.4	High-stakes. Use models for literature triage and structured summaries, not autonomous diagnosis or treatment decisions.
金融 / 分析 / バリュエーション / DD	GPT-5.4	Opus 4.6	高ステークス。GPTはExcelと財務ワークフロー、Opusは多段分析と資料品質で強い。
Finance / analysis / valuation / diligence	GPT-5.4	Opus 4.6	High-stakes. GPT is explicitly strong for Excel and finance workflows, while Opus is strong for deeper multi-step analysis.
OpenClaw のメインエージェント	GPT-5.4	Sonnet 4.6	主担当は万能型。Codexは実装ワーカー、Opusは監査役に回す。
OpenClaw main agent	GPT-5.4	Sonnet 4.6	Use a broad generalist as the lead, keep Codex as the builder, and use Opus as the reviewer.

高リスク領域のルール

High-stakes guardrails

Rule 1

法律、医療、金融では、モデル単独で最終結論を出さない。責任者と資格者のレビューを前提にする。

In legal, medical, and financial workflows, do not let the model produce the final answer alone. Require accountable human review.

Rule 2

必ず一次資料、公式資料、原文に戻る。引用元のURLと日付を保存する。

Always return to primary, official, or source documents, and keep the source URL plus the date you checked it.

Rule 3

モデルには、下書き、比較、論点抽出、表の整理、チェックリスト生成をやらせる。意思決定は人間が持つ。

Use the model for drafting, comparison, issue spotting, table cleanup, and checklist generation. Keep final decision ownership with humans.

2 CLI の役割分担ルール

Two-CLI operating rule

GPT-5.4 CLI

要件整理、設計、差分レビュー、リサーチ、文書化を担当。

Assign requirements, design, diff review, research, and documentation here.

GPT-5.3-Codex CLI

実装、テスト、CI/CD、デバッグ、修復反復を担当。

Assign implementation, tests, CI/CD, debugging, and fix loops here.

共通ルール

Shared rule

同じファイルは同時に触らない。handoff を1枚置く。司令塔はGPT、施工担当はCodex。

Do not edit the same file concurrently. Keep one handoff file. Let GPT lead and Codex build.

参照

References

OpenAI

Introducing GPT-5.4

OpenAI

Introducing GPT-5.3-Codex

OpenAI

ChatGPT for Excel and financial data integrations

OpenAI

OpenAI API Pricing

Anthropic

Claude Sonnet 4.6

Anthropic

Claude Opus 4.6

Google DeepMind

Gemini 3.1 Pro model card

Google

Gemini API pricing

注記

Note

上の推奨順位は、2026年3月6日時点で公開されている各社公式情報、公開ベンチ、価格、利用シナリオから作った実務上の推奨です。標準化された単一ベンチによる絶対順位ではありません。

The rankings above are operational recommendations inferred from official product pages, public benchmarks, pricing, and use-case fit as of March 6, 2026. They are not a universal absolute ranking from one standardized benchmark.

最新AIルール

Latest AI Rules

まず守るルール

Operating baseline

モデル別の役割

Model roles

用途別の推奨

Recommendations by use case

高リスク領域のルール

High-stakes guardrails

2 CLI の役割分担ルール

Two-CLI operating rule

参照

References