GemScore V4 Is Live: Living Intelligence for Startup Evaluation

We spent months building V4. Today it's live.

If you read our earlier post back in January, you know the thesis: V3 gave you a snapshot. V4 gives you a living system. Reports that predict, interact, and stay current as your startup evolves.

That thesis is now real, running in production, and available to every user on Assessment and Verification tiers — at no extra cost.

Here's everything you need to know.

What V4 Actually Is

V4 is a Living Intelligence layer on top of GemScore. It doesn't replace V3 — it enhances it.

When you run a V4 evaluation, the system first performs the full V3 analysis you're used to: dual-axis scoring (Potential + Readiness), adversarial Bull vs. Bear debate, evidence verification, and IC-ready memo generation. Then V4 kicks in with six additional intelligence modules that run concurrently on top of the V3 results.

The result: your report goes from a document you read once to a dashboard you return to, interact with, and share with investors as a living artifact.

The Six V4 Features

1. Scenario Modeling

Instead of a single score, V4 generates probability-weighted futures. Each evaluation produces multiple scenarios — typically Optimistic, Base Case, and Pessimistic — with specific probabilities assigned by the AI based on your company's current position.

Each scenario includes:

Score projections — How your Potential and Readiness scores would change under this path
Triggers — The specific events that would push you toward this scenario (e.g., "Close 3 enterprise pilots within 6 months")
Milestones — What success looks like at 6, 12, and 18 month intervals
Timeline — How long this trajectory would take to materialize

Why it matters: Investors don't invest in certainty — they invest in risk-adjusted outcomes. Scenario modeling makes the probability distribution explicit, giving both founders and investors a shared framework for discussing upside and downside.

Scenario Modeling — probability distribution bar with Base Case and Optimistic paths, each showing Potential and Readiness score projections, key triggers, and milestone timelines. Scenario Modeling in action: probability-weighted paths with score projections, triggers, and milestones for each scenario.

2. Interactive Q&A

Your report becomes a conversation partner. A floating "Ask AI" button is always accessible while viewing your V4 report. Click it, and you can ask follow-up questions in natural language.

Example questions:

"Why did my Market score drop to 68?"
"What would it take to improve Readiness by 15 points?"
"Compare my competitive position to [specific competitor]"
"What if we pivot to enterprise-only?"

Every answer comes with citations — specific references to the evidence, scores, and sections that inform the response. This isn't a generic chatbot; it reasons over your actual evaluation data.

How it works: Instead of embedding your report into a vector database and doing similarity search (the standard RAG approach), we build a hierarchical tree structure from your evaluation results. When you ask a question, the system traverses this tree using LLM-guided reasoning — choosing which branches to explore based on your question's intent. This gives us structured, explainable retrieval with section-level citations, without the precision loss that comes with vector similarity.

Rate limits: Each Q&A session supports up to 50 questions, with a rate limit of 10 questions per minute to ensure quality responses. Subscribers get higher quotas.

Report Q&A — floating slide-over panel showing a conversation with AI about improving scores, with evidence-grounded answers citing specific report sections like Validation Findings and Evidence Chain. Interactive Q&A in action: ask follow-up questions in natural language and get evidence-grounded answers with citations to specific report sections.

Knowledge Map — hierarchical tree of every section the AI analyzed, with 36 sections indexed across Overall Scores, Adversarial Debate, Validation Findings, and Evidence Chain. The Knowledge Map powers interactive Q&A — a hierarchical index of every section the AI analyzed. Click any leaf to view details or ask a follow-up question.

3. Financial Model Generator

V4 auto-generates a driver-based financial model from your evaluation data — no spreadsheet skills required.

What you get:

12-month monthly P&L — Revenue, expenses broken down by category (payroll by role, operations, services, marketing), net income, and cash position for each month
3-year yearly projections — High-level annual figures for pitch decks
Runway dashboard — Current runway in months, monthly burn rate with trend, and cash position with alerts when runway drops below 12 months
Unit economics — LTV, CAC, LTV/CAC ratio, payback period, gross margin
Industry benchmarks — Each assumption is compared against industry medians and flagged if it's significantly above or below (so you know when your projections look optimistic or pessimistic)
Fundraising advisor — When to start raising, target raise range, expected timeline, and dilution scenarios with pre/post-money calculations
What-If simulator — Interactive sliders to adjust assumptions (founder salary, engineer salary, marketing budget, hiring delays) and see the impact on runway in real time

The model is business-model aware — it detects whether you're SaaS, marketplace, consumer, services, hardware, or pre-revenue, and adjusts its driver assumptions accordingly.

Export: Download the full financial model as a formatted Excel workbook (.xlsx) with three sheets (Summary, Monthly P&L, Assumptions). Ready to share with investors or import into your own models.

Financial Projections — unit economics strip (LTV, CAC, LTV/CAC, Payback, Gross Margin) and 12-month P&L table with revenue, expenses by category, net income, cash position, and runway. Unit economics dashboard and 12-month P&L table with monthly/yearly toggle, expense breakdown, cash tracking, and runway calculation.

4. Evidence Graph

Every GemScore evaluation generates hundreds of data points: scores, dimension breakdowns, strengths, gaps, red flags, evidence items, and debate arguments. The Evidence Graph visualizes all of these as an interactive force-directed network.

Nodes are color-coded by type:

Scores — Your top-level Potential and Readiness
Dimensions — Individual scoring axes (Market, Product, Team, etc.)
Evidence — Verified claims and data points
Debate — Bull and Bear arguments

Click any node to highlight its connections and see metadata — confidence levels, sources, related evidence. Zoom, pan, and explore how the AI arrived at its conclusions.

Why it matters: Transparency. Instead of trusting a black box score, you can trace every number back through the reasoning chain. Investors reviewing your report can do the same.

Evidence Graph — interactive force-directed network visualization with color-coded nodes for Scores (blue), Dimensions (purple), Strengths (green), Gaps (orange), Red Flags (red), and Evidence (gray). Evidence Graph visualization: 80+ nodes and 84 connections mapping the full reasoning chain from scores to individual evidence items. Click any node to trace the logic.

5. Investor DNA Matching

V4 analyzes your evaluation results and matches them against investor profiles based on four dimensions:

Thesis alignment (40%) — Does this investor actually invest in your sector, stage, and geography?
Portfolio synergy (25%) — Are there complementary companies in their portfolio, or conflicts?
Check size fit (20%) — Does your raise target fall within their typical range?
Track record (15%) — What's their history with companies at your stage?

Each match includes a score (0-100%), specific match reasons, potential concerns, and a suggested approach strategy for outreach.

The matching runs through an AI scoring layer after a pre-filter pass that eliminates obvious mismatches (wrong sector, wrong stage, wrong check size). This keeps quality high while avoiding wasted compute on clearly irrelevant pairings.

Investor Match — expanded card showing thesis fit, check size analysis, portfolio synergy breakdown, match reasons (green), concerns (amber), and a suggested approach strategy. Investor DNA Matching: each match card shows thesis fit, check size analysis, portfolio synergy, specific match reasons and concerns, plus a suggested outreach approach.

6. Live Monitoring

Enable monitoring on any evaluated idea, and V4 will periodically re-scan for new evidence and changes. Configure the check interval (every 6, 12, 24, 48, or 72 hours) based on how actively things are changing.

The monitoring dashboard shows:

Active/Inactive status with toggle control
Total checks completed and updates found
Next scheduled check time
History of detected changes

Why it matters: Your report stays current. If a competitor raises funding, if your traction numbers change, or if new market data emerges — the system catches it and flags the impact.

Model Presets: Choose Your AI Configuration

Every V4 evaluation is powered by a multi-agent pipeline where specialized AI models handle different stages: analysis, web search, debate (Bull and Bear perspectives), and synthesis. You can control which models power your evaluation by selecting a preset.

Available Presets

Express — Optimized for speed. Fast turnaround with capable models. Best for screening-stage evaluations where you need quick directional insight.
Standard — The default. Balanced between speed, depth, and cost. Recommended for most evaluations.
Professional — More capable models across all slots. Deeper analysis, stronger debate quality.
Executive — Premium models throughout the pipeline. For evaluations that need maximum analytical depth.
Research — The most thorough configuration. Cross-provider debate (different AI providers argue Bull vs. Bear), more debate rounds, and the most capable models. Best for high-stakes evaluations.

Each preset configures:

Which AI model handles each pipeline stage
The debate strategy (single-model, cross-provider, or rotation)
Number of debate rounds
Estimated processing time

You select your preset when ordering an evaluation, right on the evaluation type selection page. A compact pill selector shows all available options with descriptions, timing estimates, and debate strategy details.

Custom Presets (Coming Soon)

We're building the ability for subscribers to create their own custom presets — pick specific models for each pipeline slot, set debate rounds, and save configurations for reuse. This is currently in development and will be available soon.

You can compare all presets and their model configurations on the Models page.

How V4 Works Under the Hood

Here's a simplified view of what happens when you submit a V4 evaluation:

Step 1: V3 Analysis — The full V3 pipeline runs first: 6 specialized agents (Business, Market, Product, Team, Risk, and a Financial pre-scan) analyze your startup across multiple dimensions. Bull and Bear debaters stress-test every claim. Evidence chains are built and verified.

Step 2: V4 Enhancement — Once V3 completes, five V4 modules run concurrently:

Scenario Modeling Agent generates probability-weighted futures
Financial Model Agent builds the driver-based projections
Q&A Index Service constructs the retrieval tree
Evidence Graph Service maps the node/edge visualization
Investor Matching Service scores against the investor database

Step 3: Assembly — All results are saved and the report is assembled. If any single V4 enhancement fails, the others still complete — failures are isolated per module.

The entire pipeline is orchestrated by our multi-agent system, with results persisted and served through the web application in real time. V4 enhancements typically add 2-5 minutes on top of the V3 evaluation time, depending on the model preset selected.

How to Use V4

Step 1: Create or Select an Idea

Step 2: Order an Evaluation

Click "Order Report" and choose your evaluation type — Assessment or Verification. V4 enhancements are included automatically with both tiers.

Step 3: Pick a Model Preset

Use the AI Configuration selector to choose your preset (Express through Research). The default "Standard" preset works well for most evaluations. Each preset shows estimated time and debate strategy.

Step 4: Submit and Wait

The evaluation runs automatically. You'll see a progress tracker showing each phase: document analysis, web research, multi-agent scoring, debate, and V4 enhancements. Typical total time: 5-15 minutes depending on your preset.

Step 5: Explore Your V4 Report

Once complete, your report opens with the new V4 layout:

Sidebar navigation groups sections into Intelligence (Scenarios, Financial, Investors, Evidence), Network, Documents (Data Room, Investment Memo), Monitoring, and Workspace
Floating Ask AI button (bottom-right) opens the Q&A panel from any section
Overview dashboard shows V4 highlights — top scenario probabilities, runway headline, investor match count, and evidence graph stats

Step 6: Take Action

Export your financial model to Excel
Share your report with a public link (with optional password protection)
Enable monitoring to keep the report updated
Ask follow-up questions to drill deeper into any finding

Frequently Asked Questions

What's the difference between V3 and V4?

V3 produces a static evaluation report with dual-axis scoring, adversarial debate, and evidence chains. V4 adds six intelligence layers on top: scenario modeling, interactive Q&A, financial projections, evidence graph visualization, investor matching, and live monitoring. V3 is still the foundation — V4 enhances it.

Do I need to re-run existing evaluations to get V4 features?

Yes. V4 enhancements are generated during the evaluation process, so existing V3 reports won't retroactively gain V4 features. You'll need to run a new evaluation to get the full V4 experience.

Is V4 a separate tier? Does it cost extra?

No. V4 enhancements are included with both Assessment and Verification tiers at no additional cost. There is no separate "V4 tier" — the intelligence features are baked into the existing evaluation flow.

How does the Q&A system work? Is it just ChatGPT?

No. The Q&A system uses a tree-based retrieval method that's purpose-built for structured evaluation data. Instead of using vector embeddings (the typical RAG approach), it builds a hierarchical tree from your report and uses LLM-guided traversal to find relevant sections. Every answer includes citations pointing to specific evidence and sections. The underlying LLM depends on your selected model preset.

Are the financial projections accurate?

The projections are AI-generated starting points based on your inputs, industry benchmarks, and comparable companies. They flag assumptions that deviate significantly from industry medians (e.g., "your CAC assumption is 2x the median for SaaS at your stage"). They're designed to be defensible conversation starters with investors, not final forecasts. Always validate with your own financial advisor.

How does investor matching work? Is it real investors?

The system matches against a curated database of investor profiles with known thesis areas, check sizes, portfolio companies, and stage preferences. Matching uses a weighted scoring formula (thesis 40%, portfolio synergy 25%, check size 20%, track record 15%) with AI-powered analysis. Results include specific match reasons and suggested approach strategies.

Why not use standard vector search for Q&A?

Traditional RAG systems embed documents into vector databases and retrieve by similarity — which works well for unstructured text but loses the hierarchical structure of evaluation reports (sections, sub-sections, dimensions, evidence chains). Our tree-based approach preserves this structure and uses LLM reasoning to navigate it, giving more precise, explainable retrieval with section-level citations.

Can I export the financial model?

Yes. Click the "Export to Excel" button on the Financial Projections section. You'll get a formatted .xlsx workbook with three sheets: Summary (key metrics and unit economics), Monthly P&L (12-month breakdown), and Assumptions (all inputs with benchmark comparisons).

How often does live monitoring check?

You configure the interval: every 6, 12, 24, 48, or 72 hours. The system runs background checks at your chosen frequency, looking for new evidence, market changes, and competitor activity. You can toggle monitoring on/off at any time.

What model presets are available?

Five presets: Express (fastest), Standard (default, balanced), Professional (deeper analysis), Executive (premium models), and Research (maximum depth with cross-provider debate). You select when ordering an evaluation. Custom presets where you choose specific models for each pipeline slot are coming soon.

Can I share my V4 report with investors?

Yes. Every report can be shared via a public link with optional password protection. The shared view includes the core evaluation report in a read-only format. V4 interactive features like Q&A are available to the report owner.

What happens if a V4 enhancement fails during evaluation?

Each V4 module runs independently. If one fails (e.g., investor matching encounters an issue), the other four still complete. Your report will show which enhancements succeeded and which didn't. The core V3 evaluation is never affected by V4 enhancement failures.

What's Next

V4 is live, but we're not done. Here's what we're working on:

Custom model presets — Create your own AI configurations, pick specific models per pipeline slot, save and reuse
Video pitch analysis — Upload your pitch video, get feedback on transcript, slides, delivery, and extracted claims
Deeper investor database — Expanding the matching pool beyond the current curated set
API access — Programmatic access to evaluations, Q&A, and financial models for integrations
Enhanced monitoring — Richer change detection, automated re-scoring, and alert notifications

Get Started

V4 is available now for all Assessment and Verification evaluations. No waitlist, no extra cost.

Start your evaluation and experience what living intelligence feels like.

If you're already a GemScore user, your next evaluation will automatically include V4 features. If you're new, sign up and try a free Screening to see the platform — then upgrade to Assessment or Verification for the full V4 experience.

Questions? Reach out to us — we'd love to hear what you think.

V4 is the biggest update we've shipped. It represents months of work across AI architecture, financial modeling, information retrieval, and frontend design. We're excited to see how founders and investors use it.

— The Athanor Team