Dear Negotiation Explorer,
Welcome to Week 21 of our NegoAI series.
Last week, we explored the structured prompt framework (Role → Context → Instructions → Criteria → Output) and how voice prompting transforms AI from a search tool into a thinking partner.
But even with strong prompting, a crucial question remains:
How do you know whether the AI’s negotiation analysis is any good?
Because negotiation decisions rely on solid reasoning, we need a structured way to evaluate the quality of the AI’s analysis.
This is where a rigorous Quality Assessment (QA) system becomes indispensable.
Understanding AI Agents
AI agents are specialized systems designed to perform specific tasks autonomously within defined parameters. Unlike general LLMs, agents focus on particular functions and can work together in sequences to solve complex problems.
Research shows that effective negotiation preparation involves understanding counterparts' behavioral tendencies and communication preferences. Traditional approaches rely on subjective assessment, but AI agents can now systematically analyze digital footprints to generate evidence-based behavioral profiles.
Key components of functional AI agents include:
Defined scope: Narrow focus on specific tasks
Independent operation: Ability to process inputs without continuous guidance
Structured outputs: Consistent, formatted information delivery
Temperature control: Adjustable creativity vs. determinism balance
The Three LLM Limitations Negotiators Must Manage
LLMs have several structural limitations, and the three we are tackling at this stage are the ones that most directly affect the quality and reliability of negotiation analysis.
1. Non-Determinism (Inconsistent Outputs)
LLMs often produce different results even when given the same prompt and materials.
In negotiation preparation — where alignment and repeatability matter — this inconsistency is a real risk.
2. Hallucinations (Confident but False Information)
LLMs may generate incorrect, invented, or unjustified statements while sounding fully confident.
In negotiation, hallucinations can distort:
interest analysis
BATNA assessments
stakeholder mapping
valuation estimates
risk evaluation
3. Opacity (Black-Box Reasoning)
LLMs do not reveal the reasoning behind their conclusions.
You cannot see:
which assumptions they made
what they ignored
how they inferred missing details
why they weighted certain information
Without traceability, you cannot verify or trust the logic — a major issue in negotiation.
Why these limitations matter
Because these constraints are structural, you should never rely on the first AI output without evaluating it.
Quality Assessment is what turns AI into a reliable analytical partner.
The Quality Assessment Framework
A four-criterion system to evaluate any AI-generated negotiation analysis
Before diving into the detailed scoring rules, here is a simple overview table of the four criteria and their weights:
Quality Assessment Criteria & Weights
Criterion | Description | Weight |
|---|---|---|
1. Interest Analysis | Depth, inferred interests, motivations, fears, constraints for both parties | 30% |
2. BATNA Analysis | Explicit + inferred BATNAs, consequences, strength evaluation | 25% |
3. Scenario Building | Multiple distinct counterpart profiles with ranked interests & BATNAs | 25% |
4. Value-Creation Insight | Creative, feasible, multi-issue options that expand the pie | 20% |
Now let’s break down each criterion in detail:
1. Interest Analysis (Depth & Accuracy)
Weight: 30%
Evaluate whether the AI captures the full spectrum of interests — not just positions.
Score 5 if:
goes far beyond price
includes motivations, fears, constraints
identifies hidden or inferred interests
ranks interests clearly
includes at least one contextual inference (e.g., if the seller already received a lower offer, anticipating their desire for certainty and speed)
extends interest analysis to the other party as well, especially within scenarios (specific, realistic, differentiated)
Score 1 if:
only positions are described
price dominates the analysis
2. BATNA Analysis (Explicit + Inferred)
Weight: 25%
Assess whether alternatives and consequences are analyzed properly.
Score 5 if:
identifies explicit alternatives
infers plausible unstated BATNAs
evaluates consequences of no agreement
distinguishes strong vs. weak BATNAs
Score 1 if:
simply repeats an obvious BATNA
ignores uncertainty
3. Scenario Building (Creativity + Plausibility)
Weight: 25%
Determine whether the output generates multiple, distinct, realistic scenarios for the counterpart.
Score 5 if:
3–4 differentiated scenarios
each with ranked interests and BATNA
scenarios feel grounded and specific
includes diagnostic questions to test which scenario is correct
Score 1 if:
one scenario only
scenarios are generic or similar
4. Value-Creation Insight (Integrative Thinking)
Weight: 20%
Evaluate whether the AI proposes options beyond price.
Score 5 if:
creative, feasible trade-offs
multi-issue solutions linked to interests
expands the pie meaningfully
Score 1 if:
purely distributive suggestions
little or no creativity
The Final Score
Final Score = (Interests × 0.30) + (BATNA × 0.25) + (Scenarios × 0.25) + (Value Creation × 0.20)
Interpretation:
4.5–5.0 → Expert-level analysis
4.0–4.4 → Strong and reliable
3.5–3.9 → Acceptable but requires refinement
3.0–3.4 → Major iteration required
<3.0 → Reject the analysis and re-prompt
See This Live
Want to see this in action?
On November 20, I'm running a free 30-minute session where I’ll voice-prompt a full negotiation case study and demo the no-code workflow — built on negotiation and AI expertise, not AI hype.
Know someone who’d benefit? Tag them below.
Example Prompt: Quality Assessment of the AI’s Negotiation Analysis
Feel free to copy, save, or adapt this prompt for your next negotiation. It’s designed to be reused across cases and contexts.
Prompt
Evaluate the quality of the negotiation analysis you just generated using the following four criteria.
For each criterion, assign a score from 1–5 and explain the rationale in 2–3 lines.
Criteria
Interest Analysis (30%) – Have you identified a full spectrum of interests (motivations, fears, constraints) for both parties, clearly ranked and going beyond price?
BATNA Analysis (25%) – Have you identified explicit and inferred BATNAs, evaluated consequences of no agreement, and distinguished between strong and weak alternatives?
Scenario Building (25%) – Have you created multiple, distinct, realistic scenarios for the counterpart, each with ranked interests, BATNA, and diagnostic questions?
Value-Creation Insight (20%) – Have you proposed feasible, multi-issue options that expand the pie beyond price?
Then calculate the final weighted score using:
(Interests × 0.30) + (BATNA × 0.25) + (Scenarios × 0.25) + (Value Creation × 0.20)
Finally, provide:
The overall score
A short summary of strengths
A short summary of weaknesses
One concrete step to improve the weakest area
Example ChatGPT-5 Quality Assessment Response
Below is the actual response generated by ChatGPT-5 using this Quality Assessment prompt.
Quality Assessment of the Negotiation Analysis
1. Interest Analysis (Score: 4/5)
The analysis does a strong job identifying the seller’s interests beyond price: speed of transaction, avoiding holding costs, minimizing hassle, and limiting renegotiation risk.
It also identifies inferred interests (certainty, neutrality of relationship, maintaining neighborhood stability).
On the buyer side, interests are addressed within the scenarios but could be more deeply differentiated, especially around emotional or lifestyle drivers.
2. BATNA Analysis (Score: 4/5)
The BATNA is correctly identified as the $17,000 offer already available to the seller.
The analysis also infers alternative paths (waiting for a future buyer, holding the land, potential zoning changes).
Consequences of no agreement are addressed, though the relative strength of each BATNA could be more explicit.
3. Scenario Building (Score: 5/5)
Three realistic, distinct buyer scenarios are articulated:
The Practical Expander
The Strategic Investor
The Reluctant Neighbor
Each includes ranked interests and plausible BATNAs with clear diagnostic questions.
The differentiation is meaningful and grounded.
4. Value-Creation Insight (Score: 4/5)
The analysis proposes solid integrative options (closing flexibility, small concessions, cost sharing).
More creative multi-issue trades could be added (conditional price increases, shared improvements, staged agreements).
Final Weighted Score
= (4 × 0.30) + (4 × 0.25) + (5 × 0.25) + (4 × 0.20)
= 4.25 / 5
Overall Assessment
A strong, reliable analysis with excellent scenario design and well-structured interests.
Areas for improvement include deeper emotional interests and more creative value-creation options.
Next Refinement Step
Improve the Value-Creation Insight criterion by generating five new integrative options tied to the interests in each scenario.
The Evaluation Loop
Follow this simple, repeatable process using the TOAST method — Test, Observe, Analyze, Solve, Test Again:
1. Test
Run your structured prompt to generate the initial output.
2. Observe
Score the output using the four criteria and identify the weakest one.
3. Analyze
Determine why that criterion failed — unclear instructions, missing steps, or insufficient detail.
4. Solve
Refine the prompt by adding clarity or constraints targeted at that specific weak area.
5. Test Again
Re-run the prompt and re-score the output.
Repeat until the analysis consistently reaches ≥ 4.2.
This is how you train the AI to think better — not just answer better..
The evaluation stage is where AI moves from being a content generator to becoming a reliable negotiation partner.
