Dear Negotiation Explorer,

Welcome to Week 21 of our NegoAI series.

Last week, we explored the structured prompt framework (Role → Context → Instructions → Criteria → Output) and how voice prompting transforms AI from a search tool into a thinking partner.

But even with strong prompting, a crucial question remains:

How do you know whether the AI’s negotiation analysis is any good?

Because negotiation decisions rely on solid reasoning, we need a structured way to evaluate the quality of the AI’s analysis.

This is where a rigorous Quality Assessment (QA) system becomes indispensable.

Understanding AI Agents

AI agents are specialized systems designed to perform specific tasks autonomously within defined parameters. Unlike general LLMs, agents focus on particular functions and can work together in sequences to solve complex problems.

Research shows that effective negotiation preparation involves understanding counterparts' behavioral tendencies and communication preferences. Traditional approaches rely on subjective assessment, but AI agents can now systematically analyze digital footprints to generate evidence-based behavioral profiles.

Key components of functional AI agents include:

  • Defined scope: Narrow focus on specific tasks

  • Independent operation: Ability to process inputs without continuous guidance

  • Structured outputs: Consistent, formatted information delivery

  • Temperature control: Adjustable creativity vs. determinism balance

The Three LLM Limitations Negotiators Must Manage

LLMs have several structural limitations, and the three we are tackling at this stage are the ones that most directly affect the quality and reliability of negotiation analysis.

1. Non-Determinism (Inconsistent Outputs)

LLMs often produce different results even when given the same prompt and materials.

In negotiation preparation — where alignment and repeatability matter — this inconsistency is a real risk.

2. Hallucinations (Confident but False Information)

LLMs may generate incorrect, invented, or unjustified statements while sounding fully confident.

In negotiation, hallucinations can distort:

  • interest analysis

  • BATNA assessments

  • stakeholder mapping

  • valuation estimates

  • risk evaluation

3. Opacity (Black-Box Reasoning)

LLMs do not reveal the reasoning behind their conclusions.

You cannot see:

  • which assumptions they made

  • what they ignored

  • how they inferred missing details

  • why they weighted certain information

  • Without traceability, you cannot verify or trust the logic — a major issue in negotiation.

Why these limitations matter

Because these constraints are structural, you should never rely on the first AI output without evaluating it.

Quality Assessment is what turns AI into a reliable analytical partner.

The Quality Assessment Framework

A four-criterion system to evaluate any AI-generated negotiation analysis

Before diving into the detailed scoring rules, here is a simple overview table of the four criteria and their weights:

Quality Assessment Criteria & Weights

Criterion

Description

Weight

1. Interest Analysis

Depth, inferred interests, motivations, fears, constraints for both parties

30%

2. BATNA Analysis

Explicit + inferred BATNAs, consequences, strength evaluation

25%

3. Scenario Building

Multiple distinct counterpart profiles with ranked interests & BATNAs

25%

4. Value-Creation Insight

Creative, feasible, multi-issue options that expand the pie

20%

Now let’s break down each criterion in detail:

1. Interest Analysis (Depth & Accuracy)

Weight: 30%

Evaluate whether the AI captures the full spectrum of interests — not just positions.

Score 5 if:

  • goes far beyond price

  • includes motivations, fears, constraints

  • identifies hidden or inferred interests

  • ranks interests clearly

  • includes at least one contextual inference (e.g., if the seller already received a lower offer, anticipating their desire for certainty and speed)

  • extends interest analysis to the other party as well, especially within scenarios (specific, realistic, differentiated)

Score 1 if:

  • only positions are described

  • price dominates the analysis

2. BATNA Analysis (Explicit + Inferred)

Weight: 25%

Assess whether alternatives and consequences are analyzed properly.

Score 5 if:

  • identifies explicit alternatives

  • infers plausible unstated BATNAs

  • evaluates consequences of no agreement

  • distinguishes strong vs. weak BATNAs

Score 1 if:

  • simply repeats an obvious BATNA

  • ignores uncertainty

3. Scenario Building (Creativity + Plausibility)

Weight: 25%

Determine whether the output generates multiple, distinct, realistic scenarios for the counterpart.

Score 5 if:

  • 3–4 differentiated scenarios

  • each with ranked interests and BATNA

  • scenarios feel grounded and specific

  • includes diagnostic questions to test which scenario is correct

Score 1 if:

  • one scenario only

  • scenarios are generic or similar

4. Value-Creation Insight (Integrative Thinking)

Weight: 20%

Evaluate whether the AI proposes options beyond price.

Score 5 if:

  • creative, feasible trade-offs

  • multi-issue solutions linked to interests

  • expands the pie meaningfully

Score 1 if:

  • purely distributive suggestions

  • little or no creativity

The Final Score

Final Score = (Interests × 0.30) + (BATNA × 0.25) + (Scenarios × 0.25) + (Value Creation × 0.20)

Interpretation:

4.5–5.0 → Expert-level analysis

4.0–4.4 → Strong and reliable

3.5–3.9 → Acceptable but requires refinement

3.0–3.4 → Major iteration required

<3.0 → Reject the analysis and re-prompt

See This Live

Want to see this in action?

On November 20, I'm running a free 30-minute session where I’ll voice-prompt a full negotiation case study and demo the no-code workflow — built on negotiation and AI expertise, not AI hype.

Know someone who’d benefit? Tag them below.

Example Prompt: Quality Assessment of the AI’s Negotiation Analysis

Feel free to copy, save, or adapt this prompt for your next negotiation. It’s designed to be reused across cases and contexts.

Prompt

Evaluate the quality of the negotiation analysis you just generated using the following four criteria.

For each criterion, assign a score from 1–5 and explain the rationale in 2–3 lines.

Criteria

Interest Analysis (30%) – Have you identified a full spectrum of interests (motivations, fears, constraints) for both parties, clearly ranked and going beyond price?

BATNA Analysis (25%) – Have you identified explicit and inferred BATNAs, evaluated consequences of no agreement, and distinguished between strong and weak alternatives?

Scenario Building (25%) – Have you created multiple, distinct, realistic scenarios for the counterpart, each with ranked interests, BATNA, and diagnostic questions?

Value-Creation Insight (20%) – Have you proposed feasible, multi-issue options that expand the pie beyond price?

Then calculate the final weighted score using:

(Interests × 0.30) + (BATNA × 0.25) + (Scenarios × 0.25) + (Value Creation × 0.20)

Finally, provide:

  • The overall score

  • A short summary of strengths

  • A short summary of weaknesses

  • One concrete step to improve the weakest area

Example ChatGPT-5 Quality Assessment Response

Below is the actual response generated by ChatGPT-5 using this Quality Assessment prompt.

Quality Assessment of the Negotiation Analysis

1. Interest Analysis (Score: 4/5)

The analysis does a strong job identifying the seller’s interests beyond price: speed of transaction, avoiding holding costs, minimizing hassle, and limiting renegotiation risk.

It also identifies inferred interests (certainty, neutrality of relationship, maintaining neighborhood stability).

On the buyer side, interests are addressed within the scenarios but could be more deeply differentiated, especially around emotional or lifestyle drivers.

2. BATNA Analysis (Score: 4/5)

The BATNA is correctly identified as the $17,000 offer already available to the seller.

The analysis also infers alternative paths (waiting for a future buyer, holding the land, potential zoning changes).

Consequences of no agreement are addressed, though the relative strength of each BATNA could be more explicit.

3. Scenario Building (Score: 5/5)

Three realistic, distinct buyer scenarios are articulated:

The Practical Expander

The Strategic Investor

The Reluctant Neighbor

Each includes ranked interests and plausible BATNAs with clear diagnostic questions.

The differentiation is meaningful and grounded.

4. Value-Creation Insight (Score: 4/5)

The analysis proposes solid integrative options (closing flexibility, small concessions, cost sharing).

More creative multi-issue trades could be added (conditional price increases, shared improvements, staged agreements).

Final Weighted Score

= (4 × 0.30) + (4 × 0.25) + (5 × 0.25) + (4 × 0.20)

= 4.25 / 5

Overall Assessment

A strong, reliable analysis with excellent scenario design and well-structured interests.

Areas for improvement include deeper emotional interests and more creative value-creation options.

Next Refinement Step

Improve the Value-Creation Insight criterion by generating five new integrative options tied to the interests in each scenario.

The Evaluation Loop

Follow this simple, repeatable process using the TOAST methodTest, Observe, Analyze, Solve, Test Again:

1. Test

Run your structured prompt to generate the initial output.

2. Observe

Score the output using the four criteria and identify the weakest one.

3. Analyze

Determine why that criterion failed — unclear instructions, missing steps, or insufficient detail.

4. Solve

Refine the prompt by adding clarity or constraints targeted at that specific weak area.

5. Test Again

Re-run the prompt and re-score the output.

Repeat until the analysis consistently reaches ≥ 4.2.

This is how you train the AI to think better — not just answer better..

The evaluation stage is where AI moves from being a content generator to becoming a reliable negotiation partner.

Reply

or to participate

Keep Reading

No posts found