Linguistic Connectors: The Small Detail That Makes Retrieval Bots Sound Human

A retrieval-based chatbot has a problem that generation-based bots do not: it is composing responses out of pre-written pieces. If you just concatenate the fragments, the result sounds like a list of facts, not an answer. The user can feel the seams.

The fix is small but important: connectors. Tiny transitional phrases that link fragments into a coherent paragraph. They are the difference between "a chatbot that returns facts" and "a chatbot that talks to you."

The Seam Problem

A naive retrieval bot returns something like:

> A Nash equilibrium is a strategy profile where no player wants to deviate. Nash equilibrium is used in non-cooperative game theory. The concept was introduced by John Nash in 1950.

Three fragments, no transitions. It reads like a glossary entry, not an answer.

A composed response with connectors:

> A Nash equilibrium is a strategy profile where no player wants to deviate. In formal game theory, this concept is used to analyze non-cooperative settings. Introduced by John Nash in 1950, it has since become a foundational tool in economics and strategic analysis.

Same three fragments, but the connectors turn them into a paragraph.

The Connector Library

ReLU.chat maintains a small library of connector phrases, organized by their linguistic role:

Additive — connect related facts:

"Additionally,"
"Furthermore,"
"Moreover,"
"In the same vein,"

Contrastive — link opposing or distinguishing facts:

"In contrast,"
"However,"
"On the other hand,"
"Unlike X, Y"

Causal — link cause to effect:

"Because of this,"
"As a result,"
"This leads to"
"Consequently,"

Exemplifying — introduce an example:

"For example,"
"Specifically,"
"To illustrate,"
"In particular,"

Sequential — indicate order:

"First,"
"Next,"
"Finally,"
"Subsequently,"

Frame-setting — introduce a definition or context:

"Formally,"
"In the context of X,"
"By definition,"
"More precisely,"

That is roughly 30 connectors in 6 categories. Not a huge library, but enough that the bot does not repeat itself.

When To Connect (And When Not To)

The naive approach is to insert a connector between every fragment. That makes responses feel formulaic and bloated. The policy network has to learn when to use a connector and when to leave fragments alone.

The rule the policy learned:

Use a connector if the two fragments have a clear semantic relationship that the connector can name (additive, contrastive, causal, etc.)
Skip the connector if the fragments are independent facts the user is just listing through
Never use more than 2-3 connectors in a single response — beyond that it reads as padding

The connector probability is itself an action head, controlled by the policy:

action.connectors ∈ {none, low, medium, high}

Where "none" means no connectors, "high" means use one between every pair.

Choosing The Right Connector

The most subtle part is picking the correct connector, not just any connector. If two fragments are contrastive but the bot uses "Furthermore," the result is jarring.

The policy makes this decision based on:

The semantic relationship between the fragments (computed by the embedding similarity and the difference vector)
The intent tags of the fragments (a definition followed by an example → "For example"; a definition followed by a counterexample → "However")
The query type (a "compare X and Y" query → use contrastive connectors)

This is a small classification problem embedded in the policy. With 6 connector categories and a few examples per category, the policy learns it quickly.

Why Small Libraries Beat Large Ones

A natural question: why not use thousands of connectors, with the policy picking the perfect one for every context? The answer is overfitting and diminishing returns.

A library of 30 connectors, used well, sounds natural. A library of 1000, used well, still sounds natural — but the policy has to choose between many near-equivalent options, and the model becomes brittle. Adding connectors increases the dimensionality of the connector-selection problem without meaningfully improving the user experience.

The library is also bounded by what the user perceives. There are only so many distinct connective phrases a human reader will notice. Beyond 6-8 in a single response, the reader stops paying attention to them anyway.

What This Buys

The user-visible effect of good connectors:

Naturalness — the response reads as a paragraph, not a list
Coherence — the user can follow the logical flow
Trust — the response feels composed, not assembled

The cost is small: 30 strings, 6 categories, one action head in the policy. No model, no inference cost. Pure linguistic engineering.

The Bigger Principle

Retrieval-based systems are sometimes criticized as "just a search engine with extra steps." Connectors are one of the small touches that rebut that. A search engine returns a list of documents. A retrieval chatbot with connectors returns a paragraph. The list is information; the paragraph is communication.

The 30 connectors in the library are not technically impressive. They are linguistically important. They are the difference between a bot that knows things and a bot that explains things.