Skip to content
Kivuli Index Lab
Home Method Research Lab Contact
Research · Direction 1 — Representation · overview synthesis · 12 Mar 2026

Are Kenyan businesses underrepresented in AI answers?

A benchmark note on when Kenyan businesses vanish, blur or lose place in AI answers despite being relevant to the query.

Outcome

Kenyan business underrepresentation is best read through answer states rather than a single visibility score. Kivuli Index Lab treats absence, generic compression and displacement as separate behaviours, because each points to a different kind of evidence problem.

A Kenyan business can be active, licensed, reviewed and locally known, yet still fail to become the example an AI answer reaches for. The important question is what kind of absence is being observed.

A prompt about Kenyan tour operators gave the lab its first stubborn comparison. The Nairobi operator was named cleanly. The coastal operator, a composite scenario built from several observed patterns rather than one named company, had the sort of public traces that should matter: a working site, review snippets, licence wording and clear service pages. In several answer runs, it did not appear at all. The answer talked about Kenya tourism, but the business itself stayed outside the frame.

That absence would have been easy to overread. One missing name does not prove a national visibility problem. A model may choose a better-known example, a fresher page or a source it can parse with less friction. Still, the lab noticed the same shape elsewhere: county-level service firms blurred into broad categories, cooperatives skipped in favour of formal companies, and mobile-first sellers pushed aside by businesses with more conventional web evidence. The first question became plain enough to write on a sheet: are Kenyan businesses underrepresented in AI answers, or are a few weak examples making the gap look larger than it is?

Starting With Absence Before Explanation

Kivuli Index Lab does not begin this work by deciding that Kenyan businesses are hidden by AI systems. The method starts lower, closer to the floor. A run is recorded as an answer state: whether the business, sector, county or business form is named, skipped, blurred or displaced. That small vocabulary matters because underrepresentation is a heavy word. If the lab uses it too early, the word starts carrying more certainty than the evidence can support.

In the early runs, the most useful clue was not simply that some businesses were absent. It was the unevenness of absence. A Nairobi professional-services firm could be named in a category answer while a similar Kisumu firm became “local consultants.” A farm-supply cooperative could be replaced by a private agribusiness with clearer pages. A WhatsApp-first seller could be skipped even when the prompt asked for the right category and county. These are not the same failure wearing different clothes.

Underrepresentation, in the lab’s working definition, is the repeated weakening of relevant Kenyan business presence across comparable AI answers, because the answer names, skips, blurs or displaces Kenyan examples in a patterned way. The definition is deliberately modest. It does not claim that every Kenyan business should appear. It says that when relevant examples lose place again and again, the pattern deserves measurement.

The lab is careful about the comparison set. Kenyan businesses should not be judged only against other Kenyan businesses if the question is whether the country’s business evidence travels well through AI answers. A tourism operator in Kenya may be compared with a similar operator in a high-income market when the prompt is framed at the same level of specificity. A professional service firm may be tested beside equivalent firms in markets where directories, review systems and structured pages are more mature. That comparison is still descriptive. It is a bench test, not a census.

The Four Visibility States In Practice

The lab’s anchor classification is simple because the answer behaviour is already complicated. A business can be named, skipped, blurred or displaced. These states are qualitative, not a score ladder. “Named” is not automatically success, and “skipped” is not automatically evidence of unfairness. Each state asks the researcher to slow down and describe what the answer actually did.

A named business appears in a way that a reader can recognise. That may sound straightforward, but naming can still carry errors. An answer may name a Kenyan operator, then attach the wrong county or imply a service it does not offer. The lab records the name as present, while keeping the inaccuracy separate. A named result with a damaged description is still a different thing from full absence.

A skipped business is absent even though the prompt makes it relevant to the comparison. In the coastal tour-operator composite, the operator is not merely absent from a broad “Kenya travel” answer. It is skipped when the prompt asks for a coastal category where the business would reasonably belong. This distinction matters. Broad prompts produce broad answers. A skipped state becomes meaningful only when the prompt has narrowed the path enough for the business or category to be a fair candidate.

Blurred is often the most Kenyan-looking failure in the lab’s notes. A specific business form becomes a label: “agricultural supplier,” “local artisan,” “transport provider,” “community finance group.” The answer has not fully ignored the space, but it has shaved off the local shape. This is where informal enterprises, cooperatives and county-level service providers often lose their edges. They remain visible as a type, not as a named participant.

Displaced means another reference takes the place the tested business or category could reasonably have occupied. A Nairobi operator may stand in for coastal operators. A formal firm may stand in for jua kali enterprises. A better-reviewed private supplier may stand in for a cooperative. Displacement is particularly important because the answer can look helpful to a casual reader. It gives a name. It fills the sentence. The loss is only visible when the researcher knows which candidate was pushed aside.

Underrepresentation often hides inside a useful-looking answer: the model gives a Kenyan example, while the relevant county or business form quietly loses its chair.

What Makes Kenya A Difficult Test Case

Kenya is not a blank space online. That would be an easy story, and a false one. Nairobi technology firms, tourism brands, finance companies, logistics operators and professional service providers can have strong digital traces. The country’s mobile-money culture is widely discussed, and some sectors generate international attention. If the lab only looked at those surfaces, Kenyan visibility might seem healthier than it is.

The difficulty begins when evidence is distributed unevenly. Some businesses have websites that read cleanly to a machine. Others rely on social pages, WhatsApp, review snippets, directory entries, licence mentions or local reputation. Some county-level enterprises are locally active but thin in public records. A business may be visible to customers through phone numbers, signs, referrals and mobile-money flows while remaining hard for an answer engine to describe with confidence.

Language adds another crease. English is the base site language for much business material, yet Swahili wording can change the category, audience and locality of a prompt. A query in English may pull in formal business pages. A Swahili prompt may trigger broader public-service language or category descriptions without the same named examples. The lab treats this as a visibility question, not a translation footnote. If the language path changes who appears, then the benchmark has to record that shift.

Business form matters too. Registered firms often fit the shape that answer engines expect: name, service, location, website, maybe reviews. Jua kali enterprises, SACCOs, cooperatives and mobile-first sellers may leave evidence in different places. Their public identity can be collective, seasonal, branch-based or tied to local relationships. When an AI answer prefers the cleanest company-shaped source, it may underrepresent businesses that are real but less tidy on the page.

The lab’s position is cautious here. It does not claim the engine “knows” Nairobi better in a human sense. It observes that certain evidence paths seem easier for the answer to reuse. Nairobi-based, formal, English-language and web-forward businesses often have more machine-readable hooks. That is an infrastructure observation. It points to the public evidence layer before it points to blame.

Reading Underrepresentation Without Turning It Into A Score

A single visibility score would be tempting. It would travel well in a slide deck. It would also flatten the very thing the lab is trying to see. If a Kenyan business is skipped, blurred or displaced, the practical meaning differs. A trade body cannot respond well if all failures are folded into one number and called “low visibility.”

For that reason, the benchmark frame keeps the states separate. Presence is recorded alongside omission, inaccuracy, regional skew, language divergence, freshness problems and business-form mismatch. The lab may compare answer behaviour across engines, but it does not turn the result into a national scoreboard. The question is more specific: what kind of evidence weakness appears, and where does it repeat?

In one composite run pattern, a coastal tourism operator is skipped in broad national prompts and appears only after the wording names the coast directly. That suggests regional pull. In another pattern, a cooperative is blurred even when the county is named, suggesting that the business form itself is hard for the answer to hold. In a third, a professional-services firm is named in English but vanishes under Swahili phrasing. Those are three different repairs if a business, county office or trade group wants to act.

This is where underrepresentation becomes useful rather than dramatic. It stops being a complaint that “AI does not show Kenyan businesses” and becomes a map of weaker passages: the county wording fails here, the business-form wording fails there, the language pair splits over there. The lab can then ask which passages are shared across a sector and which belong to one business’s thin evidence.

There is also a discipline in refusing perfect symmetry. Kenya is not underrepresented in the same way across tourism, fintech, agriculture and professional services. Some sectors have export-facing pages and international mentions. Some rely on local operating knowledge. Some generate reviews; others work through relationships that rarely become public text. The benchmark has to be lumpy because the market is lumpy.

What The Finding Can And Cannot Say

The material’s strongest finding is a methodological one: Kenyan business absence in AI answers should be classified before it is explained. If a reader sees no Kenyan example in an answer, the first move is not to guess why. The first move is to record the answer state, the prompt type, the engine, the language, the date, the sector and the region. Only then can the lab compare the result with nearby runs.

That habit protects against two bad readings. The first is overconfidence: treating one missing business as proof of a system-wide bias. The second is dismissal: treating absence as random noise because answer engines change. The lab sits between those temptations. It accepts instability, but it still asks whether instability has a repeated shape.

For Kenyan businesses, the practical implication is sobering but usable. A business can have real customers and still leave too little machine-readable evidence for an answer engine to name it. A county can have active enterprises and still be represented by one louder city. A sector can be visible nationally while its smaller operators are blurred into generic descriptions. These are public-evidence problems as much as model-output problems.

For county offices and trade bodies, the value is collective. If many businesses in a category are skipped, a shared evidence project may matter more than rewriting one website. If businesses are named but inaccurately described, category definitions and source clarity may be the weak point. If displacement keeps pulling examples toward Nairobi, county-linked materials may need to make local operators easier to identify as local operators, not just as Kenyan businesses in a broad sense.

Limits Of The Benchmark

This work does not prove that Kenyan businesses are underrepresented by a measured national percentage. The lab does not present invented sample sizes, exact shares or full-market coverage claims. Its samples are descriptive: sectors, counties, business forms, languages and evidence conditions chosen because they help the answer behaviour become visible. That makes the findings useful for diagnosis, but not a population estimate.

The method also cannot see every source an engine considered. An answer may skip a business because the model had weak evidence, because the prompt was too broad, because a competing source was easier to summarise, or because the engine’s interface changed its answer style. The lab can classify the output state and compare patterns. It cannot claim private knowledge of the model’s internal cause.

AI answers are unstable. A business skipped in one run may be named later. A named answer may carry a wrong service description. A county may appear after a wording change that is too narrow to count as broad visibility. The lab treats those shifts as part of the record. Repeatability means another reader can reconstruct the test path, not that every answer will be identical.

The fairest conclusion is therefore restrained: Kenyan businesses show observable risks of underrepresentation when answer states repeatedly move from named presence into skipped, blurred or displaced forms. The risk is strongest where evidence is thin, local, bilingual, informal, seasonal or county-specific. The next work-item in the sequence turns from the broad question to sector comparison, because a national visibility problem is too blunt unless the lab can see which categories carry better evidence and which ones keep slipping out of view.

Contact

Follow the pattern from answer state to benchmark frame.

The index is built for readers who need evidence they can discuss, challenge and reuse.

Contact the lab