# Animal Data Merge, Species Coverage, and Web Matcher Overlap — Diagnosis

Report date: 2026-04-20
Read-only diagnosis. No changes made.

---

## Part A — Animal Data Merge (SM + Profiler)

### How the matcher combines data

In `matchingEngine.ts:306-309`, the `findMatches()` function iterates over all animals from SM and fetches profiler notes separately:

```typescript
for (const animal of animals) {           // SM data via fetchAnimals()
    const notes = getBehaviorNotes(animal.id);  // Profiler data from behavior_notes table
```

The two sources are **never merged into a single object for scoring**. Each scoring function receives `animal` (SM) and `notes` (profiler) as separate parameters and picks which to read per field. There is no "merge" step — it's a dual-source lookup pattern.

### Field-by-field: which source wins

| Attribute | SM has it? | Profiler has it? | Matcher reads from | Best source intention | Current matches intention? |
|---|---|---|---|---|---|
| Species | ✅ `animal.species` | ❌ | **Neither** — not used at all | SM | ❌ Not used |
| Color | ✅ `animal.color` | ✅ `notes.color` | **SM only** (`scoreColorMatch` reads `animal.color`) | Profiler (richer) | ❌ Reads SM, ignores profiler |
| Age | ✅ `animal.age`, `animal.ageInYears` | ❌ | **SM** (`scoreTraitMatch` reads `animal.ageInYears`) | SM | ✅ |
| Sex | ✅ `animal.sex` | ❌ | **Neither** — not scored | SM | ⚠️ Not scored at all |
| Size | ✅ `animal.size` | ❌ | **SM** (`scoreSizeMatch` reads `animal.size`) | SM | ✅ |
| Breed | ✅ `animal.breed` | ❌ | **SM** (`scoreTraitMatch` reads `animal.breed`) | SM | ✅ |
| Energy | ❌ | ✅ `notes.energyLevel` (text), `notes.energyLevel_match` (enum) | **Profiler text only** (`scoreEnergyMatch` reads `notes.energyLevel`, ignores enum) | Profiler enum | ⚠️ Reads text, ignores enum |
| Special needs | ❌ (only `additionalFlags` has "On Meds") | ✅ `notes.specialNeeds` | **Profiler** (`scoreSpecialNeeds` + `scoreExperience` read `notes.specialNeeds`) | Profiler | ✅ |
| Good w/ Cats | ❌ | ✅ `notes.goodWithCats_match` (enum), `notes.goodWithCats_text` | **Neither usable** — reads `notes.otherAnimalReaction` (legacy, always empty) | Profiler enum | ❌ Reads dead legacy field |
| Good w/ Dogs | ❌ | ✅ `notes.goodWithDogs_match` (enum), `notes.goodWithDogs_text` | **Neither usable** — reads `notes.otherAnimalReaction` (legacy, always empty) | Profiler enum | ❌ Reads dead legacy field |
| Good w/ Kids | ❌ | ✅ `notes.goodWithKids_match` (enum), `notes.goodWithKids_text` | **Neither usable** — reads `notes.kidBehavior` (legacy, always empty) | Profiler enum | ❌ Reads dead legacy field |
| People reaction | ❌ | ✅ `notes.peopleReaction` | **Profiler** (used in `scoreTraitMatch` personality keywords) | Profiler | ✅ |
| Backstory | ❌ | ✅ `notes.backstory` | **Profiler** (used in `scoreTraitMatch` + `generateExplanation`) | Profiler | ✅ |

### Summary of merge deviations

- **4 fields read correctly:** age, size, breed from SM; special needs, people reaction, backstory from profiler.
- **1 field reads wrong source:** color reads SM (less descriptive) instead of profiler (richer).
- **3 fields read dead legacy fields:** goodWithCats, goodWithDogs, goodWithKids all read empty legacy fields instead of the enum fields that have actual data.
- **1 field reads text instead of enum:** energy reads freeform text and re-parses it instead of using the pre-parsed enum.
- **2 fields not scored at all:** species, sex.

---

## Part B — Small Species Handling

### Does the ADOPTER_EXTRACTION_PROMPT handle small species?

**No.** The prompt (attributeParser.ts:107-116) does not list rabbit, guinea pig, gerbil, hamster, ferret, bird, reptile, or any small species as examples. The `preferredTraits` field examples are:
- "Young, playful cat"
- "Calm adult dog, medium size"
- "Any age, cuddly personality"

If the coordinator says "they want a rabbit," GPT-4o will likely produce `preferredTraits: "Rabbit"` or `preferredTraits: "Rabbit, calm, any age"` — burying species in the freeform traits string, same as cats and dogs. Since there's no structured species field, extraction handles all species identically (poorly).

### Species in the shelter database

From `animal_metadata`:

| Species | Count |
|---|---|
| Cat | 250 |
| Dog | 98 |
| Rabbit | 25 |
| Ferret | 1 |
| Guinea Pig | 1 |

SM returns each species distinctly — "Rabbit", "Guinea Pig", "Ferret" are separate `SPECIESNAME` values, not lumped under a generic "Small Animal" label.

### Small species in current Q1

Q1 text in staging-staff (index.html): **"Species, color, size and energy level"**

This is generic enough to cover any species. The problem isn't the question — it's the extraction and matching pipeline downstream.

---

## Part C — Current Q1 Text

Exact text from staging-staff/index.html:

```
<li><span class="adopter-q-num">1</span> Species, color, size and energy level</li>
```

---

## Part D — Web Matcher Overlap

### Architecture

The Web Matcher is a **client-side filter app** — completely separate from the Adopter matcher's server-side scoring engine.

| Component | Adopter Matcher | Web Matcher |
|---|---|---|
| Location | `matchingEngine.ts` (server) | `matcher-web/app.js` (client) |
| Triggered by | `POST /api/match` or `POST /api/coordinator/process` | User clicks filter checkboxes |
| Data source | `fetchAnimals()` (SM) + `getBehaviorNotes()` (profiler) | `GET /api/animals` (SM + profiler notes + bios, merged server-side) |
| Species handling | **None** — no filter, no score | **Yes** — species tabs (Dog / Cat / Small Animals) |
| Matching method | Weighted scoring (7 factors, server-side) | Client-side checkbox filters (include/exclude) |
| Shared code | `matchingEngine.ts` | None — standalone JS |

### Web Matcher endpoints

- **Data:** `GET /api/animals` (server.ts:703) — returns all adoptable animals with `behaviorNotes` attached (including `_match` enum fields)
- **Static:** `/matcher` route (server.ts:6493) serves `matcher-web/` directory
- Does NOT use `/api/match` or `matchingEngine.ts` at all.

### Filter checkboxes (matcher-web/index.html)

| Filter | Values | Enabled? | Reads from |
|---|---|---|---|
| Species tabs | dog / cat / small | ✅ Enabled | `animal.species` via `matchesSpecies()` |
| Age | young / adult / senior | ✅ Enabled | `animal.ageInYears` (SM) |
| Sex | male / female | ✅ Enabled | `animal.sex` (SM) |
| Energy Level | low / medium / high | **Disabled** (HTML `disabled` attr) | `animal.behaviorNotes.energyLevel_match` (profiler **enum**) |
| Good with Cats | unknown / yes / somewhat / no | **Disabled** | `animal.behaviorNotes.goodWithCats_match` (profiler **enum**) |
| Good with Dogs | unknown / yes / somewhat / no | **Disabled** | `animal.behaviorNotes.goodWithDogs_match` (profiler **enum**) |
| Good with Kids | unknown / yes / somewhat / no | **Disabled** | `animal.behaviorNotes.goodWithKids_match` (profiler **enum**) |
| Special Needs | yes | **Disabled** | `animal.behaviorNotes.specialNeeds` (profiler text) |
| Color search | text input | ✅ Enabled | `animal.color` (SM) + `animal.behaviorNotes.color` (profiler — **both!**) |

### Critical finding: Web Matcher already uses the _match enums correctly

The Web Matcher's `applyFilters()` function (app.js:311-400) reads the **correct fields**:

```javascript
// Energy: reads energyLevel_match enum (not freeform text)
const energyMatch = animal.behaviorNotes?.energyLevel_match;

// Good with cats: reads goodWithCats_match enum (not legacy field)
const catsMatch = animal.behaviorNotes?.goodWithCats_match;

// Good with dogs: reads goodWithDogs_match enum
const dogsMatch = animal.behaviorNotes?.goodWithDogs_match;

// Good with kids: reads goodWithKids_match enum
const kidsMatch = animal.behaviorNotes?.goodWithKids_match;
```

Comments in the code explicitly say "no text fallback, only use _match field" (app.js:246, 255, 266, 277).

**The Web Matcher also reads BOTH color sources** for search (app.js:292):
```javascript
const bnColor = (animal.behaviorNotes?.color || '').toLowerCase();
```

So the Web Matcher already does what the Adopter matcher should be doing — it reads enum fields and checks both SM and profiler color.

### Small species in Web Matcher

The Web Matcher has a "Small Animals" tab (app.js:36, index.html:41) that filters on:

```javascript
const SMALL_SPECIES = ['rabbit', 'guinea pig', 'hamster', 'gerbil', 'ferret', 
  'chinchilla', 'rat', 'mouse', 'bird', 'reptile', 'small'];
```

So species filtering is already production-ready on the Web Matcher side, just not on the Adopter matching side.

### Shared code analysis

| Fix needed | Adopter matcher impact | Web Matcher impact |
|---|---|---|
| Add `preferredSpecies` to extraction | `attributeParser.ts` + `adopter_preferences` schema | No impact — Web Matcher doesn't use adopter preferences |
| Species pre-filter in matcher | `matchingEngine.ts` | No impact — Web Matcher filters client-side via `matchesSpecies()` |
| Read `_match` enums instead of legacy fields | `matchingEngine.ts` | No impact — Web Matcher already reads enums correctly |
| Read profiler color in addition to SM | `matchingEngine.ts` | No impact — Web Matcher already reads both |

**Conclusion: The Adopter matcher and Web Matcher share zero code.** Fixes to `matchingEngine.ts` and `attributeParser.ts` will not affect the Web Matcher at all — no auto-benefit, but also no risk of breaking it.

The Web Matcher can serve as a **reference implementation** for the correct field reads. Its `applyFilters()` function demonstrates the exact pattern the Adopter matcher should adopt.

---

## Summary of Findings

1. **Merge pattern:** No merge — dual-source lookup. 4 fields correct, 1 wrong source, 3 dead legacy reads, 1 text-instead-of-enum, 2 not scored.

2. **Small species:** SM tracks them distinctly (Rabbit, Guinea Pig, Ferret). Extraction prompt doesn't mention them but GPT-4o will likely handle them passably via `preferredTraits`. The real gap is species filtering in the matcher (none exists).

3. **Q1 text:** "Species, color, size and energy level" — generic enough for all species.

4. **Web Matcher vs Adopter Matcher:** Completely separate code. Web Matcher already correctly reads `_match` enums, both color sources, and has species tabs with small animal support. Adopter matcher fixes are isolated to `matchingEngine.ts` + `attributeParser.ts` — zero overlap, zero risk.

---

REPORT COMPLETE — 178 total lines.