For Fish Audio and Fish-Speech. Refinery generates candidate ref combinations, renders them against the same phrases and styles, lets you favorite the best outputs, and biases the next round toward those refs.
Voice cloning with Fish Audio and Fish-Speech depends heavily on the reference clips you condition on. Two different 3-of-5 picks from the same speaker can produce noticeably different output. Without structure, picking the best combination is an open-ended listening session.
Refinery wraps that listening session in a refinement loop. Render candidates
against the same phrase, mark the ones that sound right, run another
round biased toward refs from your favorites. Two or three passes
usually converge.
Sample candidate ref combinations from your folder.
Generate each variant against the same test phrases (and optional S2 style tags).
Compare variants side-by-side. Mark the ones that sound right.
Next round gives favorited refs 2× weight; at least half the new variants inherit a favorite.
Same speaker (public-domain LJSpeech), same phrase, same model. Only the reference combination differs between rounds. Round 1 is a random pick; round 3 is what Refinery converged on after two refinement passes.
K=3 of 5 references, chosen uniformly at random.
K=3 of 5 references, after two favorite-weighted refinement rounds.
.lab transcription script for ref folders
you haven't transcribed yet.
Python 3.11+ with uv, plus a Fish Audio API key or a Fish-Speech server. Pick a path.
git clone https://github.com/mikeharty/refinery.git
cd refinery
cp .env.example .env
# In .env:
# FISH_TTS_URL=https://api.fish.audio/v1/tts
# FISH_API_KEY=your_api_key_here
# FISH_MODEL=s2-pro
uv sync
uv run uvicorn app:app --host 0.0.0.0 --port 5055 --reload
# Open http://localhost:5055
git clone https://github.com/mikeharty/refinery.git
cd refinery
cp .env.example .env
# Install + start Fish-Speech project-locally (Apple Silicon, MPS)
scripts/install-fish-macos.sh --install-brew-deps
scripts/start-fish-macos.sh &
uv sync
uv run uvicorn app:app --host 0.0.0.0 --port 5055 --reload
# Open http://localhost:5055
git clone https://github.com/mikeharty/refinery.git
cd refinery
cp .env.example .env
# Linux/WSL with an NVIDIA CUDA GPU (24GB+ VRAM recommended for S2-Pro)
docker compose --profile download run --rm fish-models
docker compose --profile fish up --build
# Open http://localhost:5055
Full configuration reference, including all
FISH_* environment variables and the Apple Silicon
installer flags, lives in the
README.