Syncing from Shen's latest main on github
21
LICENSE
Normal file
@ -0,0 +1,21 @@
|
||||
MIT License
|
||||
|
||||
Copyright (c) 2026 Shen Ge
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
93
README.md
@ -0,0 +1,93 @@
|
||||
# AI Workshop
|
||||
|
||||
A hands-on work shop where we will use technology to solve actual problems in your own life.
|
||||
|
||||
Today's technology is more capable than ever, and you might be surprised just how easy it is to control with the right techniques and mindset. Yet somehow the supposed miracles of technological development have yet to impact our lives in a positive way. Rather than serving us, we often find the opposite -- many of most commercially successful AI applications _feed_ on our attention, often at the cost of our wellbeing. That's not because AI is evil, just the unfortunate alignment of incentives that many of the top tech talent also wants to make money. Rather than unknowingly selling your attention and data, you can take control and make your technology work for you.
|
||||
|
||||
This workshop
|
||||
|
||||
This course will focus around a personal project: something that would help you in your life or just something you think is cool. We're going to use te
|
||||
|
||||
This is not a programming course, though programming is supplemental. It's not a test or credential. You showed up because you're curious and motivated, and our job is to help you turn that into
|
||||
|
||||
---
|
||||
|
||||
## Who's running this
|
||||
|
||||
Ryan Franz and Shen Ge. Friends and engineers with backgrounds in physics, programming, and aerospace. We have decades of combined experience getting hard things to work in the real world, including putting 2 NOVA-C landers on the Moon with the IM-1 and IM-2 missions. We know how to make technology work and we can help you get your technology to work. We'll put you in the driver seat of your own technology. How can we promise that? We're not the teachers, we're the guides. AI is the teacher. The skill is having a vision, knowing what's reasonable, and knowing where to look when things go wrong.
|
||||
|
||||
---
|
||||
|
||||
## The bet behind this class
|
||||
|
||||
There is a widening gap between what technology can do and what most people are *using* technology to do.
|
||||
|
||||
In the last few years, a new generation of AI tools has made it possible for a single motivated person — without a software background — to direct a computer to do things that, until very recently, required teams of engineers. The technology is here. What's missing, for most people, is the awareness that they can pick it up and use it themselves.
|
||||
|
||||
That is what this class is about: closing that gap, one person at a time.
|
||||
|
||||
---
|
||||
|
||||
## What you'll actually do
|
||||
|
||||
There is one organizing idea: **your personal project**.
|
||||
|
||||
Somewhere in your life, there's friction. A repetitive task you do by hand. Not being able to find things because you organize them into random piles in your attic. A small business workflow that should take 5 minutes and takes 45. Something you've been meaning to figure out for years but never had the time.
|
||||
|
||||
It's never been easier to address these frictions than the present. It doesn't have to be today, but I want you thinking: if you were Iron Man and you could just snap your fingers and make your laptop, cellphone, smart devices do anything you wanted, what would you do with that power? It's not just "hypothetically possible" with AI. AI _is_ that power, and it's probably already in your pocket. It's my job to guide and motivate you into realizing what you can do with that.
|
||||
|
||||
|
||||
See [`personal-project.md`](personal-project.md) for more on what a project can look like and how loose the requirements are. (Spoiler: very loose.)
|
||||
|
||||
---
|
||||
|
||||
## What this class is *not*
|
||||
|
||||
- **Not a SaaS demo.** We are not going to teach you how to pay company X to do thing Y. The whole point is to put *you* in control.
|
||||
- **Not a graded class.** No tests, no homework, no minimum project complexity, no wrong answers, you can't mess up. Being present is the bar.
|
||||
- **Not a credential.** There is no credential associated with taking this class. The reward is understanding how to use technology. Although if you _want_ I'll invent a digital token that we can agree represents a credential, and I'll try to convince why that's more authentic and useful than any other artifact I could hand you.
|
||||
- **Not a promise that everything is easy.** Some things will be frustrating. The two of us are here to get you unstuck when you are.
|
||||
|
||||
---
|
||||
|
||||
## Our biases (so they're not surprises)
|
||||
|
||||
- **Free and open source by default.** Open-weight models, self-hostable tools, standard formats. We will not steer you into a walled garden if we can help it.
|
||||
- **Local-first when it's reasonable.** Running things on your own machine is often simpler, cheaper, and more private than people assume. Cloud compute has genuine uses too but you should have an idea of what's reasonable.
|
||||
- **Honest about limits.** AI is powerful and AI is fallible. We'll show you both. Calibration matters more than hype.
|
||||
|
||||
---
|
||||
|
||||
## Sessions
|
||||
|
||||
We're starting with one class. If it feels valuable to people, it continues.
|
||||
|
||||
- [`sessions/01-orientation.md`](sessions/01-orientation.md) — what to expect at the first session
|
||||
|
||||
---
|
||||
|
||||
## Repository layout
|
||||
|
||||
```
|
||||
.
|
||||
├── README.md ← you are here
|
||||
├── personal-project.md ← what a personal project is and isn't
|
||||
├── sessions/ ← per-session notes for students
|
||||
├── reference/ ← background material to dip into when a project needs it
|
||||
│ ├── python/ ← self-paced Python primer
|
||||
│ ├── git/ ← version control basics
|
||||
│ ├── github/ ← sharing and backing up code
|
||||
│ ├── huggingface/ ← grabbing and running open-weight models
|
||||
│ ├── pytorch/ ← the framework most modern AI is written in
|
||||
│ ├── docker/ ← running other people's software cleanly
|
||||
│ └── papers/ ← a small reading list (Attention Is All You Need, GPT 1–4)
|
||||
└── examples/ ← full, working example projects you can run, read, or fork
|
||||
├── image_meaning_db/ ← search images by meaning (CLIP + ChromaDB + FastAPI)
|
||||
├── audio_meaning_db/ ← search spoken audio by what's said (Whisper + embeddings)
|
||||
└── everything_function/ ← ten "smart Python functions" backed by the same local model call + a browser UI
|
||||
```
|
||||
|
||||
Most of the new entries above are placeholders for now — they'll fill in as the class progresses or as projects pull us toward them.
|
||||
|
||||
---
|
||||
|
||||
27
examples/README.md
Normal file
@ -0,0 +1,27 @@
|
||||
# Example projects
|
||||
|
||||
Full, working projects you can clone, run, and tear apart. They're not lessons — they're more like worked examples. The point is to give you something concrete to look at when you're trying to imagine what your own project could look like (or to copy-paste from when one of them does roughly what you need).
|
||||
|
||||
These are intentionally on equal footing — there's no "beginner / intermediate / advanced" tier. Pick whichever interests you.
|
||||
|
||||
## Index
|
||||
|
||||
| Project | What it is | What you'll see in it |
|
||||
|---------|------------|-----------------------|
|
||||
| [`image_meaning_db/`](image_meaning_db/) | Search a folder of images by meaning, not filename. Drop in a query image, get back the closest matches. | CLIP embeddings, ChromaDB, FastAPI, a tiny browser UI, all in one Docker container. |
|
||||
| [`audio_meaning_db/`](audio_meaning_db/) | Search spoken audio by what's said in it. Drop in a clip, get back the closest segments from your library. | Whisper transcription, sentence embeddings, segment chunking, FastAPI, Docker. |
|
||||
| [`everything_function/`](everything_function/) | Ten Python functions — arithmetic, prime factorization, sentiment, translation, OCR, photo→recipe — all backed by the same one-line call to a local AI model. Browser UI + terminal REPLs. | A local Qwen vision-language model in Ollama, FastAPI, Docker Compose, and the realization that a "function" can have a prompt for a body. |
|
||||
|
||||
More will be added over time.
|
||||
|
||||
## How to use these
|
||||
|
||||
Three reasonable modes, in increasing order of effort:
|
||||
|
||||
1. **Just run one.** Each project's README has a `docker compose up -d --build` line. Try it. Poke at the UI. Get a feel for what's possible.
|
||||
2. **Read the code.** The backends are deliberately small — a single `main.py` per project. Open it, ask AI to walk you through any part you don't understand. This is one of the best ways to *see* how a complete small thing fits together.
|
||||
3. **Fork and modify.** Copy the folder somewhere of your own, change things, see what breaks. Swap the embedding model. Change the seed images. Add a "delete by ID" endpoint. This is where it stops being an example and starts being a project.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Both current examples need Docker. See [`../reference/docker/`](../reference/docker/) for the basics; each project's README also links the official install guides.
|
||||
88
examples/audio_meaning_db/README.md
Normal file
@ -0,0 +1,88 @@
|
||||
# audio_meaning_db
|
||||
|
||||
A self-contained semantic audio search tool for **spoken audio**. Upload audio clips (optionally with a description) to build up a database, then search by audio to find the nearest neighbors by what's being said. Runs as a single Docker service: a FastAPI backend that transcribes speech locally with Whisper (`openai/whisper-base`), embeds the transcripts with a sentence-transformer (`all-MiniLM-L6-v2`), and stores vectors in ChromaDB, served behind a minimal browser UI.
|
||||
|
||||
Long uploads are split into **60-second segments**; each segment is transcribed and indexed independently. Search returns the best-matching segment along with a link to the full parent clip.
|
||||
|
||||
## Why Whisper + sentence embeddings (and not CLAP)
|
||||
|
||||
This tool is optimized for finding audio by *what is said*, not by *how it sounds*. Two recordings of the sentence "I love cats" embedded with CLAP would look similar regardless of content; two recordings saying "I love cats" vs "I'm fond of felines" would look very different. Whisper transcription + text embeddings inverts that: meaning-preserving paraphrases match, and acoustic differences (voice, accent, background noise) are ignored.
|
||||
|
||||
If you want music/SFX/ambient similarity instead, swap in CLAP — the plumbing is the same.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
You need Docker Engine and the Docker Compose plugin. If you don't already have them:
|
||||
|
||||
- **Linux (Ubuntu/Debian):** follow the official install guide at https://docs.docker.com/engine/install/ubuntu/. After installing, add your user to the `docker` group so you don't need `sudo`:
|
||||
```bash
|
||||
sudo usermod -aG docker $USER
|
||||
newgrp docker
|
||||
```
|
||||
- **macOS / Windows:** install Docker Desktop from https://docs.docker.com/desktop/. Compose is bundled.
|
||||
|
||||
Verify it works:
|
||||
```bash
|
||||
docker --version
|
||||
docker compose version
|
||||
```
|
||||
|
||||
## Running it
|
||||
|
||||
From this directory:
|
||||
|
||||
```bash
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
Then open http://localhost:8082 in your browser.
|
||||
|
||||
### What to expect on the first run
|
||||
|
||||
The first `up --build` is slow because it:
|
||||
|
||||
1. Installs Python deps including CPU-only PyTorch (~200 MB pip download) and `ffmpeg` + `libsndfile`.
|
||||
2. Downloads the Whisper model (~150 MB) and the sentence-transformer (~80 MB) into a cached volume on first server start.
|
||||
3. Starts with an **empty database** — no seeding. Upload your own audio.
|
||||
|
||||
Watch progress with:
|
||||
```bash
|
||||
docker compose logs -f backend
|
||||
```
|
||||
|
||||
You'll see `ASR model openai/whisper-base ready.` and `Text embedding model ... ready.` once it's warmed up. Subsequent runs reuse the cached models and existing database, so startup is fast.
|
||||
|
||||
## Using the UI
|
||||
|
||||
Two tabs:
|
||||
|
||||
- **Submit Audio** — drop or click to select an audio file (mp3, wav, m4a, flac, ogg). Add an optional description and click *Submit to Database*. The file is chunked into 60-second segments, each transcribed and embedded. You'll see the per-segment transcripts once it's done.
|
||||
- **Search by Audio** — drop or click to select a query clip (≤ 60 seconds, hard-enforced). The backend transcribes it, embeds the transcript, and returns the most semantically similar stored segments, ranked by cosine similarity. Each result shows the segment's transcript, a playable audio slice of just that segment, and a toggle to play the full parent clip.
|
||||
|
||||
## API
|
||||
|
||||
Direct endpoints:
|
||||
|
||||
- `POST /api/submit` — multipart form: `file` (audio), optional `description` (string). Returns `{parent_id, filename, segments_added, total_segments, duration_sec, segments}` where `segments` includes per-segment timestamps and transcripts.
|
||||
- `POST /api/search` — multipart form: `file` (audio, ≤ 60 s), optional query param `n` (default 10). Returns `{results, query_transcript}` with ranked matches.
|
||||
- `GET /api/audio/{filename}` — serves a stored audio file in its original format.
|
||||
- `GET /api/segment/{parent_filename}?start=<sec>&end=<sec>` — serves a WAV slice of a segment.
|
||||
- `GET /api/stats` — `{total_segments, total_clips}`.
|
||||
|
||||
## Stopping and resetting
|
||||
|
||||
```bash
|
||||
docker compose down # stop containers, keep data
|
||||
docker compose down -v # also delete the database, cached models, and stored audio
|
||||
```
|
||||
|
||||
If you wipe volumes, the next start will re-download both models.
|
||||
|
||||
## Configuration
|
||||
|
||||
Environment variables set in `docker-compose.yml`:
|
||||
|
||||
- `ASR_MODEL` — HuggingFace Whisper model name. Default: `openai/whisper-base`. Smaller/faster: `openai/whisper-tiny`. Better quality: `openai/whisper-small` (~500 MB, slower on CPU). English-only variants (`-base.en`, `-tiny.en`) are slightly better for English-only content. If you change this, existing transcripts stay valid (only query behavior changes).
|
||||
- `TEXT_EMBEDDING_MODEL` — sentence-transformers model name. Default: `sentence-transformers/all-MiniLM-L6-v2` (384-d). If you change this, wipe the `chroma_data` volume — embedding dimensions must match across all stored vectors.
|
||||
|
||||
Host port mapping is also in `docker-compose.yml`; change the left side of `"8082:8080"` if 8082 conflicts with something else.
|
||||
23
examples/audio_meaning_db/backend/Dockerfile
Normal file
@ -0,0 +1,23 @@
|
||||
FROM python:3.12-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
ENV HF_HOME=/root/.cache/huggingface \
|
||||
PIP_DEFAULT_TIMEOUT=180 \
|
||||
PYTHONUNBUFFERED=1
|
||||
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||
ffmpeg \
|
||||
libsndfile1 \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
COPY requirements.txt .
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
|
||||
COPY . .
|
||||
|
||||
RUN mkdir -p /app/audio /app/chroma_data
|
||||
|
||||
EXPOSE 8080
|
||||
|
||||
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8080"]
|
||||
283
examples/audio_meaning_db/backend/main.py
Normal file
@ -0,0 +1,283 @@
|
||||
import asyncio
|
||||
import io
|
||||
import os
|
||||
import uuid
|
||||
from pathlib import Path
|
||||
|
||||
import chromadb
|
||||
import numpy as np
|
||||
import soundfile as sf
|
||||
import soxr
|
||||
from fastapi import FastAPI, File, Form, HTTPException, UploadFile
|
||||
from fastapi.responses import FileResponse, HTMLResponse, Response
|
||||
from pydub import AudioSegment
|
||||
from sentence_transformers import SentenceTransformer
|
||||
from transformers import pipeline
|
||||
|
||||
ASR_MODEL = os.environ.get("ASR_MODEL", "openai/whisper-base")
|
||||
TEXT_EMBEDDING_MODEL = os.environ.get(
|
||||
"TEXT_EMBEDDING_MODEL", "sentence-transformers/all-MiniLM-L6-v2"
|
||||
)
|
||||
SEGMENT_LENGTH_SEC = 60
|
||||
MIN_TAIL_SEGMENT_SEC = 3.0
|
||||
SEARCH_MAX_SEC = 60.5
|
||||
TARGET_SAMPLE_RATE = 16000
|
||||
AUDIO_DIR = Path("/app/audio")
|
||||
CHROMA_DIR = Path("/app/chroma_data")
|
||||
|
||||
AUDIO_DIR.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
app = FastAPI(title="Audio Meaning DB")
|
||||
|
||||
chroma_client = chromadb.PersistentClient(path=str(CHROMA_DIR))
|
||||
collection = chroma_client.get_or_create_collection(
|
||||
name="audio_segments",
|
||||
metadata={"hnsw:space": "cosine"},
|
||||
)
|
||||
|
||||
print(f"Loading ASR model {ASR_MODEL}...")
|
||||
asr_pipeline = pipeline(
|
||||
"automatic-speech-recognition",
|
||||
model=ASR_MODEL,
|
||||
chunk_length_s=30,
|
||||
device="cpu",
|
||||
)
|
||||
print(f"ASR model {ASR_MODEL} ready.")
|
||||
|
||||
print(f"Loading text embedding model {TEXT_EMBEDDING_MODEL}...")
|
||||
text_embedder = SentenceTransformer(TEXT_EMBEDDING_MODEL)
|
||||
print(f"Text embedding model {TEXT_EMBEDDING_MODEL} ready.")
|
||||
|
||||
|
||||
def _load_with_pydub(audio_bytes: bytes) -> tuple[np.ndarray, int]:
|
||||
audio = AudioSegment.from_file(io.BytesIO(audio_bytes))
|
||||
samples = np.array(audio.get_array_of_samples(), dtype=np.float32)
|
||||
if audio.channels > 1:
|
||||
samples = samples.reshape(-1, audio.channels)
|
||||
max_val = float(2 ** (audio.sample_width * 8 - 1))
|
||||
samples = samples / max_val
|
||||
return samples, audio.frame_rate
|
||||
|
||||
|
||||
def load_audio(audio_bytes: bytes, filename: str) -> np.ndarray:
|
||||
"""Decode audio bytes to mono float32 at TARGET_SAMPLE_RATE."""
|
||||
ext = Path(filename).suffix.lower().lstrip(".")
|
||||
data: np.ndarray
|
||||
sr: int
|
||||
if ext in ("wav", "flac", "ogg", "oga"):
|
||||
try:
|
||||
data, sr = sf.read(io.BytesIO(audio_bytes), dtype="float32", always_2d=False)
|
||||
except Exception:
|
||||
data, sr = _load_with_pydub(audio_bytes)
|
||||
else:
|
||||
data, sr = _load_with_pydub(audio_bytes)
|
||||
|
||||
if data.ndim > 1:
|
||||
data = data.mean(axis=1)
|
||||
data = np.asarray(data, dtype=np.float32)
|
||||
|
||||
if sr != TARGET_SAMPLE_RATE:
|
||||
data = soxr.resample(data, sr, TARGET_SAMPLE_RATE, quality="HQ").astype(np.float32)
|
||||
|
||||
return data
|
||||
|
||||
|
||||
def segment_audio(audio: np.ndarray) -> list[tuple[float, float, np.ndarray]]:
|
||||
"""Split into ~60s segments. Returns [(start_sec, end_sec, samples), ...]."""
|
||||
sr = TARGET_SAMPLE_RATE
|
||||
total_sec = len(audio) / sr
|
||||
if total_sec <= SEGMENT_LENGTH_SEC:
|
||||
return [(0.0, total_sec, audio)]
|
||||
|
||||
segments: list[tuple[float, float, np.ndarray]] = []
|
||||
segment_samples = SEGMENT_LENGTH_SEC * sr
|
||||
num_full = len(audio) // segment_samples
|
||||
for i in range(num_full):
|
||||
start = i * segment_samples
|
||||
end = start + segment_samples
|
||||
segments.append((start / sr, end / sr, audio[start:end]))
|
||||
|
||||
tail_start = num_full * segment_samples
|
||||
tail = audio[tail_start:]
|
||||
tail_sec = len(tail) / sr
|
||||
if tail_sec >= MIN_TAIL_SEGMENT_SEC:
|
||||
segments.append((tail_start / sr, total_sec, tail))
|
||||
|
||||
return segments
|
||||
|
||||
|
||||
def transcribe(audio: np.ndarray) -> str:
|
||||
"""Transcribe mono 16kHz float32 audio to text."""
|
||||
result = asr_pipeline({"array": audio, "sampling_rate": TARGET_SAMPLE_RATE})
|
||||
return (result.get("text") or "").strip()
|
||||
|
||||
|
||||
def embed_text(text: str) -> list[float]:
|
||||
query = text if text else "(no speech detected)"
|
||||
vec = text_embedder.encode(query, convert_to_numpy=True, normalize_embeddings=True)
|
||||
return vec.tolist()
|
||||
|
||||
|
||||
def fmt_time(sec: float) -> str:
|
||||
sec = int(round(sec))
|
||||
return f"{sec // 60}:{sec % 60:02d}"
|
||||
|
||||
|
||||
@app.post("/api/submit")
|
||||
async def submit_audio(
|
||||
file: UploadFile = File(...),
|
||||
description: str = Form(""),
|
||||
):
|
||||
audio_bytes = await file.read()
|
||||
|
||||
parent_id = str(uuid.uuid4())
|
||||
ext = Path(file.filename or "audio.wav").suffix or ".wav"
|
||||
filename = f"{parent_id}{ext}"
|
||||
filepath = AUDIO_DIR / filename
|
||||
filepath.write_bytes(audio_bytes)
|
||||
|
||||
try:
|
||||
audio = load_audio(audio_bytes, file.filename or "")
|
||||
except Exception as e:
|
||||
filepath.unlink(missing_ok=True)
|
||||
raise HTTPException(status_code=400, detail=f"Could not decode audio: {e}")
|
||||
|
||||
total_sec = len(audio) / TARGET_SAMPLE_RATE
|
||||
segments = segment_audio(audio)
|
||||
is_full_clip = len(segments) == 1
|
||||
|
||||
ids: list[str] = []
|
||||
embeddings: list[list[float]] = []
|
||||
metadatas: list[dict] = []
|
||||
segment_summaries: list[dict] = []
|
||||
|
||||
for i, (start, end, samples) in enumerate(segments):
|
||||
transcript = await asyncio.to_thread(transcribe, samples)
|
||||
embedding = await asyncio.to_thread(embed_text, transcript)
|
||||
segment_id = f"{parent_id}:{i}"
|
||||
ids.append(segment_id)
|
||||
embeddings.append(embedding)
|
||||
metadatas.append({
|
||||
"parent_id": parent_id,
|
||||
"parent_filename": filename,
|
||||
"parent_original_name": file.filename or "unknown",
|
||||
"segment_index": i,
|
||||
"start_sec": float(start),
|
||||
"end_sec": float(end),
|
||||
"parent_duration_sec": float(total_sec),
|
||||
"description": description,
|
||||
"transcript": transcript,
|
||||
"is_full_clip": is_full_clip,
|
||||
})
|
||||
segment_summaries.append({
|
||||
"segment_index": i,
|
||||
"start_sec": float(start),
|
||||
"end_sec": float(end),
|
||||
"transcript": transcript,
|
||||
})
|
||||
|
||||
collection.add(ids=ids, embeddings=embeddings, metadatas=metadatas)
|
||||
|
||||
return {
|
||||
"parent_id": parent_id,
|
||||
"filename": filename,
|
||||
"segments_added": len(segments),
|
||||
"total_segments": collection.count(),
|
||||
"duration_sec": total_sec,
|
||||
"segments": segment_summaries,
|
||||
}
|
||||
|
||||
|
||||
@app.post("/api/search")
|
||||
async def search_audio(file: UploadFile = File(...), n: int = 10):
|
||||
audio_bytes = await file.read()
|
||||
try:
|
||||
audio = load_audio(audio_bytes, file.filename or "")
|
||||
except Exception as e:
|
||||
raise HTTPException(status_code=400, detail=f"Could not decode audio: {e}")
|
||||
|
||||
duration_sec = len(audio) / TARGET_SAMPLE_RATE
|
||||
if duration_sec > SEARCH_MAX_SEC:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=f"Search audio must be at most {SEGMENT_LENGTH_SEC}s. Got {duration_sec:.1f}s.",
|
||||
)
|
||||
|
||||
count = collection.count()
|
||||
if count == 0:
|
||||
return {"results": [], "message": "No audio in database yet.", "query_transcript": ""}
|
||||
|
||||
transcript = await asyncio.to_thread(transcribe, audio)
|
||||
embedding = await asyncio.to_thread(embed_text, transcript)
|
||||
|
||||
results = collection.query(
|
||||
query_embeddings=[embedding],
|
||||
n_results=min(n, count),
|
||||
include=["distances", "metadatas"],
|
||||
)
|
||||
|
||||
matches = []
|
||||
for i, doc_id in enumerate(results["ids"][0]):
|
||||
distance = results["distances"][0][i]
|
||||
metadata = results["metadatas"][0][i]
|
||||
similarity = 1 - distance
|
||||
matches.append({
|
||||
"id": doc_id,
|
||||
"parent_filename": metadata["parent_filename"],
|
||||
"parent_original_name": metadata["parent_original_name"],
|
||||
"segment_index": metadata["segment_index"],
|
||||
"start_sec": metadata["start_sec"],
|
||||
"end_sec": metadata["end_sec"],
|
||||
"parent_duration_sec": metadata["parent_duration_sec"],
|
||||
"description": metadata.get("description", ""),
|
||||
"transcript": metadata.get("transcript", ""),
|
||||
"is_full_clip": metadata.get("is_full_clip", True),
|
||||
"similarity": round(similarity, 4),
|
||||
})
|
||||
|
||||
return {"results": matches, "query_transcript": transcript}
|
||||
|
||||
|
||||
@app.get("/api/audio/{filename}")
|
||||
async def get_audio(filename: str):
|
||||
filepath = AUDIO_DIR / filename
|
||||
if not filepath.exists():
|
||||
raise HTTPException(status_code=404, detail="Audio not found")
|
||||
return FileResponse(filepath)
|
||||
|
||||
|
||||
@app.get("/api/segment/{parent_filename}")
|
||||
async def get_segment(parent_filename: str, start: float, end: float):
|
||||
filepath = AUDIO_DIR / parent_filename
|
||||
if not filepath.exists():
|
||||
raise HTTPException(status_code=404, detail="Audio not found")
|
||||
if end <= start or start < 0:
|
||||
raise HTTPException(status_code=400, detail="Invalid segment range")
|
||||
|
||||
audio_bytes = filepath.read_bytes()
|
||||
audio = load_audio(audio_bytes, parent_filename)
|
||||
sr = TARGET_SAMPLE_RATE
|
||||
start_idx = max(0, int(start * sr))
|
||||
end_idx = min(len(audio), int(end * sr))
|
||||
if end_idx <= start_idx:
|
||||
raise HTTPException(status_code=400, detail="Empty segment slice")
|
||||
|
||||
buf = io.BytesIO()
|
||||
sf.write(buf, audio[start_idx:end_idx], sr, format="WAV", subtype="PCM_16")
|
||||
buf.seek(0)
|
||||
return Response(content=buf.read(), media_type="audio/wav")
|
||||
|
||||
|
||||
@app.get("/api/stats")
|
||||
async def stats():
|
||||
total_segments = collection.count()
|
||||
unique_parents = 0
|
||||
if total_segments > 0:
|
||||
all_meta = collection.get(include=["metadatas"])
|
||||
unique_parents = len({m["parent_id"] for m in all_meta["metadatas"]})
|
||||
return {"total_segments": total_segments, "total_clips": unique_parents}
|
||||
|
||||
|
||||
@app.get("/", response_class=HTMLResponse)
|
||||
async def index():
|
||||
return Path("static/index.html").read_text()
|
||||
12
examples/audio_meaning_db/backend/requirements.txt
Normal file
@ -0,0 +1,12 @@
|
||||
fastapi==0.115.6
|
||||
uvicorn==0.34.0
|
||||
python-multipart==0.0.20
|
||||
chromadb==0.6.3
|
||||
transformers==4.46.3
|
||||
sentence-transformers==3.3.1
|
||||
numpy>=1.26,<2.2
|
||||
soundfile==0.12.1
|
||||
soxr==0.5.0.post1
|
||||
pydub==0.25.1
|
||||
--extra-index-url https://download.pytorch.org/whl/cpu
|
||||
torch==2.5.1+cpu
|
||||
589
examples/audio_meaning_db/backend/static/index.html
Normal file
@ -0,0 +1,589 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>Audio Meaning DB</title>
|
||||
<style>
|
||||
:root {
|
||||
--bg: #0f1117;
|
||||
--surface: #1a1d27;
|
||||
--border: #2a2d3a;
|
||||
--text: #e1e4ed;
|
||||
--muted: #8b8fa3;
|
||||
--accent: #6c5ce7;
|
||||
--accent-hover: #7f71ed;
|
||||
--success: #2ecc71;
|
||||
--warning: #f39c12;
|
||||
--error: #e74c3c;
|
||||
--card-bg: #1e2130;
|
||||
}
|
||||
* { margin: 0; padding: 0; box-sizing: border-box; }
|
||||
body {
|
||||
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', system-ui, sans-serif;
|
||||
background: var(--bg);
|
||||
color: var(--text);
|
||||
min-height: 100vh;
|
||||
}
|
||||
.container { max-width: 960px; margin: 0 auto; padding: 2rem 1.5rem; }
|
||||
h1 {
|
||||
font-size: 1.75rem;
|
||||
font-weight: 700;
|
||||
margin-bottom: 0.25rem;
|
||||
}
|
||||
.subtitle { color: var(--muted); margin-bottom: 2rem; font-size: 0.9rem; }
|
||||
.stats { color: var(--muted); font-size: 0.85rem; margin-bottom: 1.5rem; }
|
||||
|
||||
.tabs {
|
||||
display: flex;
|
||||
gap: 0;
|
||||
margin-bottom: 2rem;
|
||||
border-bottom: 1px solid var(--border);
|
||||
}
|
||||
.tab {
|
||||
padding: 0.75rem 1.5rem;
|
||||
cursor: pointer;
|
||||
color: var(--muted);
|
||||
border-bottom: 2px solid transparent;
|
||||
font-size: 0.95rem;
|
||||
transition: all 0.15s;
|
||||
}
|
||||
.tab:hover { color: var(--text); }
|
||||
.tab.active {
|
||||
color: var(--accent);
|
||||
border-bottom-color: var(--accent);
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.panel { display: none; }
|
||||
.panel.active { display: block; }
|
||||
|
||||
.drop-zone {
|
||||
border: 2px dashed var(--border);
|
||||
border-radius: 12px;
|
||||
padding: 3rem 2rem;
|
||||
text-align: center;
|
||||
cursor: pointer;
|
||||
transition: all 0.2s;
|
||||
margin-bottom: 1.5rem;
|
||||
position: relative;
|
||||
}
|
||||
.drop-zone:hover, .drop-zone.dragover {
|
||||
border-color: var(--accent);
|
||||
background: rgba(108, 92, 231, 0.05);
|
||||
}
|
||||
.drop-zone input { display: none; }
|
||||
.drop-zone p { color: var(--muted); margin-top: 0.5rem; font-size: 0.9rem; }
|
||||
.drop-zone .icon { font-size: 2rem; margin-bottom: 0.5rem; }
|
||||
|
||||
.preview {
|
||||
background: var(--surface);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: 10px;
|
||||
padding: 1rem 1.25rem;
|
||||
margin-bottom: 1.5rem;
|
||||
}
|
||||
.preview-meta {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: baseline;
|
||||
gap: 1rem;
|
||||
margin-bottom: 0.75rem;
|
||||
}
|
||||
.preview-name {
|
||||
font-weight: 600;
|
||||
font-size: 0.95rem;
|
||||
word-break: break-all;
|
||||
}
|
||||
.preview-details {
|
||||
color: var(--muted);
|
||||
font-size: 0.8rem;
|
||||
white-space: nowrap;
|
||||
}
|
||||
.preview audio { width: 100%; margin-bottom: 0.75rem; }
|
||||
|
||||
button {
|
||||
background: var(--accent);
|
||||
color: white;
|
||||
border: none;
|
||||
padding: 0.65rem 1.5rem;
|
||||
border-radius: 8px;
|
||||
font-size: 0.9rem;
|
||||
cursor: pointer;
|
||||
font-weight: 600;
|
||||
transition: background 0.15s;
|
||||
}
|
||||
button:hover { background: var(--accent-hover); }
|
||||
button:disabled { opacity: 0.5; cursor: not-allowed; }
|
||||
|
||||
.status {
|
||||
margin-top: 1rem;
|
||||
padding: 0.75rem 1rem;
|
||||
border-radius: 8px;
|
||||
font-size: 0.85rem;
|
||||
display: none;
|
||||
}
|
||||
.status.show { display: block; }
|
||||
.status.info { background: rgba(108, 92, 231, 0.1); color: var(--accent); }
|
||||
.status.success { background: rgba(46, 204, 113, 0.1); color: var(--success); }
|
||||
.status.warning { background: rgba(243, 156, 18, 0.1); color: var(--warning); }
|
||||
.status.error { background: rgba(231, 76, 60, 0.1); color: var(--error); }
|
||||
|
||||
.duration-error {
|
||||
color: var(--error);
|
||||
font-size: 0.8rem;
|
||||
margin-top: 0.25rem;
|
||||
}
|
||||
|
||||
.query-transcript {
|
||||
background: var(--surface);
|
||||
border: 1px solid var(--border);
|
||||
border-left: 3px solid var(--accent);
|
||||
border-radius: 6px;
|
||||
padding: 0.65rem 0.9rem;
|
||||
font-size: 0.85rem;
|
||||
margin-top: 1rem;
|
||||
}
|
||||
.query-transcript .label {
|
||||
color: var(--muted);
|
||||
font-size: 0.75rem;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.05em;
|
||||
margin-bottom: 0.25rem;
|
||||
}
|
||||
|
||||
.results {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 0.9rem;
|
||||
margin-top: 1.5rem;
|
||||
}
|
||||
.result-card {
|
||||
background: var(--card-bg);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: 10px;
|
||||
padding: 0.9rem 1rem;
|
||||
}
|
||||
.result-card-header {
|
||||
display: flex;
|
||||
align-items: baseline;
|
||||
gap: 0.75rem;
|
||||
flex-wrap: wrap;
|
||||
margin-bottom: 0.5rem;
|
||||
}
|
||||
.result-card .rank {
|
||||
font-size: 0.75rem;
|
||||
color: var(--muted);
|
||||
font-weight: 600;
|
||||
}
|
||||
.result-card .similarity {
|
||||
font-weight: 700;
|
||||
font-size: 1rem;
|
||||
color: var(--accent);
|
||||
}
|
||||
.result-card .orig-name {
|
||||
color: var(--muted);
|
||||
font-size: 0.8rem;
|
||||
flex: 1;
|
||||
min-width: 0;
|
||||
overflow: hidden;
|
||||
text-overflow: ellipsis;
|
||||
white-space: nowrap;
|
||||
}
|
||||
.result-card .segment-range {
|
||||
color: var(--muted);
|
||||
font-size: 0.75rem;
|
||||
font-variant-numeric: tabular-nums;
|
||||
}
|
||||
.result-card .description {
|
||||
color: var(--text);
|
||||
font-size: 0.85rem;
|
||||
margin: 0.3rem 0 0.5rem;
|
||||
font-style: italic;
|
||||
}
|
||||
.result-card .transcript {
|
||||
color: var(--text);
|
||||
font-size: 0.85rem;
|
||||
background: var(--surface);
|
||||
border-radius: 6px;
|
||||
padding: 0.5rem 0.7rem;
|
||||
margin: 0.5rem 0;
|
||||
line-height: 1.4;
|
||||
white-space: pre-wrap;
|
||||
}
|
||||
.result-card .transcript.empty {
|
||||
color: var(--muted);
|
||||
font-style: italic;
|
||||
}
|
||||
.result-card audio { width: 100%; margin-top: 0.25rem; }
|
||||
.result-card .full-toggle {
|
||||
color: var(--accent);
|
||||
font-size: 0.8rem;
|
||||
cursor: pointer;
|
||||
margin-top: 0.5rem;
|
||||
display: inline-block;
|
||||
user-select: none;
|
||||
}
|
||||
.result-card .full-toggle:hover { text-decoration: underline; }
|
||||
.result-card .full-audio { margin-top: 0.5rem; display: none; }
|
||||
.result-card .full-audio.show { display: block; }
|
||||
|
||||
textarea.description {
|
||||
width: 100%;
|
||||
background: var(--surface);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: 8px;
|
||||
color: var(--text);
|
||||
padding: 0.6rem 0.75rem;
|
||||
font-family: inherit;
|
||||
font-size: 0.9rem;
|
||||
resize: vertical;
|
||||
min-height: 72px;
|
||||
margin-bottom: 0.75rem;
|
||||
}
|
||||
textarea.description:focus {
|
||||
outline: none;
|
||||
border-color: var(--accent);
|
||||
}
|
||||
|
||||
.loading {
|
||||
display: inline-block;
|
||||
width: 16px; height: 16px;
|
||||
border: 2px solid var(--border);
|
||||
border-top-color: var(--accent);
|
||||
border-radius: 50%;
|
||||
animation: spin 0.6s linear infinite;
|
||||
vertical-align: middle;
|
||||
margin-right: 0.5rem;
|
||||
}
|
||||
@keyframes spin { to { transform: rotate(360deg); } }
|
||||
|
||||
.segment-list {
|
||||
margin-top: 0.75rem;
|
||||
background: var(--surface);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: 8px;
|
||||
padding: 0.5rem 0.75rem;
|
||||
font-size: 0.82rem;
|
||||
color: var(--muted);
|
||||
max-height: 200px;
|
||||
overflow-y: auto;
|
||||
}
|
||||
.segment-list .seg {
|
||||
padding: 0.4rem 0;
|
||||
border-bottom: 1px solid var(--border);
|
||||
}
|
||||
.segment-list .seg:last-child { border-bottom: none; }
|
||||
.segment-list .seg .range {
|
||||
color: var(--accent);
|
||||
font-variant-numeric: tabular-nums;
|
||||
font-weight: 600;
|
||||
margin-right: 0.5rem;
|
||||
}
|
||||
.segment-list .seg .text { color: var(--text); }
|
||||
.segment-list .seg .text.empty { color: var(--muted); font-style: italic; }
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="container">
|
||||
<h1>Audio Meaning DB</h1>
|
||||
<p class="subtitle">Semantic audio search — speech transcribed with Whisper, indexed by sentence embedding</p>
|
||||
<p class="stats" id="stats">Loading...</p>
|
||||
|
||||
<div class="tabs">
|
||||
<div class="tab active" data-panel="submit">Submit Audio</div>
|
||||
<div class="tab" data-panel="search">Search by Audio</div>
|
||||
</div>
|
||||
|
||||
<!-- Submit Panel -->
|
||||
<div class="panel active" id="panel-submit">
|
||||
<div class="drop-zone" id="submit-drop">
|
||||
<div class="icon">🎤</div>
|
||||
<p>Drop or click to browse an audio file (mp3, wav, m4a, flac, ogg)</p>
|
||||
<input type="file" accept="audio/*" id="submit-file">
|
||||
</div>
|
||||
<div id="submit-preview" style="display:none">
|
||||
<div class="preview">
|
||||
<div class="preview-meta">
|
||||
<div class="preview-name" id="submit-file-name"></div>
|
||||
<div class="preview-details" id="submit-file-details"></div>
|
||||
</div>
|
||||
<audio controls id="submit-audio"></audio>
|
||||
<textarea class="description" id="submit-description"
|
||||
placeholder="Describe this audio (optional) — e.g. 'voice memo about the Q3 roadmap'"></textarea>
|
||||
<button id="submit-btn">Submit to Database</button>
|
||||
</div>
|
||||
</div>
|
||||
<div class="status" id="submit-status"></div>
|
||||
<div id="submit-segments"></div>
|
||||
</div>
|
||||
|
||||
<!-- Search Panel -->
|
||||
<div class="panel" id="panel-search">
|
||||
<div class="drop-zone" id="search-drop">
|
||||
<div class="icon">🔍</div>
|
||||
<p>Drop or click to browse a query audio clip (max 60 seconds)</p>
|
||||
<input type="file" accept="audio/*" id="search-file">
|
||||
</div>
|
||||
<div id="search-preview" style="display:none">
|
||||
<div class="preview">
|
||||
<div class="preview-meta">
|
||||
<div class="preview-name" id="search-file-name"></div>
|
||||
<div class="preview-details" id="search-file-details"></div>
|
||||
</div>
|
||||
<audio controls id="search-audio"></audio>
|
||||
<button id="search-btn">Find Similar Audio</button>
|
||||
<div class="duration-error" id="search-duration-error" style="display:none"></div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="status" id="search-status"></div>
|
||||
<div id="query-transcript-box"></div>
|
||||
<div class="results" id="search-results"></div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<script>
|
||||
const escape = s => (s || '').replace(/[&<>"']/g, c =>
|
||||
({'&':'&','<':'<','>':'>','"':'"',"'":'''}[c]));
|
||||
|
||||
const fmtTime = sec => {
|
||||
sec = Math.round(Number(sec) || 0);
|
||||
const m = Math.floor(sec / 60);
|
||||
const s = sec % 60;
|
||||
return `${m}:${s.toString().padStart(2, '0')}`;
|
||||
};
|
||||
|
||||
// Tabs
|
||||
document.querySelectorAll('.tab').forEach(tab => {
|
||||
tab.addEventListener('click', () => {
|
||||
document.querySelectorAll('.tab').forEach(t => t.classList.remove('active'));
|
||||
document.querySelectorAll('.panel').forEach(p => p.classList.remove('active'));
|
||||
tab.classList.add('active');
|
||||
document.getElementById('panel-' + tab.dataset.panel).classList.add('active');
|
||||
});
|
||||
});
|
||||
|
||||
// Drop zones
|
||||
function setupDropZone(opts) {
|
||||
const { dropEl, fileInput, audioEl, nameEl, detailsEl, previewEl, onLoaded } = opts;
|
||||
let selectedFile = null;
|
||||
|
||||
dropEl.addEventListener('click', () => fileInput.click());
|
||||
dropEl.addEventListener('dragover', e => { e.preventDefault(); dropEl.classList.add('dragover'); });
|
||||
dropEl.addEventListener('dragleave', () => dropEl.classList.remove('dragover'));
|
||||
dropEl.addEventListener('drop', e => {
|
||||
e.preventDefault();
|
||||
dropEl.classList.remove('dragover');
|
||||
if (e.dataTransfer.files.length) handleFile(e.dataTransfer.files[0]);
|
||||
});
|
||||
fileInput.addEventListener('change', () => {
|
||||
if (fileInput.files.length) handleFile(fileInput.files[0]);
|
||||
});
|
||||
|
||||
function handleFile(file) {
|
||||
selectedFile = file;
|
||||
const url = URL.createObjectURL(file);
|
||||
audioEl.src = url;
|
||||
nameEl.textContent = file.name;
|
||||
detailsEl.textContent = `${(file.size / 1024).toFixed(1)} KB`;
|
||||
previewEl.style.display = 'block';
|
||||
audioEl.addEventListener('loadedmetadata', () => {
|
||||
detailsEl.textContent = `${fmtTime(audioEl.duration)} · ${(file.size / 1024).toFixed(1)} KB`
|
||||
.replace('·', '·');
|
||||
if (onLoaded) onLoaded(audioEl.duration);
|
||||
}, { once: true });
|
||||
}
|
||||
|
||||
return { getFile: () => selectedFile };
|
||||
}
|
||||
|
||||
const submitZone = setupDropZone({
|
||||
dropEl: document.getElementById('submit-drop'),
|
||||
fileInput: document.getElementById('submit-file'),
|
||||
audioEl: document.getElementById('submit-audio'),
|
||||
nameEl: document.getElementById('submit-file-name'),
|
||||
detailsEl: document.getElementById('submit-file-details'),
|
||||
previewEl: document.getElementById('submit-preview'),
|
||||
});
|
||||
|
||||
const searchBtn = document.getElementById('search-btn');
|
||||
const searchErr = document.getElementById('search-duration-error');
|
||||
|
||||
const searchZone = setupDropZone({
|
||||
dropEl: document.getElementById('search-drop'),
|
||||
fileInput: document.getElementById('search-file'),
|
||||
audioEl: document.getElementById('search-audio'),
|
||||
nameEl: document.getElementById('search-file-name'),
|
||||
detailsEl: document.getElementById('search-file-details'),
|
||||
previewEl: document.getElementById('search-preview'),
|
||||
onLoaded: dur => {
|
||||
if (dur > 60.5) {
|
||||
searchBtn.disabled = true;
|
||||
searchErr.style.display = 'block';
|
||||
searchErr.textContent = `Too long (${fmtTime(dur)}). Search query must be ≤ 60s. Trim it locally and try again.`;
|
||||
} else {
|
||||
searchBtn.disabled = false;
|
||||
searchErr.style.display = 'none';
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
function showStatus(id, msg, type) {
|
||||
const el = document.getElementById(id);
|
||||
el.className = 'status show ' + type;
|
||||
el.innerHTML = msg;
|
||||
}
|
||||
|
||||
function clearStatus(id) {
|
||||
const el = document.getElementById(id);
|
||||
el.className = 'status';
|
||||
el.innerHTML = '';
|
||||
}
|
||||
|
||||
// Stats
|
||||
async function loadStats() {
|
||||
try {
|
||||
const resp = await fetch('/api/stats');
|
||||
const data = await resp.json();
|
||||
document.getElementById('stats').textContent =
|
||||
`${data.total_clips} clips · ${data.total_segments} segments in database`;
|
||||
} catch(e) {
|
||||
document.getElementById('stats').textContent = 'Could not load stats';
|
||||
}
|
||||
}
|
||||
loadStats();
|
||||
|
||||
// Submit
|
||||
document.getElementById('submit-btn').addEventListener('click', async () => {
|
||||
const file = submitZone.getFile();
|
||||
if (!file) return;
|
||||
const btn = document.getElementById('submit-btn');
|
||||
btn.disabled = true;
|
||||
document.getElementById('submit-segments').innerHTML = '';
|
||||
showStatus('submit-status',
|
||||
'<span class="loading"></span> Transcribing and embedding (this can take 10-30s on CPU)...',
|
||||
'info');
|
||||
|
||||
const form = new FormData();
|
||||
form.append('file', file);
|
||||
form.append('description', document.getElementById('submit-description').value);
|
||||
|
||||
try {
|
||||
const resp = await fetch('/api/submit', { method: 'POST', body: form });
|
||||
const data = await resp.json();
|
||||
if (resp.ok) {
|
||||
showStatus('submit-status',
|
||||
`Stored! ${data.segments_added} segment(s) added — ${data.total_segments} total segments in DB`,
|
||||
'success');
|
||||
document.getElementById('submit-description').value = '';
|
||||
|
||||
if (data.segments && data.segments.length) {
|
||||
const listHtml = data.segments.map(s => {
|
||||
const txtCls = s.transcript ? 'text' : 'text empty';
|
||||
const txt = s.transcript || '(no speech detected)';
|
||||
return `<div class="seg">
|
||||
<span class="range">${fmtTime(s.start_sec)}–${fmtTime(s.end_sec)}</span>
|
||||
<span class="${txtCls}">${escape(txt)}</span>
|
||||
</div>`;
|
||||
}).join('');
|
||||
document.getElementById('submit-segments').innerHTML =
|
||||
`<div class="segment-list"><div style="margin-bottom:0.4rem">Transcribed segments:</div>${listHtml}</div>`;
|
||||
}
|
||||
|
||||
loadStats();
|
||||
} else {
|
||||
showStatus('submit-status', 'Error: ' + (data.detail || JSON.stringify(data)), 'error');
|
||||
}
|
||||
} catch(e) {
|
||||
showStatus('submit-status', 'Request failed: ' + e.message, 'error');
|
||||
}
|
||||
btn.disabled = false;
|
||||
});
|
||||
|
||||
// Search
|
||||
searchBtn.addEventListener('click', async () => {
|
||||
const file = searchZone.getFile();
|
||||
if (!file) return;
|
||||
searchBtn.disabled = true;
|
||||
showStatus('search-status',
|
||||
'<span class="loading"></span> Transcribing query and searching...',
|
||||
'info');
|
||||
document.getElementById('search-results').innerHTML = '';
|
||||
document.getElementById('query-transcript-box').innerHTML = '';
|
||||
|
||||
const form = new FormData();
|
||||
form.append('file', file);
|
||||
|
||||
try {
|
||||
const resp = await fetch('/api/search', { method: 'POST', body: form });
|
||||
const data = await resp.json();
|
||||
if (resp.ok) {
|
||||
if (data.query_transcript) {
|
||||
document.getElementById('query-transcript-box').innerHTML = `
|
||||
<div class="query-transcript">
|
||||
<div class="label">Query transcript</div>
|
||||
<div>${escape(data.query_transcript)}</div>
|
||||
</div>`;
|
||||
}
|
||||
if (!data.results || data.results.length === 0) {
|
||||
showStatus('search-status', data.message || 'No results found.', 'info');
|
||||
} else {
|
||||
showStatus('search-status', `Found ${data.results.length} results`, 'success');
|
||||
const container = document.getElementById('search-results');
|
||||
data.results.forEach((r, i) => {
|
||||
const card = document.createElement('div');
|
||||
card.className = 'result-card';
|
||||
const descHtml = r.description
|
||||
? `<div class="description">“${escape(r.description)}”</div>` : '';
|
||||
const transcriptCls = r.transcript ? 'transcript' : 'transcript empty';
|
||||
const transcriptText = r.transcript || '(no speech detected)';
|
||||
const rangeLabel = r.is_full_clip
|
||||
? `full clip · ${fmtTime(r.parent_duration_sec)}`
|
||||
: `${fmtTime(r.start_sec)}–${fmtTime(r.end_sec)} of ${fmtTime(r.parent_duration_sec)}`;
|
||||
const segUrl = `/api/segment/${encodeURIComponent(r.parent_filename)}?start=${r.start_sec}&end=${r.end_sec}`;
|
||||
const fullUrl = `/api/audio/${encodeURIComponent(r.parent_filename)}`;
|
||||
const showFull = !r.is_full_clip;
|
||||
card.innerHTML = `
|
||||
<div class="result-card-header">
|
||||
<span class="rank">#${i + 1}</span>
|
||||
<span class="similarity">${(r.similarity * 100).toFixed(1)}%</span>
|
||||
<span class="orig-name">${escape(r.parent_original_name)}</span>
|
||||
<span class="segment-range">${rangeLabel}</span>
|
||||
</div>
|
||||
${descHtml}
|
||||
<div class="${transcriptCls}">${escape(transcriptText)}</div>
|
||||
<audio controls src="${segUrl}"></audio>
|
||||
${showFull ? `
|
||||
<div class="full-toggle" data-url="${fullUrl}">▸ Play full clip</div>
|
||||
<audio controls class="full-audio" preload="none"></audio>
|
||||
` : ''}
|
||||
`;
|
||||
container.appendChild(card);
|
||||
});
|
||||
|
||||
container.querySelectorAll('.full-toggle').forEach(toggle => {
|
||||
toggle.addEventListener('click', () => {
|
||||
const audio = toggle.parentElement.querySelector('.full-audio');
|
||||
if (audio.classList.contains('show')) {
|
||||
audio.classList.remove('show');
|
||||
audio.pause();
|
||||
toggle.textContent = '▸ Play full clip';
|
||||
} else {
|
||||
if (!audio.src) audio.src = toggle.dataset.url;
|
||||
audio.classList.add('show');
|
||||
toggle.textContent = '▾ Hide full clip';
|
||||
}
|
||||
});
|
||||
});
|
||||
}
|
||||
} else {
|
||||
showStatus('search-status', 'Error: ' + (data.detail || JSON.stringify(data)), 'error');
|
||||
}
|
||||
} catch(e) {
|
||||
showStatus('search-status', 'Request failed: ' + e.message, 'error');
|
||||
}
|
||||
searchBtn.disabled = false;
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
21
examples/audio_meaning_db/docker-compose.yml
Normal file
@ -0,0 +1,21 @@
|
||||
services:
|
||||
backend:
|
||||
build: ./backend
|
||||
ports:
|
||||
- "8082:8080"
|
||||
volumes:
|
||||
- audio_store:/app/audio
|
||||
- chroma_data:/app/chroma_data
|
||||
- hf_cache:/root/.cache/huggingface
|
||||
environment:
|
||||
- ASR_MODEL=openai/whisper-base
|
||||
- TEXT_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
memory: 4G
|
||||
|
||||
volumes:
|
||||
audio_store:
|
||||
chroma_data:
|
||||
hf_cache:
|
||||
160
examples/everything_function/README.md
Normal file
@ -0,0 +1,160 @@
|
||||
# everything_function
|
||||
|
||||
A folder full of Python functions. Ten of them. They do ten different things — arithmetic, polynomial root-finding, prime factorization, sentiment analysis, translation, summarization, turning a messy paragraph into an action list, labeling photos, generating recipes from food pictures, reading text out of an image.
|
||||
|
||||
Open any of them and you will find roughly the same code:
|
||||
|
||||
```python
|
||||
def ai_something(input):
|
||||
prompt = "...some text describing the task, plus a couple of examples..."
|
||||
return ask(prompt) # ← exact same call every time
|
||||
```
|
||||
|
||||
The thing they all share is `ask` — a four-line wrapper around a local AI model. Each function's "logic" lives entirely in the prompt. There is no `if`/`else`. There is no library doing the real work behind the scenes. There is a description of a task, and there is a model predicting what should come next.
|
||||
|
||||
If that's all you take from this example, it's the right thing to take.
|
||||
|
||||
## The headline
|
||||
|
||||
For a wide and growing set of problems, **you can get a usable result by describing the task in plain English and showing a couple of examples** — faster, cheaper, and often *better* than you could by collecting a dataset and training a model specifically for that task. The interesting thing isn't any single demo in this folder. It's that every demo is the same code.
|
||||
|
||||
A Python function used to mean "a body of code I wrote." It can also mean "a body of prompt the model continues." Once you see that, you start noticing how much of the work in your life could be described that way.
|
||||
|
||||
## What you actually run
|
||||
|
||||
Three pieces, all started by the one `docker-compose.yml`:
|
||||
|
||||
1. **The `ollama` container** — runs a vision-capable open-weight model (`qwen3.5:9b` by default) locally and exposes an HTTP API. Nothing leaves your machine.
|
||||
2. **The `web` container** — a small FastAPI service that imports the same `ai_xxx` functions used by the terminal scripts and exposes them on a browser-friendly page at <http://localhost:8082>. This is the one to project for a class.
|
||||
3. **The scripts in `scripts/`** — small Python programs you run on your host. Each one is a single concept ("AI as a translator", "AI as a prime factorizer") with its own canned examples followed by an interactive REPL. This is the one to read when you want to see exactly what the prompt looks like.
|
||||
|
||||
The web UI and the terminal scripts call the *same Python functions*. The web container just wraps them in HTTP. There is no second implementation.
|
||||
|
||||
### Prerequisites
|
||||
|
||||
Docker Engine + Compose (see [`../../reference/docker/`](../../reference/docker/)). Python 3.10+ on the host if you also want to run the terminal scripts directly (which you should — they're the most instructive view).
|
||||
|
||||
### Bring everything up
|
||||
|
||||
From this folder:
|
||||
|
||||
```bash
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
That starts `ollama`, kicks off a one-shot `model-puller` container that downloads `qwen3.5:9b` (~6 GB) into a Docker volume, and starts the `web` service. Watch the model download with:
|
||||
|
||||
```bash
|
||||
docker compose logs -f model-puller
|
||||
```
|
||||
|
||||
When you see `Model qwen3.5:9b is ready.`, you're good to go. Subsequent runs reuse the volume — model download happens once.
|
||||
|
||||
### Use the web UI
|
||||
|
||||
Open <http://localhost:8082> in a browser. Pick a demo from the sidebar (Math / Text / Vision), tweak the inputs, hit Run. Math demos show AI vs. Python side-by-side. Vision demos let you pick a sample image or upload your own.
|
||||
|
||||
### Use the terminal scripts
|
||||
|
||||
Install the small host-side Python deps once:
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
Then run any demo:
|
||||
|
||||
```bash
|
||||
cd scripts
|
||||
python arithmetic.py
|
||||
python algebra_roots.py
|
||||
python prime_factorization.py
|
||||
# ...etc
|
||||
```
|
||||
|
||||
Each script prints a handful of canned examples and then drops into an interactive REPL — type your own inputs at the `>` prompt, type `q` to quit. Vision demos default to images in `sample_images/`; pass `--image PATH` to try your own.
|
||||
|
||||
### Pick a different model
|
||||
|
||||
Anything in the [Ollama library](https://ollama.com/library) that fits in your RAM will work. The vision demos need a vision-capable model. To swap:
|
||||
|
||||
```bash
|
||||
OLLAMA_MODEL=qwen2.5vl:3b docker compose up -d # smaller, faster, less accurate
|
||||
```
|
||||
|
||||
The terminal scripts read `OLLAMA_MODEL` from the environment too, so set it in your shell before running them.
|
||||
|
||||
### Tear it down
|
||||
|
||||
```bash
|
||||
docker compose down # stop containers, keep the downloaded model
|
||||
docker compose down -v # also delete the model (you'll re-download next time)
|
||||
```
|
||||
|
||||
## The four-ish gears of getting AI to do a thing
|
||||
|
||||
When you have a task and want to use AI for it, this is roughly the order to try things in. Start at the top. Stop when it's good enough.
|
||||
|
||||
1. **Zero-shot.** Just describe the task. *"Translate this sentence into Japanese."* Often this is all you need, especially with a frontier model. Cost: nothing. Time: one prompt.
|
||||
|
||||
2. **Few-shot.** Same prompt, plus a couple of input/output examples. This is what every script in this folder is doing — those little blocks of `Sentence: ... / Sentiment: positive` examples teach the model the output format you want without retraining anything. The GPT-3 paper called this *in-context learning* and made it the headline result, because nobody had quite believed how much of a difference a few examples in the prompt could make. Cost: still nothing. Time: one prompt, maybe slightly longer.
|
||||
|
||||
3. **Try a stronger model — or wait for one.** This sounds glib, but it's real: a task that just barely doesn't work on a 7B local model may work on a 32B model, or on a frontier model, or on next year's 7B model. Capability is moving fast enough that "wait six months" is sometimes a legitimate engineering plan. The 7B vision model you're running here would have been an unimaginable result in 2020.
|
||||
|
||||
4. **Fine-tune a model.** Take an open-weight model and continue training it on examples specific to your problem. This is real work — you need a dataset, GPU time, and a feedback loop — but the bar is much lower than it used to be. A nice trick: you can often use a frontier model to *generate the first version of your training set*, before you go and collect your own data.
|
||||
|
||||
5. **Build the whole pipeline.** Custom data, custom architecture, custom training. This is what you do when none of the above is enough and the problem is worth real money. Most projects never need to come down here.
|
||||
|
||||
Most personal projects live happily in step 1 or step 2. Most of the demos in this folder live in step 2. Try step 1 first; reach for step 2 when the answers are inconsistent.
|
||||
|
||||
## Local open-weight model vs. paying a frontier lab
|
||||
|
||||
We're running a local model in this example. That is a deliberate choice, not the only choice, and it's worth being explicit about the tradeoff.
|
||||
|
||||
**Why local / open-weight is the default for this workshop:**
|
||||
|
||||
- You are in full control. The model runs on your hardware, your inputs never leave your machine, and the container can run on an airplane.
|
||||
- Nothing to sign up for, nothing to pay for, nothing to expire.
|
||||
- You can see *what the model is*. It's a file. You can copy it, version it, swap it out. It's not a magic URL controlled by a company that might change its mind.
|
||||
- Most importantly: when you can run the thing yourself, you stop being intimidated by it. It becomes another piece of software.
|
||||
|
||||
**When a frontier API (Anthropic, OpenAI, Google) is a better fit:**
|
||||
|
||||
- You want the strongest possible quality and don't want to wait for open weights to catch up.
|
||||
- You're building infrastructure and the model is the cheap part of the stack.
|
||||
- You're prototyping and don't care which model wins — you just want to see whether *any* AI can solve your problem before committing to running one.
|
||||
- You want a model that is *currently* better than anything you can run at home — frontier models are typically 6–18 months ahead of what fits on a laptop, and that gap matters for hard tasks.
|
||||
|
||||
A reasonable workflow: prototype on a frontier API, then swap in a local model once you know the problem is solvable and want to bring the work back in-house. The Python code on your side barely changes — you swap the HTTP endpoint and maybe tweak a prompt.
|
||||
|
||||
## What's in `scripts/`
|
||||
|
||||
Each file defines one or more functions. Each function is a "smart Python function" backed by the same `ask` call.
|
||||
|
||||
| Script | What it does | Hand-written equivalent |
|
||||
|--------|--------------|-------------------------|
|
||||
| `arithmetic.py` | Add, subtract, multiply, divide. | `+`, `-`, `*`, `/`. We compare side-by-side. |
|
||||
| `algebra_roots.py` | Find the real roots of a polynomial. Pretty-prints the polynomial first. | `numpy.roots`. Same — compared side-by-side. |
|
||||
| `prime_factorization.py` | Factor an integer into its primes. | Trial division. Compared side-by-side. |
|
||||
| `sentiment.py` | Label a sentence as positive / negative / neutral. | Used to be a research problem. There isn't a clean built-in. |
|
||||
| `translate.py` | Translate text into any target language. | An API call to Google Translate, basically. |
|
||||
| `summarize.py` | Shrink a passage to a target word count. | No clean built-in. |
|
||||
| `action_list.py` | Turn a messy stream-of-consciousness into a bulleted list of actions. | None. This is the kind of thing you can only really do this way. |
|
||||
| `image_label.py` | "What's in this picture?" | An image classifier. Used to require training one. |
|
||||
| `recipe_from_food.py` | Picture of food → ingredients and steps. | None. Two tasks combined: identify the dish, then generate a recipe. |
|
||||
| `ocr.py` | Read printed text out of an image. | Tesseract / ABBYY / paid OCR APIs. |
|
||||
|
||||
The first three are the most important. If you watch a 9B local model add two-digit numbers correctly because you wrote `"2 + 3 = 5"` at the top of a prompt — and then watch the exact same code shape factor 360 into primes — you have seen the central trick. Everything else in the folder is the same trick pointed at fancier problems.
|
||||
|
||||
## Where this comes from (optional reading)
|
||||
|
||||
The "give the model a couple of examples in the prompt" technique is the headline result of [the GPT-3 paper](../../reference/papers/2020_05_28_gpt_3.pdf) (Brown et al., 2020, *Language Models are Few-Shot Learners*). The fact that a vision-language model can answer questions about a picture comes out of [CLIP](../../reference/papers/2021_02_26_CLIP.pdf) (Radford et al., 2021) and what came after it. The reason these models follow your instructions instead of going off and writing related trivia comes from [InstructGPT](../../reference/papers/2022_03_04_instructGPT.pdf) (Ouyang et al., 2022). And if you ever find yourself adding *"let's think step by step"* to a prompt to get a better answer, that's [Chain-of-Thought Prompting](../../reference/papers/2022_01_28_chain_of_thought.pdf) (Wei et al., 2022).
|
||||
|
||||
You do not need any of that to run the demos. Read them if you want to know why the technique works at all.
|
||||
|
||||
## Caveats worth being honest about
|
||||
|
||||
- **The model is wrong sometimes.** For arithmetic with big numbers, for ambiguous prompts, for tasks at the edge of its training. Watch for that — calibration matters more than enthusiasm. Prime factorization is a fun way to break it: small numbers work fine, but it confidently makes up factors for big ones.
|
||||
- **A 9B model is not a frontier model.** If a demo looks shaky, try the same prompt on a frontier API and see whether it's the technique that's failing or just this particular model. Usually it's the latter.
|
||||
- **Temperature 0 ≠ deterministic.** Even with `temperature=0`, the same prompt can give slightly different answers across runs, depending on the version of Ollama and the model. Don't be surprised.
|
||||
- **None of this is engineering best practice.** It's an *example*. Real production AI systems have output validation, retries, structured-output enforcement, monitoring, and so on. Start simple; complicate only when you need to.
|
||||
63
examples/everything_function/docker-compose.yml
Normal file
@ -0,0 +1,63 @@
|
||||
services:
|
||||
ollama:
|
||||
image: ollama/ollama:latest
|
||||
ports:
|
||||
- "11434:11434"
|
||||
volumes:
|
||||
- ollama_models:/root/.ollama
|
||||
healthcheck:
|
||||
test: ["CMD", "ollama", "list"]
|
||||
interval: 5s
|
||||
timeout: 5s
|
||||
retries: 60
|
||||
# Uncomment the block below if you have an NVIDIA GPU and the
|
||||
# NVIDIA container toolkit installed. Vision models are dramatically
|
||||
# faster with a GPU, but everything in this example also runs on CPU.
|
||||
# deploy:
|
||||
# resources:
|
||||
# reservations:
|
||||
# devices:
|
||||
# - driver: nvidia
|
||||
# count: all
|
||||
# capabilities: [gpu]
|
||||
|
||||
model-puller:
|
||||
image: ollama/ollama:latest
|
||||
depends_on:
|
||||
ollama:
|
||||
condition: service_healthy
|
||||
environment:
|
||||
- OLLAMA_HOST=http://ollama:11434
|
||||
entrypoint: ["/bin/bash", "-c"]
|
||||
# `ollama pull` shows a redrawing progress bar with carriage returns and
|
||||
# ANSI cursor-hide codes. Without a TTY (which is the case here) that
|
||||
# comes out of `docker logs` as invisible output plus a hijacked cursor.
|
||||
# The pipeline below converts \r into \n so each progress update becomes
|
||||
# its own log line, and strips ANSI escape sequences so the cursor is
|
||||
# left alone.
|
||||
command:
|
||||
- |
|
||||
set -eo pipefail
|
||||
echo "Pulling ${OLLAMA_MODEL:-qwen3.5:9b} (this may take a while on first run)..."
|
||||
ollama pull "${OLLAMA_MODEL:-qwen3.5:9b}" 2>&1 \
|
||||
| stdbuf -oL tr '\r' '\n' \
|
||||
| stdbuf -oL sed -E 's/\x1b\[[?0-9;]*[a-zA-Z]//g'
|
||||
echo "Model ${OLLAMA_MODEL:-qwen3.5:9b} is ready. You can now run the demo scripts on the host."
|
||||
restart: "no"
|
||||
|
||||
web:
|
||||
build: ./web
|
||||
ports:
|
||||
- "8082:8080"
|
||||
environment:
|
||||
- OLLAMA_URL=http://ollama:11434
|
||||
- OLLAMA_MODEL=${OLLAMA_MODEL:-qwen3.5:9b}
|
||||
volumes:
|
||||
- ./scripts:/app/scripts:ro
|
||||
- ./sample_images:/app/sample_images:ro
|
||||
depends_on:
|
||||
ollama:
|
||||
condition: service_healthy
|
||||
|
||||
volumes:
|
||||
ollama_models:
|
||||
2
examples/everything_function/requirements.txt
Normal file
@ -0,0 +1,2 @@
|
||||
httpx==0.28.1
|
||||
numpy==2.2.1
|
||||
13
examples/everything_function/sample_images/SOURCES.md
Normal file
@ -0,0 +1,13 @@
|
||||
# Sample images
|
||||
|
||||
Tiny demo images for the vision scripts. Each was downsized to ~640px on the long side and JPEG-compressed to keep the repo light.
|
||||
|
||||
| File | Source | Notes |
|
||||
|------|--------|-------|
|
||||
| `food_pizza.jpg` | Wikimedia Commons — `Eq_it-na_pizza-margherita_sep2005_sml.jpg` | Photo of a pizza margherita. Resized from the original. |
|
||||
| `animal_dog.jpg` | Wikimedia Commons — `Labrador_on_Quantock_(2175262184).jpg` | Photo of a labrador. Resized from the original. |
|
||||
| `text_notice.jpg` | Generated locally with PIL | Synthetic notice text — used as a controlled OCR target so the script always has predictable content to read. |
|
||||
|
||||
If you want to swap any of these out, drop a replacement into this folder and update the corresponding script's `DEFAULT_IMAGE` constant (or just pass `--image YOUR_FILE.jpg`).
|
||||
|
||||
Wikimedia source files are typically CC BY-SA / CC BY or public domain — see the file description pages on commons.wikimedia.org for the exact license for each original. The synthetic OCR image is original to this workshop and is covered by the repo's main LICENSE.
|
||||
BIN
examples/everything_function/sample_images/animal_dog.jpg
Normal file
|
After Width: | Height: | Size: 73 KiB |
BIN
examples/everything_function/sample_images/food_pizza.jpg
Normal file
|
After Width: | Height: | Size: 61 KiB |
BIN
examples/everything_function/sample_images/text_notice.jpg
Normal file
|
After Width: | Height: | Size: 39 KiB |
66
examples/everything_function/scripts/action_list.py
Normal file
@ -0,0 +1,66 @@
|
||||
"""Turn a messy stream-of-consciousness description into a clean action list.
|
||||
|
||||
This one is closer to how you'll probably actually use AI in your own
|
||||
projects: glue logic that turns "the way a human thinks about something"
|
||||
into a structured artifact a program can act on.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from ai_function import ask
|
||||
|
||||
|
||||
def ai_action_list(description: str) -> str:
|
||||
return ask(
|
||||
"Read the description and output a clean bulleted list of concrete "
|
||||
"actions to take, one per line, each starting with '- '. No preamble, "
|
||||
"no closing remarks.\n"
|
||||
"Description: ok so I need to like, get groceries — milk, bread, maybe "
|
||||
"those frozen dumplings — and then drop off the package at the post "
|
||||
"office, oh and call mom back, she texted twice yesterday.\n"
|
||||
"Actions:\n"
|
||||
"- Buy groceries (milk, bread, frozen dumplings)\n"
|
||||
"- Drop off package at the post office\n"
|
||||
"- Call mom back\n"
|
||||
f"Description: {description}\n"
|
||||
"Actions:\n"
|
||||
"- "
|
||||
)
|
||||
|
||||
|
||||
def _canned_examples() -> None:
|
||||
messy = (
|
||||
"ugh ok so the dishwasher is making that noise again, I should probably "
|
||||
"look at the filter or just call the repair guy, also the lawn is getting "
|
||||
"long again it's been like three weeks, and we're almost out of dog food "
|
||||
"I keep meaning to grab a bag, oh and Sarah's birthday is on Saturday I "
|
||||
"haven't gotten anything yet"
|
||||
)
|
||||
print("input:")
|
||||
print(messy)
|
||||
print("\naction list:")
|
||||
print("- " + ai_action_list(messy))
|
||||
print()
|
||||
|
||||
|
||||
def _interactive() -> None:
|
||||
print("--- interactive ---")
|
||||
print("Paste a messy description and we'll convert it to an action list. 'q' to quit.\n")
|
||||
while True:
|
||||
try:
|
||||
text = input("> ").strip()
|
||||
except (EOFError, KeyboardInterrupt):
|
||||
print()
|
||||
return
|
||||
if text.lower() in {"q", "quit", "exit"}:
|
||||
return
|
||||
if not text:
|
||||
continue
|
||||
print("\naction list:")
|
||||
print("- " + ai_action_list(text))
|
||||
print()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
_canned_examples()
|
||||
_interactive()
|
||||
78
examples/everything_function/scripts/ai_function.py
Normal file
@ -0,0 +1,78 @@
|
||||
"""The one-and-only AI primitive used by every demo in this folder.
|
||||
|
||||
Every script in scripts/ imports `ask` from this module. The whole point of
|
||||
the example is that *every* "smart function" you'll see is the same call —
|
||||
just a different prompt going in, and a string coming back out. The model
|
||||
doesn't know whether it's doing arithmetic, sentiment analysis, or reading
|
||||
text out of a photo. It is, in every case, predicting what comes next.
|
||||
|
||||
The model runs locally inside the `ollama` container started by the
|
||||
docker-compose file alongside this script. We hit its HTTP API; nothing
|
||||
about your input ever leaves the machine.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import base64
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
import httpx
|
||||
|
||||
OLLAMA_URL = os.environ.get("OLLAMA_URL", "http://localhost:11434")
|
||||
OLLAMA_MODEL = os.environ.get("OLLAMA_MODEL", "qwen3.5:9b")
|
||||
|
||||
# Reasonable timeout for a CPU-only first run; vision calls can be slow.
|
||||
_TIMEOUT = httpx.Timeout(connect=10.0, read=300.0, write=60.0, pool=10.0)
|
||||
|
||||
|
||||
def ask(prompt: str, image: str | Path | None = None, *, temperature: float = 0.0) -> str:
|
||||
"""Ask the local model to continue some text.
|
||||
|
||||
Args:
|
||||
prompt: The text the model sees. Whatever you write here is the
|
||||
"body" of your AI function — the same way regular Python functions
|
||||
have a body of code, an AI function has a body of prompt.
|
||||
image: Optional path to a local image. When provided, the model
|
||||
sees the image alongside the prompt (this needs a vision-capable
|
||||
model — `qwen2.5vl` is the default).
|
||||
temperature: 0 makes outputs roughly deterministic, which is what we
|
||||
want for demos. Crank it up for more variety, down for fewer surprises.
|
||||
|
||||
Returns:
|
||||
The model's continuation as a plain string, with surrounding whitespace
|
||||
stripped.
|
||||
"""
|
||||
payload: dict = {
|
||||
"model": OLLAMA_MODEL,
|
||||
"prompt": prompt,
|
||||
"stream": False,
|
||||
"options": {"temperature": temperature},
|
||||
# Some Qwen builds support a "thinking" mode where the model writes
|
||||
# out an internal monologue before answering. For this workshop we
|
||||
# want clean, direct completions, so we ask for that explicitly.
|
||||
"think": False,
|
||||
}
|
||||
|
||||
if image is not None:
|
||||
image_path = Path(image)
|
||||
if not image_path.exists():
|
||||
raise FileNotFoundError(f"Image not found: {image_path}")
|
||||
payload["images"] = [base64.b64encode(image_path.read_bytes()).decode("ascii")]
|
||||
|
||||
with httpx.Client(timeout=_TIMEOUT) as client:
|
||||
try:
|
||||
r = client.post(f"{OLLAMA_URL}/api/generate", json=payload)
|
||||
except httpx.ConnectError as e:
|
||||
raise RuntimeError(
|
||||
f"Could not reach Ollama at {OLLAMA_URL}. "
|
||||
"Is the docker-compose stack running? Try `docker compose up -d`."
|
||||
) from e
|
||||
r.raise_for_status()
|
||||
return r.json()["response"].strip()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
# Sanity check — confirms the model is reachable and answering.
|
||||
print(f"Asking {OLLAMA_MODEL} at {OLLAMA_URL} to say hello...")
|
||||
print(ask("Say hello in five words or fewer."))
|
||||
115
examples/everything_function/scripts/algebra_roots.py
Normal file
@ -0,0 +1,115 @@
|
||||
"""Find the roots of a polynomial by *asking* the model for them.
|
||||
|
||||
Compare against numpy's actual numerical root finder. For nice polynomials
|
||||
with integer roots, the AI often gets there. For uglier ones, it makes
|
||||
plausible-looking guesses that don't quite check out. Both behaviors are
|
||||
interesting — and both are what you should expect from a system that
|
||||
"reasons" by predicting the most likely next characters.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import numpy as np
|
||||
|
||||
from ai_function import ask
|
||||
|
||||
|
||||
def format_polynomial(coeffs: list[float]) -> str:
|
||||
"""Render coefficients as a readable expression like '2x^3 - 5x + 1'."""
|
||||
terms = []
|
||||
degree = len(coeffs) - 1
|
||||
for i, c in enumerate(coeffs):
|
||||
if c == 0:
|
||||
continue
|
||||
power = degree - i
|
||||
# Coefficient formatting: drop trailing zeros, keep sign in front.
|
||||
abs_str = f"{abs(c):g}"
|
||||
sign = "+" if c > 0 else "-"
|
||||
if power == 0:
|
||||
body = abs_str
|
||||
else:
|
||||
coef = "" if abs_str == "1" else abs_str
|
||||
var = "x" if power == 1 else f"x^{power}"
|
||||
body = f"{coef}{var}"
|
||||
terms.append(f"{sign} {body}")
|
||||
if not terms:
|
||||
return "0"
|
||||
expr = " ".join(terms)
|
||||
# Drop the leading "+ " if positive.
|
||||
if expr.startswith("+ "):
|
||||
expr = expr[2:]
|
||||
elif expr.startswith("- "):
|
||||
expr = "-" + expr[2:]
|
||||
return expr
|
||||
|
||||
|
||||
def ai_polynomial_roots(coeffs: list[float]) -> str:
|
||||
"""Same shape as numpy.roots: pass coefficients high-degree to low-degree."""
|
||||
polynomial = format_polynomial(coeffs)
|
||||
return ask(
|
||||
"Find the real roots of the polynomial. Output only a Python list of "
|
||||
"numbers rounded to 3 decimal places, like [1.0, -2.0]. No explanation.\n"
|
||||
"Polynomial: x^2 - 5x + 6\n"
|
||||
"Roots: [2.0, 3.0]\n"
|
||||
"Polynomial: x^2 - 4\n"
|
||||
"Roots: [-2.0, 2.0]\n"
|
||||
"Polynomial: x^3 - 6x^2 + 11x - 6\n"
|
||||
"Roots: [1.0, 2.0, 3.0]\n"
|
||||
f"Polynomial: {polynomial}\n"
|
||||
"Roots: "
|
||||
)
|
||||
|
||||
|
||||
def py_polynomial_roots(coeffs: list[float]) -> list[float]:
|
||||
return sorted(np.roots(coeffs).real.round(3).tolist())
|
||||
|
||||
|
||||
def _run_one(coeffs: list[float]) -> None:
|
||||
print(f" coefficients: {coeffs}")
|
||||
print(f" polynomial: {format_polynomial(coeffs)}")
|
||||
ai_out = ai_polynomial_roots(coeffs)
|
||||
py_out = py_polynomial_roots(coeffs)
|
||||
print(f" AI says: {ai_out}")
|
||||
print(f" numpy says: {py_out}")
|
||||
print()
|
||||
|
||||
|
||||
def _canned_examples() -> None:
|
||||
cases = [
|
||||
[1, -7, 12], # roots 3, 4
|
||||
[1, 0, -9], # roots -3, 3
|
||||
[1, -6, 11, -6], # roots 1, 2, 3
|
||||
[2, -3, -11, 6], # roots 3, -2, 0.5
|
||||
]
|
||||
for coeffs in cases:
|
||||
_run_one(coeffs)
|
||||
|
||||
|
||||
def _interactive() -> None:
|
||||
print("--- interactive ---")
|
||||
print(
|
||||
"Enter polynomial coefficients high-degree to low-degree, separated by "
|
||||
"commas or spaces. Example: `1, -7, 12` is x^2 - 7x + 12. 'q' to quit.\n"
|
||||
)
|
||||
while True:
|
||||
try:
|
||||
raw = input("coefficients > ").strip()
|
||||
except (EOFError, KeyboardInterrupt):
|
||||
print()
|
||||
return
|
||||
if raw.lower() in {"q", "quit", "exit"}:
|
||||
return
|
||||
try:
|
||||
coeffs = [float(p) for p in raw.replace(",", " ").split() if p]
|
||||
except ValueError:
|
||||
print(" could not parse those as numbers, try again")
|
||||
continue
|
||||
if len(coeffs) < 2:
|
||||
print(" need at least 2 coefficients (a linear polynomial)")
|
||||
continue
|
||||
_run_one(coeffs)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
_canned_examples()
|
||||
_interactive()
|
||||
118
examples/everything_function/scripts/arithmetic.py
Normal file
@ -0,0 +1,118 @@
|
||||
"""Four "AI functions" that do arithmetic.
|
||||
|
||||
The point isn't to replace `+`, `-`, `*`, `/` with a 7-billion-parameter
|
||||
neural network (please don't). The point is that each of these functions
|
||||
looks identical in shape to a normal Python function — same `def`, same
|
||||
arguments, same return value — but the *body* of the function is a prompt
|
||||
instead of a hand-written rule. The "logic" is whatever the model fills in
|
||||
next.
|
||||
|
||||
Run this script and compare each AI answer to what plain Python computes.
|
||||
You should see them agree on simple cases. That agreement is doing a lot
|
||||
of work: it tells you that "predict the next token" is a real and useful
|
||||
operation, not a parlor trick.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from ai_function import ask
|
||||
|
||||
|
||||
def ai_add(a: float, b: float) -> str:
|
||||
return ask(
|
||||
"Output only the answer, no explanation.\n"
|
||||
"2 + 3 = 5\n"
|
||||
"10 + 7 = 17\n"
|
||||
"100 + 25 = 125\n"
|
||||
f"{a} + {b} = "
|
||||
)
|
||||
|
||||
|
||||
def ai_subtract(a: float, b: float) -> str:
|
||||
return ask(
|
||||
"Output only the answer, no explanation.\n"
|
||||
"9 - 4 = 5\n"
|
||||
"20 - 13 = 7\n"
|
||||
"100 - 41 = 59\n"
|
||||
f"{a} - {b} = "
|
||||
)
|
||||
|
||||
|
||||
def ai_multiply(a: float, b: float) -> str:
|
||||
return ask(
|
||||
"Output only the answer, no explanation.\n"
|
||||
"3 * 4 = 12\n"
|
||||
"8 * 7 = 56\n"
|
||||
"12 * 11 = 132\n"
|
||||
f"{a} * {b} = "
|
||||
)
|
||||
|
||||
|
||||
def ai_divide(a: float, b: float) -> str:
|
||||
return ask(
|
||||
"Output only the answer as a decimal rounded to 4 places, no explanation.\n"
|
||||
"10 / 4 = 2.5000\n"
|
||||
"9 / 3 = 3.0000\n"
|
||||
"22 / 7 = 3.1429\n"
|
||||
f"{a} / {b} = "
|
||||
)
|
||||
|
||||
|
||||
OPS: dict[str, tuple] = {
|
||||
"+": (ai_add, lambda a, b: a + b),
|
||||
"-": (ai_subtract, lambda a, b: a - b),
|
||||
"*": (ai_multiply, lambda a, b: a * b),
|
||||
"/": (ai_divide, lambda a, b: round(a / b, 4)),
|
||||
}
|
||||
|
||||
|
||||
def _canned_examples() -> None:
|
||||
cases = [
|
||||
("+", 47, 28),
|
||||
("-", 153, 89),
|
||||
("*", 14, 17),
|
||||
("/", 44, 6),
|
||||
]
|
||||
|
||||
print(f"{'op':<4} {'inputs':<14} {'AI':<14} {'python':<14} match?")
|
||||
print("-" * 60)
|
||||
for op, a, b in cases:
|
||||
ai_fn, py_fn = OPS[op]
|
||||
ai_out = ai_fn(a, b).strip().rstrip(".")
|
||||
py_out = py_fn(a, b)
|
||||
match = "yes" if ai_out == str(py_out) else "no"
|
||||
print(f"{op:<4} {f'{a} {op} {b}':<14} {ai_out:<14} {str(py_out):<14} {match}")
|
||||
|
||||
|
||||
def _interactive() -> None:
|
||||
print("\n--- interactive ---")
|
||||
print("Type an expression like `47 + 28`, or 'q' to quit. Ops: + - * /\n")
|
||||
while True:
|
||||
try:
|
||||
raw = input("> ").strip()
|
||||
except (EOFError, KeyboardInterrupt):
|
||||
print()
|
||||
return
|
||||
if raw.lower() in {"q", "quit", "exit"}:
|
||||
return
|
||||
parts = raw.split()
|
||||
if len(parts) != 3 or parts[1] not in OPS:
|
||||
print(" expected: <number> <op> <number>")
|
||||
continue
|
||||
try:
|
||||
a, b = float(parts[0]), float(parts[2])
|
||||
except ValueError:
|
||||
print(" those don't look like numbers")
|
||||
continue
|
||||
op = parts[1]
|
||||
ai_fn, py_fn = OPS[op]
|
||||
ai_out = ai_fn(a, b).strip().rstrip(".")
|
||||
py_out = py_fn(a, b)
|
||||
match = "✓" if ai_out == str(py_out) else "✗"
|
||||
print(f" AI: {ai_out}")
|
||||
print(f" python: {py_out} {match}\n")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
_canned_examples()
|
||||
_interactive()
|
||||
35
examples/everything_function/scripts/image_label.py
Normal file
@ -0,0 +1,35 @@
|
||||
"""Take a picture, return a short label of what's in it.
|
||||
|
||||
Pre-2021, this was an entire subfield: ImageNet classifiers, object
|
||||
detectors, training pipelines. Here we hand a JPEG to a vision-language
|
||||
model and ask it the question in plain English.
|
||||
|
||||
The default image is `sample_images/animal_dog.jpg`. Pass `--image PATH`
|
||||
to point at your own picture.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
from pathlib import Path
|
||||
|
||||
from ai_function import ask
|
||||
|
||||
DEFAULT_IMAGE = Path(__file__).resolve().parent.parent / "sample_images" / "animal_dog.jpg"
|
||||
|
||||
|
||||
def ai_image_label(image_path: str | Path) -> str:
|
||||
return ask(
|
||||
"Look at the image and give a short label (one to five words) describing "
|
||||
"what is in it. Output only the label, no full sentence and no punctuation.",
|
||||
image=image_path,
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument("--image", type=Path, default=DEFAULT_IMAGE)
|
||||
args = parser.parse_args()
|
||||
|
||||
print(f"Image: {args.image}")
|
||||
print(f"Label: {ai_image_label(args.image)}")
|
||||
35
examples/everything_function/scripts/ocr.py
Normal file
@ -0,0 +1,35 @@
|
||||
"""Read printed text out of an image — what people usually call OCR.
|
||||
|
||||
Traditional OCR (Tesseract, ABBYY, etc.) is a dedicated system trained on
|
||||
piles of text-image pairs and uses character-level segmentation. We do
|
||||
the same job here by asking a general-purpose vision-language model to
|
||||
just *read out loud* what it sees.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
from pathlib import Path
|
||||
|
||||
from ai_function import ask
|
||||
|
||||
DEFAULT_IMAGE = Path(__file__).resolve().parent.parent / "sample_images" / "text_notice.jpg"
|
||||
|
||||
|
||||
def ai_ocr(image_path: str | Path) -> str:
|
||||
return ask(
|
||||
"Read all the text visible in this image, exactly as written, preserving "
|
||||
"line breaks. Output only the text — no commentary, no quotes, no "
|
||||
"summary, no description of the image.",
|
||||
image=image_path,
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument("--image", type=Path, default=DEFAULT_IMAGE)
|
||||
args = parser.parse_args()
|
||||
|
||||
print(f"Image: {args.image}\n")
|
||||
print("--- transcribed text ---")
|
||||
print(ai_ocr(args.image))
|
||||
90
examples/everything_function/scripts/prime_factorization.py
Normal file
@ -0,0 +1,90 @@
|
||||
"""Factor an integer into its prime factors — via AI and via real Python.
|
||||
|
||||
Like `arithmetic.py`, the point isn't to suggest that you should replace a
|
||||
hand-coded factoring routine with a 9-billion-parameter neural network.
|
||||
It's to point out that the *same kind of prompt that did addition* will
|
||||
also "do" prime factorization, for small numbers at least. The model
|
||||
gets it right surprisingly often; it also gets it wrong in interesting
|
||||
ways once the numbers get big enough to actually require the algorithm.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from ai_function import ask
|
||||
|
||||
|
||||
def ai_prime_factorize(n: int) -> str:
|
||||
return ask(
|
||||
"Give the prime factorization of the integer. Output only the answer "
|
||||
"in the form 'p1^e1 * p2^e2 * ...', with primes in ascending order and "
|
||||
"exponents omitted when they are 1. No explanation.\n"
|
||||
"12 = 2^2 * 3\n"
|
||||
"60 = 2^2 * 3 * 5\n"
|
||||
"1001 = 7 * 11 * 13\n"
|
||||
"97 = 97\n"
|
||||
f"{n} = "
|
||||
)
|
||||
|
||||
|
||||
def py_prime_factorize(n: int) -> str:
|
||||
"""Return the prime factorization formatted like '2^2 * 3'.
|
||||
|
||||
Trial division. Plenty fast for any number a student will type at a prompt.
|
||||
"""
|
||||
if n < 2:
|
||||
return str(n)
|
||||
factors: list[tuple[int, int]] = []
|
||||
remaining = n
|
||||
p = 2
|
||||
while p * p <= remaining:
|
||||
e = 0
|
||||
while remaining % p == 0:
|
||||
remaining //= p
|
||||
e += 1
|
||||
if e:
|
||||
factors.append((p, e))
|
||||
p += 1
|
||||
if remaining > 1:
|
||||
factors.append((remaining, 1))
|
||||
|
||||
parts = [f"{p}" if e == 1 else f"{p}^{e}" for p, e in factors]
|
||||
return " * ".join(parts)
|
||||
|
||||
|
||||
def _canned_examples() -> None:
|
||||
cases = [84, 360, 1024, 2025, 9991]
|
||||
print(f"{'n':>8} {'AI':<24} {'python':<24} match?")
|
||||
print("-" * 70)
|
||||
for n in cases:
|
||||
ai_out = ai_prime_factorize(n).strip().rstrip(".")
|
||||
py_out = py_prime_factorize(n)
|
||||
match = "yes" if ai_out == py_out else "no"
|
||||
print(f"{n:>8} {ai_out:<24} {py_out:<24} {match}")
|
||||
|
||||
|
||||
def _interactive() -> None:
|
||||
print("\n--- interactive ---")
|
||||
print("Type an integer to factor, or 'q' to quit.\n")
|
||||
while True:
|
||||
try:
|
||||
raw = input("n = ").strip()
|
||||
except (EOFError, KeyboardInterrupt):
|
||||
print()
|
||||
return
|
||||
if raw.lower() in {"q", "quit", "exit"}:
|
||||
return
|
||||
try:
|
||||
n = int(raw)
|
||||
except ValueError:
|
||||
print(" not an integer, try again")
|
||||
continue
|
||||
ai_out = ai_prime_factorize(n).strip().rstrip(".")
|
||||
py_out = py_prime_factorize(n)
|
||||
match = "✓" if ai_out == py_out else "✗"
|
||||
print(f" AI: {ai_out}")
|
||||
print(f" python: {py_out} {match}\n")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
_canned_examples()
|
||||
_interactive()
|
||||
34
examples/everything_function/scripts/recipe_from_food.py
Normal file
@ -0,0 +1,34 @@
|
||||
"""Take a picture of food, get a recipe back.
|
||||
|
||||
This one is fun because the model has to do two things at once: recognize
|
||||
what the dish is, then generate plausible instructions for making it.
|
||||
Neither step is hard-coded anywhere.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
from pathlib import Path
|
||||
|
||||
from ai_function import ask
|
||||
|
||||
DEFAULT_IMAGE = Path(__file__).resolve().parent.parent / "sample_images" / "food_pizza.jpg"
|
||||
|
||||
|
||||
def ai_recipe_from_food(image_path: str | Path) -> str:
|
||||
return ask(
|
||||
"Look at the food in the image. First, on one line, write 'Dish: ' "
|
||||
"followed by the dish name. Then write 'Ingredients:' followed by a "
|
||||
"bulleted list (one per line, '- ' prefix). Then write 'Steps:' "
|
||||
"followed by a numbered list of brief cooking steps. Keep it concise.",
|
||||
image=image_path,
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument("--image", type=Path, default=DEFAULT_IMAGE)
|
||||
args = parser.parse_args()
|
||||
|
||||
print(f"Image: {args.image}\n")
|
||||
print(ai_recipe_from_food(args.image))
|
||||
59
examples/everything_function/scripts/sentiment.py
Normal file
@ -0,0 +1,59 @@
|
||||
"""Classify the sentiment of a sentence.
|
||||
|
||||
This is the first demo where there isn't a plain-Python alternative to
|
||||
compare against. Pre-2017, "tell me if this review is positive or negative"
|
||||
was a research problem — papers, labeled datasets, custom-trained models.
|
||||
Here it's a few lines of prompt.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from ai_function import ask
|
||||
|
||||
|
||||
def ai_sentiment(text: str) -> str:
|
||||
return ask(
|
||||
"Classify the sentiment of the following sentence as exactly one of: "
|
||||
"positive, negative, neutral. Output only the single word.\n"
|
||||
'Sentence: "I love this product, it changed my life."\n'
|
||||
"Sentiment: positive\n"
|
||||
'Sentence: "It arrived broken and customer service ignored me."\n'
|
||||
"Sentiment: negative\n"
|
||||
'Sentence: "The package arrived on Tuesday."\n'
|
||||
"Sentiment: neutral\n"
|
||||
f'Sentence: "{text}"\n'
|
||||
"Sentiment: "
|
||||
)
|
||||
|
||||
|
||||
def _canned_examples() -> None:
|
||||
samples = [
|
||||
"Honestly the best coffee I've had in months.",
|
||||
"The hotel was fine, nothing memorable either way.",
|
||||
"Worst flight of my life, I'll never fly this airline again.",
|
||||
"It works, but the instructions could be a lot clearer.",
|
||||
]
|
||||
for s in samples:
|
||||
print(f"input: {s}")
|
||||
print(f"sentiment: {ai_sentiment(s)}\n")
|
||||
|
||||
|
||||
def _interactive() -> None:
|
||||
print("--- interactive ---")
|
||||
print("Type a sentence to classify, or 'q' to quit.\n")
|
||||
while True:
|
||||
try:
|
||||
text = input("> ").strip()
|
||||
except (EOFError, KeyboardInterrupt):
|
||||
print()
|
||||
return
|
||||
if text.lower() in {"q", "quit", "exit"}:
|
||||
return
|
||||
if not text:
|
||||
continue
|
||||
print(f"sentiment: {ai_sentiment(text)}\n")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
_canned_examples()
|
||||
_interactive()
|
||||
83
examples/everything_function/scripts/summarize.py
Normal file
@ -0,0 +1,83 @@
|
||||
"""Summarize a chunk of text down to a target length.
|
||||
|
||||
The "function signature" is `summarize(text, max_words)`. Both arguments
|
||||
get pasted into the prompt. The model handles the rest. There is no
|
||||
sentence parser, no extractive ranking algorithm, no fine-tuned summarization
|
||||
head. There is a paragraph of text in, and a paragraph of text out.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from ai_function import ask
|
||||
|
||||
|
||||
def ai_summarize(text: str, max_words: int = 30) -> str:
|
||||
return ask(
|
||||
f"Summarize the passage in no more than {max_words} words. "
|
||||
"Output only the summary, no preamble.\n"
|
||||
"Passage: The Industrial Revolution, which began in Britain in the late "
|
||||
"18th century, transformed economies that had been based on agriculture and "
|
||||
"handicrafts into ones dominated by industry and machine manufacturing. "
|
||||
"It led to mass migration from the countryside to growing cities, dramatic "
|
||||
"increases in average income and population, and profound social changes "
|
||||
"that reshaped daily life across much of the world.\n"
|
||||
"Summary: The Industrial Revolution shifted economies from farming to "
|
||||
"factories, drove urban migration, and reshaped daily life.\n"
|
||||
f"Passage: {text}\n"
|
||||
"Summary: "
|
||||
)
|
||||
|
||||
|
||||
def _canned_examples() -> None:
|
||||
text = (
|
||||
"Photosynthesis is the process by which green plants, algae, and certain "
|
||||
"bacteria convert light energy, typically from the sun, into chemical "
|
||||
"energy stored in glucose. Inside the chloroplasts of plant cells, "
|
||||
"chlorophyll absorbs sunlight and uses it to split water molecules into "
|
||||
"oxygen, which is released as a byproduct, and hydrogen, which combines "
|
||||
"with carbon dioxide drawn from the air to form sugars. These sugars "
|
||||
"fuel the plant's growth and ultimately feed nearly every organism on "
|
||||
"Earth, either directly or indirectly. The oxygen released as a byproduct "
|
||||
"is also what most life on the planet depends on to breathe."
|
||||
)
|
||||
print(f"original ({len(text.split())} words):")
|
||||
print(text)
|
||||
print()
|
||||
for limit in (40, 20, 10):
|
||||
print(f"--- max {limit} words ---")
|
||||
print(ai_summarize(text, max_words=limit))
|
||||
print()
|
||||
|
||||
|
||||
def _interactive() -> None:
|
||||
print("--- interactive ---")
|
||||
print("Paste a passage to summarize, then a max word count. 'q' to quit.\n")
|
||||
while True:
|
||||
try:
|
||||
text = input("passage > ").strip()
|
||||
except (EOFError, KeyboardInterrupt):
|
||||
print()
|
||||
return
|
||||
if text.lower() in {"q", "quit", "exit"}:
|
||||
return
|
||||
if not text:
|
||||
continue
|
||||
try:
|
||||
raw_limit = input("max words [30] > ").strip() or "30"
|
||||
limit = int(raw_limit)
|
||||
except (EOFError, KeyboardInterrupt):
|
||||
print()
|
||||
return
|
||||
except ValueError:
|
||||
print(" not a number, try again")
|
||||
continue
|
||||
print(f"\noriginal ({len(text.split())} words):")
|
||||
print(text)
|
||||
print(f"\nsummary (max {limit} words):")
|
||||
print(ai_summarize(text, max_words=limit))
|
||||
print()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
_canned_examples()
|
||||
_interactive()
|
||||
63
examples/everything_function/scripts/translate.py
Normal file
@ -0,0 +1,63 @@
|
||||
"""Translate arbitrary text into an arbitrary target language.
|
||||
|
||||
A few years ago, building this would have meant either licensing Google
|
||||
Translate, or training your own sequence-to-sequence model per language
|
||||
pair. Here `target_language` is just a string the model reads.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from ai_function import ask
|
||||
|
||||
|
||||
def ai_translate(text: str, target_language: str) -> str:
|
||||
return ask(
|
||||
"Translate the sentence into the target language. Output only the "
|
||||
"translation, with no quotes and no explanation.\n"
|
||||
"Target language: French\n"
|
||||
'Sentence: "Where is the train station?"\n'
|
||||
"Translation: Où est la gare ?\n"
|
||||
"Target language: Japanese\n"
|
||||
'Sentence: "I would like a cup of coffee."\n'
|
||||
"Translation: コーヒーを一杯ください。\n"
|
||||
f"Target language: {target_language}\n"
|
||||
f'Sentence: "{text}"\n'
|
||||
"Translation: "
|
||||
)
|
||||
|
||||
|
||||
def _canned_examples() -> None:
|
||||
sentence = "The library closes at six o'clock on Sundays."
|
||||
print(f"original: {sentence}\n")
|
||||
for lang in ["Spanish", "German", "Mandarin Chinese", "Brazilian Portuguese"]:
|
||||
print(f"{lang:>22}: {ai_translate(sentence, lang)}")
|
||||
|
||||
|
||||
def _interactive() -> None:
|
||||
print("\n--- interactive ---")
|
||||
print("Translate your own text. 'q' as the text to quit.\n")
|
||||
while True:
|
||||
try:
|
||||
text = input("text > ").strip()
|
||||
except (EOFError, KeyboardInterrupt):
|
||||
print()
|
||||
return
|
||||
if text.lower() in {"q", "quit", "exit"}:
|
||||
return
|
||||
if not text:
|
||||
continue
|
||||
try:
|
||||
lang = input("target language > ").strip()
|
||||
except (EOFError, KeyboardInterrupt):
|
||||
print()
|
||||
return
|
||||
if not lang:
|
||||
print(" no target language, skipping")
|
||||
continue
|
||||
print(f"original: {text}")
|
||||
print(f"translation: {ai_translate(text, lang)}\n")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
_canned_examples()
|
||||
_interactive()
|
||||
13
examples/everything_function/web/Dockerfile
Normal file
@ -0,0 +1,13 @@
|
||||
FROM python:3.12-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
COPY requirements.txt .
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
|
||||
COPY main.py .
|
||||
COPY static ./static
|
||||
|
||||
EXPOSE 8080
|
||||
|
||||
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8080"]
|
||||
195
examples/everything_function/web/main.py
Normal file
@ -0,0 +1,195 @@
|
||||
"""FastAPI backend for the everything_function web UI.
|
||||
|
||||
This server doesn't do any AI work itself. It imports the same `ai_xxx`
|
||||
functions used by the terminal demo scripts and exposes them as HTTP
|
||||
endpoints so a browser can call them. The point: the web UI is just
|
||||
another front-end for the *same Python functions* you can run from
|
||||
`scripts/`. Nothing magic is happening in here.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import sys
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
|
||||
from fastapi import FastAPI, File, Form, HTTPException, UploadFile
|
||||
from fastapi.responses import FileResponse
|
||||
from fastapi.staticfiles import StaticFiles
|
||||
from pydantic import BaseModel
|
||||
|
||||
SCRIPTS_DIR = Path("/app/scripts")
|
||||
SAMPLES_DIR = Path("/app/sample_images")
|
||||
sys.path.insert(0, str(SCRIPTS_DIR))
|
||||
|
||||
import action_list # noqa: E402
|
||||
import algebra_roots # noqa: E402
|
||||
import arithmetic # noqa: E402
|
||||
import image_label # noqa: E402
|
||||
import ocr # noqa: E402
|
||||
import prime_factorization # noqa: E402
|
||||
import recipe_from_food # noqa: E402
|
||||
import sentiment # noqa: E402
|
||||
import summarize # noqa: E402
|
||||
import translate # noqa: E402
|
||||
|
||||
app = FastAPI(title="everything_function")
|
||||
|
||||
|
||||
# ------ arithmetic ------------------------------------------------------------
|
||||
|
||||
class ArithRequest(BaseModel):
|
||||
op: str
|
||||
a: float
|
||||
b: float
|
||||
|
||||
|
||||
@app.post("/api/arithmetic")
|
||||
def api_arithmetic(req: ArithRequest):
|
||||
if req.op not in arithmetic.OPS:
|
||||
raise HTTPException(400, f"unknown op: {req.op!r}")
|
||||
ai_fn, py_fn = arithmetic.OPS[req.op]
|
||||
ai_result = ai_fn(req.a, req.b).strip().rstrip(".")
|
||||
py_result = py_fn(req.a, req.b)
|
||||
return {"ai": ai_result, "python": str(py_result)}
|
||||
|
||||
|
||||
# ------ algebra ---------------------------------------------------------------
|
||||
|
||||
class AlgebraRequest(BaseModel):
|
||||
coefficients: list[float]
|
||||
|
||||
|
||||
@app.post("/api/algebra")
|
||||
def api_algebra(req: AlgebraRequest):
|
||||
if len(req.coefficients) < 2:
|
||||
raise HTTPException(400, "need at least 2 coefficients")
|
||||
polynomial = algebra_roots.format_polynomial(req.coefficients)
|
||||
ai_result = algebra_roots.ai_polynomial_roots(req.coefficients)
|
||||
py_result = algebra_roots.py_polynomial_roots(req.coefficients)
|
||||
return {"polynomial": polynomial, "ai": ai_result, "python": py_result}
|
||||
|
||||
|
||||
# ------ prime factorization ---------------------------------------------------
|
||||
|
||||
class PrimeRequest(BaseModel):
|
||||
n: int
|
||||
|
||||
|
||||
@app.post("/api/prime_factorization")
|
||||
def api_prime(req: PrimeRequest):
|
||||
ai_result = prime_factorization.ai_prime_factorize(req.n).strip().rstrip(".")
|
||||
py_result = prime_factorization.py_prime_factorize(req.n)
|
||||
return {"ai": ai_result, "python": py_result}
|
||||
|
||||
|
||||
# ------ text demos ------------------------------------------------------------
|
||||
|
||||
class SentimentRequest(BaseModel):
|
||||
text: str
|
||||
|
||||
|
||||
@app.post("/api/sentiment")
|
||||
def api_sentiment(req: SentimentRequest):
|
||||
return {"result": sentiment.ai_sentiment(req.text)}
|
||||
|
||||
|
||||
class TranslateRequest(BaseModel):
|
||||
text: str
|
||||
target_language: str
|
||||
|
||||
|
||||
@app.post("/api/translate")
|
||||
def api_translate(req: TranslateRequest):
|
||||
return {"result": translate.ai_translate(req.text, req.target_language)}
|
||||
|
||||
|
||||
class SummarizeRequest(BaseModel):
|
||||
text: str
|
||||
max_words: int = 30
|
||||
|
||||
|
||||
@app.post("/api/summarize")
|
||||
def api_summarize(req: SummarizeRequest):
|
||||
return {"result": summarize.ai_summarize(req.text, max_words=req.max_words)}
|
||||
|
||||
|
||||
class ActionListRequest(BaseModel):
|
||||
description: str
|
||||
|
||||
|
||||
@app.post("/api/action_list")
|
||||
def api_action_list(req: ActionListRequest):
|
||||
return {"result": "- " + action_list.ai_action_list(req.description)}
|
||||
|
||||
|
||||
# ------ image demos -----------------------------------------------------------
|
||||
|
||||
def _resolve_image(file: UploadFile | None, sample: str | None) -> Path:
|
||||
"""Save the upload to /tmp, or return the named sample image's path."""
|
||||
if file is not None and file.filename:
|
||||
suffix = Path(file.filename).suffix.lower() or ".jpg"
|
||||
tmp = tempfile.NamedTemporaryFile(delete=False, suffix=suffix)
|
||||
try:
|
||||
tmp.write(file.file.read())
|
||||
return Path(tmp.name)
|
||||
finally:
|
||||
tmp.close()
|
||||
if sample:
|
||||
path = SAMPLES_DIR / sample
|
||||
if not path.exists():
|
||||
raise HTTPException(404, f"sample image not found: {sample!r}")
|
||||
return path
|
||||
raise HTTPException(400, "must provide either an uploaded file or a sample name")
|
||||
|
||||
|
||||
@app.post("/api/image_label")
|
||||
def api_image_label(
|
||||
file: UploadFile | None = File(None),
|
||||
sample: str | None = Form(None),
|
||||
):
|
||||
path = _resolve_image(file, sample)
|
||||
return {"result": image_label.ai_image_label(path)}
|
||||
|
||||
|
||||
@app.post("/api/recipe_from_food")
|
||||
def api_recipe(
|
||||
file: UploadFile | None = File(None),
|
||||
sample: str | None = Form(None),
|
||||
):
|
||||
path = _resolve_image(file, sample)
|
||||
return {"result": recipe_from_food.ai_recipe_from_food(path)}
|
||||
|
||||
|
||||
@app.post("/api/ocr")
|
||||
def api_ocr(
|
||||
file: UploadFile | None = File(None),
|
||||
sample: str | None = Form(None),
|
||||
):
|
||||
path = _resolve_image(file, sample)
|
||||
return {"result": ocr.ai_ocr(path)}
|
||||
|
||||
|
||||
# ------ sample images ---------------------------------------------------------
|
||||
|
||||
@app.get("/api/sample_images")
|
||||
def list_samples():
|
||||
return [
|
||||
{"name": p.name, "url": f"/api/sample_images/{p.name}"}
|
||||
for p in sorted(SAMPLES_DIR.glob("*.jpg"))
|
||||
]
|
||||
|
||||
|
||||
@app.get("/api/sample_images/{name}")
|
||||
def get_sample(name: str):
|
||||
path = SAMPLES_DIR / name
|
||||
if path.suffix.lower() not in {".jpg", ".jpeg", ".png"} or not path.exists():
|
||||
raise HTTPException(404)
|
||||
# Pin to the samples directory so we can't escape via "../" tricks.
|
||||
if path.resolve().parent != SAMPLES_DIR.resolve():
|
||||
raise HTTPException(404)
|
||||
return FileResponse(path)
|
||||
|
||||
|
||||
# Static frontend served from /
|
||||
app.mount("/", StaticFiles(directory="/app/static", html=True), name="static")
|
||||
5
examples/everything_function/web/requirements.txt
Normal file
@ -0,0 +1,5 @@
|
||||
fastapi==0.115.6
|
||||
uvicorn==0.34.0
|
||||
python-multipart==0.0.20
|
||||
httpx==0.28.1
|
||||
numpy==2.2.1
|
||||
770
examples/everything_function/web/static/index.html
Normal file
@ -0,0 +1,770 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8" />
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1" />
|
||||
<title>everything_function</title>
|
||||
<style>
|
||||
:root {
|
||||
--bg: #f7f7f4;
|
||||
--panel: #ffffff;
|
||||
--border: #e3e1da;
|
||||
--text: #222;
|
||||
--muted: #7a766c;
|
||||
--accent: #2858d6;
|
||||
--accent-hover: #1d44ad;
|
||||
--code-bg: #f1efe6;
|
||||
--good: #1f7a3a;
|
||||
--bad: #b03030;
|
||||
--mono: ui-monospace, "SF Mono", Menlo, Consolas, monospace;
|
||||
--sans: -apple-system, BlinkMacSystemFont, "Segoe UI", system-ui, sans-serif;
|
||||
}
|
||||
* { box-sizing: border-box; }
|
||||
body {
|
||||
margin: 0;
|
||||
font-family: var(--sans);
|
||||
background: var(--bg);
|
||||
color: var(--text);
|
||||
font-size: 16px;
|
||||
line-height: 1.5;
|
||||
}
|
||||
.layout { display: grid; grid-template-columns: 260px 1fr; min-height: 100vh; }
|
||||
nav {
|
||||
background: var(--panel);
|
||||
border-right: 1px solid var(--border);
|
||||
padding: 24px 16px;
|
||||
}
|
||||
nav h1 {
|
||||
font-size: 14px;
|
||||
letter-spacing: 0.08em;
|
||||
text-transform: uppercase;
|
||||
color: var(--muted);
|
||||
margin: 0 0 8px 12px;
|
||||
font-weight: 600;
|
||||
}
|
||||
nav .group-label {
|
||||
font-size: 11px;
|
||||
letter-spacing: 0.1em;
|
||||
text-transform: uppercase;
|
||||
color: var(--muted);
|
||||
margin: 18px 0 6px 12px;
|
||||
}
|
||||
nav ul { list-style: none; padding: 0; margin: 0; }
|
||||
nav li button {
|
||||
width: 100%;
|
||||
text-align: left;
|
||||
background: transparent;
|
||||
border: 0;
|
||||
padding: 8px 12px;
|
||||
border-radius: 6px;
|
||||
color: var(--text);
|
||||
font: inherit;
|
||||
cursor: pointer;
|
||||
}
|
||||
nav li button:hover { background: var(--code-bg); }
|
||||
nav li button.active { background: var(--accent); color: white; }
|
||||
|
||||
main { padding: 32px 40px; max-width: 920px; }
|
||||
main h2 { margin: 0 0 6px; font-size: 28px; font-weight: 700; }
|
||||
main .subtitle { color: var(--muted); margin: 0 0 24px; }
|
||||
|
||||
.demo { display: none; }
|
||||
.demo.active { display: block; }
|
||||
|
||||
.field { margin: 12px 0; display: flex; flex-direction: column; gap: 4px; }
|
||||
.field label { font-weight: 600; font-size: 13px; }
|
||||
.field input, .field textarea, .field select {
|
||||
padding: 10px 12px;
|
||||
border: 1px solid var(--border);
|
||||
border-radius: 6px;
|
||||
background: var(--panel);
|
||||
font: inherit;
|
||||
color: var(--text);
|
||||
}
|
||||
.field textarea { font-family: var(--sans); min-height: 100px; resize: vertical; }
|
||||
.field input[type="number"] { max-width: 200px; }
|
||||
.inline-fields { display: flex; gap: 12px; flex-wrap: wrap; align-items: end; }
|
||||
.inline-fields .field { flex: 0 1 auto; }
|
||||
|
||||
button.run {
|
||||
margin-top: 12px;
|
||||
background: var(--accent);
|
||||
color: white;
|
||||
border: 0;
|
||||
padding: 10px 22px;
|
||||
border-radius: 6px;
|
||||
font: inherit;
|
||||
font-weight: 600;
|
||||
cursor: pointer;
|
||||
}
|
||||
button.run:hover { background: var(--accent-hover); }
|
||||
button.run:disabled { opacity: 0.6; cursor: wait; }
|
||||
|
||||
.output {
|
||||
margin-top: 20px;
|
||||
padding: 16px 18px;
|
||||
background: var(--panel);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: 8px;
|
||||
min-height: 60px;
|
||||
}
|
||||
.output .label {
|
||||
font-size: 11px;
|
||||
letter-spacing: 0.1em;
|
||||
text-transform: uppercase;
|
||||
color: var(--muted);
|
||||
margin-bottom: 4px;
|
||||
}
|
||||
.output .row { margin: 8px 0; }
|
||||
.output .value { font-family: var(--mono); font-size: 15px; white-space: pre-wrap; word-break: break-word; }
|
||||
.output .compare { display: grid; grid-template-columns: 1fr 1fr; gap: 16px; }
|
||||
.output .compare > div { background: var(--code-bg); padding: 12px; border-radius: 6px; }
|
||||
.output.empty { color: var(--muted); font-style: italic; }
|
||||
.output .match-yes { color: var(--good); font-weight: 600; }
|
||||
.output .match-no { color: var(--bad); font-weight: 600; }
|
||||
.output .err { color: var(--bad); font-family: var(--mono); white-space: pre-wrap; }
|
||||
|
||||
.help { color: var(--muted); font-size: 13px; margin: 0 0 14px; }
|
||||
.pretty-poly { font-family: var(--mono); font-size: 18px; padding: 10px 14px; background: var(--code-bg); border-radius: 6px; display: inline-block; }
|
||||
|
||||
.samples { display: flex; gap: 12px; flex-wrap: wrap; margin-top: 8px; }
|
||||
.samples label {
|
||||
cursor: pointer;
|
||||
border: 2px solid transparent;
|
||||
border-radius: 6px;
|
||||
padding: 4px;
|
||||
background: var(--panel);
|
||||
}
|
||||
.samples label.selected { border-color: var(--accent); }
|
||||
.samples img { display: block; height: 110px; border-radius: 4px; }
|
||||
.samples input { display: none; }
|
||||
|
||||
.dropzone {
|
||||
margin-top: 4px;
|
||||
border: 2px dashed var(--border);
|
||||
border-radius: 8px;
|
||||
background: var(--panel);
|
||||
padding: 18px;
|
||||
text-align: center;
|
||||
cursor: pointer;
|
||||
transition: border-color 0.15s, background 0.15s;
|
||||
min-height: 200px;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
gap: 10px;
|
||||
}
|
||||
.dropzone:hover { border-color: var(--accent); }
|
||||
.dropzone.dragover {
|
||||
border-color: var(--accent);
|
||||
background: #eef3ff;
|
||||
}
|
||||
.dropzone img.preview-img {
|
||||
max-width: 100%;
|
||||
max-height: 280px;
|
||||
border-radius: 6px;
|
||||
border: 1px solid var(--border);
|
||||
background: white;
|
||||
}
|
||||
.dropzone .hint { color: var(--muted); font-size: 14px; }
|
||||
.dropzone .source-tag {
|
||||
font-size: 12px;
|
||||
color: var(--muted);
|
||||
font-family: var(--mono);
|
||||
}
|
||||
.dropzone input[type="file"] { display: none; }
|
||||
|
||||
.small { font-size: 13px; color: var(--muted); }
|
||||
.or-separator {
|
||||
text-align: center;
|
||||
color: var(--muted);
|
||||
font-size: 12px;
|
||||
letter-spacing: 0.1em;
|
||||
text-transform: uppercase;
|
||||
margin: 12px 0 6px;
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="layout">
|
||||
<nav>
|
||||
<h1>everything_function</h1>
|
||||
<p class="small" style="margin: 0 12px 12px; color: var(--muted);">Same code, any function.</p>
|
||||
|
||||
<div class="group-label">Math</div>
|
||||
<ul>
|
||||
<li><button data-demo="arithmetic" class="active">Arithmetic</button></li>
|
||||
<li><button data-demo="algebra">Polynomial roots</button></li>
|
||||
<li><button data-demo="prime">Prime factorization</button></li>
|
||||
</ul>
|
||||
|
||||
<div class="group-label">Text</div>
|
||||
<ul>
|
||||
<li><button data-demo="sentiment">Sentiment</button></li>
|
||||
<li><button data-demo="translate">Translate</button></li>
|
||||
<li><button data-demo="summarize">Summarize</button></li>
|
||||
<li><button data-demo="action_list">Action list</button></li>
|
||||
</ul>
|
||||
|
||||
<div class="group-label">Vision</div>
|
||||
<ul>
|
||||
<li><button data-demo="image_label">Image label</button></li>
|
||||
<li><button data-demo="recipe">Food → recipe</button></li>
|
||||
<li><button data-demo="ocr">OCR</button></li>
|
||||
</ul>
|
||||
</nav>
|
||||
|
||||
<main>
|
||||
<!-- arithmetic -->
|
||||
<section class="demo active" id="demo-arithmetic">
|
||||
<h2>Arithmetic</h2>
|
||||
<p class="subtitle">AI as <code>+</code>, <code>-</code>, <code>*</code>, <code>/</code>. Compared side-by-side against real Python.</p>
|
||||
<form data-form="arithmetic">
|
||||
<div class="inline-fields">
|
||||
<div class="field"><label>a</label><input type="number" name="a" step="any" value="47" /></div>
|
||||
<div class="field"><label>op</label>
|
||||
<select name="op">
|
||||
<option value="+">+</option>
|
||||
<option value="-">-</option>
|
||||
<option value="*">*</option>
|
||||
<option value="/">/</option>
|
||||
</select>
|
||||
</div>
|
||||
<div class="field"><label>b</label><input type="number" name="b" step="any" value="28" /></div>
|
||||
</div>
|
||||
<button class="run" type="submit">Run</button>
|
||||
</form>
|
||||
<div class="output empty" data-out="arithmetic">Click Run to ask the model.</div>
|
||||
</section>
|
||||
|
||||
<!-- algebra -->
|
||||
<section class="demo" id="demo-algebra">
|
||||
<h2>Polynomial roots</h2>
|
||||
<p class="subtitle">Give the AI a polynomial. It guesses the real roots. <code>numpy.roots</code> solves it for real.</p>
|
||||
<form data-form="algebra">
|
||||
<div class="field">
|
||||
<label>Coefficients (high-degree to low-degree, comma- or space-separated)</label>
|
||||
<input type="text" name="coefficients" value="1, -7, 12" />
|
||||
<p class="help">e.g. <code>1, -7, 12</code> is x² − 7x + 12.</p>
|
||||
</div>
|
||||
<div id="algebra-poly-preview" style="margin: 8px 0;"></div>
|
||||
<button class="run" type="submit">Run</button>
|
||||
</form>
|
||||
<div class="output empty" data-out="algebra">Click Run to ask the model.</div>
|
||||
</section>
|
||||
|
||||
<!-- prime factorization -->
|
||||
<section class="demo" id="demo-prime">
|
||||
<h2>Prime factorization</h2>
|
||||
<p class="subtitle">AI tries to give the prime factorization. Real Python does it by trial division.</p>
|
||||
<form data-form="prime">
|
||||
<div class="field">
|
||||
<label>n</label>
|
||||
<input type="number" name="n" min="2" value="360" />
|
||||
</div>
|
||||
<button class="run" type="submit">Run</button>
|
||||
</form>
|
||||
<div class="output empty" data-out="prime">Click Run to ask the model.</div>
|
||||
</section>
|
||||
|
||||
<!-- sentiment -->
|
||||
<section class="demo" id="demo-sentiment">
|
||||
<h2>Sentiment</h2>
|
||||
<p class="subtitle">Label a sentence as <code>positive</code>, <code>negative</code>, or <code>neutral</code>.</p>
|
||||
<form data-form="sentiment">
|
||||
<div class="field">
|
||||
<label>Sentence</label>
|
||||
<textarea name="text">Honestly the best coffee I've had in months.</textarea>
|
||||
</div>
|
||||
<button class="run" type="submit">Run</button>
|
||||
</form>
|
||||
<div class="output empty" data-out="sentiment">Click Run to ask the model.</div>
|
||||
</section>
|
||||
|
||||
<!-- translate -->
|
||||
<section class="demo" id="demo-translate">
|
||||
<h2>Translate</h2>
|
||||
<p class="subtitle">Type any sentence and any target language. No language pair training needed.</p>
|
||||
<form data-form="translate">
|
||||
<div class="field">
|
||||
<label>Sentence</label>
|
||||
<textarea name="text">The library closes at six o'clock on Sundays.</textarea>
|
||||
</div>
|
||||
<div class="field">
|
||||
<label>Target language</label>
|
||||
<input type="text" name="target_language" value="Brazilian Portuguese" />
|
||||
</div>
|
||||
<button class="run" type="submit">Run</button>
|
||||
</form>
|
||||
<div class="output empty" data-out="translate">Click Run to ask the model.</div>
|
||||
</section>
|
||||
|
||||
<!-- summarize -->
|
||||
<section class="demo" id="demo-summarize">
|
||||
<h2>Summarize</h2>
|
||||
<p class="subtitle">Shrink a passage down to a target word count.</p>
|
||||
<form data-form="summarize">
|
||||
<div class="field">
|
||||
<label>Passage</label>
|
||||
<textarea name="text">Photosynthesis is the process by which green plants, algae, and certain bacteria convert light energy, typically from the sun, into chemical energy stored in glucose. Inside the chloroplasts of plant cells, chlorophyll absorbs sunlight and uses it to split water molecules into oxygen, which is released as a byproduct, and hydrogen, which combines with carbon dioxide drawn from the air to form sugars. These sugars fuel the plant's growth and ultimately feed nearly every organism on Earth, either directly or indirectly. The oxygen released as a byproduct is also what most life on the planet depends on to breathe.</textarea>
|
||||
</div>
|
||||
<div class="field">
|
||||
<label>Max words</label>
|
||||
<input type="number" name="max_words" min="5" value="20" />
|
||||
</div>
|
||||
<button class="run" type="submit">Run</button>
|
||||
</form>
|
||||
<div class="output empty" data-out="summarize">Click Run to ask the model.</div>
|
||||
</section>
|
||||
|
||||
<!-- action list -->
|
||||
<section class="demo" id="demo-action_list">
|
||||
<h2>Action list</h2>
|
||||
<p class="subtitle">Messy stream-of-consciousness in, clean bulleted to-dos out.</p>
|
||||
<form data-form="action_list">
|
||||
<div class="field">
|
||||
<label>Description</label>
|
||||
<textarea name="description">ugh ok so the dishwasher is making that noise again, I should probably look at the filter or just call the repair guy, also the lawn is getting long again it's been like three weeks, and we're almost out of dog food I keep meaning to grab a bag, oh and Sarah's birthday is on Saturday I haven't gotten anything yet</textarea>
|
||||
</div>
|
||||
<button class="run" type="submit">Run</button>
|
||||
</form>
|
||||
<div class="output empty" data-out="action_list">Click Run to ask the model.</div>
|
||||
</section>
|
||||
|
||||
<!-- image label -->
|
||||
<section class="demo" id="demo-image_label">
|
||||
<h2>Image label</h2>
|
||||
<p class="subtitle">Hand the model a picture. Get a short label back.</p>
|
||||
<form data-form="image_label" data-image-form>
|
||||
<div class="field">
|
||||
<label>Image:</label>
|
||||
<div class="dropzone" data-dropzone tabindex="0">
|
||||
<div data-preview></div>
|
||||
<div class="hint">Drop · paste (Ctrl/Cmd+V) · click to choose a file</div>
|
||||
<div class="source-tag" data-source-tag></div>
|
||||
<input type="file" name="file" accept="image/*" />
|
||||
</div>
|
||||
<div class="or-separator">or pick a sample</div>
|
||||
<div class="samples" data-sample-grid></div>
|
||||
</div>
|
||||
<button class="run" type="submit">Run</button>
|
||||
</form>
|
||||
<div class="output empty" data-out="image_label">Click Run to ask the model.</div>
|
||||
</section>
|
||||
|
||||
<!-- recipe -->
|
||||
<section class="demo" id="demo-recipe">
|
||||
<h2>Food → recipe</h2>
|
||||
<p class="subtitle">Picture of a dish in. Ingredients and steps out.</p>
|
||||
<form data-form="recipe_from_food" data-image-form>
|
||||
<div class="field">
|
||||
<label>Image:</label>
|
||||
<div class="dropzone" data-dropzone tabindex="0">
|
||||
<div data-preview></div>
|
||||
<div class="hint">Drop · paste (Ctrl/Cmd+V) · click to choose a file</div>
|
||||
<div class="source-tag" data-source-tag></div>
|
||||
<input type="file" name="file" accept="image/*" />
|
||||
</div>
|
||||
<div class="or-separator">or pick a sample</div>
|
||||
<div class="samples" data-sample-grid></div>
|
||||
</div>
|
||||
<button class="run" type="submit">Run</button>
|
||||
</form>
|
||||
<div class="output empty" data-out="recipe_from_food">Click Run to ask the model.</div>
|
||||
</section>
|
||||
|
||||
<!-- ocr -->
|
||||
<section class="demo" id="demo-ocr">
|
||||
<h2>OCR</h2>
|
||||
<p class="subtitle">Read printed text out of an image. No OCR engine — just a vision-language model asked to read.</p>
|
||||
<form data-form="ocr" data-image-form>
|
||||
<div class="field">
|
||||
<label>Image:</label>
|
||||
<div class="dropzone" data-dropzone tabindex="0">
|
||||
<div data-preview></div>
|
||||
<div class="hint">Drop · paste (Ctrl/Cmd+V) · click to choose a file</div>
|
||||
<div class="source-tag" data-source-tag></div>
|
||||
<input type="file" name="file" accept="image/*" />
|
||||
</div>
|
||||
<div class="or-separator">or pick a sample</div>
|
||||
<div class="samples" data-sample-grid></div>
|
||||
</div>
|
||||
<button class="run" type="submit">Run</button>
|
||||
</form>
|
||||
<div class="output empty" data-out="ocr">Click Run to ask the model.</div>
|
||||
</section>
|
||||
</main>
|
||||
</div>
|
||||
|
||||
<script>
|
||||
// ----- nav switching -----
|
||||
const navButtons = document.querySelectorAll("nav button[data-demo]");
|
||||
const demos = document.querySelectorAll("section.demo");
|
||||
navButtons.forEach(btn => {
|
||||
btn.addEventListener("click", () => {
|
||||
navButtons.forEach(b => b.classList.remove("active"));
|
||||
btn.classList.add("active");
|
||||
demos.forEach(d => d.classList.remove("active"));
|
||||
document.getElementById("demo-" + btn.dataset.demo).classList.add("active");
|
||||
});
|
||||
});
|
||||
|
||||
// ----- helpers -----
|
||||
function setOutput(name, html) {
|
||||
const el = document.querySelector(`[data-out="${name}"]`);
|
||||
el.classList.remove("empty");
|
||||
el.innerHTML = html;
|
||||
}
|
||||
function setLoading(name) {
|
||||
setOutput(name, `<div class="small">asking the model... (CPU inference can take a few seconds, vision longer)</div>`);
|
||||
}
|
||||
function setError(name, err) {
|
||||
setOutput(name, `<div class="err">error: ${escapeHtml(err)}</div>`);
|
||||
}
|
||||
function escapeHtml(s) {
|
||||
return String(s).replace(/[&<>"']/g, c => ({"&": "&", "<": "<", ">": ">", '"': """, "'": "'"}[c]));
|
||||
}
|
||||
async function postJSON(url, body) {
|
||||
const r = await fetch(url, {
|
||||
method: "POST",
|
||||
headers: {"Content-Type": "application/json"},
|
||||
body: JSON.stringify(body),
|
||||
});
|
||||
if (!r.ok) throw new Error(await r.text().catch(() => r.statusText));
|
||||
return r.json();
|
||||
}
|
||||
async function postForm(url, form) {
|
||||
const r = await fetch(url, {method: "POST", body: form});
|
||||
if (!r.ok) throw new Error(await r.text().catch(() => r.statusText));
|
||||
return r.json();
|
||||
}
|
||||
function withRunning(btn, fn) {
|
||||
const original = btn.textContent;
|
||||
btn.disabled = true;
|
||||
btn.textContent = "Running…";
|
||||
return fn().finally(() => { btn.disabled = false; btn.textContent = original; });
|
||||
}
|
||||
|
||||
// ----- arithmetic -----
|
||||
document.querySelector('[data-form="arithmetic"]').addEventListener("submit", async e => {
|
||||
e.preventDefault();
|
||||
const f = e.target;
|
||||
const a = parseFloat(f.a.value), b = parseFloat(f.b.value), op = f.op.value;
|
||||
setLoading("arithmetic");
|
||||
try {
|
||||
await withRunning(f.querySelector("button"), async () => {
|
||||
const r = await postJSON("/api/arithmetic", {a, b, op});
|
||||
const match = r.ai === r.python;
|
||||
setOutput("arithmetic", `
|
||||
<div class="row"><span class="label">expression</span><div class="value">${a} ${op} ${b}</div></div>
|
||||
<div class="compare">
|
||||
<div><div class="label">AI says</div><div class="value">${escapeHtml(r.ai)}</div></div>
|
||||
<div><div class="label">python says</div><div class="value">${escapeHtml(r.python)}</div></div>
|
||||
</div>
|
||||
<div class="row" style="margin-top:14px;">match? <span class="${match ? 'match-yes' : 'match-no'}">${match ? 'yes' : 'no'}</span></div>
|
||||
`);
|
||||
});
|
||||
} catch (err) { setError("arithmetic", err.message); }
|
||||
});
|
||||
|
||||
// ----- algebra: live polynomial preview -----
|
||||
function formatPolynomial(coeffs) {
|
||||
if (!coeffs.length) return "0";
|
||||
const degree = coeffs.length - 1;
|
||||
const parts = [];
|
||||
coeffs.forEach((c, i) => {
|
||||
if (c === 0) return;
|
||||
const power = degree - i;
|
||||
const abs = Math.abs(c);
|
||||
const sign = c > 0 ? "+" : "-";
|
||||
let body;
|
||||
if (power === 0) body = String(abs);
|
||||
else {
|
||||
const coef = abs === 1 ? "" : String(abs);
|
||||
const v = power === 1 ? "x" : `x^${power}`;
|
||||
body = `${coef}${v}`;
|
||||
}
|
||||
parts.push(`${sign} ${body}`);
|
||||
});
|
||||
let expr = parts.join(" ");
|
||||
if (expr.startsWith("+ ")) expr = expr.slice(2);
|
||||
else if (expr.startsWith("- ")) expr = "-" + expr.slice(2);
|
||||
return expr;
|
||||
}
|
||||
function parseCoeffs(raw) {
|
||||
return raw.split(/[, ]+/).filter(p => p).map(p => parseFloat(p));
|
||||
}
|
||||
const algInput = document.querySelector('[data-form="algebra"] input[name="coefficients"]');
|
||||
const algPreview = document.getElementById("algebra-poly-preview");
|
||||
function updatePolyPreview() {
|
||||
const coeffs = parseCoeffs(algInput.value);
|
||||
if (coeffs.length < 2 || coeffs.some(isNaN)) {
|
||||
algPreview.innerHTML = `<span class="small">enter at least 2 numeric coefficients</span>`;
|
||||
} else {
|
||||
algPreview.innerHTML = `<span class="small">polynomial:</span> <span class="pretty-poly">${escapeHtml(formatPolynomial(coeffs))}</span>`;
|
||||
}
|
||||
}
|
||||
algInput.addEventListener("input", updatePolyPreview);
|
||||
updatePolyPreview();
|
||||
|
||||
document.querySelector('[data-form="algebra"]').addEventListener("submit", async e => {
|
||||
e.preventDefault();
|
||||
const f = e.target;
|
||||
const coeffs = parseCoeffs(f.coefficients.value);
|
||||
if (coeffs.length < 2 || coeffs.some(isNaN)) { setError("algebra", "need ≥2 numeric coefficients"); return; }
|
||||
setLoading("algebra");
|
||||
try {
|
||||
await withRunning(f.querySelector("button"), async () => {
|
||||
const r = await postJSON("/api/algebra", {coefficients: coeffs});
|
||||
setOutput("algebra", `
|
||||
<div class="row"><span class="label">polynomial</span><div class="value">${escapeHtml(r.polynomial)}</div></div>
|
||||
<div class="compare">
|
||||
<div><div class="label">AI says</div><div class="value">${escapeHtml(r.ai)}</div></div>
|
||||
<div><div class="label">numpy says</div><div class="value">${escapeHtml(JSON.stringify(r.python))}</div></div>
|
||||
</div>
|
||||
`);
|
||||
});
|
||||
} catch (err) { setError("algebra", err.message); }
|
||||
});
|
||||
|
||||
// ----- prime -----
|
||||
document.querySelector('[data-form="prime"]').addEventListener("submit", async e => {
|
||||
e.preventDefault();
|
||||
const f = e.target;
|
||||
const n = parseInt(f.n.value, 10);
|
||||
if (!Number.isFinite(n) || n < 2) { setError("prime", "n must be an integer ≥ 2"); return; }
|
||||
setLoading("prime");
|
||||
try {
|
||||
await withRunning(f.querySelector("button"), async () => {
|
||||
const r = await postJSON("/api/prime_factorization", {n});
|
||||
const match = r.ai === r.python;
|
||||
setOutput("prime", `
|
||||
<div class="row"><span class="label">n</span><div class="value">${n}</div></div>
|
||||
<div class="compare">
|
||||
<div><div class="label">AI says</div><div class="value">${escapeHtml(r.ai)}</div></div>
|
||||
<div><div class="label">python says</div><div class="value">${escapeHtml(r.python)}</div></div>
|
||||
</div>
|
||||
<div class="row" style="margin-top:14px;">match? <span class="${match ? 'match-yes' : 'match-no'}">${match ? 'yes' : 'no'}</span></div>
|
||||
`);
|
||||
});
|
||||
} catch (err) { setError("prime", err.message); }
|
||||
});
|
||||
|
||||
// ----- text demos: sentiment, translate, summarize, action_list -----
|
||||
function wireTextDemo(formName, apiPath, fieldsFn, renderFn) {
|
||||
document.querySelector(`[data-form="${formName}"]`).addEventListener("submit", async e => {
|
||||
e.preventDefault();
|
||||
const f = e.target;
|
||||
const payload = fieldsFn(f);
|
||||
if (payload === null) return;
|
||||
setLoading(formName);
|
||||
try {
|
||||
await withRunning(f.querySelector("button"), async () => {
|
||||
const r = await postJSON(apiPath, payload);
|
||||
setOutput(formName, renderFn(payload, r));
|
||||
});
|
||||
} catch (err) { setError(formName, err.message); }
|
||||
});
|
||||
}
|
||||
wireTextDemo(
|
||||
"sentiment", "/api/sentiment",
|
||||
f => ({text: f.text.value.trim()}),
|
||||
(req, r) => `
|
||||
<div class="row"><span class="label">input</span><div class="value">${escapeHtml(req.text)}</div></div>
|
||||
<div class="row"><span class="label">sentiment</span><div class="value">${escapeHtml(r.result)}</div></div>`,
|
||||
);
|
||||
wireTextDemo(
|
||||
"translate", "/api/translate",
|
||||
f => ({text: f.text.value.trim(), target_language: f.target_language.value.trim()}),
|
||||
(req, r) => `
|
||||
<div class="row"><span class="label">original</span><div class="value">${escapeHtml(req.text)}</div></div>
|
||||
<div class="row"><span class="label">target language</span><div class="value">${escapeHtml(req.target_language)}</div></div>
|
||||
<div class="row"><span class="label">translation</span><div class="value">${escapeHtml(r.result)}</div></div>`,
|
||||
);
|
||||
wireTextDemo(
|
||||
"summarize", "/api/summarize",
|
||||
f => ({text: f.text.value.trim(), max_words: parseInt(f.max_words.value, 10)}),
|
||||
(req, r) => `
|
||||
<div class="row"><span class="label">original (${req.text.split(/\s+/).length} words)</span><div class="value">${escapeHtml(req.text)}</div></div>
|
||||
<div class="row"><span class="label">summary (max ${req.max_words} words)</span><div class="value">${escapeHtml(r.result)}</div></div>`,
|
||||
);
|
||||
wireTextDemo(
|
||||
"action_list", "/api/action_list",
|
||||
f => ({description: f.description.value.trim()}),
|
||||
(req, r) => `
|
||||
<div class="row"><span class="label">input</span><div class="value">${escapeHtml(req.description)}</div></div>
|
||||
<div class="row"><span class="label">action list</span><div class="value">${escapeHtml(r.result)}</div></div>`,
|
||||
);
|
||||
|
||||
// ----- image demos: shared scaffolding -----
|
||||
// Each image form has three ways to provide an image:
|
||||
// 1. pick a sample from the grid below the dropzone
|
||||
// 2. drop a file onto the dropzone
|
||||
// 3. paste a file from the clipboard while the demo is active
|
||||
// 4. click the dropzone to open a file picker
|
||||
//
|
||||
// We track the current selection on the form element itself:
|
||||
// form._chosen = { kind: "sample", name: "food_pizza.jpg" }
|
||||
// | { kind: "file", file: File, url: ObjectURL }
|
||||
// This way submit knows what to send and preview knows what to show.
|
||||
function setChosen(form, chosen) {
|
||||
// revoke any previous object URL to avoid leaking memory
|
||||
if (form._chosen?.kind === "file" && form._chosen.url) {
|
||||
URL.revokeObjectURL(form._chosen.url);
|
||||
}
|
||||
form._chosen = chosen;
|
||||
|
||||
const preview = form.querySelector("[data-preview]");
|
||||
const tag = form.querySelector("[data-source-tag]");
|
||||
preview.innerHTML = "";
|
||||
|
||||
if (chosen.kind === "file") {
|
||||
const img = document.createElement("img");
|
||||
img.src = chosen.url;
|
||||
img.className = "preview-img";
|
||||
preview.appendChild(img);
|
||||
tag.textContent = `using uploaded image: ${chosen.file.name || "(pasted)"}`;
|
||||
} else if (chosen.kind === "sample") {
|
||||
const img = document.createElement("img");
|
||||
img.src = `/api/sample_images/${chosen.name}`;
|
||||
img.className = "preview-img";
|
||||
preview.appendChild(img);
|
||||
tag.textContent = `using sample: ${chosen.name}`;
|
||||
}
|
||||
|
||||
// sync the sample-grid radios to reflect the current choice
|
||||
form.querySelectorAll('input[name="sample"]').forEach(r => {
|
||||
r.checked = (chosen.kind === "sample" && r.value === chosen.name);
|
||||
r.closest("label").classList.toggle("selected", r.checked);
|
||||
});
|
||||
}
|
||||
|
||||
function attachFile(form, file) {
|
||||
if (!file || !file.type.startsWith("image/")) return false;
|
||||
const url = URL.createObjectURL(file);
|
||||
setChosen(form, {kind: "file", file, url});
|
||||
return true;
|
||||
}
|
||||
|
||||
fetch("/api/sample_images").then(r => r.json()).then(samples => {
|
||||
document.querySelectorAll("form[data-image-form]").forEach(form => {
|
||||
const formName = form.dataset.form;
|
||||
const grid = form.querySelector("[data-sample-grid]");
|
||||
|
||||
samples.forEach((s, i) => {
|
||||
const id = `${formName}-sample-${i}`;
|
||||
const label = document.createElement("label");
|
||||
label.innerHTML = `
|
||||
<input type="radio" name="sample" value="${s.name}" id="${id}" />
|
||||
<img src="${s.url}" alt="${s.name}" />
|
||||
<div class="small" style="text-align:center; margin-top:4px;">${s.name}</div>
|
||||
`;
|
||||
grid.appendChild(label);
|
||||
});
|
||||
grid.addEventListener("change", e => {
|
||||
if (e.target.name === "sample" && e.target.checked) {
|
||||
setChosen(form, {kind: "sample", name: e.target.value});
|
||||
}
|
||||
});
|
||||
|
||||
// default selection: first sample
|
||||
if (samples.length) setChosen(form, {kind: "sample", name: samples[0].name});
|
||||
|
||||
// dropzone wiring: click-to-pick, drag-and-drop
|
||||
const dropzone = form.querySelector("[data-dropzone]");
|
||||
const fileInput = form.querySelector('input[type="file"]');
|
||||
|
||||
dropzone.addEventListener("click", e => {
|
||||
// don't reopen the picker if the user is clicking a child link/button
|
||||
if (e.target.closest("input, button, a")) return;
|
||||
fileInput.click();
|
||||
});
|
||||
fileInput.addEventListener("change", () => {
|
||||
if (fileInput.files[0]) attachFile(form, fileInput.files[0]);
|
||||
fileInput.value = ""; // allow re-picking the same file later
|
||||
});
|
||||
|
||||
["dragenter", "dragover"].forEach(ev => {
|
||||
dropzone.addEventListener(ev, e => {
|
||||
e.preventDefault();
|
||||
dropzone.classList.add("dragover");
|
||||
});
|
||||
});
|
||||
["dragleave", "drop"].forEach(ev => {
|
||||
dropzone.addEventListener(ev, e => {
|
||||
e.preventDefault();
|
||||
dropzone.classList.remove("dragover");
|
||||
});
|
||||
});
|
||||
dropzone.addEventListener("drop", e => {
|
||||
const file = e.dataTransfer?.files?.[0];
|
||||
if (!attachFile(form, file)) {
|
||||
// Some browsers ship an image via the items API when dragging from another page
|
||||
for (const item of e.dataTransfer?.items || []) {
|
||||
if (item.kind === "file" && item.type.startsWith("image/")) {
|
||||
if (attachFile(form, item.getAsFile())) return;
|
||||
}
|
||||
}
|
||||
}
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
// ----- paste-anywhere handler -----
|
||||
// When the active demo is an image one, Ctrl/Cmd+V attaches the clipboard image
|
||||
// (works for screenshots, copied images from web pages, etc.).
|
||||
document.addEventListener("paste", e => {
|
||||
const activeDemo = document.querySelector("section.demo.active");
|
||||
const form = activeDemo?.querySelector("form[data-image-form]");
|
||||
if (!form) return;
|
||||
for (const item of e.clipboardData?.items || []) {
|
||||
if (item.kind === "file" && item.type.startsWith("image/")) {
|
||||
const file = item.getAsFile();
|
||||
if (file) {
|
||||
// Rename pasted blobs so the source tag says something useful.
|
||||
const named = new File([file], file.name || `pasted-${Date.now()}.png`, {type: file.type});
|
||||
attachFile(form, named);
|
||||
e.preventDefault();
|
||||
return;
|
||||
}
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
// ----- image-form submit -----
|
||||
document.querySelectorAll("form[data-image-form]").forEach(form => {
|
||||
form.addEventListener("submit", async e => {
|
||||
e.preventDefault();
|
||||
const chosen = form._chosen;
|
||||
if (!chosen) {
|
||||
setError(form.dataset.form, "pick a sample, upload, drop, or paste an image first");
|
||||
return;
|
||||
}
|
||||
|
||||
const fd = new FormData();
|
||||
let previewUrl;
|
||||
if (chosen.kind === "file") {
|
||||
fd.append("file", chosen.file);
|
||||
previewUrl = chosen.url;
|
||||
} else {
|
||||
fd.append("sample", chosen.name);
|
||||
previewUrl = `/api/sample_images/${chosen.name}`;
|
||||
}
|
||||
|
||||
const apiPath = `/api/${form.dataset.form}`;
|
||||
setLoading(form.dataset.form);
|
||||
try {
|
||||
await withRunning(form.querySelector("button"), async () => {
|
||||
const r = await postForm(apiPath, fd);
|
||||
setOutput(form.dataset.form, `
|
||||
<div class="row"><span class="label">input</span><div style="margin-top:6px"><img src="${previewUrl}" style="max-width:240px; max-height:180px; border-radius:6px; border:1px solid var(--border);" /></div></div>
|
||||
<div class="row"><span class="label">model output</span><div class="value">${escapeHtml(r.result)}</div></div>
|
||||
`);
|
||||
});
|
||||
} catch (err) { setError(form.dataset.form, err.message); }
|
||||
});
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
85
examples/image_meaning_db/README.md
Normal file
@ -0,0 +1,85 @@
|
||||
# image_meaning_db
|
||||
|
||||
A self-contained semantic image search tool. Upload images (optionally with a description) to build up a database, then search by image to find the nearest neighbors by meaning. Runs as a single Docker service: a FastAPI backend that embeds images locally with CLIP (`clip-ViT-B-32`) and stores vectors in ChromaDB, served behind a minimal browser UI.
|
||||
|
||||
On first launch it auto-seeds the database with ~100 sample images from Lorem Picsum so you have something to search against immediately.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
You need Docker Engine and the Docker Compose plugin. If you don't already have them:
|
||||
|
||||
- **Linux (Ubuntu/Debian):** follow the official install guide at https://docs.docker.com/engine/install/ubuntu/. After installing, add your user to the `docker` group so you don't need `sudo`:
|
||||
```bash
|
||||
sudo usermod -aG docker $USER
|
||||
newgrp docker
|
||||
```
|
||||
- **macOS / Windows:** install Docker Desktop from https://docs.docker.com/desktop/. Compose is bundled.
|
||||
|
||||
Verify it works:
|
||||
```bash
|
||||
docker --version
|
||||
docker compose version
|
||||
```
|
||||
|
||||
## Running it
|
||||
|
||||
From the project root:
|
||||
|
||||
```bash
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
(If your Compose is the older standalone binary, use `docker-compose` with a hyphen instead.)
|
||||
|
||||
Then open http://localhost:8081 in your browser.
|
||||
|
||||
### What to expect on the first run
|
||||
|
||||
The first `up --build` is slow because it:
|
||||
|
||||
1. Installs Python deps including CPU-only PyTorch (~200 MB pip download).
|
||||
2. Downloads the CLIP model weights (~600 MB) into a cached volume on first server start.
|
||||
3. Fetches 100 seed images from picsum.photos and embeds them.
|
||||
|
||||
Watch progress with:
|
||||
```bash
|
||||
docker compose logs -f backend
|
||||
```
|
||||
|
||||
You'll see `Model clip-ViT-B-32 ready.`, then `Seed: N images indexed...` messages as the database fills. The UI is usable throughout — refresh to see the image count climb.
|
||||
|
||||
Subsequent runs reuse the cached model and the existing database, so startup is fast.
|
||||
|
||||
## Using the UI
|
||||
|
||||
Two tabs:
|
||||
|
||||
- **Submit Image** — drop, paste (Ctrl/Cmd+V), or click to select an image. Add an optional description (e.g. `"red coffee mug on wooden desk"`) and click *Submit to Database*. The image is embedded and stored.
|
||||
- **Search by Image** — drop/paste/select a query image. The backend embeds it and returns the most semantically similar stored images, ranked by cosine similarity, with any descriptions they were submitted with.
|
||||
|
||||
## API
|
||||
|
||||
If you want to hit the backend directly:
|
||||
|
||||
- `POST /api/submit` — multipart form: `file` (image), optional `description` (string). Returns `{id, filename, total_images}`.
|
||||
- `POST /api/search` — multipart form: `file` (image), optional query param `n` (default 10). Returns ranked list of matches with similarity scores.
|
||||
- `GET /api/images/{filename}` — serves a stored image.
|
||||
- `GET /api/stats` — `{total_images: N}`.
|
||||
|
||||
## Stopping and resetting
|
||||
|
||||
```bash
|
||||
docker compose down # stop containers, keep data
|
||||
docker compose down -v # also delete the database, cached model, and stored images
|
||||
```
|
||||
|
||||
If you wipe volumes, the next start will re-download the CLIP model and re-seed the 100 sample images.
|
||||
|
||||
## Configuration
|
||||
|
||||
Environment variables set in `docker-compose.yml`:
|
||||
|
||||
- `EMBEDDING_MODEL` — sentence-transformers model name. Default: `clip-ViT-B-32`. If you change this, wipe the `chroma_data` volume — embedding dimensions must match across all stored vectors.
|
||||
- `SEED_COUNT` — number of sample images to seed on first launch. Default: `100`. Set to `0` to skip seeding.
|
||||
|
||||
Host port mapping is also in `docker-compose.yml`; change the left side of `"8081:8080"` if 8081 conflicts with something else on your machine.
|
||||
17
examples/image_meaning_db/backend/Dockerfile
Normal file
@ -0,0 +1,17 @@
|
||||
FROM python:3.12-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
ENV HF_HOME=/root/.cache/huggingface \
|
||||
PIP_DEFAULT_TIMEOUT=180
|
||||
|
||||
COPY requirements.txt .
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
|
||||
COPY . .
|
||||
|
||||
RUN mkdir -p /app/images /app/chroma_data
|
||||
|
||||
EXPOSE 8080
|
||||
|
||||
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8080"]
|
||||
197
examples/image_meaning_db/backend/main.py
Normal file
@ -0,0 +1,197 @@
|
||||
import asyncio
|
||||
import io
|
||||
import os
|
||||
import uuid
|
||||
from pathlib import Path
|
||||
|
||||
import chromadb
|
||||
import httpx
|
||||
from fastapi import FastAPI, File, Form, UploadFile
|
||||
from fastapi.responses import FileResponse, HTMLResponse
|
||||
from fastapi.staticfiles import StaticFiles
|
||||
from PIL import Image
|
||||
from sentence_transformers import SentenceTransformer
|
||||
|
||||
EMBEDDING_MODEL = os.environ.get("EMBEDDING_MODEL", "clip-ViT-B-32")
|
||||
SEED_COUNT = int(os.environ.get("SEED_COUNT", "100"))
|
||||
IMAGE_DIR = Path("/app/images")
|
||||
CHROMA_DIR = Path("/app/chroma_data")
|
||||
|
||||
IMAGE_DIR.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
app = FastAPI(title="Image Meaning DB")
|
||||
|
||||
chroma_client = chromadb.PersistentClient(path=str(CHROMA_DIR))
|
||||
collection = chroma_client.get_or_create_collection(
|
||||
name="images",
|
||||
metadata={"hnsw:space": "cosine"},
|
||||
)
|
||||
|
||||
print(f"Loading embedding model {EMBEDDING_MODEL}...")
|
||||
embedder = SentenceTransformer(EMBEDDING_MODEL)
|
||||
print(f"Model {EMBEDDING_MODEL} ready.")
|
||||
|
||||
|
||||
def load_image(image_bytes: bytes) -> Image.Image:
|
||||
img = Image.open(io.BytesIO(image_bytes)).convert("RGB")
|
||||
max_dim = 512
|
||||
if max(img.size) > max_dim:
|
||||
img.thumbnail((max_dim, max_dim), Image.LANCZOS)
|
||||
return img
|
||||
|
||||
|
||||
def get_embedding(img: Image.Image) -> list[float]:
|
||||
vec = embedder.encode(img, convert_to_numpy=True, normalize_embeddings=True)
|
||||
return vec.tolist()
|
||||
|
||||
|
||||
async def _seed_database():
|
||||
if SEED_COUNT <= 0:
|
||||
print("Seed: SEED_COUNT is 0, skipping.")
|
||||
return
|
||||
if collection.count() > 0:
|
||||
print(f"Seed: collection already has {collection.count()} images, skipping.")
|
||||
return
|
||||
print(f"Seed: fetching {SEED_COUNT} sample images from picsum.photos...")
|
||||
try:
|
||||
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
|
||||
resp = await client.get(
|
||||
"https://picsum.photos/v2/list",
|
||||
params={"page": 1, "limit": SEED_COUNT},
|
||||
)
|
||||
resp.raise_for_status()
|
||||
photos = resp.json()
|
||||
|
||||
sem = asyncio.Semaphore(8)
|
||||
|
||||
async def fetch(p):
|
||||
async with sem:
|
||||
try:
|
||||
r = await client.get(f"https://picsum.photos/id/{p['id']}/512/512")
|
||||
r.raise_for_status()
|
||||
return p, r.content
|
||||
except Exception as e:
|
||||
print(f"Seed: fetch failed id={p['id']}: {e}")
|
||||
return None
|
||||
|
||||
fetched = await asyncio.gather(*(fetch(p) for p in photos))
|
||||
except Exception as e:
|
||||
print(f"Seed: list fetch failed: {e}")
|
||||
return
|
||||
|
||||
added = 0
|
||||
for item in fetched:
|
||||
if item is None:
|
||||
continue
|
||||
p, image_bytes = item
|
||||
try:
|
||||
img = load_image(image_bytes)
|
||||
embedding = await asyncio.to_thread(get_embedding, img)
|
||||
image_id = f"seed-{p['id']}"
|
||||
filename = f"{image_id}.jpg"
|
||||
(IMAGE_DIR / filename).write_bytes(image_bytes)
|
||||
collection.add(
|
||||
ids=[image_id],
|
||||
embeddings=[embedding],
|
||||
metadatas=[{
|
||||
"filename": filename,
|
||||
"original_name": f"picsum-{p['id']}.jpg",
|
||||
"description": f"Photo by {p['author']}",
|
||||
}],
|
||||
)
|
||||
added += 1
|
||||
if added % 10 == 0:
|
||||
print(f"Seed: {added} images indexed...")
|
||||
except Exception as e:
|
||||
print(f"Seed: embed/store failed id={p['id']}: {e}")
|
||||
print(f"Seed: finished, {added} images added.")
|
||||
|
||||
|
||||
@app.on_event("startup")
|
||||
async def on_startup():
|
||||
asyncio.create_task(_seed_database())
|
||||
|
||||
|
||||
@app.post("/api/submit")
|
||||
async def submit_image(
|
||||
file: UploadFile = File(...),
|
||||
description: str = Form(""),
|
||||
):
|
||||
"""Upload an image, embed it, and store it."""
|
||||
image_bytes = await file.read()
|
||||
img = load_image(image_bytes)
|
||||
|
||||
embedding = get_embedding(img)
|
||||
|
||||
image_id = str(uuid.uuid4())
|
||||
ext = Path(file.filename or "image.png").suffix or ".png"
|
||||
filename = f"{image_id}{ext}"
|
||||
filepath = IMAGE_DIR / filename
|
||||
filepath.write_bytes(image_bytes)
|
||||
|
||||
collection.add(
|
||||
ids=[image_id],
|
||||
embeddings=[embedding],
|
||||
metadatas=[{
|
||||
"filename": filename,
|
||||
"original_name": file.filename or "unknown",
|
||||
"description": description,
|
||||
}],
|
||||
)
|
||||
|
||||
count = collection.count()
|
||||
return {"id": image_id, "filename": filename, "total_images": count}
|
||||
|
||||
|
||||
@app.post("/api/search")
|
||||
async def search_image(file: UploadFile = File(...), n: int = 10):
|
||||
"""Upload an image, embed it, and find nearest neighbors."""
|
||||
image_bytes = await file.read()
|
||||
img = load_image(image_bytes)
|
||||
|
||||
embedding = get_embedding(img)
|
||||
|
||||
count = collection.count()
|
||||
if count == 0:
|
||||
return {"results": [], "message": "No images in database yet."}
|
||||
|
||||
results = collection.query(
|
||||
query_embeddings=[embedding],
|
||||
n_results=min(n, count),
|
||||
include=["distances", "metadatas"],
|
||||
)
|
||||
|
||||
matches = []
|
||||
for i, doc_id in enumerate(results["ids"][0]):
|
||||
distance = results["distances"][0][i]
|
||||
metadata = results["metadatas"][0][i]
|
||||
similarity = 1 - distance # cosine distance -> similarity
|
||||
matches.append({
|
||||
"id": doc_id,
|
||||
"filename": metadata["filename"],
|
||||
"original_name": metadata["original_name"],
|
||||
"description": metadata.get("description", ""),
|
||||
"similarity": round(similarity, 4),
|
||||
})
|
||||
|
||||
return {"results": matches}
|
||||
|
||||
|
||||
@app.get("/api/images/{filename}")
|
||||
async def get_image(filename: str):
|
||||
"""Serve a stored image."""
|
||||
filepath = IMAGE_DIR / filename
|
||||
if not filepath.exists():
|
||||
return {"error": "Image not found"}
|
||||
return FileResponse(filepath)
|
||||
|
||||
|
||||
@app.get("/api/stats")
|
||||
async def stats():
|
||||
"""Return database stats."""
|
||||
return {"total_images": collection.count()}
|
||||
|
||||
|
||||
@app.get("/", response_class=HTMLResponse)
|
||||
async def index():
|
||||
return Path("static/index.html").read_text()
|
||||
10
examples/image_meaning_db/backend/requirements.txt
Normal file
@ -0,0 +1,10 @@
|
||||
fastapi==0.115.6
|
||||
uvicorn==0.34.0
|
||||
python-multipart==0.0.20
|
||||
httpx==0.28.1
|
||||
chromadb==0.6.3
|
||||
Pillow==11.1.0
|
||||
sentence-transformers==3.3.1
|
||||
--extra-index-url https://download.pytorch.org/whl/cpu
|
||||
torch==2.5.1+cpu
|
||||
torchvision==0.20.1+cpu
|
||||
423
examples/image_meaning_db/backend/static/index.html
Normal file
@ -0,0 +1,423 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>Image Meaning DB</title>
|
||||
<style>
|
||||
:root {
|
||||
--bg: #0f1117;
|
||||
--surface: #1a1d27;
|
||||
--border: #2a2d3a;
|
||||
--text: #e1e4ed;
|
||||
--muted: #8b8fa3;
|
||||
--accent: #6c5ce7;
|
||||
--accent-hover: #7f71ed;
|
||||
--success: #2ecc71;
|
||||
--card-bg: #1e2130;
|
||||
}
|
||||
* { margin: 0; padding: 0; box-sizing: border-box; }
|
||||
body {
|
||||
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', system-ui, sans-serif;
|
||||
background: var(--bg);
|
||||
color: var(--text);
|
||||
min-height: 100vh;
|
||||
}
|
||||
.container { max-width: 960px; margin: 0 auto; padding: 2rem 1.5rem; }
|
||||
h1 {
|
||||
font-size: 1.75rem;
|
||||
font-weight: 700;
|
||||
margin-bottom: 0.25rem;
|
||||
}
|
||||
.subtitle { color: var(--muted); margin-bottom: 2rem; font-size: 0.9rem; }
|
||||
.stats { color: var(--muted); font-size: 0.85rem; margin-bottom: 1.5rem; }
|
||||
|
||||
.tabs {
|
||||
display: flex;
|
||||
gap: 0;
|
||||
margin-bottom: 2rem;
|
||||
border-bottom: 1px solid var(--border);
|
||||
}
|
||||
.tab {
|
||||
padding: 0.75rem 1.5rem;
|
||||
cursor: pointer;
|
||||
color: var(--muted);
|
||||
border-bottom: 2px solid transparent;
|
||||
font-size: 0.95rem;
|
||||
transition: all 0.15s;
|
||||
}
|
||||
.tab:hover { color: var(--text); }
|
||||
.tab.active {
|
||||
color: var(--accent);
|
||||
border-bottom-color: var(--accent);
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.panel { display: none; }
|
||||
.panel.active { display: block; }
|
||||
|
||||
.drop-zone {
|
||||
border: 2px dashed var(--border);
|
||||
border-radius: 12px;
|
||||
padding: 3rem 2rem;
|
||||
text-align: center;
|
||||
cursor: pointer;
|
||||
transition: all 0.2s;
|
||||
margin-bottom: 1.5rem;
|
||||
position: relative;
|
||||
}
|
||||
.drop-zone:hover, .drop-zone.dragover {
|
||||
border-color: var(--accent);
|
||||
background: rgba(108, 92, 231, 0.05);
|
||||
}
|
||||
.drop-zone input { display: none; }
|
||||
.drop-zone p { color: var(--muted); margin-top: 0.5rem; font-size: 0.9rem; }
|
||||
.drop-zone .icon { font-size: 2rem; margin-bottom: 0.5rem; }
|
||||
|
||||
.preview-container {
|
||||
display: flex;
|
||||
align-items: flex-start;
|
||||
gap: 1.5rem;
|
||||
margin-bottom: 1.5rem;
|
||||
}
|
||||
.preview-img {
|
||||
max-width: 200px;
|
||||
max-height: 200px;
|
||||
border-radius: 8px;
|
||||
object-fit: cover;
|
||||
border: 1px solid var(--border);
|
||||
}
|
||||
.preview-info { flex: 1; }
|
||||
.preview-info p { color: var(--muted); font-size: 0.85rem; margin-bottom: 0.5rem; }
|
||||
|
||||
button {
|
||||
background: var(--accent);
|
||||
color: white;
|
||||
border: none;
|
||||
padding: 0.65rem 1.5rem;
|
||||
border-radius: 8px;
|
||||
font-size: 0.9rem;
|
||||
cursor: pointer;
|
||||
font-weight: 600;
|
||||
transition: background 0.15s;
|
||||
}
|
||||
button:hover { background: var(--accent-hover); }
|
||||
button:disabled { opacity: 0.5; cursor: not-allowed; }
|
||||
|
||||
.status {
|
||||
margin-top: 1rem;
|
||||
padding: 0.75rem 1rem;
|
||||
border-radius: 8px;
|
||||
font-size: 0.85rem;
|
||||
display: none;
|
||||
}
|
||||
.status.show { display: block; }
|
||||
.status.info { background: rgba(108, 92, 231, 0.1); color: var(--accent); }
|
||||
.status.success { background: rgba(46, 204, 113, 0.1); color: var(--success); }
|
||||
.status.error { background: rgba(231, 76, 60, 0.1); color: #e74c3c; }
|
||||
|
||||
.results {
|
||||
display: grid;
|
||||
grid-template-columns: repeat(auto-fill, minmax(200px, 1fr));
|
||||
gap: 1rem;
|
||||
margin-top: 1.5rem;
|
||||
}
|
||||
.result-card {
|
||||
background: var(--card-bg);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: 10px;
|
||||
overflow: hidden;
|
||||
transition: transform 0.15s;
|
||||
}
|
||||
.result-card:hover { transform: translateY(-2px); }
|
||||
.result-card img {
|
||||
width: 100%;
|
||||
aspect-ratio: 1;
|
||||
object-fit: cover;
|
||||
display: block;
|
||||
}
|
||||
.result-card .card-info {
|
||||
padding: 0.6rem 0.75rem;
|
||||
}
|
||||
.result-card .similarity {
|
||||
font-weight: 700;
|
||||
font-size: 0.95rem;
|
||||
color: var(--accent);
|
||||
}
|
||||
.result-card .card-name {
|
||||
color: var(--muted);
|
||||
font-size: 0.75rem;
|
||||
margin-top: 0.2rem;
|
||||
white-space: nowrap;
|
||||
overflow: hidden;
|
||||
text-overflow: ellipsis;
|
||||
}
|
||||
.result-card .card-desc {
|
||||
color: var(--text);
|
||||
font-size: 0.8rem;
|
||||
margin-top: 0.35rem;
|
||||
line-height: 1.3;
|
||||
}
|
||||
.result-card .rank {
|
||||
font-size: 0.7rem;
|
||||
color: var(--muted);
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
textarea.description {
|
||||
width: 100%;
|
||||
background: var(--surface);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: 8px;
|
||||
color: var(--text);
|
||||
padding: 0.6rem 0.75rem;
|
||||
font-family: inherit;
|
||||
font-size: 0.9rem;
|
||||
resize: vertical;
|
||||
min-height: 72px;
|
||||
margin-bottom: 0.75rem;
|
||||
}
|
||||
textarea.description:focus {
|
||||
outline: none;
|
||||
border-color: var(--accent);
|
||||
}
|
||||
|
||||
.loading {
|
||||
display: inline-block;
|
||||
width: 16px; height: 16px;
|
||||
border: 2px solid var(--border);
|
||||
border-top-color: var(--accent);
|
||||
border-radius: 50%;
|
||||
animation: spin 0.6s linear infinite;
|
||||
vertical-align: middle;
|
||||
margin-right: 0.5rem;
|
||||
}
|
||||
@keyframes spin { to { transform: rotate(360deg); } }
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="container">
|
||||
<h1>Image Meaning DB</h1>
|
||||
<p class="subtitle">Semantic image search powered by CLIP embeddings</p>
|
||||
<p class="stats" id="stats">Loading...</p>
|
||||
|
||||
<div class="tabs">
|
||||
<div class="tab active" data-panel="submit">Submit Image</div>
|
||||
<div class="tab" data-panel="search">Search by Image</div>
|
||||
</div>
|
||||
|
||||
<!-- Submit Panel -->
|
||||
<div class="panel active" id="panel-submit">
|
||||
<div class="drop-zone" id="submit-drop">
|
||||
<div class="icon">🖼</div>
|
||||
<p>Drop, paste (Ctrl/Cmd+V), or click to browse</p>
|
||||
<input type="file" accept="image/*" id="submit-file">
|
||||
</div>
|
||||
<div id="submit-preview" style="display:none">
|
||||
<div class="preview-container">
|
||||
<img class="preview-img" id="submit-preview-img">
|
||||
<div class="preview-info">
|
||||
<p id="submit-file-name"></p>
|
||||
<textarea class="description" id="submit-description"
|
||||
placeholder="Describe this image (optional) — e.g. 'red coffee mug on wooden desk'"></textarea>
|
||||
<button id="submit-btn">Submit to Database</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="status" id="submit-status"></div>
|
||||
</div>
|
||||
|
||||
<!-- Search Panel -->
|
||||
<div class="panel" id="panel-search">
|
||||
<div class="drop-zone" id="search-drop">
|
||||
<div class="icon">🔍</div>
|
||||
<p>Drop, paste (Ctrl/Cmd+V), or click to browse a query image</p>
|
||||
<input type="file" accept="image/*" id="search-file">
|
||||
</div>
|
||||
<div id="search-preview" style="display:none">
|
||||
<div class="preview-container">
|
||||
<img class="preview-img" id="search-preview-img">
|
||||
<div class="preview-info">
|
||||
<p id="search-file-name"></p>
|
||||
<button id="search-btn">Find Similar Images</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="status" id="search-status"></div>
|
||||
<div class="results" id="search-results"></div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<script>
|
||||
// Tabs
|
||||
document.querySelectorAll('.tab').forEach(tab => {
|
||||
tab.addEventListener('click', () => {
|
||||
document.querySelectorAll('.tab').forEach(t => t.classList.remove('active'));
|
||||
document.querySelectorAll('.panel').forEach(p => p.classList.remove('active'));
|
||||
tab.classList.add('active');
|
||||
document.getElementById('panel-' + tab.dataset.panel).classList.add('active');
|
||||
});
|
||||
});
|
||||
|
||||
// Drop zones
|
||||
function setupDropZone(dropEl, fileInput, previewImg, fileNameEl, previewContainer) {
|
||||
let selectedFile = null;
|
||||
|
||||
dropEl.addEventListener('click', () => fileInput.click());
|
||||
dropEl.addEventListener('dragover', e => { e.preventDefault(); dropEl.classList.add('dragover'); });
|
||||
dropEl.addEventListener('dragleave', () => dropEl.classList.remove('dragover'));
|
||||
dropEl.addEventListener('drop', e => {
|
||||
e.preventDefault();
|
||||
dropEl.classList.remove('dragover');
|
||||
if (e.dataTransfer.files.length) handleFile(e.dataTransfer.files[0]);
|
||||
});
|
||||
fileInput.addEventListener('change', () => {
|
||||
if (fileInput.files.length) handleFile(fileInput.files[0]);
|
||||
});
|
||||
|
||||
function handleFile(file) {
|
||||
selectedFile = file;
|
||||
const url = URL.createObjectURL(file);
|
||||
previewImg.src = url;
|
||||
fileNameEl.textContent = file.name + ' (' + (file.size / 1024).toFixed(1) + ' KB)';
|
||||
previewContainer.style.display = 'block';
|
||||
}
|
||||
|
||||
return { getFile: () => selectedFile, handleFile };
|
||||
}
|
||||
|
||||
const submitZone = setupDropZone(
|
||||
document.getElementById('submit-drop'),
|
||||
document.getElementById('submit-file'),
|
||||
document.getElementById('submit-preview-img'),
|
||||
document.getElementById('submit-file-name'),
|
||||
document.getElementById('submit-preview')
|
||||
);
|
||||
|
||||
const searchZone = setupDropZone(
|
||||
document.getElementById('search-drop'),
|
||||
document.getElementById('search-file'),
|
||||
document.getElementById('search-preview-img'),
|
||||
document.getElementById('search-file-name'),
|
||||
document.getElementById('search-preview')
|
||||
);
|
||||
|
||||
const getSubmitFile = submitZone.getFile;
|
||||
const getSearchFile = searchZone.getFile;
|
||||
|
||||
document.addEventListener('paste', (e) => {
|
||||
const target = e.target;
|
||||
if (target && (target.tagName === 'TEXTAREA' || target.tagName === 'INPUT')) return;
|
||||
const items = e.clipboardData && e.clipboardData.items;
|
||||
if (!items) return;
|
||||
for (const item of items) {
|
||||
if (item.kind === 'file' && item.type.startsWith('image/')) {
|
||||
const blob = item.getAsFile();
|
||||
if (!blob) continue;
|
||||
const ext = (item.type.split('/')[1] || 'png').split('+')[0];
|
||||
const file = new File([blob], `pasted-${Date.now()}.${ext}`, { type: item.type });
|
||||
const activeTab = document.querySelector('.tab.active').dataset.panel;
|
||||
(activeTab === 'submit' ? submitZone : searchZone).handleFile(file);
|
||||
e.preventDefault();
|
||||
break;
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
function showStatus(id, msg, type) {
|
||||
const el = document.getElementById(id);
|
||||
el.className = 'status show ' + type;
|
||||
el.innerHTML = msg;
|
||||
}
|
||||
|
||||
// Stats
|
||||
async function loadStats() {
|
||||
try {
|
||||
const resp = await fetch('/api/stats');
|
||||
const data = await resp.json();
|
||||
document.getElementById('stats').textContent = data.total_images + ' images in database';
|
||||
} catch(e) {
|
||||
document.getElementById('stats').textContent = 'Could not load stats';
|
||||
}
|
||||
}
|
||||
loadStats();
|
||||
|
||||
// Submit
|
||||
document.getElementById('submit-btn').addEventListener('click', async () => {
|
||||
const file = getSubmitFile();
|
||||
if (!file) return;
|
||||
const btn = document.getElementById('submit-btn');
|
||||
btn.disabled = true;
|
||||
showStatus('submit-status', '<span class="loading"></span> Embedding image...', 'info');
|
||||
|
||||
const form = new FormData();
|
||||
form.append('file', file);
|
||||
form.append('description', document.getElementById('submit-description').value);
|
||||
|
||||
try {
|
||||
const resp = await fetch('/api/submit', { method: 'POST', body: form });
|
||||
const data = await resp.json();
|
||||
if (resp.ok) {
|
||||
showStatus('submit-status',
|
||||
'Image stored! ID: ' + data.id + ' — ' + data.total_images + ' total images in DB', 'success');
|
||||
document.getElementById('submit-description').value = '';
|
||||
loadStats();
|
||||
} else {
|
||||
showStatus('submit-status', 'Error: ' + JSON.stringify(data), 'error');
|
||||
}
|
||||
} catch(e) {
|
||||
showStatus('submit-status', 'Request failed: ' + e.message, 'error');
|
||||
}
|
||||
btn.disabled = false;
|
||||
});
|
||||
|
||||
// Search
|
||||
document.getElementById('search-btn').addEventListener('click', async () => {
|
||||
const file = getSearchFile();
|
||||
if (!file) return;
|
||||
const btn = document.getElementById('search-btn');
|
||||
btn.disabled = true;
|
||||
showStatus('search-status', '<span class="loading"></span> Embedding and searching...', 'info');
|
||||
document.getElementById('search-results').innerHTML = '';
|
||||
|
||||
const form = new FormData();
|
||||
form.append('file', file);
|
||||
|
||||
try {
|
||||
const resp = await fetch('/api/search', { method: 'POST', body: form });
|
||||
const data = await resp.json();
|
||||
if (resp.ok) {
|
||||
if (data.results.length === 0) {
|
||||
showStatus('search-status', data.message || 'No results found.', 'info');
|
||||
} else {
|
||||
showStatus('search-status', 'Found ' + data.results.length + ' results', 'success');
|
||||
const container = document.getElementById('search-results');
|
||||
const escape = s => (s || '').replace(/[&<>"']/g, c =>
|
||||
({'&':'&','<':'<','>':'>','"':'"',"'":'''}[c]));
|
||||
data.results.forEach((r, i) => {
|
||||
const card = document.createElement('div');
|
||||
card.className = 'result-card';
|
||||
const descHtml = r.description
|
||||
? `<div class="card-desc">${escape(r.description)}</div>` : '';
|
||||
card.innerHTML = `
|
||||
<img src="/api/images/${r.filename}" alt="${escape(r.original_name)}">
|
||||
<div class="card-info">
|
||||
<span class="rank">#${i + 1}</span>
|
||||
<span class="similarity">${(r.similarity * 100).toFixed(1)}%</span>
|
||||
<div class="card-name">${escape(r.original_name)}</div>
|
||||
${descHtml}
|
||||
</div>`;
|
||||
container.appendChild(card);
|
||||
});
|
||||
}
|
||||
} else {
|
||||
showStatus('search-status', 'Error: ' + JSON.stringify(data), 'error');
|
||||
}
|
||||
} catch(e) {
|
||||
showStatus('search-status', 'Request failed: ' + e.message, 'error');
|
||||
}
|
||||
btn.disabled = false;
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
20
examples/image_meaning_db/docker-compose.yml
Normal file
@ -0,0 +1,20 @@
|
||||
services:
|
||||
backend:
|
||||
build: ./backend
|
||||
ports:
|
||||
- "8081:8080"
|
||||
volumes:
|
||||
- image_store:/app/images
|
||||
- chroma_data:/app/chroma_data
|
||||
- hf_cache:/root/.cache/huggingface
|
||||
environment:
|
||||
- EMBEDDING_MODEL=clip-ViT-B-32
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
memory: 4G
|
||||
|
||||
volumes:
|
||||
image_store:
|
||||
chroma_data:
|
||||
hf_cache:
|
||||
BIN
images/CodingAIWorkshop_ShenRyan.png
Executable file
|
After Width: | Height: | Size: 1.7 MiB |
BIN
images/CodingAIWorkshop_ShenRyan_V1.png
Executable file
|
After Width: | Height: | Size: 2.0 MiB |
BIN
images/CodingAIWorkshop_ShenRyan_V2.png
Executable file
|
After Width: | Height: | Size: 2.0 MiB |
BIN
images/CodingAIWorkshop_ShenRyan_V3.png
Executable file
|
After Width: | Height: | Size: 1.6 MiB |
BIN
images/CodingAIWorkshop_ShenRyan_V4.png
Executable file
|
After Width: | Height: | Size: 1.7 MiB |
BIN
images/CodingAIWorkshop_ShenRyan_V5.png
Executable file
|
After Width: | Height: | Size: 1.7 MiB |
BIN
images/CodingAIWorkshop_ShenRyan_printable.png
Executable file
|
After Width: | Height: | Size: 1.6 MiB |
68
personal-project.md
Normal file
@ -0,0 +1,68 @@
|
||||
# Your Personal Project
|
||||
|
||||
Every person in this class is working toward one thing: a personal project that solves some problem in your life, or that you just think would be cool. There are no wrong answers, but let's collaborate and learn about what's possible as we go, so there's a good chance you'll like the results. We call it your *personal project*. It's the through-line of everything else we do.
|
||||
|
||||
---
|
||||
|
||||
## How to think about it
|
||||
|
||||
A personal project is **something you want to exist** — that doesn't currently exist for you in a form that works.
|
||||
|
||||
It can be small. It probably *should* be small to start. The goal is not to build a startup. The goal is to take one piece of friction out of your life and replace it with something you control.
|
||||
|
||||
Some shapes a personal project can take:
|
||||
|
||||
- A script that takes a folder of messy files and renames or organizes them by content.
|
||||
- A tool that searches your photos by what's in them, not by date or filename.
|
||||
- An assistant that reads a long PDF (a contract, a manual, a research paper) and answers your specific questions about it.
|
||||
- A workflow that pulls data from somewhere you check often and gives you a summary on a schedule.
|
||||
- A small program that does one tedious thing your job currently asks you to do by hand.
|
||||
- An interface for a hobby — your books, your recipes, your training data, your woodworking inventory.
|
||||
- Something for someone you care about — a parent who can't navigate menus, a kid who'd love a custom learning tool.
|
||||
|
||||
None of those are a *requirement*. They're just to get the shape across.
|
||||
|
||||
---
|
||||
|
||||
## You don't need one yet
|
||||
|
||||
You do not need to walk in knowing what your project is.
|
||||
|
||||
Most people don't, the first time they're asked. Friction in your own life is often invisible — you've adapted around it for so long that you no longer see it as friction. Part of what we'll work on together is **noticing**.
|
||||
|
||||
Helpful questions, to sit with rather than answer immediately:
|
||||
|
||||
- What do I do every week that I find tedious?
|
||||
- What information do I keep losing track of?
|
||||
- What is something I wish my computer would just *do*, that it doesn't?
|
||||
- If I had a tireless, patient assistant who could read, write, and run code, what's the first thing I'd point them at?
|
||||
|
||||
If something rises to the top, write it down. If nothing does, that's also fine. Bring an open mind to the next session and we'll work on it together.
|
||||
|
||||
---
|
||||
|
||||
## What we (the instructors) actually do
|
||||
|
||||
We are not graders. We are collaborators with more practice than you have at this specific thing.
|
||||
|
||||
Concretely, our role is to:
|
||||
|
||||
- **Help you see what's possible.** Most of the limits you imagine on your devices are not real. We'll show you the real ones.
|
||||
- **Help you frame the problem.** A vague wish ("I wish my email was less of a mess") becomes a tractable project ("a script that flags all unread email older than 30 days from senders I've never replied to") — but that translation is a skill, and we have it.
|
||||
- **Get you unstuck.** When something fails — and things will fail — we help you debug, redirect, or pick a different approach.
|
||||
- **Push back when it helps.** If your project is too ambitious for the time you have, or too small to be worth the work, we'll say so. You can ignore us.
|
||||
|
||||
---
|
||||
|
||||
## What we *don't* do
|
||||
|
||||
- We don't pick your project for you.
|
||||
- We don't write your project for you. AI will do most of the writing; you do the steering. We help with the steering.
|
||||
- We don't grade or evaluate your project. There is no rubric.
|
||||
- We don't lock you into any particular tool, vendor, or stack. Your project is yours, on your machine, in formats you control.
|
||||
|
||||
---
|
||||
|
||||
## A closing note
|
||||
|
||||
If at the end of this class you have a small thing running on your computer that does one useful thing for *you* specifically — something that didn't exist before, that you can keep using or modify whenever you want — then this class worked. That's the whole bar.
|
||||
32
reference/README.md
Normal file
@ -0,0 +1,32 @@
|
||||
# Reference
|
||||
|
||||
Background material to dip into when your project pulls you toward something you don't yet know. None of it is required reading. The class is driven by your project; this folder exists so that when a project hits a wall, there's something concrete to point at.
|
||||
|
||||
There are two flavors of reference here, and they're meant to be used differently.
|
||||
|
||||
## Tools and tech (hands-on)
|
||||
|
||||
Short, self-paced primers on the things you'll most often end up touching. Read them when you have a reason to, not before.
|
||||
|
||||
| Folder | What it covers |
|
||||
|--------|----------------|
|
||||
| [`python/`](python/) | Python basics — enough to read and tweak the code AI writes for you |
|
||||
| [`git/`](git/) | Tracking changes, undoing mistakes, working on more than one thing at once |
|
||||
| [`github/`](github/) | Putting code somewhere others (or future-you) can find it |
|
||||
| [`huggingface/`](huggingface/) | Where most open-weight models live; how to grab one and use it |
|
||||
| [`pytorch/`](pytorch/) | The framework most modern models are written in |
|
||||
| [`docker/`](docker/) | Running other people's software without polluting your own machine |
|
||||
|
||||
More will appear here as projects surface the need for them.
|
||||
|
||||
## Papers (reading)
|
||||
|
||||
A small, opinionated set of papers that, taken together, give you a feel for *how we got here*. Skim them, read the abstracts, or just look at the dates and the names — the trajectory matters more than any individual result.
|
||||
|
||||
See [`papers/`](papers/).
|
||||
|
||||
## How to use this folder
|
||||
|
||||
- **Don't binge it.** Reading reference material in the abstract is the slowest way to learn it. Wait until your project gives you a reason.
|
||||
- **Skim, then dive.** Read the README of a topic before reading any of the lessons. You may find you only need one section.
|
||||
- **Ask AI as you go.** These primers are starting points, not textbooks. If something doesn't click, paste the snippet at an AI and ask.
|
||||
89
reference/docker/00_installing-docker.md
Normal file
@ -0,0 +1,89 @@
|
||||
# Installing Docker
|
||||
|
||||
You only need this if your project leads you into Docker. Skip it otherwise.
|
||||
|
||||
There are two flavors of Docker install:
|
||||
|
||||
- **Docker Desktop** — a graphical app for Mac and Windows that bundles the Docker engine, a small Linux VM, and a GUI. The easiest path on those platforms.
|
||||
- **Docker Engine** — the command-line daemon, no GUI. The standard way to install Docker on Linux.
|
||||
|
||||
Both run the same `docker` command. Pick whichever fits your OS.
|
||||
|
||||
## Windows
|
||||
|
||||
1. Go to https://docs.docker.com/desktop/install/windows-install/
|
||||
2. Download **Docker Desktop for Windows**.
|
||||
3. Run the installer. Note if they ask for your email, you can just skip that part, it isn't necessary.
|
||||
4. When asked, leave **"Use WSL 2 instead of Hyper-V"** checked (this is the modern, recommended backend).
|
||||
5. Restart your machine when prompted.
|
||||
6. Launch Docker Desktop from the Start menu. The first launch takes a minute as it sets up.
|
||||
|
||||
Verify in PowerShell or Command Prompt:
|
||||
|
||||
```
|
||||
docker --version
|
||||
docker compose version
|
||||
```
|
||||
|
||||
You should see versions printed for both.
|
||||
|
||||
> Note: Docker Desktop is free for personal use, students, education, and small businesses. Read the license terms if you'll use it at a larger company — they may require a paid subscription. On Linux you don't need Docker Desktop at all.
|
||||
|
||||
## Mac
|
||||
|
||||
1. Go to https://docs.docker.com/desktop/install/mac-install/
|
||||
2. Download the version for your chip — **Apple Silicon** (M1/M2/M3/M4) or **Intel**. If you're not sure: click the Apple menu → About This Mac. If it says "Apple M…" pick Apple Silicon.
|
||||
3. Open the `.dmg` and drag **Docker** to your Applications folder.
|
||||
4. Launch Docker from Applications. Grant the permissions it asks for. You'll see a whale icon in your menu bar when it's running.
|
||||
|
||||
Verify in Terminal:
|
||||
|
||||
```
|
||||
docker --version
|
||||
docker compose version
|
||||
```
|
||||
|
||||
## Linux
|
||||
|
||||
On Linux you install the engine directly — no Desktop app required.
|
||||
|
||||
The cleanest path is the official convenience script, which handles all major distros:
|
||||
|
||||
```
|
||||
curl -fsSL https://get.docker.com | sh
|
||||
```
|
||||
|
||||
Then add your user to the `docker` group so you don't have to `sudo` every command:
|
||||
|
||||
```
|
||||
sudo usermod -aG docker $USER
|
||||
```
|
||||
|
||||
**Log out and log back in** for that group change to take effect.
|
||||
|
||||
Verify:
|
||||
|
||||
```
|
||||
docker --version
|
||||
docker compose version
|
||||
docker run hello-world
|
||||
```
|
||||
|
||||
If you'd rather use your distribution's package manager directly, the official instructions live at https://docs.docker.com/engine/install/ — pick your distro from the sidebar.
|
||||
|
||||
## A note on `docker compose` vs `docker-compose`
|
||||
|
||||
- `docker compose` (two words, a space) is the modern V2 plugin. Use this.
|
||||
- `docker-compose` (one word, a hyphen) is the legacy V1 tool. Older tutorials use it. The commands are nearly identical. If you see `docker-compose up` somewhere, mentally translate it to `docker compose up`.
|
||||
|
||||
Modern Docker installs include the V2 plugin out of the box. You should not need to install `docker-compose` separately.
|
||||
|
||||
## Did it work?
|
||||
|
||||
The classic smoke test. From any terminal:
|
||||
|
||||
```
|
||||
docker run hello-world
|
||||
```
|
||||
|
||||
You should see a friendly message ending with "Hello from Docker!". If you see that, you're done — head back to [`01_what_is_docker.md`](01_what_is_docker.md).
|
||||
87
reference/docker/01_what_is_docker.md
Normal file
@ -0,0 +1,87 @@
|
||||
# Lesson 01: What is Docker?
|
||||
|
||||
Before any commands, the mental model. Without it, the commands feel like incantations.
|
||||
|
||||
## The problem Docker solves
|
||||
|
||||
You want to run a piece of software — somebody else's web app, a database, a model server, a tool from a research paper. Normally, that means:
|
||||
|
||||
1. Install the right version of Python (or Node, or Java, or…).
|
||||
2. Install the right system libraries it depends on. Hope they don't conflict with the ones already on your machine.
|
||||
3. Set environment variables. Edit a config file. Maybe set up a database the app expects.
|
||||
4. Pray that the README was up to date.
|
||||
|
||||
Multiply that by every project you want to try, and your laptop becomes a tangled mess. Worse, when it finally works on your machine, it'll behave differently on a coworker's machine or on a server because *their* tangle is different from yours.
|
||||
|
||||
Docker fixes this by giving every piece of software its own little box — a **container** — that already has everything it needs inside it. The box runs on your machine, but its contents are isolated from the rest of your system. You can have ten of them running simultaneously, each with completely different Python versions, system libraries, even different Linux distributions inside, and none of them step on each other or on you.
|
||||
|
||||
## What a container actually is
|
||||
|
||||
A container is a process running on your computer, just like any other program — but the operating system has lied to that process about what's around it. From inside, the process sees:
|
||||
|
||||
- Its own filesystem (a tiny Linux install, or whatever the container's image provides)
|
||||
- Its own network interface
|
||||
- Its own process list (it can't see your other programs)
|
||||
|
||||
But it shares your computer's kernel, your CPU, and (if you ask) your network and folders. That's why containers are dramatically lighter than virtual machines: a VM ships a whole operating system plus a kernel and runs it on top of yours. A container ships just the parts above the kernel and uses yours. A VM takes minutes to boot and gigabytes of disk. A container takes seconds and tens of megabytes.
|
||||
|
||||
```
|
||||
Heavy: Light:
|
||||
┌─────────────┐ ┌──────────┐ ┌──────────┐
|
||||
│ App │ │ App A │ │ App B │
|
||||
├─────────────┤ ├──────────┤ ├──────────┤
|
||||
│ Libs │ │ Libs A │ │ Libs B │
|
||||
├─────────────┤ └────┬─────┘ └─────┬────┘
|
||||
│ Guest OS │ ← VM │ │
|
||||
├─────────────┤ └──── Docker ─┘
|
||||
│ Hypervisor │ │
|
||||
├─────────────┤ ┌──────┴───────┐
|
||||
│ Host OS │ │ Host OS │
|
||||
└─────────────┘ └──────────────┘
|
||||
```
|
||||
|
||||
## Free and open source
|
||||
|
||||
The pieces that actually run containers — `containerd`, `runc`, the Docker engine — are open source under permissive licenses. The format containers use (OCI, the Open Container Initiative spec) is an open standard, so containers built with Docker also run on alternative engines like Podman or Kubernetes' runtime. You are not locked into anyone's product.
|
||||
|
||||
There is a company called Docker, Inc. that makes a commercial GUI app (Docker Desktop) and a hosted image registry (Docker Hub). You can use both for free for personal projects, or you can skip both entirely and run pure open-source Docker on Linux pushing to your own registry. Nothing in this primer requires you to pay anyone.
|
||||
|
||||
## Why it's a good toolbox item, even if you're not "deploying" anything
|
||||
|
||||
The same idea — "describe an environment in a file, run it in a box" — works at every scale:
|
||||
|
||||
- **Single command on your laptop.** `docker run somebody/cool-thing` and a moment later it's running. No system pollution.
|
||||
- **A whole project's worth of services.** One `docker-compose.yml` file describes a database, a backend, a worker, a frontend — and `docker compose up` brings the whole thing online. Stop it with one command. Delete it with one command.
|
||||
- **Cloud deployment.** The same image you ran locally can be pushed to a registry and pulled down on a server. The behavior is identical.
|
||||
- **Industrial scale.** Kubernetes — what most of the modern internet runs on — schedules containers. The container you build today is, structurally, the same artifact those systems orchestrate.
|
||||
|
||||
You don't have to care about all of those. You can stop at "I can run somebody's cool GitHub project without breaking my laptop" and you've already gotten huge value from Docker.
|
||||
|
||||
## Why it supersedes per-language environments
|
||||
|
||||
If you've used `venv`, `conda`, `pyenv`, `pipenv`, `poetry`, `rbenv`, `nvm`, or `asdf` — Docker is doing the same job (isolation) but at the operating-system layer instead of the language layer. That has two big advantages:
|
||||
|
||||
1. **It's language-agnostic.** A Python project, a Rust binary, and a PostgreSQL database all use the same isolation mechanism. You learn it once.
|
||||
2. **It captures *system* dependencies too.** Many Python projects need a system library (`ffmpeg`, `libpq`, CUDA drivers, etc.). A `venv` can't install those. A Dockerfile can.
|
||||
|
||||
This doesn't mean you should never use `venv` again — for a one-file script, `venv` is fine. But the moment a project starts needing more than one language, more than one service, or system-level dependencies, Docker becomes the easier path.
|
||||
|
||||
## The two things you'll keep hearing
|
||||
|
||||
Two words, often used loosely, that mean very specific things:
|
||||
|
||||
- **Image** — the recipe. A frozen, read-only snapshot of a filesystem plus instructions for what to run when it starts. Images have names like `python:3.12-slim` or `postgres:16`.
|
||||
- **Container** — a running instance of an image. You can have many containers from the same image, like running the same `.exe` twice.
|
||||
|
||||
Lesson 04 belabors this distinction, because confusing the two is the most common stumbling block when starting out.
|
||||
|
||||
## Try it yourself
|
||||
|
||||
Before doing anything else: do you have Docker installed? If not, see [`installing-docker.md`](installing-docker.md). Otherwise run:
|
||||
|
||||
```bash
|
||||
docker --version
|
||||
docker compose version
|
||||
```
|
||||
|
||||
If both print versions, you're ready for [`02_hello_world.md`](02_hello_world.md).
|
||||
87
reference/docker/02_hello_world.md
Normal file
@ -0,0 +1,87 @@
|
||||
# Lesson 02: Hello World
|
||||
|
||||
Yes — Docker has a `hello-world` image, just like programming languages have a "hello world" program. Running it is the canonical first step.
|
||||
|
||||
Open a terminal and run:
|
||||
|
||||
```bash
|
||||
docker run hello-world
|
||||
```
|
||||
|
||||
You should see something like:
|
||||
|
||||
```
|
||||
Unable to find image 'hello-world:latest' locally
|
||||
latest: Pulling from library/hello-world
|
||||
...
|
||||
Status: Downloaded newer image for hello-world:latest
|
||||
|
||||
Hello from Docker!
|
||||
This message shows that your installation appears to be working correctly.
|
||||
|
||||
To generate this message, Docker took the following steps:
|
||||
1. The Docker client contacted the Docker daemon.
|
||||
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
|
||||
3. The Docker daemon ran a new container from that image, which runs the
|
||||
executable that produces the output you are currently reading.
|
||||
4. The Docker daemon streamed that output to the Docker client, which
|
||||
sent it to your terminal.
|
||||
...
|
||||
```
|
||||
|
||||
## What just happened
|
||||
|
||||
`docker run hello-world` is a small command doing a surprising amount of work:
|
||||
|
||||
1. You asked Docker for an image named `hello-world`.
|
||||
2. Docker checked your local image cache. Not there.
|
||||
3. Docker reached out to **Docker Hub** — a public registry of images — and downloaded it.
|
||||
4. Docker created a new container from that image and started the program inside.
|
||||
5. The program printed its message and exited.
|
||||
6. The container stopped (but still exists; we'll deal with that in lesson 04).
|
||||
|
||||
The whole thing took a few seconds and didn't install anything onto your system the way a normal program would. The `hello-world` "program" lives entirely inside the image's filesystem.
|
||||
|
||||
## The shape of every `docker run` command
|
||||
|
||||
```bash
|
||||
docker run [options] IMAGE [command and arguments]
|
||||
```
|
||||
|
||||
- `IMAGE` is the name of the image to run (`hello-world`, `ubuntu`, `python:3.12`, etc.).
|
||||
- `[options]` go *before* the image name. (Order matters; this catches everyone at first.)
|
||||
- `[command and arguments]` go *after* the image, and override the image's default command.
|
||||
|
||||
For `hello-world` the image's default command is "print the welcome message and exit", and we didn't override it.
|
||||
|
||||
## Run it again
|
||||
|
||||
```bash
|
||||
docker run hello-world
|
||||
```
|
||||
|
||||
This time it skips the download — the image is already in your local cache — and just runs the container. Almost instant.
|
||||
|
||||
## A first peek at what's there
|
||||
|
||||
Docker stores every image it has downloaded. List them:
|
||||
|
||||
```bash
|
||||
docker images
|
||||
```
|
||||
|
||||
You should see `hello-world` with a size of around 14 kilobytes. That's the entire program plus its filesystem. (For comparison, an installed copy of, say, Python 3 takes hundreds of megabytes.)
|
||||
|
||||
You can also list every container that's ever existed, even stopped ones:
|
||||
|
||||
```bash
|
||||
docker ps -a
|
||||
```
|
||||
|
||||
You'll see two entries — one for each time you ran `hello-world`. They both ran for a fraction of a second and then exited. Don't worry about them; lesson 04 covers cleaning up.
|
||||
|
||||
## Try it yourself
|
||||
|
||||
1. Run `docker run hello-world` a third time. Notice how fast it is now that the image is cached.
|
||||
2. Run `docker images` and find the row for `hello-world`. The `IMAGE ID` column shows its content hash — that's how Docker uniquely identifies images under the hood.
|
||||
3. Move on to [`03_running_containers.md`](03_running_containers.md), where we'll run something more interesting than a "hello".
|
||||
115
reference/docker/03_running_containers.md
Normal file
@ -0,0 +1,115 @@
|
||||
# Lesson 03: Running Containers (and Pretending You're on a Different OS)
|
||||
|
||||
Now we run something with substance. This lesson is mostly about feeling, in your hands, what container isolation actually means.
|
||||
|
||||
## Step into Ubuntu — even if you're on a Mac or Windows
|
||||
|
||||
```bash
|
||||
docker run -it ubuntu bash
|
||||
```
|
||||
|
||||
The new flags:
|
||||
|
||||
- `-i` — keep STDIN open so you can type into the container.
|
||||
- `-t` — allocate a pseudo-TTY so the terminal behaves like a real shell.
|
||||
- (`-it` is just `-i` and `-t` stuck together — you'll see this everywhere.)
|
||||
|
||||
`ubuntu` is the image name. `bash` is the command to run inside, overriding the image's default.
|
||||
|
||||
After a brief download, your prompt changes to something like:
|
||||
|
||||
```
|
||||
root@a1b2c3d4e5f6:/#
|
||||
```
|
||||
|
||||
You are now in a shell *inside* an Ubuntu container. Try a few commands:
|
||||
|
||||
```bash
|
||||
cat /etc/os-release # confirm it really is Ubuntu
|
||||
ls / # the container's root filesystem
|
||||
whoami # root, by default, inside a container
|
||||
apt-get update # this Ubuntu's package manager (works even on a Mac)
|
||||
apt-get install -y cowsay # install something
|
||||
/usr/games/cowsay "I am running inside Docker"
|
||||
```
|
||||
|
||||
You just installed `cowsay` into an Ubuntu system on your machine without touching your real machine at all. To leave:
|
||||
|
||||
```bash
|
||||
exit
|
||||
```
|
||||
|
||||
You're back on your real machine. None of those packages are installed on your host. The container is still around (stopped, but present) — we'll clean it up in the next lesson.
|
||||
|
||||
## Now try Alpine — a totally different distro
|
||||
|
||||
```bash
|
||||
docker run -it alpine sh
|
||||
```
|
||||
|
||||
Alpine is a tiny Linux distro often used for containers (the image is around 5MB). It uses `sh` instead of `bash` and `apk` instead of `apt-get`:
|
||||
|
||||
```sh
|
||||
cat /etc/os-release # now it's Alpine
|
||||
apk add --no-cache cowsay
|
||||
cowsay "Different OS, same laptop"
|
||||
exit
|
||||
```
|
||||
|
||||
Two completely different Linux distributions, one after the other, on the same physical machine, neither one leaving a trace once it stops.
|
||||
|
||||
## Now Python, with no Python installed on your host
|
||||
|
||||
```bash
|
||||
docker run -it python:3.12 python
|
||||
```
|
||||
|
||||
You're dropped into a Python 3.12 REPL — running inside a container that has Python 3.12 installed, even if your real machine has Python 3.10, Python 2, or no Python at all. Type:
|
||||
|
||||
```python
|
||||
import sys
|
||||
sys.version
|
||||
exit()
|
||||
```
|
||||
|
||||
You just used Python 3.12 without installing it.
|
||||
|
||||
## The `:tag` part
|
||||
|
||||
Notice `python:3.12`. The part after the colon is the **tag**, usually a version. If you leave it off (`python`), Docker assumes `:latest` which is whatever the maintainers most recently published. You should usually pin a version explicitly so your work is reproducible.
|
||||
|
||||
Some examples of tags you'll see:
|
||||
|
||||
- `python:3.12` — Python 3.12 on a default base.
|
||||
- `python:3.12-slim` — same Python, on a much smaller Debian base. Faster to download, smaller disk footprint.
|
||||
- `python:3.12-alpine` — same Python, on the tiny Alpine base. Even smaller. Occasionally causes issues with packages that expect a glibc-based system.
|
||||
- `ubuntu:22.04`, `ubuntu:24.04`, `ubuntu:latest` — Ubuntu by version number.
|
||||
- `postgres:16`, `redis:7`, `nginx:1.27` — pick the major version you want.
|
||||
|
||||
When in doubt, look up the image on Docker Hub (https://hub.docker.com/) and read its tag list.
|
||||
|
||||
## Run a container in the background
|
||||
|
||||
Some containers (web servers, databases) are meant to keep running. Use `-d` (detached) to start one in the background:
|
||||
|
||||
```bash
|
||||
docker run -d --name webserver nginx
|
||||
```
|
||||
|
||||
`--name webserver` gives the container a memorable name instead of a random one like `peaceful_einstein`. Visit `http://localhost` … wait, you can't yet. We haven't told Docker to expose its port. That's lesson 07. For now, just check it's running:
|
||||
|
||||
```bash
|
||||
docker ps
|
||||
```
|
||||
|
||||
You should see one running container. Stop it:
|
||||
|
||||
```bash
|
||||
docker stop webserver
|
||||
```
|
||||
|
||||
## Try it yourself
|
||||
|
||||
1. Run an Ubuntu container and a Debian container side-by-side in two terminal windows. Note that `cat /etc/os-release` shows different distros in each, even though they share your machine.
|
||||
2. Run `docker run -it python:3.11 python` and then `docker run -it python:3.12 python`. Two different Python versions, on the same laptop, in the same minute, without installing anything.
|
||||
3. Move on to [`04_images_vs_containers.md`](04_images_vs_containers.md) — by now you've created several containers and you may be wondering where they all went.
|
||||
130
reference/docker/04_images_vs_containers.md
Normal file
@ -0,0 +1,130 @@
|
||||
# Lesson 04: Images vs. Containers
|
||||
|
||||
This is the distinction that separates "I'm fumbling" from "I get it." Internalize this and most of Docker stops feeling magical.
|
||||
|
||||
## The recipe vs. the cake
|
||||
|
||||
- **Image** — a frozen, read-only snapshot of a filesystem with a default command attached. The "recipe." Doesn't run, doesn't change. Examples: `ubuntu:24.04`, `python:3.12-slim`, `postgres:16`.
|
||||
- **Container** — a running (or stopped) instance of an image. The "cake." Each container is its own thing, with its own writable layer on top of the image's read-only one.
|
||||
|
||||
You can make many containers from one image, the same way you can bake many cakes from one recipe. Each container changes independently. None of them changes the image.
|
||||
|
||||
```
|
||||
IMAGE: python:3.12 (read-only — like a class)
|
||||
│
|
||||
├─ CONTAINER A (a running instance — like an object)
|
||||
├─ CONTAINER B (another instance, unaffected by A)
|
||||
└─ CONTAINER C
|
||||
```
|
||||
|
||||
If you're a programmer: an image is like a class, a container is like an instance of that class. Multiple instances; each has its own state.
|
||||
|
||||
## Inspect what you have
|
||||
|
||||
```bash
|
||||
docker images
|
||||
```
|
||||
|
||||
Lists every image on your machine. Columns: repository name, tag, image ID, age, size.
|
||||
|
||||
```bash
|
||||
docker ps
|
||||
```
|
||||
|
||||
Lists **running** containers only.
|
||||
|
||||
```bash
|
||||
docker ps -a
|
||||
```
|
||||
|
||||
Lists **all** containers — running, stopped, exited, whatever. By now you probably have a pile of stopped containers from the previous lessons.
|
||||
|
||||
Each container has a name (random or chosen with `--name`) and an ID (a long hex string). You can refer to a container by either; the first few characters of the ID usually suffice.
|
||||
|
||||
## Stop, start, restart
|
||||
|
||||
```bash
|
||||
docker stop <name-or-id> # send SIGTERM, then SIGKILL after 10s
|
||||
docker start <name-or-id> # start a stopped container again
|
||||
docker restart <name-or-id> # stop and start
|
||||
```
|
||||
|
||||
A stopped container still exists on disk — its filesystem state is preserved. Starting it again resumes from where it left off (process exits, but the writable layer persists until you remove the container).
|
||||
|
||||
## Get rid of containers
|
||||
|
||||
```bash
|
||||
docker rm <name-or-id> # remove a stopped container
|
||||
docker rm -f <name-or-id> # force: stop and remove in one shot
|
||||
```
|
||||
|
||||
To clean up *all* stopped containers at once:
|
||||
|
||||
```bash
|
||||
docker container prune
|
||||
```
|
||||
|
||||
It will ask for confirmation. This frees disk and de-clutters `docker ps -a`. It does not delete images.
|
||||
|
||||
## Get rid of images
|
||||
|
||||
```bash
|
||||
docker rmi <name-or-id> # remove an image
|
||||
```
|
||||
|
||||
You can't remove an image while a container (running or stopped) is using it. Remove the containers first, or use `docker rmi -f` to force.
|
||||
|
||||
To clean up all images that aren't used by any container:
|
||||
|
||||
```bash
|
||||
docker image prune -a
|
||||
```
|
||||
|
||||
Again, asks for confirmation. Saves a lot of disk on a machine where you've been experimenting.
|
||||
|
||||
## The four-step lifecycle
|
||||
|
||||
This is essentially all of Docker's container model:
|
||||
|
||||
```
|
||||
docker pull docker run docker stop docker rm
|
||||
────────────► ────────────► ────────────► ────────────►
|
||||
image cached container running container stopped container gone
|
||||
(or built locally) (still on disk)
|
||||
```
|
||||
|
||||
`docker run` is actually two steps: create a container, then start it. Most of the time you don't care, but it's why you'll occasionally see `docker create` followed by `docker start` in scripts.
|
||||
|
||||
## A few extra moves you'll want
|
||||
|
||||
Run a command inside a *running* container (useful for debugging):
|
||||
|
||||
```bash
|
||||
docker exec -it <name-or-id> bash
|
||||
```
|
||||
|
||||
This drops you into a shell inside an already-running container. Different from `docker run`, which starts a *new* container.
|
||||
|
||||
Stream the logs of a running container:
|
||||
|
||||
```bash
|
||||
docker logs -f <name-or-id>
|
||||
```
|
||||
|
||||
`-f` means "follow" — like `tail -f`. Press `Ctrl+C` to stop following (the container keeps running).
|
||||
|
||||
Inspect details:
|
||||
|
||||
```bash
|
||||
docker inspect <name-or-id>
|
||||
```
|
||||
|
||||
Dumps a giant JSON blob with everything Docker knows about a container or image. Most of the time you don't need it; when you do, it's there.
|
||||
|
||||
## Try it yourself
|
||||
|
||||
1. Run `docker ps -a` and count how many stopped containers you've accumulated from lessons 02 and 03.
|
||||
2. Pick one of the stopped Ubuntu containers and run `docker start -ai <name>` to wake it back up and reattach. You'll find yourself back inside that container's filesystem, with whatever changes you made still there.
|
||||
3. Now run `docker container prune` to clean them all up. (Confirm with `y`.) Then `docker ps -a` should be empty (or nearly so).
|
||||
4. Run `docker images` to see what's left. Even with no containers, the *images* are still there, ready for the next `docker run`.
|
||||
5. Move on to [`05_dockerfiles.md`](05_dockerfiles.md) where we'll build our own image instead of using other people's.
|
||||
151
reference/docker/05_dockerfiles.md
Normal file
@ -0,0 +1,151 @@
|
||||
# Lesson 05: Writing a Dockerfile
|
||||
|
||||
So far we've used images other people made. Now we'll make our own. The recipe is just a text file called `Dockerfile`.
|
||||
|
||||
## The smallest useful Dockerfile
|
||||
|
||||
Make a new folder somewhere on your machine. We'll call it `my-first-image/`.
|
||||
|
||||
```bash
|
||||
mkdir my-first-image
|
||||
cd my-first-image
|
||||
```
|
||||
|
||||
Inside it, create two files.
|
||||
|
||||
**`app.py`** — a tiny Python program:
|
||||
|
||||
```python
|
||||
print("Hello from a container I built myself!")
|
||||
```
|
||||
|
||||
**`Dockerfile`** — no file extension, capital D, exactly that name:
|
||||
|
||||
```dockerfile
|
||||
FROM python:3.12-slim
|
||||
WORKDIR /app
|
||||
COPY app.py .
|
||||
CMD ["python", "app.py"]
|
||||
```
|
||||
|
||||
Four lines, four instructions. Reading top to bottom:
|
||||
|
||||
- `FROM python:3.12-slim` — start from an existing image. Our image inherits everything from `python:3.12-slim`. We rarely build images from scratch; we almost always start from someone else's base.
|
||||
- `WORKDIR /app` — set the working directory inside the image to `/app`. Like `cd`-ing into a folder. Creates it if it doesn't exist.
|
||||
- `COPY app.py .` — copy `app.py` from your folder (the "build context") into the image's `/app/` directory.
|
||||
- `CMD ["python", "app.py"]` — set the default command. This is what runs when someone does `docker run` without giving their own command.
|
||||
|
||||
## Build the image
|
||||
|
||||
```bash
|
||||
docker build -t my-first-image .
|
||||
```
|
||||
|
||||
Breaking that down:
|
||||
|
||||
- `docker build` — build an image.
|
||||
- `-t my-first-image` — tag (name) it `my-first-image`.
|
||||
- `.` — use the current directory as the build context (this is where Docker looks for the Dockerfile and any files you `COPY`).
|
||||
|
||||
You'll see Docker work through the Dockerfile step by step. When it's done:
|
||||
|
||||
```bash
|
||||
docker images
|
||||
```
|
||||
|
||||
You'll find `my-first-image` in the list.
|
||||
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
docker run my-first-image
|
||||
```
|
||||
|
||||
Output:
|
||||
|
||||
```
|
||||
Hello from a container I built myself!
|
||||
```
|
||||
|
||||
The container ran your script and exited. The image is yours now — you can hand it to someone else, push it to a registry (lesson 08), or run it on any machine with Docker installed.
|
||||
|
||||
## A more realistic Dockerfile
|
||||
|
||||
Most projects have dependencies. Let's say `app.py` uses the `requests` library.
|
||||
|
||||
**`app.py`**:
|
||||
|
||||
```python
|
||||
import requests
|
||||
r = requests.get("https://api.github.com")
|
||||
print("GitHub API status:", r.status_code)
|
||||
```
|
||||
|
||||
**`requirements.txt`**:
|
||||
|
||||
```
|
||||
requests
|
||||
```
|
||||
|
||||
**`Dockerfile`**:
|
||||
|
||||
```dockerfile
|
||||
FROM python:3.12-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Copy requirements first and install them.
|
||||
# This step is cached separately from your source code,
|
||||
# so changing app.py doesn't reinstall dependencies.
|
||||
COPY requirements.txt .
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
|
||||
# Now copy the rest of the source.
|
||||
COPY . .
|
||||
|
||||
CMD ["python", "app.py"]
|
||||
```
|
||||
|
||||
Two new instructions:
|
||||
|
||||
- `RUN <command>` — run a shell command *during the build*. The result becomes part of the image. Use this to install packages, compile code, set up directories.
|
||||
- The order matters for caching: copying `requirements.txt` and installing it *before* copying the rest of the source means that when you only edit `app.py`, Docker reuses the cached "dependencies installed" layer and just redoes the source copy. This is the single biggest performance trick in Dockerfiles.
|
||||
|
||||
Build and run:
|
||||
|
||||
```bash
|
||||
docker build -t my-app .
|
||||
docker run my-app
|
||||
```
|
||||
|
||||
You should see GitHub's API status code (probably `200`).
|
||||
|
||||
## The most common Dockerfile instructions
|
||||
|
||||
| Instruction | What it does |
|
||||
|-------------|--------------|
|
||||
| `FROM` | Base image to start from. Must be the first instruction. |
|
||||
| `WORKDIR` | Set the working directory for subsequent steps. |
|
||||
| `COPY <src> <dest>` | Copy files from build context into the image. |
|
||||
| `RUN <cmd>` | Run a shell command at build time. Used for installs, etc. |
|
||||
| `ENV KEY=value` | Set an environment variable inside the image. |
|
||||
| `EXPOSE 8080` | Document that the container listens on a port. (Doesn't actually open it — that's `-p` at run time.) |
|
||||
| `CMD ["python", "app.py"]` | Default command when the container starts. |
|
||||
| `ENTRYPOINT ["…"]` | Like `CMD` but harder to override. Use `CMD` until you need `ENTRYPOINT`. |
|
||||
|
||||
## Look at a real Dockerfile
|
||||
|
||||
Open [`../../examples/image_meaning_db/backend/Dockerfile`](../../examples/image_meaning_db/backend/Dockerfile). It's not much bigger than what we just wrote. Real-world Dockerfiles are usually under 30 lines.
|
||||
|
||||
## Layers (a useful 60-second mental model)
|
||||
|
||||
Each instruction in a Dockerfile creates a "layer" — basically a diff on top of the previous one. Layers are cached individually. If you change `app.py` and rebuild, Docker reuses every layer up to the `COPY . .` step, then redoes only that and anything after.
|
||||
|
||||
This is why people write Dockerfiles in a specific order: things that change least frequently (base image, system packages) go near the top. Things that change most (your source code) go near the bottom. Get this right and your rebuilds are seconds instead of minutes.
|
||||
|
||||
## Try it yourself
|
||||
|
||||
1. Write a Dockerfile for a script that uses two libraries (e.g., `requests` and `rich`) and prints something fancy. Build and run.
|
||||
2. Edit just the script (not `requirements.txt`) and rebuild. Notice that Docker reuses the cached pip-install layer.
|
||||
3. Now edit `requirements.txt` and rebuild. Notice that step 2 onwards now re-runs.
|
||||
4. Move on to [`06_volumes_and_persistence.md`](06_volumes_and_persistence.md) — your containers can build files, but they vanish when the container is removed. Time to fix that.
|
||||
141
reference/docker/06_volumes_and_persistence.md
Normal file
@ -0,0 +1,141 @@
|
||||
# Lesson 06: Volumes and Persistence
|
||||
|
||||
By default, containers are **ephemeral**. Whatever a container writes to its own filesystem disappears the moment that container is removed. That's a feature, not a bug — it's what keeps containers clean — but it's a surprise the first time you run a database in a container and then `docker rm` it.
|
||||
|
||||
## See the problem
|
||||
|
||||
```bash
|
||||
docker run -it --name scratch ubuntu bash
|
||||
```
|
||||
|
||||
Inside:
|
||||
|
||||
```bash
|
||||
echo "important data" > /important.txt
|
||||
cat /important.txt
|
||||
exit
|
||||
```
|
||||
|
||||
Now remove the container and try again with a fresh one:
|
||||
|
||||
```bash
|
||||
docker rm scratch
|
||||
docker run -it --name scratch ubuntu bash
|
||||
cat /important.txt # No such file or directory
|
||||
exit
|
||||
docker rm scratch
|
||||
```
|
||||
|
||||
The file existed only inside the old container's writable layer. New container, new layer, no file.
|
||||
|
||||
## Two ways to persist data
|
||||
|
||||
Both are called "mounts" — they connect a path *inside* the container to something that survives the container.
|
||||
|
||||
### 1. Bind mounts — connect to a folder on your host
|
||||
|
||||
A bind mount maps a folder on your real machine to a folder inside the container. Changes show up on both sides instantly.
|
||||
|
||||
```bash
|
||||
mkdir ~/docker-data
|
||||
docker run -it --name scratch \
|
||||
-v ~/docker-data:/data \
|
||||
ubuntu bash
|
||||
```
|
||||
|
||||
The flag `-v <host-path>:<container-path>` is the mount. Inside the container:
|
||||
|
||||
```bash
|
||||
echo "important data" > /data/important.txt
|
||||
exit
|
||||
```
|
||||
|
||||
Back on your host:
|
||||
|
||||
```bash
|
||||
cat ~/docker-data/important.txt
|
||||
```
|
||||
|
||||
The file is there, on your real machine. Remove and re-create the container — the file is still there because it never lived inside the container in the first place.
|
||||
|
||||
Bind mounts are great for:
|
||||
|
||||
- **Development** — mount your source code into the container so edits on your laptop are picked up immediately.
|
||||
- **Configs** — point the container at a config file you maintain on your host.
|
||||
- **"I need to see the files"** — folders you want to open in your file manager.
|
||||
|
||||
> On Windows, host paths look like `C:\Users\you\docker-data` or, in PowerShell, `${PWD}\docker-data`. On Mac/Linux, `~/docker-data` or `$(pwd)/docker-data` works.
|
||||
|
||||
### 2. Named volumes — Docker-managed storage
|
||||
|
||||
A named volume lives somewhere Docker chooses (you don't need to care where), and you refer to it by name.
|
||||
|
||||
```bash
|
||||
docker volume create mydata
|
||||
docker run -it --name scratch \
|
||||
-v mydata:/data \
|
||||
ubuntu bash
|
||||
```
|
||||
|
||||
Inside the container, write to `/data` as before, exit, remove the container, run a fresh one with the same `-v mydata:/data`, and your files are still there.
|
||||
|
||||
Named volumes are better than bind mounts when:
|
||||
|
||||
- You don't care where on disk the data lives — you just want it to *persist*.
|
||||
- The data is "the database's data" or "the model's cache" — internal to the app, not something a human is going to open.
|
||||
- You're running on a server and don't want to commit to specific host paths.
|
||||
|
||||
List your volumes:
|
||||
|
||||
```bash
|
||||
docker volume ls
|
||||
```
|
||||
|
||||
Remove a volume (only when you're sure):
|
||||
|
||||
```bash
|
||||
docker volume rm mydata
|
||||
```
|
||||
|
||||
## A realistic example: persisting a database
|
||||
|
||||
```bash
|
||||
docker run -d \
|
||||
--name pg \
|
||||
-e POSTGRES_PASSWORD=secret \
|
||||
-v pgdata:/var/lib/postgresql/data \
|
||||
postgres:16
|
||||
```
|
||||
|
||||
Run that, connect, create a database, write some rows, then:
|
||||
|
||||
```bash
|
||||
docker stop pg
|
||||
docker rm pg
|
||||
```
|
||||
|
||||
Now run it again with the **same** `-v pgdata:/var/lib/postgresql/data`. Your database is still there. The volume outlived the container.
|
||||
|
||||
Without that volume, removing the container would have erased everything.
|
||||
|
||||
## The danger to know about
|
||||
|
||||
When you remove a stopped container with `docker rm`, the volume *survives*. Good.
|
||||
|
||||
When you bring down a Compose stack (lesson 09) with `docker compose down`, volumes *survive*. Good.
|
||||
|
||||
When you bring it down with `docker compose down -v`, the `-v` is "and also nuke the volumes." That command will silently delete your database data. **`down -v` is destructive. Use it deliberately.**
|
||||
|
||||
Likewise:
|
||||
|
||||
```bash
|
||||
docker volume prune
|
||||
```
|
||||
|
||||
Removes any volume that no container is currently using. Convenient for cleanup; catastrophic if your database container happened to be stopped at the time.
|
||||
|
||||
## Try it yourself
|
||||
|
||||
1. Use a bind mount to share your current directory with a Python container: `docker run -it -v "$(pwd)":/work -w /work python:3.12 bash`. Then `ls /work` inside the container — you should see the same files as on your host.
|
||||
2. Edit a file on your host (in any editor) while that container is running, then `cat` it from inside the container. The change is instant.
|
||||
3. Move on to [`07_ports_and_env.md`](07_ports_and_env.md) where we'll let containers talk to the outside world.
|
||||
125
reference/docker/07_ports_and_env.md
Normal file
@ -0,0 +1,125 @@
|
||||
# Lesson 07: Ports and Environment Variables
|
||||
|
||||
A container by itself is an island. Two things you'll almost always want:
|
||||
|
||||
1. To reach a server running *inside* the container from your browser or curl.
|
||||
2. To pass configuration (API keys, database URLs, settings) *into* the container without baking it into the image.
|
||||
|
||||
## Exposing a port: `-p`
|
||||
|
||||
Run a web server in a container:
|
||||
|
||||
```bash
|
||||
docker run -d --name web -p 8080:80 nginx
|
||||
```
|
||||
|
||||
The new flag:
|
||||
|
||||
- `-p 8080:80` — map host port `8080` to container port `80`. Anyone hitting `http://localhost:8080` on your machine lands on port `80` inside the container.
|
||||
|
||||
Open http://localhost:8080 in your browser. You should see the default nginx welcome page. That web server is running inside an isolated container, but you can talk to it over the network like any other local service.
|
||||
|
||||
Stop it:
|
||||
|
||||
```bash
|
||||
docker stop web
|
||||
docker rm web
|
||||
```
|
||||
|
||||
### Port-mapping rules of thumb
|
||||
|
||||
- Format: `-p HOST_PORT:CONTAINER_PORT`. The first number is what you'll type in your browser. The second is what the program inside the container listens on.
|
||||
- If two containers want port 80, you can't map both to host port 80. Use different host ports: `-p 8081:80` and `-p 8082:80`. Inside each container, both still see port 80.
|
||||
- For local-only access, you can bind to localhost: `-p 127.0.0.1:8080:80`. Now nothing outside your machine can reach it, even if your firewall is open.
|
||||
|
||||
## Passing config: `-e`
|
||||
|
||||
The flag `-e KEY=value` sets an environment variable inside the container.
|
||||
|
||||
```bash
|
||||
docker run --rm \
|
||||
-e GREETING="Hello from the env" \
|
||||
ubuntu bash -c 'echo $GREETING'
|
||||
```
|
||||
|
||||
(`--rm` means "delete this container as soon as it exits" — handy for one-shot commands so you don't accumulate junk.)
|
||||
|
||||
Real-world example: Postgres expects a password via env var.
|
||||
|
||||
```bash
|
||||
docker run -d \
|
||||
--name pg \
|
||||
-e POSTGRES_PASSWORD=secret \
|
||||
-p 5432:5432 \
|
||||
postgres:16
|
||||
```
|
||||
|
||||
Now you have a Postgres server reachable at `localhost:5432`, with no installation footprint on your host.
|
||||
|
||||
## Env files: `--env-file`
|
||||
|
||||
If you have a lot of variables (or you don't want them in your shell history), put them in a file:
|
||||
|
||||
**`.env`**:
|
||||
|
||||
```
|
||||
POSTGRES_USER=workshop
|
||||
POSTGRES_PASSWORD=secret
|
||||
POSTGRES_DB=projects
|
||||
```
|
||||
|
||||
Then:
|
||||
|
||||
```bash
|
||||
docker run -d --name pg \
|
||||
--env-file .env \
|
||||
-p 5432:5432 \
|
||||
postgres:16
|
||||
```
|
||||
|
||||
> Don't commit `.env` files containing secrets to git. Add `.env` to your `.gitignore` and (if you want a checked-in example) commit `.env.example` with placeholder values.
|
||||
|
||||
## A note on networking between containers
|
||||
|
||||
Two containers on the same Docker network can reach each other by name. Docker creates a default network automatically; for multi-container setups you'll usually create your own (or let `docker compose` do it for you, lesson 09).
|
||||
|
||||
Quick taste — two containers on a shared network:
|
||||
|
||||
```bash
|
||||
docker network create mynet
|
||||
docker run -d --network mynet --name api nginx
|
||||
docker run --rm --network mynet curlimages/curl curl -s api
|
||||
```
|
||||
|
||||
The second container reached the first by the name `api`, no IP address required. Compose makes this automatic for you, which is why we recommend going there next.
|
||||
|
||||
Clean up:
|
||||
|
||||
```bash
|
||||
docker stop api && docker rm api
|
||||
docker network rm mynet
|
||||
```
|
||||
|
||||
## Putting it together
|
||||
|
||||
A single command that runs a backend container with everything plugged in:
|
||||
|
||||
```bash
|
||||
docker run -d \
|
||||
--name my-backend \
|
||||
-p 8080:8080 \
|
||||
-e DATABASE_URL=postgres://workshop:secret@db:5432/projects \
|
||||
-e LOG_LEVEL=info \
|
||||
-v ./uploads:/app/uploads \
|
||||
my-backend-image:1.0
|
||||
```
|
||||
|
||||
Read it left to right: in the background, named `my-backend`, port 8080 mapped, two env vars set, a folder mounted, using image `my-backend-image:1.0`.
|
||||
|
||||
This is the kind of `docker run` command that, once it grows past about four flags, gets annoying to retype every time. Which is the perfect motivation for `docker compose`, coming in lesson 09.
|
||||
|
||||
## Try it yourself
|
||||
|
||||
1. Run nginx with `-p 8080:80` and visit it in your browser. Stop it and run it again with `-p 9000:80`. Same image, different host port.
|
||||
2. Run a Postgres container with `-e POSTGRES_PASSWORD=secret` and connect to it from your host with a database tool (DBeaver, `psql`, or `pgcli`). No Postgres installed on your machine, fully working server.
|
||||
3. Move on to [`08_registries.md`](08_registries.md) — how images get from one machine to another.
|
||||
111
reference/docker/08_registries.md
Normal file
@ -0,0 +1,111 @@
|
||||
# Lesson 08: Registries — Pulling and Pushing Images
|
||||
|
||||
A **registry** is a server that stores images. So far, every time you typed `docker run ubuntu`, Docker silently pulled the image from a registry — by default, **Docker Hub** (https://hub.docker.com/).
|
||||
|
||||
Registries are how an image you build on your laptop gets onto a server, or into a colleague's hands.
|
||||
|
||||
## Pulling explicitly
|
||||
|
||||
`docker run` pulls if needed, but you can also pull on its own:
|
||||
|
||||
```bash
|
||||
docker pull postgres:16
|
||||
```
|
||||
|
||||
Useful when you want to grab images ahead of time (before a flight, before a workshop, etc.) without running them.
|
||||
|
||||
## Where images live in image names
|
||||
|
||||
The full form of an image reference is:
|
||||
|
||||
```
|
||||
<registry-host>/<namespace>/<repository>:<tag>
|
||||
```
|
||||
|
||||
Some examples expanded:
|
||||
|
||||
- `ubuntu` → `docker.io/library/ubuntu:latest` (Docker Hub's "official" images live under the `library` namespace, and the registry/tag are filled in by default.)
|
||||
- `python:3.12-slim` → `docker.io/library/python:3.12-slim`.
|
||||
- `ghcr.io/myname/myproject:v1` → an image hosted on **GitHub Container Registry**, under your account, version `v1`.
|
||||
- `registry.example.com/team/service:1.4` → an image in some company's private registry.
|
||||
|
||||
If the registry isn't specified, Docker assumes `docker.io` (Docker Hub).
|
||||
|
||||
## Hosted registry options (and the FOSS take)
|
||||
|
||||
You have choices about where your images live. None of them locks you in — you can move images between registries with `docker pull` and `docker push`.
|
||||
|
||||
- **Docker Hub** — the default. Free for public images. Has rate limits on anonymous pulls (a real annoyance for CI, less so for personal use).
|
||||
- **GitHub Container Registry (`ghcr.io`)** — free for public images, integrates with your GitHub account, no pull rate limit. A good choice if you already use GitHub.
|
||||
- **GitLab Container Registry** — built into GitLab projects.
|
||||
- **Self-hosted** — the reference Docker registry is itself open source. `docker run -d -p 5000:5000 registry:2` and you have your own registry running on your machine, no third party involved. Useful if you want full ownership of your images or you're working offline.
|
||||
|
||||
## Pushing your own image
|
||||
|
||||
1. **Create an account** on whichever registry you want to use. For Docker Hub, sign up at https://hub.docker.com/. For GHCR, you already have one (your GitHub account).
|
||||
|
||||
2. **Log in** from the command line:
|
||||
|
||||
```bash
|
||||
docker login # Docker Hub
|
||||
docker login ghcr.io # GitHub Container Registry
|
||||
```
|
||||
|
||||
You'll be prompted for a username and either a password or a personal access token (recommended; treat it like a password and store it carefully).
|
||||
|
||||
3. **Tag your image** with the destination address. Suppose you built `my-app` in lesson 05 and your Docker Hub username is `yourname`:
|
||||
|
||||
```bash
|
||||
docker tag my-app yourname/my-app:1.0
|
||||
```
|
||||
|
||||
`docker tag` doesn't copy the image — it just adds another name pointing at the same content. Now `docker images` shows both `my-app` and `yourname/my-app:1.0`.
|
||||
|
||||
4. **Push it:**
|
||||
|
||||
```bash
|
||||
docker push yourname/my-app:1.0
|
||||
```
|
||||
|
||||
Docker uploads it layer by layer to the registry.
|
||||
|
||||
5. **Confirm:** anyone (or you, from a different machine) can now run:
|
||||
|
||||
```bash
|
||||
docker pull yourname/my-app:1.0
|
||||
docker run yourname/my-app:1.0
|
||||
```
|
||||
|
||||
That's the whole flow. Tag, push, pull, run.
|
||||
|
||||
## Tags as versions
|
||||
|
||||
Tags are how you publish multiple versions of the same image. Convention is to:
|
||||
|
||||
- Tag specific releases: `yourname/my-app:1.0`, `yourname/my-app:1.1`, …
|
||||
- Also tag the most recent as `:latest` so people who don't specify a tag get something reasonable.
|
||||
|
||||
```bash
|
||||
docker tag yourname/my-app:1.0 yourname/my-app:latest
|
||||
docker push yourname/my-app:latest
|
||||
```
|
||||
|
||||
`:latest` is a convention, not magic. It's just a tag named "latest", and you have to explicitly point it at whatever you want to be "latest".
|
||||
|
||||
## Public vs private
|
||||
|
||||
By default Docker Hub repositories are public. If you want a private one, you set that on the registry's website (Docker Hub has free private repos with limits; GHCR private images are free with reasonable limits tied to your GitHub plan). Pushing to a private repo works exactly the same as pushing to a public one — the only difference is who can pull.
|
||||
|
||||
## A small note on supply-chain caution
|
||||
|
||||
Anyone can publish an image to Docker Hub. When you run `docker run somebody/cool-thing`, you're trusting that "somebody" to not have put anything malicious in there. Use your judgment:
|
||||
|
||||
- **Official images** (the ones in the `library` namespace — `python`, `postgres`, `nginx`, `ubuntu`, etc.) are maintained by the projects themselves and are pretty trustworthy.
|
||||
- **Verified publisher** images on Docker Hub are vetted by Docker.
|
||||
- **Random images from random accounts** are the same risk as random code from random GitHub accounts. Read the Dockerfile if it's available; check the publish date and pull count; prefer well-known maintainers.
|
||||
|
||||
## Try it yourself
|
||||
|
||||
1. Tag the `my-app` image you built in lesson 05 with your registry username, then push it to a public Docker Hub repo.
|
||||
2. From a different terminal (or after running `docker rmi yourname/my-app:1.0` to clear your local copy), `docker pull yourname/my-app:1.0`. Confirm it runs.
|
||||
3. Move on to [`09_compose_basics.md`](09_compose_basics.md), where we stop typing long `docker run` commands.
|
||||
164
reference/docker/09_compose_basics.md
Normal file
@ -0,0 +1,164 @@
|
||||
# Lesson 09: Docker Compose — Putting the Run Command in a File
|
||||
|
||||
Recall the last `docker run` from lesson 07:
|
||||
|
||||
```bash
|
||||
docker run -d \
|
||||
--name my-backend \
|
||||
-p 8080:8080 \
|
||||
-e DATABASE_URL=postgres://workshop:secret@db:5432/projects \
|
||||
-e LOG_LEVEL=info \
|
||||
-v ./uploads:/app/uploads \
|
||||
my-backend-image:1.0
|
||||
```
|
||||
|
||||
Imagine typing that — correctly — every time. Imagine a friend trying to run your project and you having to send them the right invocation through chat. Imagine having to spin up a backend *and* a database *and* a cache.
|
||||
|
||||
This is the problem `docker compose` solves. You describe what you want in a YAML file. Then you run `docker compose up` and Docker does the rest.
|
||||
|
||||
## Your first compose file
|
||||
|
||||
Make a folder and put this in it as `docker-compose.yml`:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
web:
|
||||
image: nginx
|
||||
ports:
|
||||
- "8080:80"
|
||||
```
|
||||
|
||||
That's it. Three nested lines describe an entire deployment.
|
||||
|
||||
In the same folder:
|
||||
|
||||
```bash
|
||||
docker compose up
|
||||
```
|
||||
|
||||
What happens:
|
||||
|
||||
- Compose reads the file.
|
||||
- It pulls `nginx` if needed.
|
||||
- It creates a network for this stack.
|
||||
- It starts a container called `web` with port 80 mapped to host 8080.
|
||||
- It streams logs to your terminal.
|
||||
|
||||
Open http://localhost:8080. There's nginx.
|
||||
|
||||
Press `Ctrl+C` to stop and (optionally) clean up with:
|
||||
|
||||
```bash
|
||||
docker compose down
|
||||
```
|
||||
|
||||
The whole stack comes down.
|
||||
|
||||
## Run in the background
|
||||
|
||||
Add `-d` (detached):
|
||||
|
||||
```bash
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
The stack starts and returns your terminal immediately. To see logs:
|
||||
|
||||
```bash
|
||||
docker compose logs -f
|
||||
```
|
||||
|
||||
To stop:
|
||||
|
||||
```bash
|
||||
docker compose down
|
||||
```
|
||||
|
||||
## A more realistic single-service file
|
||||
|
||||
Translating the long `docker run` from the start of this lesson:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
backend:
|
||||
image: my-backend-image:1.0
|
||||
ports:
|
||||
- "8080:8080"
|
||||
environment:
|
||||
- DATABASE_URL=postgres://workshop:secret@db:5432/projects
|
||||
- LOG_LEVEL=info
|
||||
volumes:
|
||||
- ./uploads:/app/uploads
|
||||
restart: unless-stopped
|
||||
```
|
||||
|
||||
Same setup as the long shell command. But:
|
||||
|
||||
- It's version-controllable (commit it to git).
|
||||
- A new person can clone your project and `docker compose up` and have the same thing running.
|
||||
- `restart: unless-stopped` means "if this container crashes or the machine reboots, bring it back automatically." That's a real production-grade behavior with one line of YAML.
|
||||
|
||||
## `image:` vs `build:`
|
||||
|
||||
A service can either pull an existing image or build one from a Dockerfile in the same project:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
backend:
|
||||
build: ./backend # Look for a Dockerfile in ./backend/ and build it
|
||||
ports:
|
||||
- "8080:8080"
|
||||
```
|
||||
|
||||
When you `docker compose up`, Compose will build the image as part of starting up. Edit the Dockerfile and run `docker compose up --build` to force a rebuild.
|
||||
|
||||
This is how most of the real-world Compose projects you'll see work: their own services use `build:`, and standard things like databases use `image:`.
|
||||
|
||||
## The lifecycle commands
|
||||
|
||||
| Command | What it does |
|
||||
|---------|--------------|
|
||||
| `docker compose up` | Start the stack, attach to logs. |
|
||||
| `docker compose up -d` | Start the stack in the background. |
|
||||
| `docker compose up --build` | Rebuild any `build:` services before starting. |
|
||||
| `docker compose down` | Stop and remove containers and networks. **Keeps volumes.** |
|
||||
| `docker compose down -v` | Same plus delete the named volumes. **Destroys data — be deliberate.** |
|
||||
| `docker compose logs -f` | Tail the logs of every service. |
|
||||
| `docker compose logs -f backend` | Logs of one service only. |
|
||||
| `docker compose ps` | List the containers in this stack. |
|
||||
| `docker compose exec backend bash` | Open a shell inside a running service. |
|
||||
| `docker compose restart backend` | Restart one service. |
|
||||
|
||||
You can spend years writing software and use only these commands.
|
||||
|
||||
## Where `compose.yml` lives
|
||||
|
||||
Compose looks for `docker-compose.yml` (or the newer name `compose.yml` — both work) in the current directory. Each project gets its own folder with its own compose file. Running `docker compose up` in different folders gives you independent stacks.
|
||||
|
||||
The two example projects in this repo are both Compose-based:
|
||||
|
||||
- [`../../examples/image_meaning_db/docker-compose.yml`](../../examples/image_meaning_db/docker-compose.yml)
|
||||
- [`../../examples/audio_meaning_db/docker-compose.yml`](../../examples/audio_meaning_db/docker-compose.yml)
|
||||
|
||||
Both are small enough to read top-to-bottom in a minute. Cross-reference them with what you've learned so far and most lines should make sense.
|
||||
|
||||
## Try it yourself
|
||||
|
||||
1. Create a folder, put a `docker-compose.yml` with the simple nginx example in it, and `docker compose up`. Visit it in your browser.
|
||||
2. Add another service to the same file — say, a second nginx on port 8081:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
web1:
|
||||
image: nginx
|
||||
ports:
|
||||
- "8080:80"
|
||||
web2:
|
||||
image: nginx
|
||||
ports:
|
||||
- "8081:80"
|
||||
```
|
||||
|
||||
Run `docker compose up`. Both are reachable on their respective ports. `docker compose down` brings them both down at once.
|
||||
|
||||
3. Move on to [`10_compose_multi_service.md`](10_compose_multi_service.md) where the services actually talk to each other.
|
||||
114
reference/docker/10_compose_multi_service.md
Normal file
@ -0,0 +1,114 @@
|
||||
# Lesson 10: A Multi-Service Stack
|
||||
|
||||
This is where Docker stops being "a fancier way to run a single program" and starts being "I can describe a whole system on one laptop." We'll build a tiny web app that talks to a real database. Two containers, one file, one command.
|
||||
|
||||
## What we're building
|
||||
|
||||
- **`db`** — a Postgres database, using the official image, with persistent storage.
|
||||
- **`adminer`** — a small web-based UI that connects to the database, so we can poke at it without installing any database tools on the host.
|
||||
|
||||
This is a deliberately small example. But the pattern — one or more services, plus a database, plus some glue — covers a huge fraction of what people actually deploy.
|
||||
|
||||
## The file
|
||||
|
||||
In a fresh folder, create `docker-compose.yml`:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
db:
|
||||
image: postgres:16
|
||||
restart: unless-stopped
|
||||
environment:
|
||||
POSTGRES_USER: workshop
|
||||
POSTGRES_PASSWORD: secret
|
||||
POSTGRES_DB: projects
|
||||
volumes:
|
||||
- pgdata:/var/lib/postgresql/data
|
||||
|
||||
adminer:
|
||||
image: adminer:5
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "8080:8080"
|
||||
depends_on:
|
||||
- db
|
||||
|
||||
volumes:
|
||||
pgdata:
|
||||
```
|
||||
|
||||
Read it carefully. A few things worth noticing:
|
||||
|
||||
- **Two services**, `db` and `adminer`. Compose will start each as a separate container.
|
||||
- **No `ports:` on `db`.** The database is not exposed to your host. It doesn't need to be — only the other container needs to reach it, and they can reach each other on the internal network.
|
||||
- **`adminer`'s `ports:`** make its UI available at `http://localhost:8080`.
|
||||
- **`depends_on: - db`** tells Compose to start `db` before `adminer`. (It doesn't wait for Postgres to be *ready* — just for the container to exist. For real production you'd add a healthcheck. For a workshop, this is fine.)
|
||||
- **A named volume `pgdata`** persists Postgres's data across `down`/`up` cycles.
|
||||
|
||||
## Bring it up
|
||||
|
||||
```bash
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
Then visit http://localhost:8080. You'll see Adminer's login screen. Fill in:
|
||||
|
||||
- **System:** PostgreSQL
|
||||
- **Server:** `db` ← this is the key part: services reach each other by service name
|
||||
- **Username:** `workshop`
|
||||
- **Password:** `secret`
|
||||
- **Database:** `projects`
|
||||
|
||||
You should land in a (mostly empty) Postgres database. Create a table, insert a row, browse it — whatever you like. You're using a database server you didn't install, through a UI you didn't install, both running on your laptop in containers.
|
||||
|
||||
## The thing that's quietly amazing
|
||||
|
||||
In the Adminer login, you typed `db` as the server. Not an IP address, not `localhost`, not `127.0.0.1`. **Just the service name.** Compose set up a private network for this stack and made every service reachable by its name.
|
||||
|
||||
This is the same mechanism that lets a "backend" container reach a "database" container with a URL like `postgres://user:pass@db:5432/projects`. The hostname is just the service's name in the compose file. Rename the service, the hostname changes with it.
|
||||
|
||||
## Persistence in action
|
||||
|
||||
Stop the stack:
|
||||
|
||||
```bash
|
||||
docker compose down
|
||||
```
|
||||
|
||||
Both containers are gone. Bring it back up:
|
||||
|
||||
```bash
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
Log back into Adminer. Your table and row are still there — because the `pgdata` volume survived the `down`.
|
||||
|
||||
Now (carefully) try the destructive form:
|
||||
|
||||
```bash
|
||||
docker compose down -v
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
Log in again. Empty database. The `-v` flag deleted the named volumes. **This is the one Compose flag to fear** — it's how you nuke your data, and it looks like every other innocent flag.
|
||||
|
||||
## A taste of what scales from here
|
||||
|
||||
What you just did was a two-container stack. The same compose file format, with the same `services:` / `volumes:` / `depends_on:` / `ports:` structure, can describe ten services. Twenty. A frontend, a backend, a worker queue, a Redis cache, a search index, a model server. All in one file.
|
||||
|
||||
Some real-world patterns you'll see in compose files:
|
||||
|
||||
- **`build:` for your own services, `image:` for stock components.** Your code is in a Dockerfile; the database and cache come from the registry.
|
||||
- **Healthchecks** so dependent services wait for upstreams to be ready (not just running).
|
||||
- **Profiles** so you can opt into extra services (`docker compose --profile gpu up`).
|
||||
- **`.env` files** that Compose reads automatically — you reference `${POSTGRES_PASSWORD}` in the YAML and it gets substituted from `.env`.
|
||||
|
||||
When your project grows past what fits comfortably in Compose — usually because you want to run across multiple machines, or want managed health/restart/scaling — the natural next step is Kubernetes. Critically: the *containers* don't change. The compose file becomes a different deployment file. Your `backend:1.0` image is still your `backend:1.0` image.
|
||||
|
||||
## Try it yourself
|
||||
|
||||
1. Bring up the stack above, log into Adminer, create a table with two columns, insert a row, log out, `down`, `up`, log back in. Confirm the row survived.
|
||||
2. Now `down -v`. Bring it back up. Confirm the row is gone. Feel that fear; remember it.
|
||||
3. Add a third service of your own — say, a `python:3.12` container with `command: ["sleep", "infinity"]` so it just sits there. Run `docker compose exec python_service bash` to drop into it. Note that from inside that container, `ping db` (after `apt-get install -y iputils-ping`) works — the database is reachable by name.
|
||||
4. Look at [`../../examples/image_meaning_db/docker-compose.yml`](../../examples/image_meaning_db/docker-compose.yml) and read it line by line. Almost every concept is one you've now seen.
|
||||
5. Move on to [`11_cleanup_and_next_steps.md`](11_cleanup_and_next_steps.md) — disk reclamation, and where to go from here.
|
||||
99
reference/docker/11_cleanup_and_next_steps.md
Normal file
@ -0,0 +1,99 @@
|
||||
# Lesson 11: Cleanup, Disk, and Where to Go Next
|
||||
|
||||
After working through this primer your machine has accumulated images, containers, volumes, and build caches. This lesson is the maintenance you'll do every few weeks. It also points at where to learn more.
|
||||
|
||||
## See what Docker is using
|
||||
|
||||
```bash
|
||||
docker system df
|
||||
```
|
||||
|
||||
A high-level summary: how much disk space your images, containers, volumes, and build cache are taking up.
|
||||
|
||||
For more detail:
|
||||
|
||||
```bash
|
||||
docker images # all images
|
||||
docker ps -a # all containers (running and stopped)
|
||||
docker volume ls # all named volumes
|
||||
```
|
||||
|
||||
## Surgical cleanup
|
||||
|
||||
If you know exactly what to remove:
|
||||
|
||||
```bash
|
||||
docker rm <container> # one container
|
||||
docker rmi <image> # one image
|
||||
docker volume rm <volume> # one volume
|
||||
```
|
||||
|
||||
A container has to be stopped (or removed with `rm -f`) before its image can be removed.
|
||||
|
||||
## Broad cleanup
|
||||
|
||||
When you just want disk back:
|
||||
|
||||
```bash
|
||||
docker container prune # remove all stopped containers
|
||||
docker image prune # remove dangling images (no tag, not used)
|
||||
docker image prune -a # remove ANY image not used by a container right now
|
||||
docker volume prune # remove volumes not used by any container
|
||||
docker network prune # remove networks not used
|
||||
docker builder prune # remove the build cache
|
||||
```
|
||||
|
||||
Each one asks for confirmation.
|
||||
|
||||
The atomic "clean everything not currently in use" command:
|
||||
|
||||
```bash
|
||||
docker system prune -a
|
||||
```
|
||||
|
||||
And the scariest version, which also removes volumes:
|
||||
|
||||
```bash
|
||||
docker system prune -a --volumes
|
||||
```
|
||||
|
||||
Read that twice before running it. It will delete data in any stopped database volume. The convenience of one command is balanced by the cost of using it without thinking.
|
||||
|
||||
## Patterns that keep your disk tidy
|
||||
|
||||
- Use `--rm` on one-shot containers: `docker run --rm -it ubuntu bash`. The container is deleted as soon as it exits, so nothing piles up.
|
||||
- Run `docker system df` occasionally. If your build cache is 30 GB, `docker builder prune` is the answer.
|
||||
- When you `docker compose down` a project you're truly done with, follow up with `docker compose down --rmi all -v` to remove its images and volumes too. (Be sure you're truly done.)
|
||||
|
||||
## Where to go next
|
||||
|
||||
The four pieces of Docker we *didn't* cover, in rough order of when you'll probably want them:
|
||||
|
||||
1. **Healthchecks.** Tell Compose how to know a service is actually ready (not just "the process exists"). Add a `healthcheck:` block to a service in `docker-compose.yml`, and use `depends_on: { db: { condition: service_healthy } }` to wait properly.
|
||||
|
||||
2. **Multi-stage builds.** A Dockerfile pattern where you build in one image and copy the artifact into a smaller "final" image. Cuts image size dramatically, especially for compiled languages.
|
||||
|
||||
3. **`.dockerignore`.** Like `.gitignore` for Docker builds — exclude files from the build context so they don't bloat your image or invalidate your cache.
|
||||
|
||||
4. **Container orchestration (Kubernetes, ECS, Nomad).** The next step up from Compose, for running containers across multiple machines. If you find yourself building production systems for paying customers, you'll meet one of these. If you're not, you almost certainly don't need it.
|
||||
|
||||
## Resources we trust
|
||||
|
||||
- **Official Docker docs** — https://docs.docker.com/ . The "Get Started" guide is excellent. The reference sections (Dockerfile reference, Compose file reference) are the authoritative source when you're hunting for a specific flag.
|
||||
- **Play with Docker** — https://labs.play-with-docker.com/ . Free, in-browser Docker environment. Great for experimenting without touching your laptop.
|
||||
- **Docker Curriculum** — https://docker-curriculum.com/ . A longer community-written tutorial, takes you a step further than this primer (deploys a multi-container app to AWS).
|
||||
- **The Compose spec** — https://compose-spec.io/ . The Compose file format is now an open spec, not just a Docker product. Useful when you want to know every legal field.
|
||||
- **Awesome Docker** — https://github.com/veggiemonk/awesome-docker . A curated index of tools, tutorials, and useful images.
|
||||
|
||||
## And finally
|
||||
|
||||
Docker is one of those tools where a small, well-chosen vocabulary covers ninety percent of the work. You now have it:
|
||||
|
||||
- `docker run`, `docker ps`, `docker images`, `docker rm`, `docker rmi`
|
||||
- `docker build`, `Dockerfile`, `FROM/WORKDIR/COPY/RUN/CMD`
|
||||
- `docker pull`, `docker push`, `docker tag`
|
||||
- `docker compose up`, `down`, `logs`, `exec`
|
||||
- `-p`, `-v`, `-e`, `-d`, `-it`, `--name`, `--rm`
|
||||
- the words *image*, *container*, *volume*, *registry*, *service*
|
||||
|
||||
Everything else is a refinement. When your project needs the refinement, look it up. Until then, you've got the toolbox.
|
||||
51
reference/docker/README.md
Normal file
@ -0,0 +1,51 @@
|
||||
# Docker — reference material
|
||||
|
||||
This folder is a self-paced Docker primer. It is **not** the spine of the class — your project is. We've put it here because Docker is one of the most useful tools you can have in your toolbox: it lets you run almost any piece of software cleanly on your machine, share it with others, and deploy it to a server later — using the *same* setup the whole way through.
|
||||
|
||||
Docker is **free and open source**. The core engine (`containerd`, `runc`, the Docker daemon) is openly developed and you can run it forever without an account, a subscription, or even an internet connection once an image is pulled. There are paid Docker products too, but they're optional — everything in this primer uses the free pieces.
|
||||
|
||||
## When to dip in
|
||||
|
||||
- You want to run a piece of software (a database, a model server, someone's GitHub project) without installing a dozen system dependencies.
|
||||
- Your project is starting to need more than one service (a web app + a database, a frontend + a backend) and you want them to start and stop together.
|
||||
- You've hit "works on my machine" pain and want a way to make the environment portable.
|
||||
- You want to deploy your project somewhere later and you'd like the deployment to behave the same as your laptop.
|
||||
- You're tired of Python virtualenvs, Node version managers, Ruby rbenv, and the rest of the per-language environment menagerie. Docker supersedes all of that with one mental model.
|
||||
|
||||
## When *not* to dip in
|
||||
|
||||
- Out of a sense of obligation. There's no test.
|
||||
- Before you have a project that benefits from it. The lessons make far more sense once you've felt the pain Docker is solving.
|
||||
|
||||
## How it scales
|
||||
|
||||
The same `docker run` you type on your laptop is the same command that runs containers in production at companies serving billions of requests. The same `docker-compose.yml` that brings up your two-service hobby project also describes the building blocks of large cloud deployments (Kubernetes, ECS, Nomad — they all eat the same container images). Starting with Docker means starting at the bottom of a ladder that goes very high. You don't have to climb it. But the rung you're on is the same wood as every rung above.
|
||||
|
||||
## Lessons
|
||||
|
||||
Work in order if you're starting from zero. Skip around if you already know parts of this.
|
||||
|
||||
| File | Topic |
|
||||
|------|-------|
|
||||
| `01_what_is_docker.md` | What a container is, why it exists, how it differs from a VM |
|
||||
| `02_hello_world.md` | Your first container: `docker run hello-world` |
|
||||
| `03_running_containers.md` | Running Ubuntu, Alpine, and Python — OS abstraction in action |
|
||||
| `04_images_vs_containers.md` | The crucial distinction. `ps`, `images`, `rm`, `rmi` |
|
||||
| `05_dockerfiles.md` | Writing a `Dockerfile` and building your own image |
|
||||
| `06_volumes_and_persistence.md` | Ephemeral by default — how to keep data around |
|
||||
| `07_ports_and_env.md` | Exposing ports (`-p`), passing config (`-e`) |
|
||||
| `08_registries.md` | Docker Hub, pulling and pushing images |
|
||||
| `09_compose_basics.md` | `docker compose` — describing a container in a file |
|
||||
| `10_compose_multi_service.md` | A two-service stack: web app + database |
|
||||
| `11_cleanup_and_next_steps.md` | Reclaiming disk space, and where to learn more |
|
||||
|
||||
If Docker isn't installed yet, see [`installing-docker.md`](installing-docker.md).
|
||||
|
||||
## Better resources we trust
|
||||
|
||||
This primer covers the slice of Docker we think is useful for the kinds of projects this workshop tends to produce. It is deliberately small. If you want to go deeper, these are good places:
|
||||
|
||||
- **The official docs** — https://docs.docker.com/get-started/ . Genuinely well written.
|
||||
- **Play with Docker** — https://labs.play-with-docker.com/ . Free in-browser Docker environment if you haven't installed it yet or want to experiment without touching your laptop.
|
||||
- **Docker Curriculum** — https://docker-curriculum.com/ . A community-written end-to-end tutorial that goes a bit further than this primer.
|
||||
- **"Docker Deep Dive" by Nigel Poulton** — if you end up needing a book.
|
||||
21
reference/git/README.md
Normal file
@ -0,0 +1,21 @@
|
||||
# Git — reference material
|
||||
|
||||
*Placeholder.* This folder will hold a self-paced primer on Git: the tool that lets you save versions of your project, undo changes safely, and try out ideas without losing what already works.
|
||||
|
||||
Planned topics (rough sketch — order subject to change):
|
||||
|
||||
- What a repository is and why you'd want one
|
||||
- `git init`, `git status`, `git add`, `git commit` — the daily loop
|
||||
- Looking at history: `git log`, `git diff`
|
||||
- Undoing things: `git restore`, `git revert`, and when *not* to use `git reset --hard`
|
||||
- Branches: trying something without breaking your working version
|
||||
- Merging and the basics of resolving a conflict
|
||||
- Ignoring files (`.gitignore`)
|
||||
|
||||
## When to dip in
|
||||
|
||||
Once your project has more than a handful of files, or once you've lost work to an editor crash. Until then, you can probably get by with "save often."
|
||||
|
||||
## Installing
|
||||
|
||||
Will be filled in. Short version: most operating systems either have it or make it a one-line install.
|
||||
21
reference/github/README.md
Normal file
@ -0,0 +1,21 @@
|
||||
# GitHub — reference material
|
||||
|
||||
*Placeholder.* This folder will hold a primer on GitHub: where most of the world's open-source code lives, and a convenient place to stash your own.
|
||||
|
||||
Planned topics:
|
||||
|
||||
- What GitHub is (and what it isn't — it's not the same thing as Git)
|
||||
- Creating an account and your first repository
|
||||
- Pushing a local project to GitHub
|
||||
- Reading other people's repos: where the README, code, and issues live
|
||||
- Cloning someone else's project to your machine
|
||||
- The basics of pull requests, in case you ever want to contribute back
|
||||
- SSH keys vs. HTTPS — which to set up and why
|
||||
|
||||
## When to dip in
|
||||
|
||||
When you want to back up your project somewhere that isn't your laptop, share it with someone, or look inside a project you found online.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
A working understanding of [`../git/`](../git/) helps but isn't strictly required for the basics.
|
||||
23
reference/huggingface/README.md
Normal file
@ -0,0 +1,23 @@
|
||||
# Hugging Face — reference material
|
||||
|
||||
*Placeholder.* This folder will cover Hugging Face: the de facto hub for open-weight AI models, datasets, and demos.
|
||||
|
||||
Planned topics:
|
||||
|
||||
- What Hugging Face is and why it matters (the "GitHub of AI models")
|
||||
- Browsing the model hub: filtering by task, size, license
|
||||
- Reading a model card: what to look for before you commit to a model
|
||||
- Downloading and running a model with `transformers`
|
||||
- The `pipeline()` shortcut for common tasks (text generation, transcription, classification, etc.)
|
||||
- `sentence-transformers` for embeddings
|
||||
- Where the weights actually live on your disk, and how to clean them up
|
||||
- Datasets and Spaces (brief tour)
|
||||
- Authentication and the few model families that need it (e.g. Llama)
|
||||
|
||||
## When to dip in
|
||||
|
||||
When your project needs a model — for transcription, summarization, image generation, embeddings, classification — and you'd rather run something locally than pay for an API.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Some [`../python/`](../python/), and ideally [`../pytorch/`](../pytorch/) if you want to peek under the hood.
|
||||
BIN
reference/papers/2012_12_03_alexnet.pdf
Normal file
BIN
reference/papers/2017_06_12_attention_is_all_you_need.pdf
Normal file
BIN
reference/papers/2017_12_05_mastering_chess_and_shogi.pdf
Normal file
BIN
reference/papers/2018_06_11_gpt_1.pdf
Normal file
BIN
reference/papers/2019_02_14_gpt_2.pdf
Normal file
BIN
reference/papers/2020_01_15_alphafold.pdf
Normal file
BIN
reference/papers/2020_01_23_scaling_laws.pdf
Normal file
BIN
reference/papers/2020_05_28_gpt_3.pdf
Normal file
BIN
reference/papers/2021_02_26_CLIP.pdf
Normal file
12150
reference/papers/2021_07_15_alphafold2.pdf
Normal file
BIN
reference/papers/2022_01_28_chain_of_thought.pdf
Normal file
BIN
reference/papers/2022_03_04_instructGPT.pdf
Normal file
BIN
reference/papers/2022_03_29_chinchilla.pdf
Normal file
BIN
reference/papers/2023_03_15_gpt_4.pdf
Normal file
BIN
reference/papers/2023_03_15_gpt_4_technical_report.pdf
Normal file
BIN
reference/papers/2023_08_30_superhuman_drone_racing.pdf
Normal file
BIN
reference/papers/2023_09_20_dalle_3.pdf
Normal file
35401
reference/papers/2024_05_08_alphafold3.pdf
Normal file
115
reference/papers/README.md
Normal file
@ -0,0 +1,115 @@
|
||||
# Papers
|
||||
|
||||
A curated reading list. The goal isn't to make you a researcher — it's to give you a *feel* for how the tools you're using actually came to exist, why the last few years surprised even the people building them, and why the same trick keeps eating new domains.
|
||||
|
||||
You can skip this section entirely. Nothing else in the workshop depends on it.
|
||||
|
||||
> Each PDF is prefixed with its publication date (`YYYY_MM_DD_…`) so the directory listing sorts chronologically. The sections below organize the same papers by *track* (language, images, games, the physical world) — but if you'd rather walk through them in pure release order, just sort the folder by name.
|
||||
|
||||
## The pattern, in one sentence
|
||||
|
||||
Take a general-purpose neural network, give it a stupidly simple objective (*predict the next word*, *win the game*, *find the shape of the protein*), and pour in absurd amounts of compute and data. Out the other side comes a system that solves a problem we used to think required something distinctly human. The papers below are the same trick, applied to wildly different problems, over the course of about a decade.
|
||||
|
||||
If you read these in order, three things should land:
|
||||
|
||||
1. **The leap was not one breakthrough.** It was the same handful of ideas, scaled up and pointed at new targets. You're not watching ten different revolutions — you're watching one revolution arrive in ten different rooms.
|
||||
2. **The "AI does that now?" era is shockingly recent.** The transformer architecture is 2017. AlphaGo is 2016. The model most people would say passes an informal Turing test (GPT-4) is 2023. Protein structure prediction stopped being an unsolved problem in 2021. Drones started beating human world champions at racing in 2023. End-to-end, the surprising stuff fits inside a single decade.
|
||||
3. **It is not just about chatbots.** Text, images, games, robotics, biology — the *same family of techniques* is making progress in all of them, often by the same labs. If you're trying to predict what gets disrupted next, that's the signal worth tracking.
|
||||
|
||||
By the end you should have the vocabulary to read AI announcements skeptically, and a working theory of where the field is going next.
|
||||
|
||||
## The unifying recipe
|
||||
|
||||
Across every paper here, four ingredients show up over and over. If you only remember one thing, remember this list — it's how to spot a paper that's part of the trajectory versus a paper that isn't.
|
||||
|
||||
1. **A simple, scalable objective.** No hand-designed rules. "Predict what comes next," "win the game," "match the experimental data." The training signal is something a computer can grade automatically, which means it can be applied at enormous scale.
|
||||
2. **A general-purpose architecture.** Mostly neural networks, and increasingly the *same* neural network blueprint (the transformer) across very different problems. There's no separate "language brain" and "vision brain" — there's one machine that learns whatever you feed it.
|
||||
3. **Self-generated or web-scale data.** Either the model learns from a huge slice of the internet, or — even better — it generates its own training data by playing against itself or simulating the world.
|
||||
4. **Compute, scaled aggressively.** More parameters, more data, more chips, more electricity. Capabilities improve in a way that's now predictable in advance (this is what "scaling laws" means).
|
||||
|
||||
When you see a result that combines those four, that's the trajectory. When you see a result that brags about clever hand-designed features and works on a small dataset, that's a different (and shrinking) part of the field.
|
||||
|
||||
## Track 1 — Language
|
||||
|
||||
The trunk of the tree. Everything modern starts here, even results that aren't about text.
|
||||
|
||||
**[Attention Is All You Need](2017_06_12_attention_is_all_you_need.pdf)** — *Vaswani et al., Google, 2017.* The transformer paper. Before this, language models read text one word at a time, like a person reading left to right. The transformer reads everything at once and learns which words should "pay attention" to which other words. That's the whole architectural change. Every chatbot you've used is downstream of this paper.
|
||||
|
||||
**[GPT-1: Improving Language Understanding by Generative Pre-Training](2018_06_11_gpt_1.pdf)** — *Radford et al., OpenAI, 2018.* The recipe: take a transformer, train it on a huge pile of text to predict the next word, then lightly fine-tune it for whatever task you actually care about. This is the template everyone copies for the next five years. About 117 million parameters — easily runs on a laptop.
|
||||
|
||||
**[GPT-2: Language Models are Unsupervised Multitask Learners](2019_02_14_gpt_2.pdf)** — *Radford et al., OpenAI, 2019.* Same recipe, ten times bigger, ten times more text. It can now write coherent paragraphs. OpenAI initially refused to release the largest version because they thought it was too dangerous — the first time anyone seriously argued that a language model itself could be a hazard. In the low billions of parameters. Sill runs on a laptop, benefits from a GPU.
|
||||
|
||||
**[GPT-3: Language Models are Few-Shot Learners](2020_05_28_gpt_3.pdf)** — *Brown et al., OpenAI, 2020.* Same recipe, a hundred times bigger again. Something unexpected happens: you stop needing to fine-tune the model for new tasks. Describe the task in the prompt, give it a couple of examples, and it figures out what you want. The first sign that scale wasn't just buying better autocomplete — it was buying *generality*. 175 billion paramters, too big to run on reasonable consumer hardware (though you could if you tried... NOTE: today's open source models of less than 10 billion parameters, which do easily run a laptop, are far stronger than GPT-3, if that gives you a sense of hardware overhang timelines... 2020 wasn't that long ago. Your 2016 laptop which was never intended to have an intelligent conversation with humans, can load a 13 billion parameter model in 2026, and far outperform than what was state-of-the-art running in a server in 2020.)
|
||||
|
||||
**[Scaling Laws for Neural Language Models](2020_01_23_scaling_laws.pdf)** — *Kaplan et al., OpenAI, 2020.* The paper that turned "make it bigger" from a hunch into a forecast. Performance improves with model size, data, and compute in smooth, predictable curves. This is why labs felt comfortable spending hundreds of millions of dollars on a single training run: the result was no longer a gamble.
|
||||
|
||||
**[Training Compute-Optimal Large Language Models ("Chinchilla")](2022_03_29_chinchilla.pdf)** — *Hoffmann et al., DeepMind, 2022.* A correction to the earlier scaling laws. Everyone had been making models that were *too big and too undertrained*. Match data to model size properly and you get more performance for the same money. The reason modern open models punch above their weight is largely this paper.
|
||||
|
||||
**[Training Language Models to Follow Instructions ("InstructGPT")](2022_03_04_instructGPT.pdf)** — *Ouyang et al., OpenAI, 2022.* The missing piece between GPT-3 and ChatGPT. The base model from the GPT-3 paper is a brilliant-but-feral autocomplete — give it "Q: what's the capital of France?" and it might respond with a list of more trivia questions instead of an answer. InstructGPT introduces *reinforcement learning from human feedback* (RLHF): people rank model outputs, and the model learns to prefer the kind of response humans want. This is the paper that made AI feel like a product instead of a research demo.
|
||||
|
||||
**[Chain-of-Thought Prompting Elicits Reasoning](2022_01_28_chain_of_thought.pdf)** — *Wei et al., Google, 2022.* A tiny paper with outsized consequences. Just add "let's think step by step" to a prompt and large models suddenly become much better at math and logic, because they show their work. This is the seed of the current reasoning-model era (the o1 / o3 / Claude-with-extended-thinking line) — those systems are, roughly, "what if we trained the model to do this on every problem, all the time?"
|
||||
|
||||
**[GPT-4 Technical Report](2023_03_15_gpt_4_technical_report.pdf)** and **[GPT-4](2023_03_15_gpt_4.pdf)** — *OpenAI, 2023.* The model most people would agree passes an informal Turing test. Performs at or near human level on a wide range of academic and professional exams. Accepts images as input — the moment language models became *multimodal*. The report is striking for what it leaves out: no model size, no architecture details, no training data details. The opacity is part of the story — frontier AI has become a commercial product, not an open science project. Read the "Predictable Scaling" section (they forecast GPT-4's performance from much smaller test runs — scaling laws, in action) and the "Limitations" section. One striking result that I think earns the distinction of calling this a "spark of AGI": they trained multpile versions of GPT-4, including one that _didn't_ see images as input, text only. The text-only model was able to generate the code for svg graphics of various objects. They used this to create an SVG graphic of a unicorn. To prove GPT-4 understood what it had generated, the researchers moved the unicorn's horn off it's head (a matter of updating location paramters for the triangle shape representing the horn), and without any further instruction, gave this altered SVG to GPT-4 (fresh prompt, GPT-4 forgot it had created it) and asked it to "fix it". It moved the horn back to the head, and this result was robust and repeatable. What I found remarkable about this is that something that has _only ever seen text in its entire existence_ seemed to know what a unicorn _looks like_. That to me is evidence that raw text seems to compress and understanding of what the text represents, which transcends the text itself. And obviously that goes for everything, not just unicorns. I suppose this could make sense, for instance, if you've ever read detailed science articles on some physical process that you could never see with your eyes... like how atoms work, or how the Earth's interior works, or how fusion happens inside stars, or the large scale structure of the universe... these are things nobody has ever seen with their eyes, but if you read a bunch of books on these topics, I'd wager you have a pretty sharp and accurate mental model of how these things work and you'd confidently claim to "understand" it, even though you've never seen it. It's not too much of a stretch to imagine the same thing happens with language models. They need to understand the concepts which produced the text, in order to do a good job of producing the text themselves.
|
||||
|
||||
## Track 2 — Images
|
||||
|
||||
The same machinery, pointed at pixels. This is also where the modern deep-learning era starts — the image track is older than the language track, and one paper here is the origin point for almost everything else on the list. Watch the architecture diverge from the text track around 2020, then converge again around 2023.
|
||||
|
||||
**[ImageNet Classification with Deep Convolutional Neural Networks ("AlexNet")](2012_12_03_alexnet.pdf)** — *Krizhevsky, Sutskever, Hinton, University of Toronto, 2012.* The paper that kicked off everything else on this list. ImageNet was a benchmark with roughly a million labeled photos across a thousand categories, and prior methods had plateaued around 26% error using hand-engineered feature pipelines. AlexNet trained a deep convolutional neural network on two consumer GPUs and roughly halved that error in one shot. Within a year, the rest of computer vision had thrown out the old toolkit and switched to deep learning; within five years the same recipe was eating language, games, and the sciences. The ingredients weren't new — convolutional nets dated to the 1980s — what was new was the scale of data, the use of GPUs, and the willingness to make the network deep. If you only read one paper for historical perspective on the whole list, read this one.
|
||||
|
||||
**[Learning Transferable Visual Models from Natural Language Supervision ("CLIP")](2021_02_26_CLIP.pdf)** — *Radford et al., OpenAI, 2021.* The bridge between language and vision. CLIP learns to put images and captions into the same conceptual space, so "a photo of a golden retriever" and the actual photo end up nearby. Suddenly you can search images by description, classify pictures you've never trained on, and — crucially — *condition image generators on text*. Almost every text-to-image system since runs on top of this idea.
|
||||
|
||||
**[DALL·E 1: Zero-Shot Text-to-Image Generation](2021_02_24_dalle_1.pdf)** — *Ramesh et al., OpenAI, 2021.* "What if we did GPT, but for images?" Chop images into a grid of tokens (like words), and train a transformer to predict the next image-token given a caption. Results were rough but stunning in 2021 — the first time a single model could draw an arbitrary thing you described in plain English.
|
||||
|
||||
**[Denoising Diffusion Probabilistic Models](2020_06_19_diffusion_models.pdf)** — *Ho et al., Berkeley, 2020.* The other paradigm for generating images. Instead of predicting pixels one at a time, start with pure noise and gradually denoise it into a picture. Counterintuitive, mathematically elegant, and — it turns out — much better at producing photorealistic images than the autoregressive approach. Every modern image generator (Stable Diffusion, Midjourney, DALL·E 2 and 3) descends from this paper.
|
||||
|
||||
**[High-Resolution Image Synthesis with Latent Diffusion Models](2021_12_20_latent_diffusion.pdf)** — *Rombach et al., 2022.* Also known as the Stable Diffusion paper. The trick: instead of denoising pixels directly (slow, expensive), compress the image into a much smaller "latent" representation first and denoise *that*. The result is a diffusion model you can actually run on a consumer GPU. This is the paper that made image generation a thing normal people could do at home.
|
||||
|
||||
**[DALL·E 2: Hierarchical Text-Conditional Image Generation with CLIP Latents](2022_04_13_dalle_2.pdf)** — *Ramesh et al., OpenAI, 2022.* Combines CLIP with diffusion. Quality jumps from "interesting curiosity" to "could plausibly be a commercial product." Note the architectural divergence from the text track: image generation has now broken away from the pure transformer recipe.
|
||||
|
||||
**[DALL·E 3](2023_09_20_dalle_3.pdf)** — *OpenAI, 2023.* Less an architectural leap than a usability one. The headline: it actually follows your prompt. Earlier image models ignored half of what you asked for; DALL·E 3 leans on a GPT-style model to interpret what you wrote, then generates accordingly. The end of the "trending on artstation, 8k, hyperreal" era of prompt engineering. By 2023 the language and image tracks have converged again, with language models in the driver's seat.
|
||||
|
||||
## Track 3 — Games and self-play
|
||||
|
||||
This track predates the modern language work and quietly establishes the most important pattern in the whole list: *the model can supervise itself.*
|
||||
|
||||
**[Mastering the Game of Go with Deep Neural Networks and Tree Search ("AlphaGo")](2016_01_28_Mastering_the_game_of_Go_with_deep_neural_networks.pdf)** — *Silver et al., DeepMind, 2016.* Go was the canonical problem that brute-force search couldn't solve — too many positions, too few patterns a programmer could write down. AlphaGo combined deep neural networks (for intuition about which moves look good) with tree search (for verifying) and beat the world champion. Until this paper, "AI plays Go at superhuman level" was a thing experts said was decades away.
|
||||
|
||||
**[Mastering Chess and Shogi by Self-Play ("AlphaZero")](2017_12_05_mastering_chess_and_shogi.pdf)** — *Silver et al., DeepMind, 2017.* The deeper result. AlphaGo learned from a database of human games. AlphaZero starts from random weights, plays itself for a few hours, and reaches superhuman play in chess, shogi, and Go using the *same algorithm and architecture* for all three games. Two things to take from this paper: (1) the same general technique handles multiple distinct problems, and (2) human data is a *crutch*, not a requirement, when the system can generate its own. That second insight is now reshaping how the language models in Track 1 get trained for reasoning.
|
||||
|
||||
## Track 4 — The physical and natural world
|
||||
|
||||
Where the pattern is now arriving. These three results are less famous than ChatGPT, and arguably more important.
|
||||
|
||||
**[Champion-level Drone Racing using Deep Reinforcement Learning](2023_08_30_superhuman_drone_racing.pdf)** — *Kaufmann et al., UZH, Nature 2023.* An autonomous drone (the system is called Swift) that beats human world champions in head-to-head racing through a physical course at over 50 mph. The interesting bits: it learns in simulation, transfers to the real world, and runs entirely onboard the drone — no remote supercomputer in the loop. This is what it looks like when the recipe leaves the data center and enters a body. Robotics has been "five years away" for forty years; results like this are the reason that's starting to change.
|
||||
|
||||
**[Highly Accurate Protein Structure Prediction with AlphaFold ("AlphaFold 1")](2020_01_15_alphafold.pdf)** — *Senior et al., DeepMind, 2020.* Proteins are strings of amino acids that fold into 3D shapes, and the shape determines what the protein does. Predicting that shape from the sequence was one of biology's grand challenge problems for fifty years. AlphaFold 1 was the system that won the field's blind benchmark (CASP13) in a way nobody had before. Set the stage for the result that actually closed the problem out.
|
||||
|
||||
**[Highly Accurate Protein Structure Prediction with AlphaFold ("AlphaFold 2")](2021_07_15_alphafold2.pdf)** — *Jumper et al., DeepMind, Nature 2021.* The structure-prediction problem, effectively solved. Predictions at accuracies competitive with experimental measurements that take months in a lab. DeepMind released the structures of essentially every known protein for free, an act that probably accelerated biology by years. If you want a concrete example of "AI did something that mattered outside computer science," this is the one to point at.
|
||||
|
||||
**[Accurate Structure Prediction of Biomolecular Interactions ("AlphaFold 3")](2024_05_08_alphafold3.pdf)** — *Abramson et al., DeepMind, Nature 2024.* AlphaFold 2 handled proteins in isolation. AlphaFold 3 handles proteins *interacting with* DNA, RNA, drug molecules, ions — the actual machinery of a cell. Architecturally, it borrows from the image-generation track (it uses diffusion). The convergence is the point: techniques developed for cat pictures are now used to design medicines.
|
||||
|
||||
## What it all means
|
||||
|
||||
Step back from the individual papers and what you have is something like this:
|
||||
|
||||
A general-purpose technique — neural networks trained at scale on simple objectives — keeps walking into fields that previously demanded specialized expertise, and within a few years it's competitive with or better than the specialists. Language. Vision. Games requiring intuition. Real-time control of physical machines. Predicting how molecules behave. Each of these used to be its own subfield with its own community and its own decade-long roadmap. Now they share an underlying method, and progress in one often transfers to the others.
|
||||
|
||||
The features that show up everywhere are the four from the recipe section: simple objective, general architecture, scalable data, lots of compute. There is no obvious wall yet. Compute keeps getting cheaper, models keep getting better in ways that scaling laws keep predicting, and the set of "problems we used to think needed a human specialist" keeps shrinking.
|
||||
|
||||
For most people, the practical implications are roughly:
|
||||
|
||||
- **Anything that can be framed as "predict the next thing" is fair game** — and that turns out to include most knowledge work, a lot of creative work, and an increasing slice of physical work too.
|
||||
- **The bottleneck is shifting from "can a machine do this?" to "have we pointed the machine at it yet?"** Most of the value over the next decade will come from people who notice an unaddressed problem, gather the right data, and apply the recipe — not from new architectures.
|
||||
- **The trajectory is jagged, not smooth.** Capabilities arrive in lumps. Whole categories of work look untouched until they're suddenly not. Plan for surprise.
|
||||
- **Where this most likely goes next:** systems that take actions in the world (agents, robots), systems that do real scientific discovery (the AlphaFold pattern applied to chemistry, materials, medicine), and reasoning systems that can chain together long thought processes before answering. All three are already in flight.
|
||||
|
||||
If you finish this list and one feeling sticks, it should be this: *the surprising stuff isn't done arriving.*
|
||||
|
||||
## How to read a paper without a PhD
|
||||
|
||||
- Read the abstract first. If it doesn't grab you, skip it.
|
||||
- Look at the figures. They usually carry most of the story.
|
||||
- Read the introduction and the conclusion. Skip methods and experiments unless you care about the details.
|
||||
- Paste sections into a chatbot and ask "explain this like I'm not a researcher." It will do a remarkably good job — which is itself part of the point.
|
||||
- Don't try to *implement* anything from a paper unless that's your project.
|
||||
24
reference/python/01_hello_world.py
Normal file
@ -0,0 +1,24 @@
|
||||
# Lesson 01: Hello World
|
||||
#
|
||||
# This is your very first Python program. Every programming journey starts here.
|
||||
# Lines that start with '#' are comments — Python ignores them. Use them to
|
||||
# leave notes for yourself or anyone reading your code.
|
||||
#
|
||||
# To run this file:
|
||||
# python 01_hello_world.py
|
||||
|
||||
# The print() function outputs text to the screen.
|
||||
print("Hello, World!")
|
||||
|
||||
# You can print any text you like — just put it inside the quotes.
|
||||
print("Welcome to the coding workshop!")
|
||||
|
||||
# You can also print numbers without quotes.
|
||||
print(42)
|
||||
print(3.14)
|
||||
|
||||
# print() can combine text and numbers using commas.
|
||||
print("The answer is", 42)
|
||||
|
||||
# --- Try it yourself ---
|
||||
# Change "Hello, World!" to your own greeting and run the file again.
|
||||
60
reference/python/02_variables.py
Normal file
@ -0,0 +1,60 @@
|
||||
# Lesson 02: Variables
|
||||
#
|
||||
# A variable is a named container that holds a value. Think of it like a
|
||||
# labeled box: you put something inside and can retrieve it later by name.
|
||||
#
|
||||
# To run this file:
|
||||
# python 02_variables.py
|
||||
|
||||
# --- Creating variables ---
|
||||
# Use the '=' sign to assign a value to a variable name.
|
||||
name = "Alice"
|
||||
age = 30
|
||||
height = 5.6
|
||||
is_student = True
|
||||
|
||||
print(name)
|
||||
print(age)
|
||||
print(height)
|
||||
print(is_student)
|
||||
|
||||
# --- Using variables in print ---
|
||||
print("Name:", name)
|
||||
print("Age:", age)
|
||||
|
||||
# --- Updating variables ---
|
||||
# You can change the value stored in a variable at any time.
|
||||
age = 31
|
||||
print("Next year, age:", age)
|
||||
|
||||
# --- Variable naming rules ---
|
||||
# Good names: descriptive, lowercase, underscores between words
|
||||
first_name = "Bob"
|
||||
last_name = "Smith"
|
||||
total_score = 100
|
||||
|
||||
# Bad (but technically valid) names — avoid these:
|
||||
# x = "Bob" too vague
|
||||
# FirstName = "Bob" use lowercase_underscore style for variables
|
||||
|
||||
# --- Combining variables (string formatting) ---
|
||||
# f-strings let you embed variables directly inside a string.
|
||||
# Put an 'f' before the opening quote, then use {variable_name} inside.
|
||||
greeting = f"Hello, my name is {first_name} {last_name}."
|
||||
print(greeting)
|
||||
|
||||
score_message = f"Total score: {total_score}"
|
||||
print(score_message)
|
||||
|
||||
# --- Multiple assignment ---
|
||||
# You can assign the same value to several variables at once.
|
||||
x = y = z = 0
|
||||
print(x, y, z)
|
||||
|
||||
# Or assign different values on one line.
|
||||
a, b, c = 1, 2, 3
|
||||
print(a, b, c)
|
||||
|
||||
# --- Try it yourself ---
|
||||
# Create variables for your own name, age, and favorite number.
|
||||
# Print a sentence that uses all three.
|
||||
79
reference/python/03_datatypes.py
Normal file
@ -0,0 +1,79 @@
|
||||
# Lesson 03: Data Types
|
||||
#
|
||||
# Every value in Python has a type. The type determines what you can do with
|
||||
# that value (e.g., you can add numbers but not multiply a word by a word).
|
||||
#
|
||||
# The built-in function type() tells you what type a value is.
|
||||
#
|
||||
# To run this file:
|
||||
# python 03_datatypes.py
|
||||
|
||||
# --- Integer (int) ---
|
||||
# Whole numbers, positive or negative, with no decimal point.
|
||||
apples = 5
|
||||
temperature = -10
|
||||
print(type(apples)) # <class 'int'>
|
||||
|
||||
# --- Float ---
|
||||
# Numbers with a decimal point.
|
||||
price = 19.99
|
||||
pi = 3.14159
|
||||
print(type(price)) # <class 'float'>
|
||||
|
||||
# --- String (str) ---
|
||||
# Text — any sequence of characters wrapped in quotes (single or double).
|
||||
first_name = "Alice"
|
||||
last_name = 'Smith'
|
||||
sentence = "She said, 'hello!'"
|
||||
print(type(first_name)) # <class 'str'>
|
||||
|
||||
# Common string operations:
|
||||
print(len(first_name)) # number of characters: 5
|
||||
print(first_name.upper()) # ALICE
|
||||
print(first_name.lower()) # alice
|
||||
print(first_name + " " + last_name) # concatenation: Alice Smith
|
||||
print(first_name * 3) # repetition: AliceAliceAlice
|
||||
|
||||
# --- Boolean (bool) ---
|
||||
# Only two possible values: True or False.
|
||||
is_raining = False
|
||||
is_sunny = True
|
||||
print(type(is_raining)) # <class 'bool'>
|
||||
|
||||
# --- None ---
|
||||
# Represents the absence of a value — like "nothing" or "not set yet".
|
||||
result = None
|
||||
print(type(result)) # <class 'NoneType'>
|
||||
print(result) # None
|
||||
|
||||
# --- Type conversion ---
|
||||
# You can convert between types using int(), float(), str(), bool().
|
||||
number_as_string = "42"
|
||||
actual_number = int(number_as_string) # "42" → 42
|
||||
print(actual_number + 8) # 50
|
||||
|
||||
float_number = float("3.14") # "3.14" → 3.14
|
||||
print(float_number)
|
||||
|
||||
number_to_string = str(100) # 100 → "100"
|
||||
print("Score: " + number_to_string)
|
||||
|
||||
# Booleans from other types:
|
||||
# 0, empty string "", None, and empty containers are False. Everything else is True.
|
||||
print(bool(0)) # False
|
||||
print(bool(1)) # True
|
||||
print(bool("")) # False
|
||||
print(bool("hi")) # True
|
||||
|
||||
# --- Arithmetic with types ---
|
||||
print(10 + 3) # addition: 13
|
||||
print(10 - 3) # subtraction: 7
|
||||
print(10 * 3) # multiplication: 30
|
||||
print(10 / 3) # division: 3.3333... (always a float)
|
||||
print(10 // 3) # floor division: 3 (integer, rounds down)
|
||||
print(10 % 3) # modulus: 1 (the remainder)
|
||||
print(10 ** 3) # exponent: 1000
|
||||
|
||||
# --- Try it yourself ---
|
||||
# Create one variable of each type above.
|
||||
# Use type() on each and print the result.
|
||||
110
reference/python/04_conditionals.py
Normal file
@ -0,0 +1,110 @@
|
||||
# Lesson 04: Conditionals
|
||||
#
|
||||
# Conditionals let your program make decisions: "if this is true, do that;
|
||||
# otherwise, do something else." The keywords are: if, elif, else.
|
||||
#
|
||||
# Indentation matters in Python! Everything inside an if block must be
|
||||
# indented by the same amount (typically 4 spaces).
|
||||
#
|
||||
# To run this file:
|
||||
# python 04_conditionals.py
|
||||
|
||||
# --- Basic if statement ---
|
||||
temperature = 35
|
||||
|
||||
if temperature > 30:
|
||||
print("It's hot outside!")
|
||||
|
||||
# --- if / else ---
|
||||
# The else block runs when the if condition is False.
|
||||
is_raining = False
|
||||
|
||||
if is_raining:
|
||||
print("Bring an umbrella.")
|
||||
else:
|
||||
print("No umbrella needed.")
|
||||
|
||||
# --- if / elif / else ---
|
||||
# Use elif (short for "else if") to check multiple conditions in sequence.
|
||||
# Python checks each condition top to bottom and stops at the first True one.
|
||||
score = 78
|
||||
|
||||
if score >= 90:
|
||||
print("Grade: A")
|
||||
elif score >= 80:
|
||||
print("Grade: B")
|
||||
elif score >= 70:
|
||||
print("Grade: C")
|
||||
elif score >= 60:
|
||||
print("Grade: D")
|
||||
else:
|
||||
print("Grade: F")
|
||||
|
||||
# --- Comparison operators ---
|
||||
# These return True or False and are used inside conditions.
|
||||
#
|
||||
# == equal to
|
||||
# != not equal to
|
||||
# > greater than
|
||||
# < less than
|
||||
# >= greater than or equal to
|
||||
# <= less than or equal to
|
||||
|
||||
x = 10
|
||||
print(x == 10) # True
|
||||
print(x != 5) # True
|
||||
print(x > 20) # False
|
||||
print(x <= 10) # True
|
||||
|
||||
# --- Logical operators: and, or, not ---
|
||||
# Combine multiple conditions together.
|
||||
|
||||
age = 25
|
||||
has_id = True
|
||||
|
||||
# 'and' requires BOTH sides to be True
|
||||
if age >= 18 and has_id:
|
||||
print("Entry allowed.")
|
||||
|
||||
# 'or' requires AT LEAST ONE side to be True
|
||||
is_weekend = False
|
||||
is_holiday = True
|
||||
|
||||
if is_weekend or is_holiday:
|
||||
print("No work today!")
|
||||
|
||||
# 'not' flips True to False and False to True
|
||||
is_logged_in = False
|
||||
|
||||
if not is_logged_in:
|
||||
print("Please log in.")
|
||||
|
||||
# --- Nested conditionals ---
|
||||
# You can place an if inside another if. Keep nesting shallow when possible.
|
||||
user_role = "admin"
|
||||
is_active = True
|
||||
|
||||
if is_active:
|
||||
if user_role == "admin":
|
||||
print("Welcome, administrator.")
|
||||
else:
|
||||
print("Welcome, user.")
|
||||
else:
|
||||
print("Account is inactive.")
|
||||
|
||||
# --- Checking membership with 'in' ---
|
||||
# 'in' tests whether a value exists inside a collection (like a list or string).
|
||||
allowed_colors = ["red", "green", "blue"]
|
||||
chosen_color = "green"
|
||||
|
||||
if chosen_color in allowed_colors:
|
||||
print(f"{chosen_color} is a valid choice.")
|
||||
else:
|
||||
print(f"{chosen_color} is not allowed.")
|
||||
|
||||
# --- Try it yourself ---
|
||||
# Write an if/elif/else block that:
|
||||
# - Takes a variable called 'hour' (an integer 0–23 representing the time of day)
|
||||
# - Prints "Good morning" if hour < 12
|
||||
# - Prints "Good afternoon" if 12 <= hour < 18
|
||||
# - Prints "Good evening" otherwise
|
||||
126
reference/python/05_loops.py
Normal file
@ -0,0 +1,126 @@
|
||||
# Lesson 05: Loops
|
||||
#
|
||||
# Loops let you repeat a block of code many times without copy-pasting it.
|
||||
# Python has two kinds of loops: for and while.
|
||||
#
|
||||
# To run this file:
|
||||
# python 05_loops.py
|
||||
|
||||
# ============================================================
|
||||
# FOR LOOPS
|
||||
# ============================================================
|
||||
# A for loop iterates over a sequence (list, string, range, etc.)
|
||||
# and runs the block once for each item.
|
||||
|
||||
print("--- for loop over a list ---")
|
||||
fruits = ["apple", "banana", "cherry"]
|
||||
|
||||
for fruit in fruits:
|
||||
print(fruit)
|
||||
|
||||
# --- range() ---
|
||||
# range(n) generates numbers from 0 up to (but not including) n.
|
||||
print("\n--- range(5) ---")
|
||||
for i in range(5):
|
||||
print(i) # prints 0, 1, 2, 3, 4
|
||||
|
||||
# range(start, stop) starts at 'start', stops before 'stop'.
|
||||
print("\n--- range(2, 7) ---")
|
||||
for i in range(2, 7):
|
||||
print(i) # prints 2, 3, 4, 5, 6
|
||||
|
||||
# range(start, stop, step) jumps by 'step' each time.
|
||||
print("\n--- range(0, 20, 5) ---")
|
||||
for i in range(0, 20, 5):
|
||||
print(i) # prints 0, 5, 10, 15
|
||||
|
||||
# --- Iterating over a string ---
|
||||
print("\n--- iterating over a string ---")
|
||||
for letter in "hello":
|
||||
print(letter)
|
||||
|
||||
# --- enumerate() ---
|
||||
# When you need both the index and the value, use enumerate().
|
||||
print("\n--- enumerate ---")
|
||||
colors = ["red", "green", "blue"]
|
||||
|
||||
for index, color in enumerate(colors):
|
||||
print(f" [{index}] {color}")
|
||||
|
||||
# ============================================================
|
||||
# WHILE LOOPS
|
||||
# ============================================================
|
||||
# A while loop keeps running as long as its condition is True.
|
||||
# You're responsible for changing something so the condition eventually
|
||||
# becomes False — otherwise the loop runs forever (infinite loop).
|
||||
|
||||
print("\n--- while loop ---")
|
||||
count = 0
|
||||
|
||||
while count < 5:
|
||||
print(count)
|
||||
count += 1 # same as: count = count + 1
|
||||
|
||||
# --- Countdown example ---
|
||||
print("\n--- countdown ---")
|
||||
n = 5
|
||||
|
||||
while n > 0:
|
||||
print(n)
|
||||
n -= 1
|
||||
print("Blast off!")
|
||||
|
||||
# ============================================================
|
||||
# LOOP CONTROL: break AND continue
|
||||
# ============================================================
|
||||
|
||||
# break — exit the loop immediately, no matter what.
|
||||
print("\n--- break ---")
|
||||
for i in range(10):
|
||||
if i == 5:
|
||||
break # stop when we reach 5
|
||||
print(i)
|
||||
|
||||
# continue — skip the rest of this iteration and go to the next one.
|
||||
print("\n--- continue (skip evens) ---")
|
||||
for i in range(10):
|
||||
if i % 2 == 0:
|
||||
continue # skip even numbers
|
||||
print(i) # only odd numbers are printed
|
||||
|
||||
# ============================================================
|
||||
# NESTED LOOPS
|
||||
# ============================================================
|
||||
# A loop inside a loop. The inner loop runs completely for every
|
||||
# single iteration of the outer loop.
|
||||
|
||||
print("\n--- multiplication table (3x3) ---")
|
||||
for row in range(1, 4):
|
||||
for col in range(1, 4):
|
||||
print(f"{row * col:3}", end="") # end="" keeps output on same line
|
||||
print() # newline after each row
|
||||
|
||||
# ============================================================
|
||||
# BUILDING RESULTS WITH LOOPS
|
||||
# ============================================================
|
||||
|
||||
# Summing numbers 1 to 100
|
||||
total = 0
|
||||
for i in range(1, 101):
|
||||
total += i
|
||||
print(f"\nSum of 1 to 100: {total}")
|
||||
|
||||
# Collecting items that pass a filter
|
||||
numbers = [3, 17, 2, 41, 8, 99, 5]
|
||||
big_numbers = []
|
||||
|
||||
for n in numbers:
|
||||
if n > 10:
|
||||
big_numbers.append(n)
|
||||
|
||||
print(f"Numbers > 10: {big_numbers}")
|
||||
|
||||
# --- Try it yourself ---
|
||||
# Use a for loop and range() to print every multiple of 3 between 1 and 50.
|
||||
# Use a while loop to keep asking for user input until they type "quit".
|
||||
# Hint: user_input = input("Enter something: ")
|
||||
142
reference/python/06_functions.py
Normal file
@ -0,0 +1,142 @@
|
||||
# Lesson 06: Functions
|
||||
#
|
||||
# A function is a reusable block of code that you define once and can call
|
||||
# as many times as you need. Functions help you avoid repeating yourself and
|
||||
# make your code easier to read and test.
|
||||
#
|
||||
# Anatomy:
|
||||
# def function_name(parameters):
|
||||
# # body — indented
|
||||
# return value # optional
|
||||
#
|
||||
# To run this file:
|
||||
# python 06_functions.py
|
||||
|
||||
# ============================================================
|
||||
# DEFINING AND CALLING A FUNCTION
|
||||
# ============================================================
|
||||
|
||||
def greet():
|
||||
print("Hello, World!")
|
||||
|
||||
greet() # call the function
|
||||
greet() # call it again — the code runs twice without copy-pasting
|
||||
|
||||
# ============================================================
|
||||
# PARAMETERS AND ARGUMENTS
|
||||
# ============================================================
|
||||
# Parameters are the variable names listed in the definition.
|
||||
# Arguments are the actual values you pass when calling the function.
|
||||
|
||||
def greet_person(name): # 'name' is the parameter
|
||||
print(f"Hello, {name}!")
|
||||
|
||||
greet_person("Alice") # "Alice" is the argument
|
||||
greet_person("Bob")
|
||||
|
||||
# Multiple parameters:
|
||||
def add(a, b):
|
||||
result = a + b
|
||||
print(f"{a} + {b} = {result}")
|
||||
|
||||
add(3, 4)
|
||||
add(10, 25)
|
||||
|
||||
# ============================================================
|
||||
# RETURN VALUES
|
||||
# ============================================================
|
||||
# Use 'return' to send a result back to wherever the function was called.
|
||||
|
||||
def multiply(a, b):
|
||||
return a * b
|
||||
|
||||
product = multiply(6, 7)
|
||||
print(f"6 x 7 = {product}")
|
||||
|
||||
# You can use the return value directly in an expression.
|
||||
print(multiply(3, 3) + multiply(2, 5)) # 9 + 10 = 19
|
||||
|
||||
# A function returns None by default if there is no return statement.
|
||||
|
||||
# ============================================================
|
||||
# DEFAULT PARAMETER VALUES
|
||||
# ============================================================
|
||||
# You can give parameters a default value. If the caller omits that
|
||||
# argument, the default is used instead.
|
||||
|
||||
def greet_with_title(name, title="Friend"):
|
||||
print(f"Hello, {title} {name}!")
|
||||
|
||||
greet_with_title("Alice", "Dr.") # Hello, Dr. Alice!
|
||||
greet_with_title("Bob") # Hello, Friend Bob!
|
||||
|
||||
# ============================================================
|
||||
# KEYWORD ARGUMENTS
|
||||
# ============================================================
|
||||
# You can pass arguments by name (in any order) to make calls clearer.
|
||||
|
||||
def describe_pet(animal, name, age):
|
||||
print(f"{name} is a {age}-year-old {animal}.")
|
||||
|
||||
describe_pet(name="Rex", age=3, animal="dog")
|
||||
|
||||
# ============================================================
|
||||
# VARIABLE SCOPE
|
||||
# ============================================================
|
||||
# Variables created inside a function only exist inside that function.
|
||||
# This is called "local scope."
|
||||
|
||||
def calculate_area(width, height):
|
||||
area = width * height # 'area' is local — only visible inside here
|
||||
return area
|
||||
|
||||
print(calculate_area(5, 4))
|
||||
# print(area) # ← this would raise a NameError — 'area' doesn't exist out here
|
||||
|
||||
# Variables defined outside functions are "global" and readable everywhere,
|
||||
# but modifying them inside a function requires the 'global' keyword (usually
|
||||
# best to avoid — prefer passing values as arguments and returning results).
|
||||
|
||||
# ============================================================
|
||||
# FUNCTIONS CALLING OTHER FUNCTIONS
|
||||
# ============================================================
|
||||
|
||||
def square(n):
|
||||
return n * n
|
||||
|
||||
def sum_of_squares(a, b):
|
||||
return square(a) + square(b)
|
||||
|
||||
print(sum_of_squares(3, 4)) # 9 + 16 = 25
|
||||
|
||||
# ============================================================
|
||||
# A PRACTICAL EXAMPLE: TEMPERATURE CONVERTER
|
||||
# ============================================================
|
||||
|
||||
def celsius_to_fahrenheit(celsius):
|
||||
return (celsius * 9 / 5) + 32
|
||||
|
||||
def fahrenheit_to_celsius(fahrenheit):
|
||||
return (fahrenheit - 32) * 5 / 9
|
||||
|
||||
temps_c = [0, 20, 37, 100]
|
||||
for c in temps_c:
|
||||
f = celsius_to_fahrenheit(c)
|
||||
print(f"{c}°C = {f:.1f}°F")
|
||||
|
||||
# ============================================================
|
||||
# RETURNING MULTIPLE VALUES
|
||||
# ============================================================
|
||||
# Python lets a function return several values as a tuple.
|
||||
|
||||
def min_max(numbers):
|
||||
return min(numbers), max(numbers)
|
||||
|
||||
data = [5, 2, 9, 1, 7, 3]
|
||||
lowest, highest = min_max(data)
|
||||
print(f"Min: {lowest}, Max: {highest}")
|
||||
|
||||
# --- Try it yourself ---
|
||||
# Write a function called 'is_even' that takes a number and returns
|
||||
# True if it's even, False if it's odd.
|
||||
# Then use it in a loop to print only the even numbers from 1 to 20.
|
||||
169
reference/python/07_lists_and_dicts.py
Normal file
@ -0,0 +1,169 @@
|
||||
# Lesson 07: Lists and Dictionaries
|
||||
#
|
||||
# Lists and dictionaries are the two most important built-in data structures
|
||||
# in Python. Lists store ordered sequences; dictionaries store key-value pairs.
|
||||
#
|
||||
# To run this file:
|
||||
# python 07_lists_and_dicts.py
|
||||
|
||||
# ============================================================
|
||||
# LISTS
|
||||
# ============================================================
|
||||
# A list is an ordered, changeable collection. Items keep their position
|
||||
# and you access them by index (starting at 0).
|
||||
|
||||
fruits = ["apple", "banana", "cherry", "date"]
|
||||
|
||||
# --- Accessing items by index ---
|
||||
print(fruits[0]) # apple (first item)
|
||||
print(fruits[2]) # cherry
|
||||
print(fruits[-1]) # date (last item — negative index counts from the end)
|
||||
|
||||
# --- Slicing: fruits[start:stop] — stop is NOT included ---
|
||||
print(fruits[1:3]) # ['banana', 'cherry']
|
||||
print(fruits[:2]) # ['apple', 'banana'] (start defaults to 0)
|
||||
print(fruits[2:]) # ['cherry', 'date'] (stop defaults to end)
|
||||
|
||||
# --- Modifying lists ---
|
||||
fruits[1] = "blueberry" # replace an item
|
||||
print(fruits)
|
||||
|
||||
fruits.append("elderberry") # add to the end
|
||||
print(fruits)
|
||||
|
||||
fruits.insert(1, "avocado") # insert at a specific index
|
||||
print(fruits)
|
||||
|
||||
fruits.remove("cherry") # remove by value (first match)
|
||||
print(fruits)
|
||||
|
||||
popped = fruits.pop() # remove and return the last item
|
||||
print(f"Removed: {popped}, List: {fruits}")
|
||||
|
||||
popped_at = fruits.pop(0) # remove and return item at index 0
|
||||
print(f"Removed: {popped_at}, List: {fruits}")
|
||||
|
||||
# --- Useful list operations ---
|
||||
numbers = [3, 1, 4, 1, 5, 9, 2, 6]
|
||||
|
||||
print(len(numbers)) # 8 — number of items
|
||||
print(min(numbers)) # 1 — smallest value
|
||||
print(max(numbers)) # 9 — largest value
|
||||
print(sum(numbers)) # 31 — total
|
||||
print(numbers.count(1)) # 2 — how many times 1 appears
|
||||
print(1 in numbers) # True — membership test
|
||||
|
||||
numbers.sort() # sort in place (modifies the list)
|
||||
print(numbers)
|
||||
|
||||
numbers.sort(reverse=True) # sort descending
|
||||
print(numbers)
|
||||
|
||||
# sorted() returns a new sorted list without changing the original
|
||||
original = [3, 1, 4, 1, 5]
|
||||
new_sorted = sorted(original)
|
||||
print(original, "→", new_sorted)
|
||||
|
||||
# --- Nested lists (2D) ---
|
||||
grid = [
|
||||
[1, 2, 3],
|
||||
[4, 5, 6],
|
||||
[7, 8, 9],
|
||||
]
|
||||
print(grid[1][2]) # 6 — row 1, column 2
|
||||
|
||||
# --- List comprehension ---
|
||||
# A concise way to build a new list by transforming or filtering another.
|
||||
squares = [x ** 2 for x in range(1, 6)]
|
||||
print(squares) # [1, 4, 9, 16, 25]
|
||||
|
||||
evens = [x for x in range(20) if x % 2 == 0]
|
||||
print(evens) # [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
|
||||
|
||||
# ============================================================
|
||||
# DICTIONARIES
|
||||
# ============================================================
|
||||
# A dictionary stores data as key-value pairs.
|
||||
# Keys must be unique and immutable (usually strings or numbers).
|
||||
# Use the key to look up the value — like a real dictionary.
|
||||
|
||||
person = {
|
||||
"name": "Alice",
|
||||
"age": 30,
|
||||
"city": "New York",
|
||||
}
|
||||
|
||||
# --- Accessing values ---
|
||||
print(person["name"]) # Alice
|
||||
print(person.get("age")) # 30
|
||||
print(person.get("email", "N/A")) # N/A — .get() returns a default if key missing
|
||||
|
||||
# --- Adding and updating entries ---
|
||||
person["email"] = "alice@example.com" # add a new key
|
||||
person["age"] = 31 # update an existing key
|
||||
print(person)
|
||||
|
||||
# --- Removing entries ---
|
||||
del person["city"]
|
||||
removed = person.pop("email") # removes and returns the value
|
||||
print(f"Removed email: {removed}")
|
||||
print(person)
|
||||
|
||||
# --- Checking for keys ---
|
||||
print("name" in person) # True
|
||||
print("phone" in person) # False
|
||||
|
||||
# --- Iterating ---
|
||||
scores = {"math": 95, "english": 88, "science": 92}
|
||||
|
||||
for subject in scores: # iterate over keys
|
||||
print(subject)
|
||||
|
||||
for subject, score in scores.items(): # iterate over key-value pairs
|
||||
print(f" {subject}: {score}")
|
||||
|
||||
for score in scores.values(): # iterate over values only
|
||||
print(score)
|
||||
|
||||
# --- Useful dictionary methods ---
|
||||
print(scores.keys()) # dict_keys(['math', 'english', 'science'])
|
||||
print(scores.values()) # dict_values([95, 88, 92])
|
||||
print(len(scores)) # 3
|
||||
|
||||
# --- Nested dictionary ---
|
||||
students = {
|
||||
"alice": {"grade": "A", "score": 95},
|
||||
"bob": {"grade": "B", "score": 83},
|
||||
}
|
||||
print(students["alice"]["score"]) # 95
|
||||
|
||||
# --- Dictionary comprehension ---
|
||||
words = ["apple", "banana", "cherry"]
|
||||
word_lengths = {word: len(word) for word in words}
|
||||
print(word_lengths) # {'apple': 5, 'banana': 6, 'cherry': 6}
|
||||
|
||||
# ============================================================
|
||||
# OTHER COLLECTION TYPES (brief overview)
|
||||
# ============================================================
|
||||
|
||||
# Tuple — like a list but immutable (cannot be changed after creation).
|
||||
# Good for fixed data like coordinates or RGB colors.
|
||||
coordinates = (40.7128, -74.0060)
|
||||
red, green, blue = (255, 0, 0)
|
||||
print(red, green, blue)
|
||||
|
||||
# Set — unordered collection of unique values. Duplicates are ignored.
|
||||
# Useful for membership tests and removing duplicates.
|
||||
unique_numbers = {1, 2, 3, 2, 1}
|
||||
print(unique_numbers) # {1, 2, 3}
|
||||
|
||||
tags = {"python", "beginner", "tutorial"}
|
||||
tags.add("code")
|
||||
print("python" in tags) # True
|
||||
|
||||
# --- Try it yourself ---
|
||||
# 1. Create a list of 5 of your favorite movies.
|
||||
# Sort them alphabetically and print each one with its rank (1, 2, 3...).
|
||||
#
|
||||
# 2. Create a dictionary representing a simple contact:
|
||||
# name, phone, email. Print each field on its own line.
|
||||
176
reference/python/08_classes.py
Normal file
@ -0,0 +1,176 @@
|
||||
# Lesson 08: Classes and Object-Oriented Programming
|
||||
#
|
||||
# A class is a blueprint for creating objects. An object bundles together
|
||||
# data (attributes) and behavior (methods) that belong to the same concept.
|
||||
#
|
||||
# Example: a "Dog" class defines what every Dog has (name, breed) and
|
||||
# what every Dog can do (bark, fetch). Each actual dog is an instance.
|
||||
#
|
||||
# To run this file:
|
||||
# python 08_classes.py
|
||||
|
||||
# ============================================================
|
||||
# DEFINING A CLASS
|
||||
# ============================================================
|
||||
|
||||
class Dog:
|
||||
# __init__ is the constructor — Python calls it automatically
|
||||
# whenever you create a new Dog. 'self' always refers to the
|
||||
# specific instance being created or used.
|
||||
def __init__(self, name, breed, age):
|
||||
self.name = name # instance attribute
|
||||
self.breed = breed
|
||||
self.age = age
|
||||
|
||||
# Methods are functions that belong to the class.
|
||||
def bark(self):
|
||||
print(f"{self.name} says: Woof!")
|
||||
|
||||
def describe(self):
|
||||
print(f"{self.name} is a {self.age}-year-old {self.breed}.")
|
||||
|
||||
def have_birthday(self):
|
||||
self.age += 1
|
||||
print(f"Happy birthday, {self.name}! You are now {self.age}.")
|
||||
|
||||
|
||||
# --- Creating instances ---
|
||||
dog1 = Dog("Rex", "German Shepherd", 3)
|
||||
dog2 = Dog("Bella", "Labrador", 5)
|
||||
|
||||
dog1.bark()
|
||||
dog2.bark()
|
||||
dog1.describe()
|
||||
dog2.describe()
|
||||
|
||||
dog1.have_birthday()
|
||||
dog1.describe()
|
||||
|
||||
# --- Accessing attributes directly ---
|
||||
print(dog2.name) # Bella
|
||||
print(dog2.age) # 5
|
||||
|
||||
# ============================================================
|
||||
# A MORE COMPLETE EXAMPLE: BankAccount
|
||||
# ============================================================
|
||||
|
||||
class BankAccount:
|
||||
def __init__(self, owner, balance=0):
|
||||
self.owner = owner
|
||||
self.balance = balance
|
||||
self.transactions = [] # every instance gets its own empty list
|
||||
|
||||
def deposit(self, amount):
|
||||
if amount <= 0:
|
||||
print("Deposit amount must be positive.")
|
||||
return
|
||||
self.balance += amount
|
||||
self.transactions.append(f"+{amount}")
|
||||
print(f"Deposited ${amount:.2f}. New balance: ${self.balance:.2f}")
|
||||
|
||||
def withdraw(self, amount):
|
||||
if amount <= 0:
|
||||
print("Withdrawal amount must be positive.")
|
||||
return
|
||||
if amount > self.balance:
|
||||
print("Insufficient funds.")
|
||||
return
|
||||
self.balance -= amount
|
||||
self.transactions.append(f"-{amount}")
|
||||
print(f"Withdrew ${amount:.2f}. New balance: ${self.balance:.2f}")
|
||||
|
||||
def show_statement(self):
|
||||
print(f"\n--- Statement for {self.owner} ---")
|
||||
for t in self.transactions:
|
||||
print(f" {t}")
|
||||
print(f" Balance: ${self.balance:.2f}")
|
||||
|
||||
|
||||
account = BankAccount("Alice", balance=1000)
|
||||
account.deposit(500)
|
||||
account.withdraw(200)
|
||||
account.withdraw(2000) # should fail
|
||||
account.show_statement()
|
||||
|
||||
# ============================================================
|
||||
# INHERITANCE
|
||||
# ============================================================
|
||||
# A child class can inherit everything from a parent class and then
|
||||
# add or override behavior. This avoids code duplication.
|
||||
|
||||
class Animal:
|
||||
def __init__(self, name, sound):
|
||||
self.name = name
|
||||
self.sound = sound
|
||||
|
||||
def speak(self):
|
||||
print(f"{self.name} says {self.sound}!")
|
||||
|
||||
def __str__(self):
|
||||
# __str__ controls what print(obj) shows
|
||||
return f"Animal(name={self.name})"
|
||||
|
||||
|
||||
class Cat(Animal): # Cat inherits from Animal
|
||||
def __init__(self, name, indoor=True):
|
||||
super().__init__(name, "Meow") # call the parent __init__
|
||||
self.indoor = indoor
|
||||
|
||||
def purr(self): # method unique to Cat
|
||||
print(f"{self.name} purrs contentedly.")
|
||||
|
||||
|
||||
class Parrot(Animal):
|
||||
def __init__(self, name, phrase):
|
||||
super().__init__(name, phrase)
|
||||
self.phrase = phrase
|
||||
|
||||
def speak(self): # override the parent method
|
||||
print(f"{self.name} squawks: '{self.phrase}!'")
|
||||
|
||||
|
||||
generic = Animal("Generic", "...")
|
||||
cat = Cat("Whiskers")
|
||||
parrot = Parrot("Polly", "Polly wants a cracker")
|
||||
|
||||
generic.speak()
|
||||
cat.speak() # inherited from Animal
|
||||
cat.purr() # Cat-specific
|
||||
parrot.speak() # overridden version
|
||||
|
||||
print(cat.indoor) # True
|
||||
print(str(generic)) # Animal(name=Generic)
|
||||
|
||||
# --- isinstance() checks whether an object is an instance of a class ---
|
||||
print(isinstance(cat, Cat)) # True
|
||||
print(isinstance(cat, Animal)) # True — Cat IS an Animal (via inheritance)
|
||||
print(isinstance(cat, Dog)) # False
|
||||
|
||||
# ============================================================
|
||||
# CLASS ATTRIBUTES (shared by all instances)
|
||||
# ============================================================
|
||||
|
||||
class Counter:
|
||||
count = 0 # class attribute — shared across all instances
|
||||
|
||||
def __init__(self):
|
||||
Counter.count += 1
|
||||
self.id = Counter.count
|
||||
|
||||
def __str__(self):
|
||||
return f"Counter #{self.id}"
|
||||
|
||||
|
||||
c1 = Counter()
|
||||
c2 = Counter()
|
||||
c3 = Counter()
|
||||
print(c1, c2, c3)
|
||||
print(f"Total counters created: {Counter.count}")
|
||||
|
||||
# --- Try it yourself ---
|
||||
# Create a 'Rectangle' class with:
|
||||
# - __init__ that takes width and height
|
||||
# - an area() method that returns width * height
|
||||
# - a perimeter() method that returns 2 * (width + height)
|
||||
# - a __str__ method that prints something like "Rectangle(4 x 6)"
|
||||
# Create two rectangles and print their area and perimeter.
|
||||
164
reference/python/09_error_handling.py
Normal file
@ -0,0 +1,164 @@
|
||||
# Lesson 09: Error Handling
|
||||
#
|
||||
# Errors (called exceptions) happen at runtime — bad input, missing files,
|
||||
# dividing by zero, etc. Instead of crashing, you can catch and handle them
|
||||
# gracefully using try/except blocks.
|
||||
#
|
||||
# To run this file:
|
||||
# python 09_error_handling.py
|
||||
|
||||
# ============================================================
|
||||
# WHY PROGRAMS CRASH
|
||||
# ============================================================
|
||||
# Uncomment any line below to see the error it produces, then re-comment it.
|
||||
|
||||
# print(10 / 0) # ZeroDivisionError
|
||||
# print(int("hello")) # ValueError
|
||||
# print([][5]) # IndexError
|
||||
# print({"a": 1}["z"]) # KeyError
|
||||
# print(open("nope.txt")) # FileNotFoundError
|
||||
|
||||
# ============================================================
|
||||
# BASIC TRY / EXCEPT
|
||||
# ============================================================
|
||||
# Code inside 'try' runs normally.
|
||||
# If an error occurs, Python jumps to the matching 'except' block
|
||||
# instead of crashing.
|
||||
|
||||
try:
|
||||
result = 10 / 0
|
||||
except ZeroDivisionError:
|
||||
print("Cannot divide by zero!")
|
||||
|
||||
# The program continues running after the except block.
|
||||
print("Program is still running.")
|
||||
|
||||
# ============================================================
|
||||
# CATCHING MULTIPLE EXCEPTION TYPES
|
||||
# ============================================================
|
||||
|
||||
def safe_divide(a, b):
|
||||
try:
|
||||
return a / b
|
||||
except ZeroDivisionError:
|
||||
print("Error: division by zero.")
|
||||
return None
|
||||
except TypeError:
|
||||
print("Error: both arguments must be numbers.")
|
||||
return None
|
||||
|
||||
print(safe_divide(10, 2)) # 5.0
|
||||
print(safe_divide(10, 0)) # Error message, then None
|
||||
print(safe_divide(10, "x")) # Error message, then None
|
||||
|
||||
# ============================================================
|
||||
# THE else AND finally CLAUSES
|
||||
# ============================================================
|
||||
# else runs only if NO exception occurred.
|
||||
# finally always runs, whether or not an exception occurred.
|
||||
# Use finally for cleanup (closing files, database connections, etc.).
|
||||
|
||||
def parse_number(text):
|
||||
try:
|
||||
number = int(text)
|
||||
except ValueError:
|
||||
print(f" '{text}' is not a valid integer.")
|
||||
else:
|
||||
print(f" Parsed successfully: {number}")
|
||||
finally:
|
||||
print(" (parse attempt complete)") # always runs
|
||||
|
||||
print("\nParsing '42':")
|
||||
parse_number("42")
|
||||
|
||||
print("\nParsing 'abc':")
|
||||
parse_number("abc")
|
||||
|
||||
# ============================================================
|
||||
# GETTING THE ERROR MESSAGE
|
||||
# ============================================================
|
||||
# Use 'as e' to capture the exception object and read its message.
|
||||
|
||||
try:
|
||||
value = int("not a number")
|
||||
except ValueError as e:
|
||||
print(f"\nCaught a ValueError: {e}")
|
||||
|
||||
# ============================================================
|
||||
# RAISING EXCEPTIONS
|
||||
# ============================================================
|
||||
# You can deliberately raise an exception to signal that something
|
||||
# went wrong in your own code.
|
||||
|
||||
def set_age(age):
|
||||
if not isinstance(age, int):
|
||||
raise TypeError("Age must be an integer.")
|
||||
if age < 0 or age > 150:
|
||||
raise ValueError(f"Age {age} is out of realistic range.")
|
||||
return age
|
||||
|
||||
try:
|
||||
set_age(-5)
|
||||
except ValueError as e:
|
||||
print(f"\nInvalid age: {e}")
|
||||
|
||||
try:
|
||||
set_age("thirty")
|
||||
except TypeError as e:
|
||||
print(f"Type error: {e}")
|
||||
|
||||
# ============================================================
|
||||
# PRACTICAL EXAMPLE: ROBUST USER INPUT
|
||||
# ============================================================
|
||||
# In a real program you'd use input() here. For the demo we
|
||||
# simulate the user typing a value.
|
||||
|
||||
def get_positive_number(value_str):
|
||||
try:
|
||||
number = float(value_str)
|
||||
if number <= 0:
|
||||
raise ValueError("Number must be positive.")
|
||||
return number
|
||||
except ValueError as e:
|
||||
print(f"Invalid input: {e}")
|
||||
return None
|
||||
|
||||
test_inputs = ["42", "-3", "0", "abc", "7.5"]
|
||||
for val in test_inputs:
|
||||
result = get_positive_number(val)
|
||||
if result is not None:
|
||||
print(f" Accepted: {result}")
|
||||
|
||||
# ============================================================
|
||||
# CUSTOM EXCEPTIONS
|
||||
# ============================================================
|
||||
# You can create your own exception classes by inheriting from Exception.
|
||||
# This is useful when you want callers to catch a specific, named error.
|
||||
|
||||
class InsufficientFundsError(Exception):
|
||||
pass
|
||||
|
||||
class Wallet:
|
||||
def __init__(self, balance):
|
||||
self.balance = balance
|
||||
|
||||
def spend(self, amount):
|
||||
if amount > self.balance:
|
||||
raise InsufficientFundsError(
|
||||
f"Tried to spend ${amount:.2f} but only have ${self.balance:.2f}."
|
||||
)
|
||||
self.balance -= amount
|
||||
print(f"Spent ${amount:.2f}. Remaining: ${self.balance:.2f}")
|
||||
|
||||
wallet = Wallet(50)
|
||||
try:
|
||||
wallet.spend(30)
|
||||
wallet.spend(30) # this will fail
|
||||
except InsufficientFundsError as e:
|
||||
print(f"\nPayment failed: {e}")
|
||||
|
||||
# --- Try it yourself ---
|
||||
# Write a function 'safe_index(lst, i)' that:
|
||||
# - Returns lst[i] if i is a valid index
|
||||
# - Catches IndexError and prints a friendly message instead of crashing
|
||||
# - Returns None when the index is out of range
|
||||
149
reference/python/10_file_io.py
Normal file
@ -0,0 +1,149 @@
|
||||
# Lesson 10: File Input and Output
|
||||
#
|
||||
# Programs often need to read data from files and write results back.
|
||||
# Python makes this straightforward with the built-in open() function.
|
||||
#
|
||||
# File modes:
|
||||
# "r" — read (default). File must exist.
|
||||
# "w" — write. Creates the file; OVERWRITES if it already exists.
|
||||
# "a" — append. Creates the file; adds to the end if it already exists.
|
||||
# "r+" — read and write.
|
||||
#
|
||||
# Always use the 'with' statement — it closes the file automatically,
|
||||
# even if an error occurs.
|
||||
#
|
||||
# To run this file:
|
||||
# python 10_file_io.py
|
||||
|
||||
import os
|
||||
|
||||
# ============================================================
|
||||
# WRITING TO A FILE
|
||||
# ============================================================
|
||||
|
||||
output_path = "sample_output.txt"
|
||||
|
||||
with open(output_path, "w") as f:
|
||||
f.write("Line 1: Hello from Python!\n")
|
||||
f.write("Line 2: Writing to files is easy.\n")
|
||||
f.write("Line 3: The 'with' block closes the file for us.\n")
|
||||
|
||||
print(f"File written: {output_path}")
|
||||
|
||||
# ============================================================
|
||||
# READING AN ENTIRE FILE AT ONCE
|
||||
# ============================================================
|
||||
|
||||
with open(output_path, "r") as f:
|
||||
contents = f.read()
|
||||
|
||||
print("\n--- Full file contents ---")
|
||||
print(contents)
|
||||
|
||||
# ============================================================
|
||||
# READING LINE BY LINE
|
||||
# ============================================================
|
||||
|
||||
print("--- Reading line by line ---")
|
||||
with open(output_path, "r") as f:
|
||||
for line in f:
|
||||
# line includes the trailing newline character, so strip it
|
||||
print(repr(line.strip()))
|
||||
|
||||
# --- readlines() returns a list of all lines ---
|
||||
with open(output_path, "r") as f:
|
||||
lines = f.readlines()
|
||||
|
||||
print(f"\nNumber of lines: {len(lines)}")
|
||||
print(f"First line: {lines[0].strip()}")
|
||||
|
||||
# ============================================================
|
||||
# APPENDING TO A FILE
|
||||
# ============================================================
|
||||
|
||||
with open(output_path, "a") as f:
|
||||
f.write("Line 4: Appended after the original content.\n")
|
||||
|
||||
with open(output_path, "r") as f:
|
||||
print("\n--- After appending ---")
|
||||
print(f.read())
|
||||
|
||||
# ============================================================
|
||||
# WRITING MULTIPLE LINES WITH writelines()
|
||||
# ============================================================
|
||||
|
||||
shopping_list = ["eggs\n", "milk\n", "bread\n", "butter\n"]
|
||||
|
||||
with open("shopping.txt", "w") as f:
|
||||
f.writelines(shopping_list)
|
||||
|
||||
print("shopping.txt created.")
|
||||
|
||||
# ============================================================
|
||||
# WORKING WITH CSV-STYLE DATA
|
||||
# ============================================================
|
||||
# CSV (comma-separated values) is a common plain-text format for tables.
|
||||
# Python's csv module handles quoting and edge cases automatically.
|
||||
|
||||
import csv
|
||||
|
||||
students = [
|
||||
{"name": "Alice", "grade": "A", "score": 95},
|
||||
{"name": "Bob", "grade": "B", "score": 83},
|
||||
{"name": "Carol", "grade": "A", "score": 91},
|
||||
]
|
||||
|
||||
csv_path = "students.csv"
|
||||
|
||||
# Write CSV
|
||||
with open(csv_path, "w", newline="") as f:
|
||||
fieldnames = ["name", "grade", "score"]
|
||||
writer = csv.DictWriter(f, fieldnames=fieldnames)
|
||||
writer.writeheader()
|
||||
writer.writerows(students)
|
||||
|
||||
print(f"\n{csv_path} written.")
|
||||
|
||||
# Read CSV
|
||||
print("\n--- Reading students.csv ---")
|
||||
with open(csv_path, "r") as f:
|
||||
reader = csv.DictReader(f)
|
||||
for row in reader:
|
||||
print(f" {row['name']}: {row['grade']} ({row['score']})")
|
||||
|
||||
# ============================================================
|
||||
# CHECKING IF A FILE EXISTS
|
||||
# ============================================================
|
||||
|
||||
print(f"\nDoes '{output_path}' exist? {os.path.exists(output_path)}")
|
||||
print(f"Does 'missing.txt' exist? {os.path.exists('missing.txt')}")
|
||||
|
||||
# ============================================================
|
||||
# SAFE READING WITH ERROR HANDLING
|
||||
# ============================================================
|
||||
|
||||
def read_file_safe(path):
|
||||
try:
|
||||
with open(path, "r") as f:
|
||||
return f.read()
|
||||
except FileNotFoundError:
|
||||
print(f"File not found: {path}")
|
||||
return None
|
||||
|
||||
content = read_file_safe("missing.txt")
|
||||
if content is None:
|
||||
print("Could not read file — skipping.")
|
||||
|
||||
# ============================================================
|
||||
# CLEANUP: remove the files we created
|
||||
# ============================================================
|
||||
|
||||
for filename in [output_path, "shopping.txt", csv_path]:
|
||||
if os.path.exists(filename):
|
||||
os.remove(filename)
|
||||
print(f"Removed: {filename}")
|
||||
|
||||
# --- Try it yourself ---
|
||||
# 1. Write a program that asks the user to enter their name and favorite color,
|
||||
# then saves those to a file called "profile.txt".
|
||||
# 2. Read the file back and print a greeting using the saved information.
|
||||
198
reference/python/11_putting_it_together.py
Normal file
@ -0,0 +1,198 @@
|
||||
# Lesson 11: Putting It All Together
|
||||
#
|
||||
# This final lesson builds a small, complete command-line program that
|
||||
# uses everything from the previous lessons:
|
||||
# - variables and data types
|
||||
# - conditionals and loops
|
||||
# - functions
|
||||
# - classes
|
||||
# - error handling
|
||||
# - file I/O
|
||||
#
|
||||
# The program is a simple Student Grade Tracker. It lets you:
|
||||
# 1. Add a student and their score
|
||||
# 2. View all students and their grades
|
||||
# 3. Show class statistics
|
||||
# 4. Save the roster to a file
|
||||
# 5. Load a previously saved roster
|
||||
#
|
||||
# To run this file:
|
||||
# python 11_putting_it_together.py
|
||||
|
||||
import csv
|
||||
import os
|
||||
|
||||
# ============================================================
|
||||
# CONSTANTS
|
||||
# ============================================================
|
||||
SAVE_FILE = "roster.csv"
|
||||
|
||||
# ============================================================
|
||||
# CLASSES
|
||||
# ============================================================
|
||||
|
||||
class Student:
|
||||
def __init__(self, name, score):
|
||||
self.name = name
|
||||
self.score = float(score)
|
||||
|
||||
@property
|
||||
def grade(self):
|
||||
if self.score >= 90:
|
||||
return "A"
|
||||
elif self.score >= 80:
|
||||
return "B"
|
||||
elif self.score >= 70:
|
||||
return "C"
|
||||
elif self.score >= 60:
|
||||
return "D"
|
||||
else:
|
||||
return "F"
|
||||
|
||||
def __str__(self):
|
||||
return f"{self.name:<20} Score: {self.score:5.1f} Grade: {self.grade}"
|
||||
|
||||
|
||||
class Roster:
|
||||
def __init__(self):
|
||||
self.students = []
|
||||
|
||||
def add_student(self, name, score):
|
||||
try:
|
||||
score = float(score)
|
||||
except ValueError:
|
||||
print(f" Error: '{score}' is not a valid score.")
|
||||
return
|
||||
if score < 0 or score > 100:
|
||||
print(" Error: Score must be between 0 and 100.")
|
||||
return
|
||||
self.students.append(Student(name, score))
|
||||
print(f" Added: {name} (score: {score})")
|
||||
|
||||
def display(self):
|
||||
if not self.students:
|
||||
print(" No students in roster.")
|
||||
return
|
||||
print(f"\n {'Name':<20} {'Score':>7} {'Grade':>6}")
|
||||
print(" " + "-" * 38)
|
||||
for student in sorted(self.students, key=lambda s: s.score, reverse=True):
|
||||
print(f" {student}")
|
||||
|
||||
def statistics(self):
|
||||
if not self.students:
|
||||
print(" No data yet.")
|
||||
return
|
||||
scores = [s.score for s in self.students]
|
||||
print(f"\n Students : {len(scores)}")
|
||||
print(f" Average : {sum(scores) / len(scores):.1f}")
|
||||
print(f" Highest : {max(scores):.1f}")
|
||||
print(f" Lowest : {min(scores):.1f}")
|
||||
|
||||
grade_counts = {}
|
||||
for student in self.students:
|
||||
grade_counts[student.grade] = grade_counts.get(student.grade, 0) + 1
|
||||
print(" Grades :", ", ".join(
|
||||
f"{g}: {n}" for g, n in sorted(grade_counts.items())
|
||||
))
|
||||
|
||||
def save(self, path):
|
||||
try:
|
||||
with open(path, "w", newline="") as f:
|
||||
writer = csv.writer(f)
|
||||
writer.writerow(["name", "score"])
|
||||
for s in self.students:
|
||||
writer.writerow([s.name, s.score])
|
||||
print(f" Saved {len(self.students)} student(s) to {path}")
|
||||
except OSError as e:
|
||||
print(f" Save failed: {e}")
|
||||
|
||||
def load(self, path):
|
||||
if not os.path.exists(path):
|
||||
print(f" File not found: {path}")
|
||||
return
|
||||
try:
|
||||
with open(path, "r") as f:
|
||||
reader = csv.DictReader(f)
|
||||
loaded = 0
|
||||
for row in reader:
|
||||
self.students.append(Student(row["name"], row["score"]))
|
||||
loaded += 1
|
||||
print(f" Loaded {loaded} student(s) from {path}")
|
||||
except (OSError, KeyError, ValueError) as e:
|
||||
print(f" Load failed: {e}")
|
||||
|
||||
|
||||
# ============================================================
|
||||
# MENU HELPERS
|
||||
# ============================================================
|
||||
|
||||
def print_menu():
|
||||
print("\n" + "=" * 40)
|
||||
print(" Student Grade Tracker")
|
||||
print("=" * 40)
|
||||
print(" 1. Add a student")
|
||||
print(" 2. View all students")
|
||||
print(" 3. Show statistics")
|
||||
print(" 4. Save roster")
|
||||
print(" 5. Load roster")
|
||||
print(" 6. Quit")
|
||||
print("=" * 40)
|
||||
|
||||
|
||||
def get_choice():
|
||||
while True:
|
||||
choice = input(" Enter choice (1-6): ").strip()
|
||||
if choice in ("1", "2", "3", "4", "5", "6"):
|
||||
return choice
|
||||
print(" Please enter a number from 1 to 6.")
|
||||
|
||||
|
||||
# ============================================================
|
||||
# MAIN PROGRAM
|
||||
# ============================================================
|
||||
|
||||
def main():
|
||||
roster = Roster()
|
||||
|
||||
# Pre-populate with some sample data so the demo is interesting
|
||||
# even without any user input.
|
||||
sample_data = [
|
||||
("Alice", 95),
|
||||
("Bob", 83),
|
||||
("Carol", 76),
|
||||
("David", 61),
|
||||
("Eve", 100),
|
||||
]
|
||||
for name, score in sample_data:
|
||||
roster.add_student(name, score)
|
||||
|
||||
while True:
|
||||
print_menu()
|
||||
choice = get_choice()
|
||||
|
||||
if choice == "1":
|
||||
name = input(" Student name: ").strip()
|
||||
score = input(" Score (0-100): ").strip()
|
||||
roster.add_student(name, score)
|
||||
|
||||
elif choice == "2":
|
||||
roster.display()
|
||||
|
||||
elif choice == "3":
|
||||
roster.statistics()
|
||||
|
||||
elif choice == "4":
|
||||
roster.save(SAVE_FILE)
|
||||
|
||||
elif choice == "5":
|
||||
roster.load(SAVE_FILE)
|
||||
|
||||
elif choice == "6":
|
||||
print("\n Goodbye!\n")
|
||||
break
|
||||
|
||||
|
||||
# This guard ensures main() only runs when you execute this file directly,
|
||||
# not when it is imported as a module by another script.
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
44
reference/python/README.md
Normal file
@ -0,0 +1,44 @@
|
||||
# Python — reference material
|
||||
|
||||
This folder is a self-paced Python primer. It is **not** the spine of the class — your project is. We've put it here because some projects will need a working understanding of Python (or another language), and when that comes up it's useful to have something concrete to point at.
|
||||
|
||||
## When to dip in
|
||||
|
||||
- Your project genuinely needs you to read or modify Python code, and AI explanations alone aren't sticking.
|
||||
- You're curious how the code AI generates for you actually works.
|
||||
- You want a foundation so you can steer AI more confidently.
|
||||
|
||||
## When *not* to dip in
|
||||
|
||||
- Out of a sense of obligation. There's no test. Nobody is going to ask if you finished `08_classes.py`.
|
||||
- Before you have a problem to apply it to. These lessons are dense if you're reading them in the abstract; they click much faster when you have a concrete reason to care.
|
||||
|
||||
## How to run a script
|
||||
|
||||
All scripts are standalone. From this folder:
|
||||
|
||||
```bash
|
||||
python3 01_hello_world.py
|
||||
```
|
||||
|
||||
(Use `python` instead of `python3` on Windows if that's what your install named.)
|
||||
|
||||
## Lessons
|
||||
|
||||
Work in order if you're starting from zero. Skip around if you already know parts of this.
|
||||
|
||||
| File | Topic |
|
||||
|------|-------|
|
||||
| `01_hello_world.py` | `print()` — your first program |
|
||||
| `02_variables.py` | Variables, f-strings |
|
||||
| `03_datatypes.py` | int, float, str, bool, None; type conversion |
|
||||
| `04_conditionals.py` | `if`, `elif`, `else`; comparisons; logical operators |
|
||||
| `05_loops.py` | `for`, `while`, `break`, `continue`, `range()` |
|
||||
| `06_functions.py` | Defining and calling functions; parameters; return values |
|
||||
| `07_lists_and_dicts.py` | Core data structures; list comprehensions |
|
||||
| `08_classes.py` | Objects, attributes, methods, inheritance |
|
||||
| `09_error_handling.py` | `try`/`except`; raising exceptions |
|
||||
| `10_file_io.py` | Reading and writing text and CSV files |
|
||||
| `11_putting_it_together.py` | A complete CLI program combining everything |
|
||||
|
||||
If Python isn't installed yet, see [`installing-python.md`](installing-python.md).
|
||||
58
reference/python/installing-python.md
Normal file
@ -0,0 +1,58 @@
|
||||
# Installing Python
|
||||
|
||||
You only need this if your project leads you into Python. Skip it otherwise.
|
||||
|
||||
## Windows
|
||||
|
||||
1. Open your browser and go to https://www.python.org/downloads/
|
||||
2. Click the big yellow **"Download Python 3.x.x"** button.
|
||||
3. Run the installer. **Important:** on the first screen, check the box that says **"Add Python to PATH"** before clicking Install Now.
|
||||
4. Click **Install Now** and follow the prompts.
|
||||
|
||||
Verify:
|
||||
|
||||
1. Press `Windows + R`, type `cmd`, press Enter.
|
||||
2. Run:
|
||||
```
|
||||
python --version
|
||||
```
|
||||
You should see something like `Python 3.12.0`. If not, try `python3 --version`.
|
||||
|
||||
## Mac
|
||||
|
||||
Mac may already have Python 2 installed, but we need Python 3.
|
||||
|
||||
**Option A — direct download:**
|
||||
|
||||
1. Go to https://www.python.org/downloads/
|
||||
2. Click **"Download Python 3.x.x"** and run the `.pkg` installer.
|
||||
|
||||
**Option B — Homebrew (recommended if you'll code regularly):**
|
||||
|
||||
1. Open **Terminal** (`Cmd + Space`, type "Terminal", Enter).
|
||||
2. Install Homebrew:
|
||||
```
|
||||
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
|
||||
```
|
||||
3. Install Python:
|
||||
```
|
||||
brew install python
|
||||
```
|
||||
|
||||
Verify in Terminal:
|
||||
```
|
||||
python3 --version
|
||||
```
|
||||
|
||||
## Linux
|
||||
|
||||
Python 3 is almost certainly already installed. Check:
|
||||
```
|
||||
python3 --version
|
||||
```
|
||||
|
||||
If not, install it through your distribution's package manager (`apt`, `dnf`, `pacman`, etc.).
|
||||
|
||||
## A note on `python` vs `python3`
|
||||
|
||||
On Mac and Linux, `python` may not exist or may point to Python 2. Use `python3` if `python` doesn't work. They refer to the same interpreter on most modern systems — it's just a naming difference.
|
||||
26
reference/pytorch/README.md
Normal file
@ -0,0 +1,26 @@
|
||||
# PyTorch — reference material
|
||||
|
||||
*Placeholder.* This folder will cover PyTorch: the framework most modern AI models are written in. You don't need to know PyTorch to *use* a model, but you'll see it everywhere once you start poking around.
|
||||
|
||||
Planned topics:
|
||||
|
||||
- What a tensor is, and why it's basically a multi-dimensional array
|
||||
- Moving tensors between CPU and GPU (and what to do if you don't have a GPU)
|
||||
- Loading a pretrained model and running inference
|
||||
- The shape of a forward pass, conceptually
|
||||
- `torch.no_grad()` — and when it matters
|
||||
- Reading a model architecture without panicking
|
||||
- A very gentle intro to training and fine-tuning (separate file, optional)
|
||||
- CPU-only PyTorch installs vs. CUDA installs — which one you actually want
|
||||
|
||||
## When to dip in
|
||||
|
||||
When you find yourself reading model code and the lines starting with `torch.` are blocking you, or when you want to fine-tune something on your own data.
|
||||
|
||||
## When *not* to dip in
|
||||
|
||||
If you only ever call models through a high-level API (Hugging Face `pipeline()`, an OpenAI-compatible client, etc.), you may never need this.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Comfort with [`../python/`](../python/), especially classes and basic data structures.
|
||||
24
registration.md
Normal file
@ -0,0 +1,24 @@
|
||||
# Event Registration
|
||||
|
||||

|
||||
|
||||
## Event Details
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| **Date & Time** | June 11, 2026, 6:00 – 8:00 PM |
|
||||
| **Location** | 708 Main Street, Houston, TX — 10th Floor |
|
||||
| **Cost** | Pay What You Want |
|
||||
|
||||
## Support the Event
|
||||
|
||||
Donations are appreciated and help us keep this workshop accessible to everyone.
|
||||
|
||||
- [Venmo](https://venmo.com/u/Shen-Ge)
|
||||
- [PayPal / Credit Card](https://www.paypal.com/ncp/payment/E5LCXMMQEEVPN)
|
||||
|
||||
## Pre-Workshop Survey
|
||||
|
||||
Please complete this short survey before attending:
|
||||
|
||||
[Take the Survey](https://docs.google.com/forms/d/e/1FAIpQLSd8an-FsTQZoksPjGanLGXqwDtDddY81RwlouBLvIpcQYnvyg/viewform)
|
||||
44
sessions/01-orientation.md
Normal file
@ -0,0 +1,44 @@
|
||||
# Session 1 — Orientation
|
||||
|
||||
This first session is about **calibration**: getting your sense of what computers and AI can do today onto the same page as reality.
|
||||
|
||||
You will not leave this session with a project. You'll leave with a question rattling around your head, and — we hope — a different relationship with the device in your pocket.
|
||||
|
||||
---
|
||||
|
||||
## What to bring
|
||||
|
||||
Nothing. No laptop required. No installs. No accounts.
|
||||
|
||||
Come ready to watch, ask questions, and think.
|
||||
|
||||
---
|
||||
|
||||
## What we'll do together
|
||||
|
||||
The session has roughly four parts:
|
||||
|
||||
1. **Why this class exists.** The gap between what technology can do and what people are doing with it, and why right now is an unusual moment.
|
||||
2. **The flip.** Your computer is not a fixed set of apps with bugs you have to live with. It's a programmable, instructable machine. Once you see it that way, you can't unsee it.
|
||||
3. **Live demos.** Real things, solving real friction, in front of you. We'll be typing, correcting, and iterating on stage — not because we're showing off, but because seeing the human in the loop is the part that demystifies it.
|
||||
4. **The seed question.** Where in *your* life is there friction because technology has never been pointed at it for *you*? We'll talk about it together. You don't have to have an answer.
|
||||
|
||||
---
|
||||
|
||||
## What you'll take home
|
||||
|
||||
One assignment, if you want to call it that:
|
||||
|
||||
> **Notice friction this week.**
|
||||
>
|
||||
> Pay attention to the things you do on a computer or a phone that feel tedious, repetitive, or "this should just work." Write a few of them down. Don't filter — even silly ones.
|
||||
|
||||
Bring whatever you have (or don't have) to the next session. We'll start working from there.
|
||||
|
||||
---
|
||||
|
||||
## After the session
|
||||
|
||||
If a thought stuck with you and you want to talk about it before next time, reach out to either of us. That's what we're here for.
|
||||
|
||||
See [`../personal-project.md`](../personal-project.md) if you want to read more about where this is all going.
|
||||