Vibe-coding a personal bookshelf using Claude Code and vision APIs

TL;DR

The author photographed ~470 books and used Claude to build a pipeline that extracts metadata via a vision API, fetches covers, and renders a spine-based bookshelf UI. The system achieved about 90% automated accuracy; the author handled the remaining fixes, design decisions and removed features that harmed experience.

What happened

The project began with an afternoon of photographing roughly 470 books (spines, covers and duplicates). Claude generated scripts that converted image formats, sent images to OpenAI's vision API to extract author, title and publisher, normalized names, resized images and wrote the results to a JSON dataset. About 90% of entries were correct; the remaining edge cases (poor lighting, damaged spines, rare editions) were corrected manually. For cover artwork, Claude first queried the Open Library API, then ran quality scoring and fell back to Google Images via SerpAPI when matches were poor. Ten covers that no API could supply were edited in Photoshop. For the UI, Claude built a bookshelf that recreates spines by extracting dominant colors and mapping page counts to spine widths; Framer Motion provided a scroll-based tilt animation, later optimized by moving animation outside React render cycles. Infinite scroll was added, proved unnecessary and was removed. A stack view for mobile was also implemented.

Why it matters

AI agents can remove repetitive execution work, shifting the human role toward judgment and taste.
Accepting imperfect automated results (90% accuracy) can be more productive than chasing edge cases.
Combining multiple APIs and fallbacks helps handle nonstandard or rare editions that single services miss.
Tooling and small UX choices—like spine widths and animation performance—still rely on human decisions.

Key facts

Approximately 470 photos of books were taken and processed.
Claude wrote and executed scripts to convert images and call a vision API for metadata extraction.
The automated pipeline returned roughly 90% correct metadata; the author manually fixed the remainder.
Covers were fetched primarily from the Open Library API, with SerpAPI/Google Images as a fallback.
Ten covers were manually edited in Photoshop because no automated source provided acceptable assets.
Spine visuals used dominant color extraction and mapped Open Library page counts to spine widths.
Scroll-based tilt animation was implemented with Framer Motion and optimized by using motion values and springs outside React render cycles.
An infinite-scroll feature was added then removed because it degraded user experience despite working technically.
A mobile 'stack' view was added, reusing animation timing and data patterns from the shelf.

What to watch next

How this pipeline scales to larger or more heterogeneous collections is not confirmed in the source.
Whether future additions will rely more on autogenerated covers or require additional manual edits is not confirmed in the source.
Potential long-term maintenance needs for API dependencies and format conversions are not confirmed in the source.

Quick glossary

Claude: A generative AI assistant used here to write and run scripts, orchestrate API calls and modify UI code.
OpenAI vision API: An image understanding service that extracts text and metadata from images.
Open Library API: A public service providing bibliographic metadata and cover images for many published works.
SerpAPI: A search engine results API used here as a fallback to fetch images via Google Images.
Framer Motion: A JavaScript library for animating React components, used to implement scroll-based tilt effects.

Reader FAQ

How accurate was the automated metadata extraction?
About 90% of books were correctly identified by the automated pipeline.

What did the author do manually?
The author fixed the remaining metadata errors, edited ten covers in Photoshop, and made UX decisions such as removing infinite scroll and choosing the spine-based layout.

Were any external APIs used for covers?
Yes — Open Library was used first, with SerpAPI/Google Images as a fallback for low-quality or missing matches.

How long did the initial capture take?
Photographing the collection took one afternoon.

Is the exact deployment or hosting setup described?
not confirmed in the source

I own more books than I can read. Not in a charming, aspirational way, but in the practical sense that at some point I stopped knowing what I owned. Somewhere…

Sources

Show HN: Vibe coding a bookshelf with Claude Code

Vibe-coding a personal bookshelf using Claude Code and vision APIs

By

TL;DR

What happened

Why it matters

Key facts

What to watch next

Quick glossary

Reader FAQ

Sources

Related posts

By

Related Post

The waning era of scale-only AI: why scaling’s grip is weakening

McKinsey and General Catalyst: the ‘learn once, work forever’ era is over

Masonite community announces passing of contributor Joe Mancuso

Leave a Reply Cancel reply

You missed

SMTP Tunnel: A SOCKS5 proxy that masks TCP as SMTP to bypass DPI

Recreated: Steve Jobs’s 1975 Atari horoscope program — now runnable

Google to publish AOSP source twice yearly, a setback for custom ROMs

Transform your phone into a true productivity workhorse with a USB-C hub