Zero-Touch Metadata
Define your standard, MARC, Dublin Core, or a custom schema, as a Pydantic model. BookWyrm scans the document content and populates the fields automatically, strictly adhering to your types.
Standardized
Enforce consistent formatting for authors, dates, and subjects.
Comprehensive
Extract abstract summaries, keywords, and ISBNs simultaneously.
Type-Safe
Integration-ready JSON output for your CMS or catalog software.
Developer Implementation
1. Define the Schema (catalog_model.py)
from pydantic import BaseModel, Field
from typing import List
class LibraryRecord(BaseModel):
title: str = Field(description="Official title of the work")
authors: List[str] = Field(description="List of primary authors")
isbn: str | None = Field(description="ISBN-13 if available")
dewey_class: str | None = Field(description="Suggested Dewey Decimal class")
subjects: List[str] = Field(description="Library of Congress Subject Headings")
abstract: str = Field(description="A concise 100-word summary")2. Run Extraction
bookwyrm summarize incoming_scan.txt \
--model-class-file catalog_model.py \
--model-class-name LibraryRecord \
--model-strength smart \
--output record_metadata.json