vlm.md
VLM Γ Agent
Recipe Cookbook
Battle-tested recipes for using VLMs in agent systems. Which model, which prompt, working code, and the gotchas we hit in production.
Browse by Category
Latest Recipes
Document Understanding Intermediate
Extract Key Clauses from Contract Images
Have your agent automatically extract parties, amounts, dates, penalty clauses, and jurisdiction from scanned contracts and output structured JSON for approval workflows or risk systems.
Claude 3.5 Sonnet GPT-4o Gemini 1.5 Pro
Document Understanding Intermediate
Parse Medical Lab Report Images
Extract test item names, values, units, reference ranges, and abnormal flags from blood work and biochemistry report photos β structured output for health management agents.
GPT-4o Claude 3.5 Sonnet
Chart & Table Intermediate
Extract Data from Bar Charts
Have your agent pull numerical data β labels, values, series names β out of bar chart images from reports, dashboards, or slides and return structured JSON.
GPT-4o Gemini 1.5 Pro Claude 3.5 Sonnet