All projects
🔬

UCReview AI

AI pipeline for auditing university course sheets

AI/LLMResearchPython

Overview

During my Research Initiation Grant at FEUP, I built UCReview AI — a pipeline that automatically scrapes university course sheets, parses their structure, and uses LLMs to audit them for completeness and quality. The goal: make course information more transparent and comparable for students.

Stack

PythonLLM APIsWeb ScrapingData Processing

Preview

🔬

Add screenshot 1

Drop an image here

🔬

Add screenshot 2

Drop an image here

What I learned

Build log

Struggles, findings, decisions, breakthroughs — the honest story.

🔴Challenge

Course sheets have zero consistency

Every faculty formats their course sheets differently. Some are PDFs, some HTML, some Word exports converted to web. The parser had to be fault-tolerant by design.

🔀Decision

Multi-provider factory pattern

Instead of hardcoding OpenAI, I built a factory that could swap providers. This saved the project when one provider had downtime during a critical testing phase.

💡Finding

LLMs hallucinate on academic jargon

Early runs had the model confidently misinterpreting Portuguese academic terminology. Had to add a validation layer and constrain outputs to defined categories.

Breakthrough

Structured output solved everything

Switching to JSON-mode / structured outputs dramatically improved reliability. Constraining the model's response format cut hallucination rate by ~80%.

🔴Challenge

Rate limits at scale

FEUP has hundreds of course sheets. Hitting API rate limits mid-run was painful. Built an async queue with backoff and checkpointing so runs could resume without starting over.