Documentation

A product-style technical brief that explains the complete grading architecture, major stages, technologies, and output artifacts.

OCR Backends2EasyOCR and Azure Document Intelligence
Scoring Engines2SBERT local path and Gemini LLM path
Modalities3Text, diagrams, and formula understanding
Primary Outputs3JSON, live ranking table, and PDF reports

Frontend Layer

Next.js website plus Streamlit workspace for research-style testing.

Backend Layer

FastAPI APIs, run management, job history, and report download endpoints.

Core Evaluation Engine

Reusable Python pipeline powering OCR, formulas, diagrams, scoring, and reports.

Pipeline Readiness

A quick visual indicator of how mature each major system block is in the current implementation.

Upload and validation
100%
OCR processing
100%
Formula understanding
95%
Diagram extraction
90%
Scoring workflow
100%
Reporting
100%

Stage-by-Stage Workflow

This is the technical narrative you can use while explaining the project to your guide.

Tech used: Next.js forms, FastAPI upload handling, JSON validation

Input and Run Setup

Collect the ideal answer sheet, rubric, and student answer sheets for a traceable batch run.

  • The ideal answer sheet becomes the reference answer source.
  • Rubric JSON controls marks distribution and weighting logic.
  • Student PDFs are stored under per-run folders for reproducibility.
Tech used: pdf2image, Poppler, EasyOCR, Azure Document Intelligence

OCR and Content Capture

Convert PDF pages into structured textual evidence.

  • EasyOCR is used for cleaner printed answer sheets.
  • Azure Document Intelligence is used for handwritten sheets.
  • OCR output is stored page-wise for downstream reuse.
Tech used: pix2tex, SymPy, OpenCV, connected-component extraction

Formula and Diagram Processing

Handle non-text answer content separately from regular prose.

  • Formula-like regions are converted into LaTeX using pix2tex.
  • SymPy checks mathematical equivalence rather than plain string similarity.
  • Diagram regions are extracted independently for visual comparison.
Tech used: Sentence Transformers, CLIP, SymPy, Gemini 2.5 Flash

Scoring and Evaluation

Combine all answer signals under the rubric.

  • SBERT measures semantic answer similarity in the local evaluation path.
  • CLIP compares extracted diagrams visually.
  • Gemini acts as the LLM-based rubric-aware grading alternative.
Tech used: FPDF2, JSON serialization, tabular summaries, API report downloads

Reports and Delivery

Produce interpretable outputs for faculty and testing.

  • Each run produces structured JSON result artifacts.
  • Each student receives a PDF evaluation report.
  • The interfaces provide ranking tables and downloadable reports.

Tech Stack and Artifacts

A concise view of what the product uses internally and what it produces after every run.

Next.jsFastAPIStreamlitEasyOCRAzure Document IntelligenceSentence TransformersCLIPpix2texSymPyGemini 2.5 FlashFPDF2OpenCV
ArtifactPurpose
OCR JSONPage-wise extracted text, boxes, and OCR metadata.
Formula ResultsDetected expressions and LaTeX/SymPy comparison data.
Diagram CropsVisual regions used for diagram-level comparison.
Evaluation JSONQuestion-wise scoring, totals, and remarks.
PDF ReportsFaculty-ready downloadable result summaries.

README Reference

The current repository documentation is still rendered below for full technical traceability.

Automated Subjective Answer Sheet Evaluation Platform

Overview

This project is a multimodal subjective answer-sheet evaluation system built for academic and research workflows. It supports both printed and handwritten answer sheets and combines OCR, semantic text comparison, diagram analysis, formula-aware scoring, and PDF report generation inside a Streamlit dashboard.

Core Capabilities

  • OCR for printed sheets using EasyOCR
  • OCR for handwritten sheets using Azure Document Intelligence
  • Semantic answer evaluation using Sentence-BERT
  • LLM-assisted evaluation using Gemini
  • Diagram extraction and similarity scoring
  • Formula detection, pix2tex OCR, and SymPy-based mathematical equivalence scoring
  • Student-wise PDF report generation
  • Evaluation summaries and downloadable outputs through the UI

High-Level Workflow

  1. Upload the ideal answer sheet, rubric JSON, and student PDFs.
  2. Select the OCR mode and evaluation engine.
  3. Run OCR page by page on each PDF.
  4. Extract diagrams from each page.
  5. In SBERT mode, detect formula regions and convert them into LaTeX with pix2tex.
  6. Score text, diagrams, and formulas according to rubric weights.
  7. Save JSON results and generate PDF reports.
  8. Review the output through the dashboard table and download actions.

Current OCR Modes

Current/printed sheets (EasyOCR)

Used for cleaner, printed, or less noisy answer sheets.

Handwritten sheets (Azure Document Intelligence)

Used for handwritten answer sheets and works page by page to stay within practical resource limits.

Current Evaluation Modes

SBERT (fast, local)

  • Text similarity with sentence-transformers/all-MiniLM-L6-v2
  • Diagram similarity with CLIP
  • Formula-aware scoring with pix2tex and SymPy

Gemini 2.5 Flash (LLM API)

  • OCR text is passed to Gemini
  • Diagram images can be attached
  • Returns score and textual feedback

Note: formula-aware symbolic scoring is currently strongest in the SBERT path.

Formula-Aware Evaluation

The project now includes a dedicated formula stage:

  • OCR blocks are grouped into line candidates
  • Formula-like lines are cropped from page images
  • pix2tex converts those crops to LaTeX
  • SymPy attempts symbolic equivalence checks
  • Formula similarity becomes a separate rubric-weighted component

This allows mathematically equivalent expressions such as (x+1)^2 and x^2 + 2x + 1 to receive appropriate credit.

Output Directories

  • results/ocr : page-wise OCR JSON
  • results/diagrams : extracted diagram crops
  • results/formulas : formula OCR JSON
  • results/formula_crops : cropped formula images
  • results/eval : SBERT evaluation JSON
  • results/eval_llm : Gemini evaluation JSON
  • results/reports : SBERT PDF reports
  • results/reports_llm : Gemini PDF reports

Technology Stack

  • Streamlit
  • FastAPI
  • Next.js
  • EasyOCR
  • Azure Document Intelligence
  • Sentence Transformers
  • OpenAI CLIP
  • pix2tex
  • SymPy
  • pdf2image
  • OpenCV
  • Pillow
  • FPDF2

Production Deployment Path

Recommended public deployment:

  • web/ on Vercel
  • backend_api/ + src/ on Google Cloud Run

This split is preferred because:

  • Vercel is ideal for the professional Next.js frontend
  • Cloud Run gives the backend a direct HTTPS URL and handles container deployment cleanly
  • Cloud Run supports larger request bodies and longer request timeouts than the Vercel upload-proxy path
  • handwritten OCR can still use Azure Document Intelligence only as an OCR provider, not as the hosting platform

Deployment guide:

Current Assumptions

  • One page is treated as one question
  • Diagram extraction relies on connected-component based cropping
  • Formula extraction uses heuristic detection on OCR layout

Current Limitations

  • Multi-step derivations may still be harder than isolated formulas
  • OCR quality still strongly affects text, symbols, and labels
  • Gemini mode does not yet use the symbolic formula scoring path
  • Public deployment is not the current focus

Team

The dashboard includes a Team tab with the current project members, roles, and contact links.

Interfaces

The project now supports two interface directions:

1. Streamlit Research Dashboard

Used for rapid prototyping, internal testing, and quick end-to-end experiments.

Run with:

.venv\Scripts\python.exe -m streamlit run src\app.py

2. Product Website Architecture

Used for the professional web version:

  • backend_api/ : FastAPI backend
  • web/ : Next.js frontend

Run the API with:

.venv\Scripts\python.exe -m uvicorn backend_api.main:app --reload

Run the web app with:

cd web
npm install
npm run dev

Default local URLs:

  • FastAPI: http://127.0.0.1:8000
  • Next.js: http://localhost:3000
  • Streamlit: http://localhost:8501

Next Directions

  • Improve formula-region detection for complex handwritten derivations
  • Separate printed and handwritten result directories
  • Add charts, analytics, and richer per-question visualization
  • Move reports and artifacts to more scalable persistent storage