OCR Backends2EasyOCR and Azure Document Intelligence

Scoring Engines2SBERT local path and Gemini LLM path

Modalities3Text, diagrams, and formula understanding

Primary Outputs3JSON, live ranking table, and PDF reports

Frontend Layer

Next.js website plus Streamlit workspace for research-style testing.

Backend Layer

FastAPI APIs, run management, job history, and report download endpoints.

Core Evaluation Engine

Reusable Python pipeline powering OCR, formulas, diagrams, scoring, and reports.

Upload and validation

100%

OCR processing

100%

Formula understanding

95%

Diagram extraction

90%

Scoring workflow

100%

Reporting

100%

Tech used: Next.js forms, FastAPI upload handling, JSON validation

Input and Run Setup

Collect the ideal answer sheet, rubric, and student answer sheets for a traceable batch run.

The ideal answer sheet becomes the reference answer source.
Rubric JSON controls marks distribution and weighting logic.
Student PDFs are stored under per-run folders for reproducibility.

Tech used: pdf2image, Poppler, EasyOCR, Azure Document Intelligence

OCR and Content Capture

Convert PDF pages into structured textual evidence.

EasyOCR is used for cleaner printed answer sheets.
Azure Document Intelligence is used for handwritten sheets.
OCR output is stored page-wise for downstream reuse.

Tech used: pix2tex, SymPy, OpenCV, connected-component extraction

Formula and Diagram Processing

Handle non-text answer content separately from regular prose.

Formula-like regions are converted into LaTeX using pix2tex.
SymPy checks mathematical equivalence rather than plain string similarity.
Diagram regions are extracted independently for visual comparison.

Tech used: Sentence Transformers, CLIP, SymPy, Gemini 2.5 Flash

Scoring and Evaluation

Combine all answer signals under the rubric.

SBERT measures semantic answer similarity in the local evaluation path.
CLIP compares extracted diagrams visually.
Gemini acts as the LLM-based rubric-aware grading alternative.

Tech used: FPDF2, JSON serialization, tabular summaries, API report downloads

Reports and Delivery

Produce interpretable outputs for faculty and testing.

Each run produces structured JSON result artifacts.
Each student receives a PDF evaluation report.
The interfaces provide ranking tables and downloadable reports.

Next.jsFastAPIStreamlitEasyOCRAzure Document IntelligenceSentence TransformersCLIPpix2texSymPyGemini 2.5 FlashFPDF2OpenCV

Artifact	Purpose
OCR JSON	Page-wise extracted text, boxes, and OCR metadata.
Formula Results	Detected expressions and LaTeX/SymPy comparison data.
Diagram Crops	Visual regions used for diagram-level comparison.
Evaluation JSON	Question-wise scoring, totals, and remarks.
PDF Reports	Faculty-ready downloadable result summaries.

Automated Subjective Answer Sheet Evaluation Platform

Overview

This project is a multimodal subjective answer-sheet evaluation system built for academic and research workflows. It supports both printed and handwritten answer sheets and combines OCR, semantic text comparison, diagram analysis, formula-aware scoring, and PDF report generation inside a Streamlit dashboard.

Core Capabilities

OCR for printed sheets using EasyOCR
OCR for handwritten sheets using Azure Document Intelligence
Semantic answer evaluation using Sentence-BERT
LLM-assisted evaluation using Gemini
Diagram extraction and similarity scoring
Formula detection, pix2tex OCR, and SymPy-based mathematical equivalence scoring
Student-wise PDF report generation
Evaluation summaries and downloadable outputs through the UI

High-Level Workflow

Upload the ideal answer sheet, rubric JSON, and student PDFs.
Select the OCR mode and evaluation engine.
Run OCR page by page on each PDF.
Extract diagrams from each page.
In SBERT mode, detect formula regions and convert them into LaTeX with pix2tex.
Score text, diagrams, and formulas according to rubric weights.
Save JSON results and generate PDF reports.
Review the output through the dashboard table and download actions.

Current OCR Modes

Current/printed sheets (EasyOCR)

Used for cleaner, printed, or less noisy answer sheets.

Handwritten sheets (Azure Document Intelligence)

Used for handwritten answer sheets and works page by page to stay within practical resource limits.

Current Evaluation Modes

SBERT (fast, local)

Text similarity with sentence-transformers/all-MiniLM-L6-v2
Diagram similarity with CLIP
Formula-aware scoring with pix2tex and SymPy

Gemini 2.5 Flash (LLM API)

OCR text is passed to Gemini
Diagram images can be attached
Returns score and textual feedback

Note: formula-aware symbolic scoring is currently strongest in the SBERT path.

Formula-Aware Evaluation

The project now includes a dedicated formula stage:

OCR blocks are grouped into line candidates
Formula-like lines are cropped from page images
pix2tex converts those crops to LaTeX
SymPy attempts symbolic equivalence checks
Formula similarity becomes a separate rubric-weighted component

This allows mathematically equivalent expressions such as (x+1)^2 and x^2 + 2x + 1 to receive appropriate credit.

Output Directories

results/ocr : page-wise OCR JSON
results/diagrams : extracted diagram crops
results/formulas : formula OCR JSON
results/formula_crops : cropped formula images
results/eval : SBERT evaluation JSON
results/eval_llm : Gemini evaluation JSON
results/reports : SBERT PDF reports
results/reports_llm : Gemini PDF reports

Technology Stack

Streamlit
FastAPI
Next.js
EasyOCR
Azure Document Intelligence
Sentence Transformers
OpenAI CLIP
pix2tex
SymPy
pdf2image
OpenCV
Pillow
FPDF2

Production Deployment Path

Recommended public deployment:

web/ on Vercel
backend_api/ + src/ on Google Cloud Run

This split is preferred because:

Vercel is ideal for the professional Next.js frontend
Cloud Run gives the backend a direct HTTPS URL and handles container deployment cleanly
Cloud Run supports larger request bodies and longer request timeouts than the Vercel upload-proxy path
handwritten OCR can still use Azure Document Intelligence only as an OCR provider, not as the hosting platform

Deployment guide:

DEPLOY_VERCEL_CLOUD_RUN.md

Current Assumptions

One page is treated as one question
Diagram extraction relies on connected-component based cropping
Formula extraction uses heuristic detection on OCR layout

Current Limitations

Multi-step derivations may still be harder than isolated formulas
OCR quality still strongly affects text, symbols, and labels
Gemini mode does not yet use the symbolic formula scoring path
Public deployment is not the current focus

Team

The dashboard includes a Team tab with the current project members, roles, and contact links.

Interfaces

The project now supports two interface directions:

1. Streamlit Research Dashboard

Used for rapid prototyping, internal testing, and quick end-to-end experiments.

Run with:

.venv\Scripts\python.exe -m streamlit run src\app.py

2. Product Website Architecture

Used for the professional web version:

backend_api/ : FastAPI backend
web/ : Next.js frontend

Run the API with:

.venv\Scripts\python.exe -m uvicorn backend_api.main:app --reload

Run the web app with:

cd web
npm install
npm run dev

Default local URLs:

FastAPI: http://127.0.0.1:8000
Next.js: http://localhost:3000
Streamlit: http://localhost:8501

Next Directions

Improve formula-region detection for complex handwritten derivations
Separate printed and handwritten result directories
Add charts, analytics, and richer per-question visualization
Move reports and artifacts to more scalable persistent storage

Documentation

Frontend Layer

Backend Layer

Core Evaluation Engine

Pipeline Readiness

Stage-by-Stage Workflow

Input and Run Setup

OCR and Content Capture

Formula and Diagram Processing

Scoring and Evaluation

Reports and Delivery

Tech Stack and Artifacts

README Reference

Automated Subjective Answer Sheet Evaluation Platform

Overview

Core Capabilities

High-Level Workflow

Current OCR Modes

Current/printed sheets (EasyOCR)

Handwritten sheets (Azure Document Intelligence)

Current Evaluation Modes

SBERT (fast, local)

Gemini 2.5 Flash (LLM API)

Formula-Aware Evaluation

Output Directories

Technology Stack

Production Deployment Path

Current Assumptions

Current Limitations

Team

Interfaces

1. Streamlit Research Dashboard

2. Product Website Architecture

Next Directions