An auto-grading system for handwritten digits implemented by Python.
Authors: Veronica Hong, Jephian Lin, and Chan-Yu Pan
Necessary Requirements : tensorflow 2.1.0, pdf2image 1.13.1, Pillow 7.0.0 and pyzbar 0.1.8.
- main.ipynb : Main file for excuting this auto-grading system.
- extractor.py : This script includes the functions for extracting the scanned file.
- ocr.py : This script holds all the code to create OCR model.
- annotator.py : This script includes the functions for grading on their examination paper.
- OCR_mdl.h5 : This file is the trained model for OCR system.
- font_style : This folder includes the necessary files about font style for annotator to grade.
- nsysu-digits : This folder is our handwritten digit database. All images are grayscale and the size is 28*28.
If you have installed jupyter notebook, run the file main.ipynb
on your machine and make sure you have installed the necessary library.
import os
import urllib
import numpy as np
base = r"https://github.com/SageLabTW/auto-grading/raw/master/nsysu-digits/"
for c in ['X', 'y']:
filename = "nsysu-digits-%s.csv"%c
if filename not in os.listdir('.'):
print(filename, 'not found --- will download')
urllib.request.urlretrieve(base + c + ".csv", filename)
Xsys = np.genfromtxt('nsysu-digits-X.csv', dtype=int, delimiter=',') ### flattened already
ysys = np.genfromtxt('nsysu-digits-y.csv', dtype=int, delimiter=',')
This NSYSU-digits database is made available by Veronica Hong, Jephian Lin, and Chan-Yu Pan under the Open Database License: http://opendatacommons.org/licenses/odbl/1.0/.
Any rights in individual contents of the database are licensed under the Database Contents License: http://opendatacommons.org/licenses/dbcl/1.0/.
- Jul 22, 2023: 1890 pictures
- Sep 9, 2022: 1639 pictures
- Nov 1, 2020: 552 pictures
- improve OCR
- test robustness of find_box and find_qr_box