Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

SageLabTW/auto-grading

Repository files navigation

Auto-grading

An auto-grading system for handwritten digits implemented by Python.
Authors: Veronica Hong, Jephian Lin, and Chan-Yu Pan

Necessary Requirements : tensorflow 2.1.0, pdf2image 1.13.1, Pillow 7.0.0 and pyzbar 0.1.8.

Repo contents

  • main.ipynb : Main file for excuting this auto-grading system.
  • extractor.py : This script includes the functions for extracting the scanned file.
  • ocr.py : This script holds all the code to create OCR model.
  • annotator.py : This script includes the functions for grading on their examination paper.
  • OCR_mdl.h5 : This file is the trained model for OCR system.
  • font_style : This folder includes the necessary files about font style for annotator to grade.
  • nsysu-digits : This folder is our handwritten digit database. All images are grayscale and the size is 28*28.

How to Use

If you have installed jupyter notebook, run the file main.ipynb on your machine and make sure you have installed the necessary library.

How to load the nsysu-digits dataset

import os
import urllib
import numpy as np

base = r"https://github.com/SageLabTW/auto-grading/raw/master/nsysu-digits/"
for c in ['X', 'y']:
    filename = "nsysu-digits-%s.csv"%c
    if filename not in os.listdir('.'):
        print(filename, 'not found --- will download')
        urllib.request.urlretrieve(base + c + ".csv", filename)

Xsys = np.genfromtxt('nsysu-digits-X.csv', dtype=int, delimiter=',') ### flattened already
ysys = np.genfromtxt('nsysu-digits-y.csv', dtype=int, delimiter=',')

License for NSYSU-digits database

This NSYSU-digits database is made available by Veronica Hong, Jephian Lin, and Chan-Yu Pan under the Open Database License: http://opendatacommons.org/licenses/odbl/1.0/.
Any rights in individual contents of the database are licensed under the Database Contents License: http://opendatacommons.org/licenses/dbcl/1.0/.

Versions

  • Jul 22, 2023: 1890 pictures
  • Sep 9, 2022: 1639 pictures
  • Nov 1, 2020: 552 pictures

TODO list

  • improve OCR
  • test robustness of find_box and find_qr_box