PluColl - The UNIPEN/NICI/HP data collection of Summer/Autumn 1994
Description
This file contains 'on-line' handwritten words data collected on a thin Wacom PL100V integrated tablet and grey-scale LCD screen (i.e., long before the iPad!) in Summer/Autumn 1994 in a collaboration project between the handwriting group at Nijmegen University and Hewlett-Packard Bristol. HP donated this data to the International Unipen Foundation. Not within the Unipen data set (10.5281/zenodo.1195802) were the individually labeled characters, which are included in this data set.
___________________________________________________________________________________________________
Files overview
plucoll-1994-2023.pdf Our report to HP, from 1996. Old postscript version refurbished to pdf in 2023 with minor changes.
At the time of the report, 35 writers were in the data set. Ultimately there were 46 writers in total.
plucoll-1994-2023.txt Flat text version of the .pdf
___________________________________________________________________________________________________
plucoll-2001.tgz Unipen file format pen-tip coordinates for words and .png images.
plucoll-2001-tgz.lst 46 writers, 210 isolated words per writer
./plucoll/
./plucoll/angelien/
./plucoll/angelien/set1.dat
./plucoll/angelien/set6.dat
./plucoll/angelien/set2.dat
./plucoll/angelien/test.dat
./plucoll/angelien/set3.dat
./plucoll/angelien/set4.dat
./plucoll/willem/
./plucoll/willem/set1.dat
./plucoll/willem/set5.dat
./plucoll/willem/set6.dat
./plucoll/willem/test.dat
./plucoll/willem/set3.dat
./plucoll/willem/set4.dat
./plucoll/piet/
./plucoll/piet/set1.dat
./plucoll/piet/set6.dat
./plucoll/piet/set2.dat
./plucoll/piet/test.dat
./plucoll/piet/set3.dat
./plucoll/piet/set4.dat
(etc.)
___________________________________________________________________________________________________
Plucoll-hwr-lbl.tgz Separate plain ASCII coordinate files (.hwr) with x,y,z and corresponding label files (.lbl) for characters
Plucoll-hwr-lbl-tgz.txt
./Plucoll-hwr-lbl/
./Plucoll-hwr-lbl/miep/
./Plucoll-hwr-lbl/miep/set1/
./Plucoll-hwr-lbl/miep/set1/miep-set1-035-bouquet.hwr
./Plucoll-hwr-lbl/miep/set1/miep-set1-035-bouquet.lbl
./Plucoll-hwr-lbl/miep/set1/miep-set1-089-fjord.hwr
./Plucoll-hwr-lbl/miep/set1/miep-set1-089-fjord.lbl
./Plucoll-hwr-lbl/miep/set1/miep-set1-166-sandwich.hwr
./Plucoll-hwr-lbl/miep/set1/miep-set1-166-sandwich.lbl
./Plucoll-hwr-lbl/miep/set1/miep-set1-006-afghanistan.hwr
./Plucoll-hwr-lbl/miep/set1/miep-set1-006-afghanistan.lbl
(etc.)
cat anton/set5/anton-set5-209-zigzag.lbl
z 12 18 0.95
i 53 28 0.95
g 95 35 0.95
z 149 45 0.95
a 213 32 0.95
g 243 42 0.95
cat anton/set5/anton-set5-209-zigzag.hwr
3584 3500 100
3582 3502 100
3578 3502 100
3576 3502 100
3576 3502 100
3578 3502 100
3582 3504 100
3596 3508 100
3612 3512 100
3630 3522 100
3648 3526 100
. . .
. . .
(x y z 'pressure' 0=penup 100=pendown)
___________________________________________________________________________________________________PluColl-Letters-for-CogniGron.tgz Simplified version, ASCII with only (x,y) coordinates, 311925 characters
PluColl-Letters-for-CogniGron-tgz.lst This collection was used for our IOP article on bio-inspired twitch ensemble trajectory control.
./Letters/
./Letters/x/
./Letters/x/Letter-x-ioff-349-npts-40-janneke-set3-018-appendix.xy
./Letters/x/Letter-x-ioff-174-npts-45-marieke-set4-034-borax.xy
./Letters/x/Letter-x-ioff-6-npts-35-hannie-set6-206-xylophone.xy
./Letters/x/Letter-x-ioff-143-npts-50-corrie-set5-072-dixieland.xy
./Letters/x/Letter-x-ioff-91-npts-45-janneke-set2-072-dixieland.xy
./Letters/x/Letter-x-ioff-41-npts-41-eelco-set2-081-excellent.xy
./Letters/x/Letter-x-ioff-186-npts-42-heleen-set6-034-borax.xy
./Letters/x/Letter-x-ioff-78-npts-67-floris-set4-130-luxe.xy
./Letters/x/Letter-x-ioff-62-npts-58-rintje-set3-146-oxford.xy
./Letters/x/Letter-x-ioff-161-npts-36-martijn-set3-034-borax.xy
./Letters/x/Letter-x-ioff-107-npts-33-saskia-set4-135-maxwell.xy
./Letters/x/Letter-x-ioff-59-npts-49-corrie-set6-078-excellent.xy
./Letters/x/Letter-x-ioff-168-npts-32-katrien-set4-161-reflex.xy
(etc.)
Filename tags:
ioff is the index of the first coordinate of a character in the original .hwr file
npts is the number of (x,y) points for that character
cat Letter-_-ioff-203-npts-14-angelien-set3-161-reflex.xy
3942 3434
3952 3434
3962 3434
3982 3436
4002 3442
4026 3448
4054 3458
4082 3464
4112 3474
4134 3484
4160 3492
Note: the character '_' (underscore) represents the connecting stroke between two characters (if present)
Explicit modeling of the connecting stroke was important in 'formal' rule-based approaches to handwriting recognition.
Lambert Schomaker, May 2023
Files
plucoll-1994-2023.pdf
Files
(222.7 MB)
Name | Size | Download all |
---|---|---|
md5:d35818950b4709f8c21a7091c8056532
|
323.6 kB | Preview Download |
md5:bf8282b680eff223c7a25eeb205a6848
|
29.3 kB | Preview Download |
md5:908fa1b0e100f8d647115441d27c485c
|
13.1 kB | Download |
md5:2c3bc5d685e1eabd8f9feaa321acebe8
|
89.4 MB | Download |
md5:9d8db7fd7268312b60051021b3621015
|
5.2 MB | Preview Download |
md5:4678cb68c9b00f6ce43769849c2290f6
|
58.1 MB | Download |
md5:cddd9111a86b0891a6be086a99da4d81
|
20.3 MB | Download |
md5:de93d7863d75afcc637f3140fea153c1
|
49.4 MB | Download |
Additional details
References
- L. Vuurpijl and L. Schomaker, "Finding structure in diversity: a hierarchical clustering method for the categorization of allographs in handwriting," Proceedings of the Fourth International Conference on Document Analysis and Recognition, Ulm, Germany, 1997, pp. 387-393 vol.1, doi: 10.1109/ICDAR.1997.619876.