-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
resegment: running for 155 minutes(?)... #73
Comments
aborting after 244 minutes... Log file https://digi.ub.uni-heidelberg.de/diglitData/v/cmd-0026e786738187ab1652ac53ccc5184f.log |
Finally went through; took hours. Since this only occurs in combination with pc-segmentation and pc-segmentation seems to be currently the weakest segmentation method, I'll close this case. |
I would really like to debug this, but unfortunately I have not been able to run ocrd-pc-segmentation in the past. So could you please provide me with the last input file? I.e. fileGrp Before we close, we should make sure this is not a bug on ocrd_cis side. Could you please |
I've let it run again... Note: complete workflow took longer than sbb_textline, resegment alone 3:30 wallclock time. I don't know which page exactly affects resegment execution time. Perhaps a consequence of too bad input to resegment. Let's wait if someone else complaines in combination with sbb_textline or similar.
|
I was able to run From what I see, this is somewhat related to bad segmentation quality (undetected multi-column layouts). But this also exposes a weakness in the resegmentation algorithm: if input regions are quite large, then the new line segmentation plus pair-wise comparison with existing lines and majority vote is inefficient. I'll have to think about his. |
Could you please revisit with the current master version @jbarth-ubhd ? |
and still running.
Workflow:
@bertsky: same image set as in last email.
PS: no cis-ocropy-clip for obvious reasons :-)
The text was updated successfully, but these errors were encountered: