GROTOAP: GROund Truth for Open Access Publications

Full item record

dc.contributor.authorTkaczyk, Dominika
dc.contributor.authorCzeczko, Artur
dc.contributor.authorRusek, Krzysztof
dc.contributor.authorBolikowski, Łukasz
dc.contributor.authorBogacewicz, Roman
dc.contributor.organizationInterdisciplinary Centre for Mathematical and Computational Modelling, University of Warsawen
dc.description.abstractThe field of digital document content analysis includes many important tasks, for example page segmentation or zone classification. It is impossible to build effective solutions for such problems and evaluate their performance without a reliable test set, that contains both input documents and expected results of segmentation and classification. In this paper we present GROTOAP — a test set useful for training and performance evaluation of page segmentation and zone classification tasks. The test set contains input articles in a digital form and corresponding ground truth files. All input documents included in the test set have been selected from DOAJ database, which indexes articles published under CC-BY license. The whole test set is available under the same license.pl_PL
dc.description.epersonŁukasz Bolikowski
dc.description.sponsorshipNational Centre for Research and Development (NCBiR) Grant No. SP/I/1/77065/10pl_PL
dc.identifier.citationD. Tkaczyk, A. Czeczko, K. Rusek, Ł. Bolikowski, and R. Bogacewicz, “GROTOAP: GROund Truth for Open Access Publications,” in Proceedings of the 2012 ACM/IEEE on Joint Conference on Digital Libraries, 2012, pp. 381-382.
dc.rightsUznanie autorstwa 3.0 Polskapl_PL
dc.titleGROTOAP: GROund Truth for Open Access Publications
Files for this record
Original bundle
Now showing 1 - 1 of 1
Name: p381-tkaczyk.pdf
Size: 300.85 KB
Format: Adobe Portable Document Format
Description: Main article
License files
Name: license.txt
Size: 234 B
Format: Item-specific license agreed upon to submission