Towards the Optical Character Recognition of DSLs - Artifact
Authors/Creators
- 1. University of Extremadura
- 2. Open University of Catalonia
Description
img2DSL is an image recognition toolkit designed to study how Optical Character Recognition can be applied to images that contain DSL snippets. Using the Object Constraint Language (OCL) as an example of textual DSL and given a dataset of Ecore models (and its OCL expressions), this toolkit encodes the OCL expressions into images and tests how different strategies improve the default OCR quality. In this project we use Tesseract as OCR engine and the different strategies are different OCR models and custom algorithms.
In order to evaluate the toolkit and the quality of its different strategies, we load the recognized expressions in the USE tool to measure of how many expressions are valid after the recognition
Notes
Files
img2DSL-SLE20-artifact.zip
Files
(10.9 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:825dea1b9d29dc252685940adc9b2188
|
10.9 MB | Preview Download |
Additional details
Related works
- Is supplement to
- Conference paper: 10.1145/3426425.3426937 (DOI)