Project deliverable Open Access
Kretschmer, Martin; Margoni, Thomas; Oruc, Pinar
There is global attention on new data analytic methods. Machine learning (essentially pattern recognition dressed as Artificial Intelligence or AI) is seen as a critical technology. Data scraping, the acquiring and structuring of information from online sources, is a typical first step for many advanced data analytic methods.
The technologies of scraping, mining and learning are often conflated, as are the legal regimes under which they are regulated. One regulatory lever under one legal regime will not deliver policy aims, such as innovation, personal dignity, Open Science, or the currently popular ‘data sovereignty’. The legal issues involved in the governance of data range from proprietary approaches (copyright, database rights) to privacy and data protection.
In addition, there are a wide range of public law instruments, for example relating to public sector data governance or the right to non-discrimination. Competition law again (which may be both privately and publicly enforceable) increasingly prescribes conduct in relation to data, such as in merger or acquisition cases, or in transparency provisions (Art. 17 CDSM; and centrally in the proposed DMA and AI Regulation).
The scope of our enquiry in this report is within private law, specifically on the attempt to assert quasi-proprietary control of information and data, or vice versa limit such attempts, for example by exempting desired activities via copyright exceptions, such as the exception for text and data mining in Arts. 3 and 4 CDSM.
We focus on case studies of three technological processes to explore in detail possible descriptions that would allow legal analysis, and an assessment of the need for a harmonisation of rights and connected exceptions under copyright law. The three case studies are:
(1) Data scraping for scientific purposes.
(2) Machine learning, in the context of Natural Language Processing (NLP).
(3) Computer vision, in the context of content moderation of images.
In parallel, we offer a thorough analysis of the policy rationale and legal context for the introduction of the two exceptions for text and data mining in the CDSM Directive (Art. 3 Text and data mining for the purposes of scientific research; Art. 4 Exception or limitation for text and data mining) which includes an analysis of how the right of reproduction (Art. 2 ISD) and its limitations (mainly Art. 5(1) ISD) interface with the overall.
The deliverable is under acceptance by the European Commission.
870626_D3.6 Interim study on the state of harmonisation of the rights of reproduction and adaptation and connected exceptions_final.pdf