Published October 17, 2023 | Version v1
Drawing Open

License Flowers

Description

License Flowers

Carlos Vivar Rios, Open Research Data Engineer, Swiss Data Science Center

Cyril Matthey–Doret, Sr. Data Science Engineer, Swiss Data Science Center

Stefan Milosavljevic, Biomedical Data Engineer, Swiss Data Science Center

A joint venture between ETH Zurich, the EPFL and the Paul Scherrer Institute, the Swiss Data Science Center was created in 2017. The center's mission is to accelerate the use of data science and machine learning techniques within academic disciplines of the ETH Domain, the Swiss academic community at large, public institutions, and the industrial sector, and enable data-driven science and innovation for societal impact.

What we do in our work: we're all data engineers, creating infrastructure to make the life of people analyzing data easier (usually data scientists). We're in the Open Research Data team, using open-source code and sharing open-source code following best practices to ensure long-term (re)usability, also known as FAIR principles. Our professional positions and contributions are based on all the open-source code and material available online, which is why licenses play a fundamental role in our day-to-day work.

Project Overview:

Idea: Every code license is extremely abstract and dull, but beyond this dullness lies a world where others can (re)use code without legal issues, promoting progress, collaboration and personal rights. The goal of the artwork is to bring each license to life, highlighting their beauty and promote their use. We started by imagining open-source software development as an ecosystem of living organisms with code as genetic information. For this ecosystem to thrive, pieces of code are used, adapted, shared, or refactored, all depending on the license behind. Posting your code on the internet is not enough: without a license specifying the terms of use, it defaults to "all rights reserved", meaning no-one is allowed to use, redistribute or adapt it. Following the biological metaphor, if code is DNA, then licenses are like organs defining whether codes are compatible and fertile. Similar to different species, some licenses are more promiscuous than others, and therefore they follow different reproductive strategies. 

Flowers encapsulate our metaphor perfectly as they are reproductive organs of plants, and they're also used to represent fertility in humans. In our art piece, we decided to use this strong symbol by associating each license to one flower. As for real flowers, we reimagined each code license as a unique character with its own story, personality, and quirks. These characters are brought to life through intricate and evocative drawings, seamlessly integrated into the pages of real notebooks. These drawings are inspired by the organic world of biology, specifically flowers, following the style of old biology books. 

Objective: Visualize the licenses used in a representative sample of repositories from papers with code (https://paperswithcode.com/)

Steps:

1. Data Collection & Cleaning:

We began by collecting information from 220 repositories (14000 in the second iteration) from papers with code. We cleaned the dataset from repos that did not contain code.

2. Data Aggregation and Use:

The cleaned data from the second iteration was then aggregated based on license type and programming language. Then we selected the five most common types of license: no license, MIT, BSD-3 Clause, GPL 3.0 and Apache. We also added the artistic license Creative Commons, even though rare, to highlight diversity. 

3. Visualization Approach:

We wanted to obtain two AI-driven inputs: a seed image, and a prompt. The former will provide guidance to the AI algorithm for styling and feeling, while the latter will drive the generation towards a specific composition and design. 

For the seed image, instead of opting for a conventional plotting library, we drew inspiration from the unique visualization style of Dear Data (http://www.dear-data.com/). This led us to design a hand-drawn doodle, reminiscent of a flower, as the seed for style. 

For the prompt, we compiled the different features for each of the selected licenses from https://www.tldrlegal.com/. This resource describes license features into what can/must be done and cannot. With all this profiling, we used ChatGPT and asked for describing flowers that represent those traits. The results were one prompt per license with guides for the color, the shape of the stem, the petals, and so on. 

After refinement of both inputs, we used Midjourney to obtain suggestions of different flowers. Then we used inpainting, upscaling and cropping methods to refine the images. As a final touch to the flower pictures, we also added their names written in cursive using the online tool https://www.calligrapher.ai/, trying to match the writing style to each license character.

4. Physical display:

We selected 6 different kinds of physical formats to print each flower. Each format represents the personality of the people who might be interested in each license. For example, the CC license flower was assigned to a sketchbook, since it's an artistic license. For MIT, since it's a very relaxed and popular license, we chose a simple notebook with blank pages. For Apache, a license with many terms that protects the recognition for contributions, we chose a black notebook with lined pages, to highlight its professional and precise nature. GPL can be seen as a passionate license, trying to spread its wisdom to future code, so we picked a notebook with a brown soft cover, better suited for multiple use. BSD-3 is similar to MIT, but with some details about what needs to be mentioned when changing things, similar to a friendly guide, so we picked a purple notebook with grid paper, trying to give structure but still leaving you the freedom of use. Finally, the flower for no license was printed on an A3 poster, as it represents code that could have been in a notebook, but is not. 

Only the MIT license is in an A4 format notebook, since it's the most popular license by far. All other licenses are in A5. To highlight the vast majority of code with no license, the no license poster is in A3 format. The poster will be displayed on an easel, overlooking the license flowers and showing its dominance over them.

Files

Apache_final_signed.png

Files (105.0 MB)

Name Size Download all
md5:335605922a4498a006f14bcb3482fbf7
26.9 MB Preview Download
md5:d673a434c08c2b64026ec129f612c0d9
29.1 MB Preview Download
md5:bc5daf08d047bf4f9a6cd2ac33ab93f6
23.2 MB Preview Download
md5:38d1df7a6ecf83d81607f1ad2b2afe17
6.8 MB Preview Download
md5:a81e778fd2eec706a9fcdd1b5e491b76
17.9 MB Download
md5:277cb8a457a54ae51ce3cd276c40d5a9
645.4 kB Preview Download
md5:668f7f1dda962369ec1b5e313f90b6c4
552.0 kB Preview Download

Additional details

Related works

Is derived from
Workflow: https://github.com/SDSC-ORD/license-collector (URL)