Interviewer: So thank you so much for agreeing to this interview. You have already sent me your form of consent. But I just want to remind you that this interview is being recorded and the video and audio will be used by myself and my student research assistants to make a transcript and we will be using a local large language model to help us with that. Are you okay with that?

Interviewee 19: Yes, I am okay with that.

Interviewer: Brilliant. So this is a project about open science practices in linguistics. And linguistics is a very broad field, as you know. So my first question is, where do you situate yourself and your research within linguistics?

Interviewee 19: I would situate myself within, I think, three broad areas of linguistics. First being corpus linguistics, then cognitive linguistics, and within that in PROJECT.

Interviewer: Oh, brilliant. Thank you. And we'll begin with you and your work and your associations. What do you associate with open science and open science practices? So what springs to mind?

Interviewee 19: For me, open science is being rigorous about your whole research process from the planning stage through the actual research, through the publication stage. And it's, I guess for me, not only about making everything publicly available, but also making my own life easier because it entails like writing very rigorous documentation in the sense of documenting decisions I made, say, in corpus linguistic when it comes to annotation, which, yeah, part of my data did I annotate? How and why, what were the critical parts that we discussed in the team. So it's about making it reusable for others, but also reusable for me later.

Interviewer: That's good. Yeah. And I introduced this study and this project under the umbrella term open science. But linguistics is traditionally thought of as a humanity. And in the humanities, some people prefer the term open research rather than open science. And others still prefer open scholarship, which is often thought more as an umbrella term to encompass open science slash research. And open education as well. I wonder, have you thought about this before? What term do you think suits linguistics better? I'm going to put them in the chat so you can see them.

Interviewee 19: Yeah, thank you for that. I mean, I originally come from an open education background, so that's how I ended up in open science or in the practice anyway, because I was publishing my teaching materials and then found it rather weird not to be the same about my research process. So, honestly, I always say open science because I think it is the most used term and it's most easily understood when I talk to people. It is at least my impression. I think having something like open scholarship is a good goal to also encompass open education, but I'm not sure if it really does. Because I still think that scholarship doesn't sound like you have higher education in there. At least for me, it doesn't. I haven't talked to anybody about it. But for me, if I read open scholarship, it sounds even more, honestly, it sounds like a humanities thing, referring back to your question, because being a scholar sounds a little bit antiquated. Is that a word? To me, I'm a scholar that sounds like I would sit in libraries a lot and read and think really hard about the stuff that I do. And I wonder, because there is also the open theory movement, whether that would be part of this as well. I think it would fit better to open scholarship than open education. I'm not sure I even answered your question because I think it's really hard.

Interviewer: Yeah. No, but still, it's really, really interesting. I will continue to use the term open science, but I mean it in a very broad sense. And you can use whichever term, you know, you prefer and you feel is most appropriate. But we'll now move to your own experiences. Which open science practices are you involved in or have you been involved in?

Interviewee 19: I would say to sum it up in a lot, because I'm currently working in a PROJECT, so PROJECT methodology statistics PROJECT. And one of our goals is to ensure that the data produced in our project, so within the whole PROJECT, is fair data as so as open as possible but as closed as necessary in the sense that we also have say clinical linguistics and you of course don't want to publish people's results to personality tests correlation with their other features. But yeah so basically involved in pre-registration consent for making working with law people here in our project involved in writing the documentation, advising projects on proper documentation practices on data wrangling analysis and publication in the end. And for my own research, it's basically the same because the only thing I have never done is pre-registration for my own research. But within the PROJECT, I'm involved in that too. So I think, like, and I also teach about open science. So maybe I'm also an educator about open science while also doing open science myself. So it's a bit, yeah.

Interviewer: Yeah, so very involved, that's for sure.

Interviewee 19: I think so, yeah.

Interviewer: And you've partly answered my next question, which is how did you first learn about these practices and what or who encouraged you to start them? But perhaps you can tell us a bit more about that.

Interviewee 19: Yeah, I first learned about these practices during my PhD when I was doing actually an education certificate, so a teaching certificate for university, when I took a course on developing open teaching materials, so OERs, open educational resources. And I thought, man, that is so cool. I wish I would have had that. When I started out, I had to ask a lot of people, can you give me your slides? That would be so nice. And blah, blah. And I always felt that it's kind of unfair that having these resources relies on having a network. Because not everybody has the same network and not everybody can or dares to ask other people about their materials. Because it always feels a bit like maybe I'm not good enough myself to come up with these things if I need other people's stuff as a basis for that. And so I thought, well, I wish I would have had it. So I'm going to do it myself. So I published OERs. And then when I published my first papers, I thought, well, now you do all this open education stuff and are very transparent about your teaching. Why aren't you doing it in your research as well? Sounds kind of inconsequential. So I started doing it. And I was informed a lot about CC licenses in the OER context. Anyway, that was basically where I started out with my research materials. And then I started publishing my code and data where possible. And yeah, documentation for the annotation. And then it kind of grew from there.

Interviewer: Yeah, super interesting. And when you were interested in, for instance, publishing your code, the last thing you mentioned, but all of these other practices too, where did you learn how to do it?

Interviewee 19: Yeah, I talked to people. So here's the network aspect again. I looked at how other people did it. I saw other papers where they said, ah, data and code are available on OSF. Or this is available on my GitHub. And I thought, well, okay, this is obviously how you do it. So I checked it out. I talked to them. And then I tried it myself. Usually there is no punishment or anything. So if you do it wrong, nobody comes around and tells you, well, this is how you should have done it. Maybe it's a good thing. Maybe it's not a good thing. I'm not sure. Because how am I supposed to know if I did something wrong if nobody looks at it? And when I started reviewing papers, and I got papers with very good markdown documents, having their whole analysis in a very well-written markdown document and everything. It's like, okay, I love this so much as a reviewer. I want to do it like this because it made me so happy about the paper. Um so yeah I looked at other people's stuff I talked to other people and I tried to do what other people did what I liked. So I that was easy for me. That's what I liked. So I’m gonna do it like this.

Interviewer: Yeah, that sounds great. And it sounds like none of this was obligatory at any stage, or am I mistaken, did you ever submit to a journal that required you to do this? Or did a supervisor encourage you or require you to do any of this?

Interviewee 19: No, my supervisor never required me to do it because she never did it herself. For her, it was super new. And I think she comes from a time where it was highly unusual to publish your data out of fear that somebody else might grab it and then might grab the super interesting research question you're working on from your hands. And I've never submitted to a journal that really required it, but to journals that encouraged it. But let me think. Maybe one I’m not sure whether one journal actually acquired it but that was a stage where I was doing it anyway. So I’m not really sure I was like okay whatever I’m gonna submit it anyway. Um yeah no yeah it was never mandatory.

Interviewer: Interesting. And I'm not sure actually, have you ever attended a meeting from ReproducibiliTea in the HumanitiTeas?

Interviewee 19: Yes.

Interviewer: Yes, you have. Okay, so I can ask the next question. What was your motivation for attending these meetings?

Interviewee 19: I think it was twofold because one, I found it really interesting and I wanted to learn about what other people do because I always find it great to see what other people do and then get super inspired and then have too many projects myself. I think you can maybe relate to that. And I mean, your ReproducibiliTea in Cologne is also often online or hybrid or it's very easy to join from afar but actually here in in my university the ReproducibiliTea is part of my job also. So in our project we are supposed to foster open science and to spread the word and we have a ReproducibiliTea here, which was kind of, it was basically dead. Only the person who organized it was there. And I was super surprised when I came about in the first session when I was at the university. They were really happy that somebody's there again. And now we're like six people or that ish. So, yeah, I take it in my own university. I take it as an opportunity to read cool papers that I wouldn't have time to read myself or that I wouldn't make time to read, let's say like that. And then discuss them with other people from like psychology, psycholinguistics, because those are often perspectives that I don't hear a lot about. So it's super valuable. 

Interviewer: Yeah great, let's now move away from your personal experiences and associations with open science. And let's try and think about the broader community of linguists. And you're welcome to think about the subdisciplines that you're most familiar with, of course. And as far as you can tell, how widespread are open science practices in linguistics at the moment?

Interviewee 19: I think it depends on what practices we're actually talking about and what subdiscipline we're talking about. So I think in psycholinguistics, from what I've gathered, pre-registration is a very common thing because they are used to adhere to the psychological way, psychology way of, not psychology, psychology way of doing it. And then publishing data and code is more usual there. And my feeling is that in corpus linguistics, or like LANGUAGE corpus linguistics, I should say, it's getting more and more common. To publish data code and also documentation, especially among younger researchers. And by young, I mean like age young and not the stage of academia they're in.

Interviewer: Okay. Yeah. That's interesting. And, so your, the practices that, most linguists are familiar with, you'd say are, are pre-registration, open data, open code.

Interviewee 19: Yeah, that's at least my impression. I think what's coming about now is open documentation because people are learning about how useful it is to have like, say, an annotation scheme, even if people are not doing exactly the same thing that you're doing. And I think it's coming about because of the corpora that are being published. And these corpora often come enriched with a lot of metadata and a lot of documentation. And then that kind of spreads from there, I would say. So enriching your stuff with metadata is very unusual so far, I would say. Because most people, when you tell them, well, do you have any metadata for that? They're like, what is that? What even is metadata? So I would say that is very uncommon.

Interviewer: Okay, yeah. And what about publishing preprints and postprints?

Interviewee 19: I think that is very common in computer linguistics, from what I've gathered, and also psycholinguistics. Not so far in my branch, honestly. Because I feel, and it's also the same for me, that the legal aspects of publishing preprints are often not very well known. So, for me personally, I would also have to say that it's the part that I know the least about.

Interviewer: And then my next question would be, what do you think are some of the reasons why there is this, as far as you can tell, sort of at least mixed uptake of open science practices in linguistics? And what are some of the factors that contribute to this low-ish uptake in several subdisciplines?

Interviewee 19: I think there are several reasons. So for once, you can get away with not doing it. So, if it's not mandatory, then why should I do it? Because it takes more time. That's basically the second reason snuck in there. Writing good documentation, making your code publishable, making your data publishable takes a lot of time. And we are all very short on time. So writing the paper takes enough time itself. And if I can get away with not doing anything extra, then fine. And so we have it not being obligatory, time constraints and then I think it's also, I wouldn't say maybe necessarily fear, but being maybe uneducated about the legal aspects of. So if I do it, do I accidentally publish data or a preprint that I'm not allowed to publish? Will I then get sued? Will the university get sued? Will I lose my job over this and so on? And then connected to that, I think, is an actual fear of criticism because, oh, no, I'm publishing my code now. People will see that I do not write pretty code or maybe I did it super complicated. And then a very well-experienced researcher will come around and say, this is crap. Sorry if you have to beep that out or anything. I don't know. It's the same for teaching materials, basically. So the fear of criticism in a competitive environment is real, I would say. And if I'm being transparent about what I do, I'm making myself very prone or prone to attacks, maybe. I'm thinking of the German word angreifbar, so I'm not sure.

Interviewer: Yeah, I understand. And do you think this fear is justified?

Interviewee 19: Partially, yes, I think, because I've heard from people who have had experiences like that and also from people who have the experience that their ideas were actually stolen. And, yeah, this is not really sanctioned other than people saying, oh, well, this person stole that idea. That wasn't so nice. Well, let's move on with our lives. Yeah, so I can understand it. But from my experience, 95% of people don't even look at your data and your code. So if you publish it, nothing happens, basically, because people are like, ah, no, too much work. I can't. People who do look at it will mostly have something to criticize. Like the same with your publication so you write a normal paper. There will be people who will be like yeah this is great and then there are other people who are like no this is wrong. I'm gonna tell you now exactly why in another publication and yeah I think through this science also improves itself. So it's the criticism is scary but it's also helpful. But for this a good culture of criticism needs to be established. And I'm not sure we have that in the open science community yet.

Interviewer: Yeah, interesting. And do you think there are any specificities about linguistics that need to be taken into consideration when trying to implement open science practices in linguistics?

Interviewee 19: I'm not sure if you can even put it like that, open science practices in linguistics, because linguistics deals with so many different things and types of data and is adjacent to so many other disciplines. So when I do psycholinguistics, I have to adhere to the psychology traditions. When I do sociolinguistics maybe I’m more to the sociology traditions, when I do historical linguistics and more like and so on. Right and the data is very varied. And so going through a whole open science cycle in linguistics, I don't think is a thing, maybe. Maybe it's subdiscipline specific. Yeah. Of course, like general things, preprints and everything, you could do in every subdiscipline. But especially with making code and data available or documentation, it will look very different depending on where you go.

Interviewer: Yeah. And this brings me actually to the next question, which is there is a risk that open science advocates, if you can put it that way, tend to, there's a risk that they continue preaching to the choir in the sense that we're speaking to people who are already aware and interested and largely agree with these practices. What can we do to reach out to more linguists, the ones who don't know or don't like these ideas?

Interviewee 19: Yeah, I'm not sure about that, actually. We have the same problem in open education because I do a lot of like events for open education where I talk about what I did and blah. And of course, the people there are interested in open education because they came to an event about it. And an alternative might be to make it obligatory for people to take courses or whatever. But then, you know, if I take a course because I have to take a course, then I'm going to sit there and ignore what is being said. I think the only real way to get people to do it would be to make it mandatory in the publication process. So if I can only publish my paper, if I put everything on OSF or wherever, then people will do it because they have to, because they have to have publications, Right? Um, it's the same with open access. In the beginning, everybody was skeptical, like, nah, do I really need it? Libraries have access anyway. And who cares about these three people without a university affiliation? They can just email me, right? There we are, the network thing again, because you have to dare to ask and you blah. Yeah. But then universities started funding open access and people noticed that it's actually useful. But first universities and like, what's it called? Third party Drittmittelgeber [funders] said, yeah, you can do it, but you have to publish it open access. And then people are like, okay, yeah, well, if you want me to. But then they found out, oh, super convenient that if you go on the website, you can download the article. Awesome. Right? And then it spread. But in the beginning, it was because it was obligatory or at least partially obligatory. And I think that would be the same as well for open science.

Interviewer: Super interesting. Yeah, that was my last official question, but is there anything else you want to add on the topic of open science and linguistics or in the humanities more broadly?

Interviewee 19: I think one thing I've noticed in the workshop about open science that I taught recently is that it's a very misunderstood concept by people who don't know much about it in the sense that they think I now have to immediately make everything open. And that is really not the case because some things can't be open and some things shouldn't be open. And also not immediately. I mean, I don't remember who it was. There was this one PhD student who wrote their whole thesis openly. Like you could access the page where they published it at any time and see their progress.

Interviewer: Oh, wow.

Interviewee 19: I don't remember who it was. Yeah, it was, I don't remember who it was, but that of course is super.

Interviewer: Like Big Brother, PhD style.

Interviewee 19: Yeah, basically. And you could check like, oh, this week nothing has changed. Ha ha ha. Right. But that of course is like super scary. But I think that's what people think about when they hear open science, that they are now going to stand in front of a tribunal of people judging what they do because they have to make everything open. So I'm not sure how to, I think we're at the preaching to the choir part again. I think for open science to be more prevalent in the research community, people should understand better what open science actually means. Because it's such a broad term. Yeah. Yeah.

Interviewer: Great. Thank you. We'll stop the recording there. All right.
