Title: The power of collaboration: creating maximum value from open-source energy projects The case for openness Have you ever wondered how much of your time and energy is wasted duplicating work someone somewhere else has done? Or, at a grander scale, how much of our national investment in research and innovation involves duplication and inefficiency? Have you ever met someone working on something similar to you and been frustrated that you hadn’t come across them much earlier? Those frustrations are what the Catalogue of Projects on Energy Data (CoPED) – an open-source directory of energy projects in the UK – sets out to address. Conceived as part of efforts to unify existing but disparate information on UK Research and Innovation (UKRI)-funded energy projects and their associated data, CoPED illuminates not only the key players and partnerships in the sector, but the gaps and inequalities in funding. “We wanted CoPED to be a centre point for the diverse communities of industry and academia to come together, through open data, and identify new trends, links and potential partnerships,” says Dr Alison Halford, a researcher at Coventry University and one of the original developers of CoPED. “But we also wanted to build narratives around opportunity: are some universities or some parts of the sector receiving all the funding while others struggle for representation? It’s as much about raising questions as it is solving problems.” Openness offers several benefits: * it reduces duplication of effort and enables more efficient allocation of resources * it aids reproducibility, increasing the confidence we can have in our collective knowledge * it drives innovation – visibility encourages others to build on the work and develop new products * it makes it easier for people with common interests to work together. CoPED aims to deliver these advantages by building on the openness of other project metadata sets. For example, CoPED is currently based on the already-open UKRI dataset but has filtered this to focus only on energy projects, and has augmented it with visualisation tools to enhance its value by broadening its audience and increasing the likelihood it will be mined by others for additional benefit. Future updates will add other data sources to further increase its value. But how do you achieve all of this in practice? Building openness in practice Making something open-source is easy, right? Upload your code to Github, write a readme and you’re done. While that is better than nothing, it’s like throwing eggs, flour and sugar on the table and calling it a cake! It takes a lot more to make something open-source and actually useful to people. Energy Systems Catapult (the Catapult) exists to help translate innovation into tangible impact in the energy sector. They have played a key role in CoPED, making sure it is set up to drive enduring impact. What does that involve? A successful, impactful open-source project requires: 1. Designing with a diverse set of end users in mind 2. Building an inclusive, active community 3. Identifying an ongoing model for support 4. Getting the basics right: choosing appropriate licences. Designing with a diverse set of end users in mind One of the biggest reasons why open-source projects are not effectively utilised is because the original developers were over-focused on their narrow application and never stopped to consider how potential end users might actually use it (or even who the potential end users might be). Sam Young, who manages the Data Science and AI practice at the Catapult, says: “We’ve seen several examples of open-source tools in the energy sector that have sophisticated, conceptually elegant frameworks for configuration and change management, but which make it incredibly complex to do the simplest thing – which is what 95% of end users will actually want to do!” It is vital both to think about what end users might want and to actively engage with those end users throughout. Alison says: “We saw the open-source and open-access aspects of CoPED as an opportunity to build capacity; to make sure its tools could be used by those with the least technical knowledge as well as those with the most. That has to be part of the open data gold standard. We need to think deeply about concepts such as inclusivity to maximise the usability and user base of the tools we produce.” Catherine Jones, Energy Data Centre Lead at the UK Energy Research Centre and a member of the CoPED working group, emphasises the importance of considering this from the start. “It’s easy to forget that not everyone is able to navigate or interact with online resources in the same way,” says Catherine. “One of my inputs on the CoPED working group was to highlight the importance of the tools and documentation being fully functional for people who use assistive technology. Both accessibility and openness need to be built in from the start. Openness in particular is a state of mind – if it’s something you only think about at the end, you’re far less likely to implement it properly and more likely simply to move on to the next project.” “With CoPED, we sought to embed stakeholder voices right from the beginning through a series of workshops with academia, industry and policymakers,” says Alison. “The idea was ‘co-creation’, but with hindsight we didn’t really do enough in this regard. How to make sure people are engaged throughout the process isn’t an easy question, but my top tip for collaboration is to do your homework and be really targeted at the stakeholders you choose to engage with and the partners you pick to work with you. Often that won’t necessarily be the most obvious people or organisations.” If you are struggling to engage with stakeholders, look for organisations that bridge different perspectives and stakeholder groups (like the Catapult) and can open up connections for you. “Organisations like the UK’s Catapult Network can play an important role in supporting open-source,” says Sam. “We can coach and mentor developers, and – particularly if they come from an academic background – offer them an understanding of what it takes to apply their work in industry.” Building an inclusive, active community Engaging with stakeholders and end users early on is also key to building an active community. The greatest value from open-source comes when different people join in to collaborate on a tool or project and a community forms around it. This rarely just happens – it often takes intentional action, concerted effort and time. CoPED itself is designed to help people find others who might be involved in similar work, but there are also other lessons to be learned from the project itself: * Reach out to a diverse range of stakeholders and not just your immediate circle – they will bring new perspectives and often surprising insights * Publicise your work in a diverse range of channels – and don’t just promote it once, but keep talking about it over weeks and months so new people hear about it * Look for organisations and people with a clear commitment to openness – they may be more willing to get involved. Identifying an ongoing model for support Sam Young says that “another key reason open-source projects fail to deliver lasting value is that they don’t plan for ongoing support”. Alison agrees. “You’ve got to have an exit strategy and a legacy plan,” she says. “For CoPED, that’s where the Catapult came in – being completely honest, we hadn’t thought fully about what would happen to CoPED when the initial project came to an end, so it was great that the Catapult was working with us and was able to take it on. I feel that it’s almost disrespectful to the data and community if it isn’t kept maintained and updated.” This support could be grant-funded or voluntary, but it doesn’t have to be. Damon Roberts, a Data Consultant at the Catapult, has worked with a host of private organisations to help them deliver open-source projects in the energy sector. “One of my favourite projects – and a hugely successful one – involved a company called Heatweb,” says Damon. “Heatweb produced open hardware controllers for heating and cooling units using Raspberry Pi devices. The goal was to give people access to data about their systems and allow them to make minimal, commonsense improvements that could have massive impacts.” With Heatweb, clients will happily pay for deep expertise and knowledge of the product as well as the hardware or software itself. Damon adds: “Developing a successful business model for an open-source project involves a different way of thinking. In my experience, companies that are flexible are well placed to take advantage of the opportunities; companies that are more set in their ways often struggle.” Sam gives a key reason ongoing support is required: “If you successfully build a tool that people are using and depending on, you have to recognise that their needs may change over time and you might need to change your tool to keep delivering for them.” Getting the basics right: choosing an appropriate licence Finally, if people are going to use your work, it’s important that you get the basics right: in particular, you must be careful to choose an appropriate licence. Robbie Morrison, a member of the CoPED working group, is a specialist in licensing and governance who has worked in the energy systems modelling field for almost three decades. He says: “Energy systems modelling has a reputation for being an area committed to the principles of openness. The Achilles’ heel here is data: the issues and complexities around data licensing are poorly resolved, and the overarching data standards also have to be licensed suitably. Personally, I take a hardline approach to openness: data, to be truly open, needs to be published under a Creative Commons BY 4.0 licence where possible. What’s clear, though, is that researchers need to understand licensing, and that licensing is an important part of their open work.” Selina So, a data science lead at Innovate UK and another member of the CoPED working group, points to alternative ways in which the wider community can benefit from industry-produced data without the need for an entirely open approach. Selina says: “Some data will be commercially sensitive, for example, and can’t be made open. In those cases, we can use techniques like federated learning, where data is used to help train models and share knowledge without having to be made public. Having said that, I do believe there are strong benefits to open data – not least in addressing potential biases.” Get Involved CoPED aims to foster openness and transparency, and hopefully reading about it has inspired you to do more to contribute towards those in your own area. Perhaps consider: * thinking about who might benefit from making your own work more open, and talking to a diverse range of people about it * making your work open and trying to build a community around the most useful parts * contributing to, or publicising, existing open-source work – such as CoPED itself. Authors and contributors * Stuart Gillespie * Alison Halford * Catherine Jones * Damon Roberts * Robbie Morrison * Samuel Young * Selina So * Stephen Haben * Alexandra Araujo Alvarez * Kirstie Whitaker * Lucy Killoran * Malvika Sharan Acknowledgements This case study is published under The Turing Way Practitioners Hub Cohort 1 - case study series. The Practitioners Hub is The Turing Way project that works with experts from partnering organisations to promote data science best practices. In 2023, The Turing Way team partnered with five organisations in the UK including the Catalogue of Projects on Energy Data (CoPED) team of Energy Systems Catapult. This work is supported by Innovate UK BridgeAI. The Practitioners Hub has also received funding and support from the Ecosystem Leadership Award under the EPSRC Grant EP/X03870X/1 & The Alan Turing Institute. We thank Dr Samuel Young, Practice Manager for Data Science & AI at Energy Systems Catapult, and Dr Stephen Haben, Senior Data Science Consultant at Energy Systems Catapult and an Expert in Residence for the first cohort of The Turing Way Practitioners Hub, for facilitating the development of this case study. The inaugural cohort of The Turing Way Practitioners Hub has been designed and led by Dr Malvika Sharan. The Research Project Manager is Alexandra Araujo Alvarez. Stuart Gillespie is the technical writer for this case study, and others in the series. Lucy Killoran, a Turing Enrichment Student and PhD Researcher - Archaeology and Computing Science at University of Glasgow and Historic Environment Scotland, served as The Turing Way liaison to the ESC contributors and the writing team. Cami Rincón, previous Research Applications Officer at the Turing Institute, contributed to the development of the Case Study Framework in this project. Led by Dr. Kirstie Whitaker, Programme Director of the Tools, Practices, and Systems research program, The Turing Way was launched in 2019. The Turing Way Practitioners Hub, established in 2023, aims to accelerate the adoption of best practices. Through a six-month cohort-based program, the Hub facilitates knowledge sharing, skill exchange, case study co-creation, and the adoption of open science practices. It also fosters a network of 'Experts in Residence' across partnering organisations. For any comments, questions or collaboration with The Turing Way, please email: turingway@turing.ac.uk. Cite this publication Gillespie, S., Halford, A., Jones, C., Roberts, D., Morrison, R., Young, S., So, S., Haben, S., Araujo Alvarez, A., Whitaker, K., Killoran, L., Sharan, M. (2023). Shared under CC-BY 4.0 International License. Zenodo. https://doi.org/10.5281/zenodo.10338376