Writing Clean Scientific Software
Description
This presentation reviews clean coding strategies for students and scientists who have learned to program on their own without formal training. There are many pain points commonly associated with scientific software, such as lack of user-friendliness and difficult-to-read code. Many of these pain points exist because the de facto programming paradigm in scientific research is publication-driven development (PDD), in which software is written for the purpose of publishing another research article at the expense of long-term software sustainability. These pain points make it harder to begin research and to collaborate with other scientists, and all too often make research frustrating. We can address many of these pain points by writing readable, reusable, and maintainable code. Variable names should be chosen to reveal intention and meaning. The length of a variable name should be measured not by number of characters, but by the time needed to understand its meaning. Decomposing large programs into smaller functions improves code readability, reusability, and testability. Functions should be short and do exactly one thing with no side effects. High-level big picture code should be separated from low-level implementation details, for example by writing code as a top-down narrative. Because comments often become out-of-date as code evolves, it is preferable to refactor code to improve readability rather than describe how it works. Well-written, automated tests increase the flexibility of code. Tests should be run frequently so that we can find bugs as soon as we introduce them. In summary, we should think of code as communication.
Notes
Files
WritingCleanScientificSoftware_v3.pdf
Files
(796.6 kB)
Name | Size | Download all |
---|---|---|
md5:968f02dab1bba227e87e3befd744a1b9
|
796.6 kB | Preview Download |
Additional details
Related works
- Has part
- 10.5281/zenodo.3491142 (DOI)
References
- Edmondson, A. (2018). The Fearless Organization: Creating Psychological Safety in the Workplace for Learning, Innovation, and Growth.
- Feathers, M. (2004). Working Effectively with Legacy Code.
- Fowler, M. (2011). Eradicating Non-Determinism in Tests. https://martinfowler.com/articles/nonDeterminism.html
- Gamma, E.; Helm, R.; Johnson, R.; and Vlissides, J. (1995). Design Patterns: Elements of Reusable Object-Oriented Software.
- Hicks, S. Code Is Communication. https://steven-j-hicks-speaking.netlify.app/code-is-communication
- Martin, R. C. (2009). Clean Code: A Handbook of Agile Software Craftsmanship.
- Martin, R. C. (2018). Clean Architecture: A Craftsman's Guide to Software Structure and Design.
- McConnell, S. (2004). Code Complete: A practical handbook of software construction, 2nd edition.
- Wilson, G.; Aruliah, D. A.; Brown C. T.; Chue Hong, N. P.; Davis, M.; Guy, R. T. et al. (2014). Best Practices for Scientific Computing, PLoS Biology, 12, e1001745, https://doi.org/10.1371/journal.pbio.1001745
- Wilson, G.; Bryan, J.; Cranston, K.; Kitzes, J.; Nederbragt, L.; and Teal, T. K. (2017). Good enough practices in scientific computing. PLoS Computational Biology, 13, e1005510, https://doi.org/10.1371/journal.pcbi.1005510
- Young-McLear, K.; Zelmanowitz, P. E.; James, R. W.; Brunswick, D.; and DeNucci, T. W. (2021). Beyond Buzzwords and Bystanders: A Framework for Systematically Developing a Diverse, Mission Ready, and Innovative Coast Guard Workforce, https://strategy.asee.org/36070