Published July 25, 2023 | Version 4
Presentation Open

Writing Clean Scientific Software

  • 1. Center for Astrophysics | Harvard & Smithsonian

Description

This presentation reviews clean coding strategies for students and scientists who have learned to program on their own without formal training. There are many pain points commonly associated with scientific software, such as lack of user-friendliness and difficult-to-read code.  Many of these pain points exist because the de facto programming paradigm in scientific research is publication-driven development (PDD), in which software is written for the purpose of publishing another research article at the expense of long-term software sustainability.  These pain points make it harder to begin research and to collaborate with other scientists, and all too often make research frustrating.  We can address many of these pain points by writing readable, reusable, and maintainable code.  Variable names should be chosen to reveal intention and meaning. The length of a variable name should be measured not by number of characters, but by the time needed to understand its meaning.  Decomposing large programs into smaller functions improves code readability, reusability, and testability.  Functions should be short and do exactly one thing with no side effects.  High-level big picture code should be separated from low-level implementation details, for example by writing code as a top-down narrative.  Because comments often become out-of-date as code evolves, it is preferable to refactor code to improve readability rather than describe how it works.  Well-written, automated tests increase the flexibility of code. Tests should be run frequently so that we can find bugs as soon as we introduce them.  In summary, we should think of code as communication.

Notes

Many of the suggestions described in this presentation were adapted from the references linked to in the slides. A minor portion of this presentation was adapted from the paper entitled "Best Practices for Scientific Computing" by G. Wilson et al., which is available under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

Files

WritingCleanScientificSoftware_v4.pdf

Files (863.7 kB)

Name Size Download all
md5:39670b75a450a64ad0f0a6f1f422a137
863.7 kB Preview Download

Additional details

Related works

Funding

Collaborative Research: Frameworks: An open source software ecosystem for plasma physics 1931388
National Science Foundation

References

  • Edmondson, A. (2018). The Fearless Organization: Creating Psychological Safety in the Workplace for Learning, Innovation, and Growth.
  • Feathers, M. (2004). Working Effectively with Legacy Code.
  • Fowler, M. (2011). Eradicating Non-Determinism in Tests. https://martinfowler.com/articles/nonDeterminism.html
  • Gamma, E.; Helm, R.; Johnson, R.; and Vlissides, J. (1995). Design Patterns: Elements of Reusable Object-Oriented Software.
  • Hicks, S. Code Is Communication. https://steven-j-hicks-speaking.netlify.app/code-is-communication
  • Khorikov, V. (2020). Unit Testing Principles, Practices, and Patterns.
  • Martin, R. C. (2009). Clean Code: A Handbook of Agile Software Craftsmanship.
  • Martin, R. C. (2018). Clean Architecture: A Craftsman's Guide to Software Structure and Design.
  • McConnell, S. (2004). Code Complete: A practical handbook of software construction, 2nd edition.
  • Wilson, G.; Aruliah, D. A.; Brown C. T.; Chue Hong, N. P.; Davis, M.; Guy, R. T. et al. (2014). Best Practices for Scientific Computing, PLoS Biology, 12, e1001745, https://doi.org/10.1371/journal.pbio.1001745
  • Wilson, G.; Bryan, J.; Cranston, K.; Kitzes, J.; Nederbragt, L.; and Teal, T. K. (2017). Good enough practices in scientific computing. PLoS Computational Biology, 13, e1005510, https://doi.org/10.1371/journal.pcbi.1005510
  • Young-McLear, K.; Zelmanowitz, P. E.; James, R. W.; Brunswick, D.; and DeNucci, T. W. (2021). Beyond Buzzwords and Bystanders: A Framework for Systematically Developing a Diverse, Mission Ready, and Innovative Coast Guard Workforce, https://strategy.asee.org/36070