Published July 25, 2023 | Version 4
Presentation Open

Writing Clean Scientific Software

  • 1. Center for Astrophysics | Harvard & Smithsonian


This presentation reviews clean coding strategies for students and scientists who have learned to program on their own without formal training. There are many pain points commonly associated with scientific software, such as lack of user-friendliness and difficult-to-read code.  Many of these pain points exist because the de facto programming paradigm in scientific research is publication-driven development (PDD), in which software is written for the purpose of publishing another research article at the expense of long-term software sustainability.  These pain points make it harder to begin research and to collaborate with other scientists, and all too often make research frustrating.  We can address many of these pain points by writing readable, reusable, and maintainable code.  Variable names should be chosen to reveal intention and meaning. The length of a variable name should be measured not by number of characters, but by the time needed to understand its meaning.  Decomposing large programs into smaller functions improves code readability, reusability, and testability.  Functions should be short and do exactly one thing with no side effects.  High-level big picture code should be separated from low-level implementation details, for example by writing code as a top-down narrative.  Because comments often become out-of-date as code evolves, it is preferable to refactor code to improve readability rather than describe how it works.  Well-written, automated tests increase the flexibility of code. Tests should be run frequently so that we can find bugs as soon as we introduce them.  In summary, we should think of code as communication.


Many of the suggestions described in this presentation were adapted from the references linked to in the slides. A minor portion of this presentation was adapted from the paper entitled "Best Practices for Scientific Computing" by G. Wilson et al., which is available under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.



Files (863.7 kB)

Name Size Download all
863.7 kB Preview Download

Additional details

Related works


Collaborative Research: Frameworks: An open source software ecosystem for plasma physics 1931388
National Science Foundation


  • Edmondson, A. (2018). The Fearless Organization: Creating Psychological Safety in the Workplace for Learning, Innovation, and Growth.
  • Feathers, M. (2004). Working Effectively with Legacy Code.
  • Fowler, M. (2011). Eradicating Non-Determinism in Tests.
  • Gamma, E.; Helm, R.; Johnson, R.; and Vlissides, J. (1995). Design Patterns: Elements of Reusable Object-Oriented Software.
  • Hicks, S. Code Is Communication.
  • Khorikov, V. (2020). Unit Testing Principles, Practices, and Patterns.
  • Martin, R. C. (2009). Clean Code: A Handbook of Agile Software Craftsmanship.
  • Martin, R. C. (2018). Clean Architecture: A Craftsman's Guide to Software Structure and Design.
  • McConnell, S. (2004). Code Complete: A practical handbook of software construction, 2nd edition.
  • Wilson, G.; Aruliah, D. A.; Brown C. T.; Chue Hong, N. P.; Davis, M.; Guy, R. T. et al. (2014). Best Practices for Scientific Computing, PLoS Biology, 12, e1001745,
  • Wilson, G.; Bryan, J.; Cranston, K.; Kitzes, J.; Nederbragt, L.; and Teal, T. K. (2017). Good enough practices in scientific computing. PLoS Computational Biology, 13, e1005510,
  • Young-McLear, K.; Zelmanowitz, P. E.; James, R. W.; Brunswick, D.; and DeNucci, T. W. (2021). Beyond Buzzwords and Bystanders: A Framework for Systematically Developing a Diverse, Mission Ready, and Innovative Coast Guard Workforce,