Published February 24, 2023 | Version v0.3.0
Software Open

StAndrewsMedTech/icairdpath-public: Release for publication with endometrial and cervical models

  • 1. University of St Andrews
  • 2. Univeristy of St Andrews

Description

iCAIRD Pathology

The iCAIRD deep learning for pathology project. Release for publication with endometrial and cervical models.

Stucture of the project

The project is designed to have a set of experiments, where each experiment contains a series of steps. Each step can read data from the data or results directories and output it to the results directories. The steps should not call each other and any code that is shared is factored out into a seperate independent package. Note that the data and results directories have been added to the .gitignore file as they are assumed to be very large.

The project is organised into the following directories:

  • repath contains the python code for the project.
  • data should contain the "raw" immutable data that will be read by the project.
  • libraries contains the installation files for any 3rd party libraries use that are installed outside of the conventional system or python package manager.
  • notebooks contains any jupyter notebooks used in the project.
  • results contains any intermidiate data and any results that are generated by the project. This directory is organised with a folder for each experiment, which can be retrived programatically.

The experiments described in published papers are in repath/experiments

To run an experiment, while in the root directory of the project, the command:

python repath <name of experiment> <name of step>

Using this project

This project is designed to run in a Docker container and used Make to automate much of the it's set up procedures.

Assuming your system has Docker installed, here are the steps to get up and running:

  1. In a terminal, inside the projects directory, run make docker_image. This will build the docker image from the Dockerfile and install all the required dependencies.
  2. Run the project using the command make docker_run. This uses the image to instantiate a new container from the image. The container should log in a the icaird user. Note - if your file system has the data and results directories in different locations, these should be remapped (see Remapping Data Locations).

Remapping Data Locations

The project has two directories that it uses to read and write data: data and results. By default, the make docker_run command maps the project directory on the host machine to the project directory inside the container. To map the data and results directories to other location on the host, I suggest copying the docker_run target in the Makefile, renaming it, adding arguments to the docker run command for your machine. For example:
docker_run_my_machine:
	docker run --gpus all -p $(JUPYTER_PORT):$(JUPYTER_PORT) \
				-v $(PROJECT_DIR):/home/icaird/$(PROJECT_NAME) \
                -v $(PROJECT_DIR):/home/icaird/data$/raid/data/(PROJECT_NAME) \
                -v $(PROJECT_DIR):/home/icaird/results$/raid/data/(PROJECT_NAME) \
				-it $(PROJECT_NAME):latest`

Updating the dependencies

When you add a dependency to the project (using conda), it's good practice to make sure it's added to the environment.yml file. To do this run the command make export_environment. Once this is done, make sure to update it in git.

Working with Git

Assuming you have cloned the repository to a location on the host machine, but are working on the code from inside the container. Unless it has been changed, the run command should map the project directory into the same directory inside the the home directory of the containers icaird user. This means that you can use git commands to push and pull code from inside or outside the container.

Using Jupyter Notebooks and Lab

You can run a Jupyter notebook or lab server with the following make targets. make notebook and make run_lab. Note that the Makefile uses a constant called JUPYTER_PORT defined on line 11, to configure which port the Jupyter runs on. If you are running on a remote machine, this port will need to be open in order for you to access the notebook or lab. Change it depending on your setup.

Adding an experiment

To program a new experiment, create a new .py file inside the repath/experiments directory. Inside define a function for each step of the experiment. Then add an import to the top of repath\__main__.py so it can be run using the python command. For example: repath/experiments/wang.

Building on different machines

Before building the image on different machines, make sure that the userid, groupid, and data writers group id are set correctly. Note the userid and groupid should be set to whatever the icarid user and group ids are. These can be found using the command id icarid. If this user and group don't exist, ask your system admin to make them or just set the id values to whatever makes sense (such as your own user). The values can be found on lines 12, 13, and 14 of the Dockerfile.
ARG UID=1016
ARG GID=1017
ARG DATA_WRITERS_GID=1016
Note - these can be passed as arguments to the image building process, so if you wanted to you could create a "build image for dgx/tar/home targets for make".

Converting iSyntax files to an intermediary raw format

Working with iSyntax files can be complex due to their reliance on Philips software libraries. Glenco Software provide a [3rd-party tool](https://github.com/glencoesoftware/isyntax2raw) that allow for their conversion into a raw format that can be used more straight-forwardly.

Files

StAndrewsMedTech/icairdpath-public-v0.3.0.zip

Files (149.9 MB)

Name Size Download all
md5:c5ea335a7fa246c79decf20859f766cc
149.9 MB Preview Download

Additional details