Methodology

We propose to build a tool to automate the whole aforementioned procedure of diagnosis of spinal deformity, which means that given the frontal-view and lateral-view X-ray images of the patients, this tool will produce a spine model which accurately match with each vertebra on the X-ray images.

1. Objectives

i) Model the spine as a parametric curve
ii) Model the spine realistically so it agrees with the landmark detection / vertebra segmentation results
iii) Adjust the model by taking the rotation of a vertebra into account
iv) Finetune the model by treating the position, orientation and shape of each vertebra as a whole and take possible biological constraints or variations into consideration

2. Methodology

i) Platform Setup

Our project will mainly be operated on the GPU farm of the Department of Computer Science and the working station owned by the AI Spine team.

Python is chosen to be our programming language because it has many useful libraries like opencv, pytorch, PIL, etc.

ii) Data Collection

In the initial stage, 1000 paired X-ray images (the front-view and lateral-view X-ray images taken from a patient simultaneously) will be provided by the Department of Orthopedics, as well as pretrained models and a 3D standard spine model.

In the future if it is necessary for us to look into finer details of the vertebra we might need to mark and label the landmarks by ourselves.

iii) Algorithm Development

Computer vision methods will be applied to the processing of input images to highlight certain features of the images, to visualize the inputs and outputs in different forms or to perform the spatial transformation of the model. Besides, there are some useful methods in computer vision to match the points in an image with the corresponding ones on an object. Many of these methods are built-in functions of Python libraries.

Additionally, several convolutional neural networks (CNN) will be trained serving different purposes. For example, we are going to train a neural network to recognize the rotation of the vertebra and in the later stages we may try to train multiple-task networks to treat more spatial information simultaneously to improve the accuracy.

Specially we have found Mask R-CNN extremely suitable for object detection and instance segmentation and even with limited amount of training images, it will still give a relatively accurate result. Another neural net that we have most interest in is called High Resolution Net (HRNet), which will preserve the resolution and thus details while processing the images, and consequently generate more accurate results in tasks sensitive to spatial position, like our project.

In the meanwhile, continuous research will also be conducted to review and validate similar work of others.