Parallel Computational Methods for Cone Beam X-Ray Computed Tomography

The focus of this work is accurate three dimensional tomography from two dimensional projections using a cluster of workstations. The Feldkamp algorithm for circular orbit tomography will be used. Projectional data in tomographic systems can be acquired much faster than the data can be reconstructed with serial processing on individual processors. A motivation of this work is to improve the computational performance of cone beam tomography in the Henry Ford Hospital Microtomography Laboratory. This laboratory examines specimens at microscopic (10--250 $\mu$m) resolution. Other laboratory systems around the world are currently in place and still others are under development. Research institutions and Simulated and real data from Henry Ford Hospital Microtomography Laboratory will be used. The tomography problem can be parallelized by partitioning the volume and distributing a portion of the volume to each processor. Each projection is then sent to each processor for backprojection. This algorithm is analyzed from a theoretic viewpoint with a LogP model. The extent to which the algorithm can be improved using parallelization is then presented. The errors in reconstructed images are addressed. The implications of the theoretical speedup obtained through parallelization is then addressed. Finally, a comparison between theory and experiment are presented. The results indicate the cone beam tomography can be parallelized, but because of the high cost of communication, it is not trivially parallel.

The main results indicate the cone beam tomography problem can be solved in $O(N^4)$ using the Feldkamp algorithm to reconstruct a volume of $N^3$ voxels from $pi N/2$ views. The parallel algorithm allows for a speedup which depend on the ratio of computation to communication. Speedup increases by $O(\sqrt{N})$ which results in an optimal parallel running time of $O(N^{3.5})$. The model accurately predicts speedup on a cluster of workstations connected with 10 and 100 Mbps Ethernet and 155 Mbps ATM.

The volume segmentation presented in the parallel algorithm can be extended to other three-dimensional imaging problems involving analysis and display of data. The LogP model for clusters of workstations can be utilized in assessing performance of general parallelization on clusters of workstations.