- •Biological and Medical Physics, Biomedical Engineering
- •Medical Image Processing
- •Preface
- •Contents
- •Contributors
- •1.1 Medical Image Processing
- •1.2 Techniques
- •1.3 Applications
- •1.4 The Contribution of This Book
- •References
- •2.1 Introduction
- •2.2 MATLAB and DIPimage
- •2.2.1 The Basics
- •2.2.2 Interactive Examination of an Image
- •2.2.3 Filtering and Measuring
- •2.2.4 Scripting
- •2.3 Cervical Cancer and the Pap Smear
- •2.4 An Interactive, Partial History of Automated Cervical Cytology
- •2.5 The Future of Automated Cytology
- •2.6 Conclusions
- •References
- •3.1 The Need for Seed-Driven Segmentation
- •3.1.1 Image Analysis and Computer Vision
- •3.1.2 Objects Are Semantically Consistent
- •3.1.3 A Separation of Powers
- •3.1.4 Desirable Properties of Seeded Segmentation Methods
- •3.2 A Review of Segmentation Techniques
- •3.2.1 Pixel Selection
- •3.2.2 Contour Tracking
- •3.2.3 Statistical Methods
- •3.2.4 Continuous Optimization Methods
- •3.2.4.1 Active Contours
- •3.2.4.2 Level Sets
- •3.2.4.3 Geodesic Active Contours
- •3.2.5 Graph-Based Methods
- •3.2.5.1 Graph Cuts
- •3.2.5.2 Random Walkers
- •3.2.5.3 Watershed
- •3.2.6 Generic Models for Segmentation
- •3.2.6.1 Continuous Models
- •3.2.6.2 Hierarchical Models
- •3.2.6.3 Combinations
- •3.3 A Unifying Framework for Discrete Seeded Segmentation
- •3.3.1 Discrete Optimization
- •3.3.2 A Unifying Framework
- •3.3.3 Power Watershed
- •3.4 Globally Optimum Continuous Segmentation Methods
- •3.4.1 Dealing with Noise and Artifacts
- •3.4.2 Globally Optimal Geodesic Active Contour
- •3.4.3 Maximal Continuous Flows and Total Variation
- •3.5 Comparison and Discussion
- •3.6 Conclusion and Future Work
- •References
- •4.1 Introduction
- •4.2 Deformable Models
- •4.2.1 Point-Based Snake
- •4.2.1.1 User Constraint Energy
- •4.2.1.2 Snake Optimization Method
- •4.2.2 Parametric Deformable Models
- •4.2.3 Geometric Deformable Models (Active Contours)
- •4.2.3.1 Curve Evolution
- •4.2.3.2 Level Set Concept
- •4.2.3.3 Geodesic Active Contour
- •4.2.3.4 Chan–Vese Deformable Model
- •4.3 Comparison of Deformable Models
- •4.4 Applications
- •4.4.1 Bone Surface Extraction from Ultrasound
- •4.4.2 Spinal Cord Segmentation
- •4.4.2.1 Spinal Cord Measurements
- •4.4.2.2 Segmentation Using Geodesic Active Contour
- •4.5 Conclusion
- •References
- •5.1 Introduction
- •5.2 Imaging Body Fat
- •5.3 Image Artifacts and Their Impact on Segmentation
- •5.3.1 Partial Volume Effect
- •5.3.2 Intensity Inhomogeneities
- •5.4 Overview of Segmentation Techniques Used to Isolate Fat
- •5.4.1 Thresholding
- •5.4.2 Selecting the Optimum Threshold
- •5.4.3 Gaussian Mixture Model
- •5.4.4 Region Growing
- •5.4.5 Adaptive Thresholding
- •5.4.6 Segmentation Using Overlapping Mosaics
- •5.6 Conclusions
- •References
- •6.1 Introduction
- •6.2 Clinical Context
- •6.3 Vessel Segmentation
- •6.3.1 Survey of Vessel Segmentation Methods
- •6.3.1.1 General Overview
- •6.3.1.2 Region-Growing Methods
- •6.3.1.3 Differential Analysis
- •6.3.1.4 Model-Based Filtering
- •6.3.1.5 Deformable Models
- •6.3.1.6 Statistical Approaches
- •6.3.1.7 Path Finding
- •6.3.1.8 Tracking Methods
- •6.3.1.9 Mathematical Morphology Methods
- •6.3.1.10 Hybrid Methods
- •6.4 Vessel Modeling
- •6.4.1 Motivation
- •6.4.1.1 Context
- •6.4.1.2 Usefulness
- •6.4.2 Deterministic Atlases
- •6.4.2.1 Pioneering Works
- •6.4.2.2 Graph-Based and Geometric Atlases
- •6.4.3 Statistical Atlases
- •6.4.3.1 Anatomical Variability Handling
- •6.4.3.2 Recent Works
- •References
- •7.1 Introduction
- •7.2 Linear Structure Detection Methods
- •7.3.1 CCM for Imaging Diabetic Peripheral Neuropathy
- •7.3.2 CCM Image Characteristics and Noise Artifacts
- •7.4.1 Foreground and Background Adaptive Models
- •7.4.2 Local Orientation and Parameter Estimation
- •7.4.3 Separation of Nerve Fiber and Background Responses
- •7.4.4 Postprocessing the Enhanced-Contrast Image
- •7.5 Quantitative Analysis and Evaluation of Linear Structure Detection Methods
- •7.5.1 Methodology of Evaluation
- •7.5.2 Database and Experiment Setup
- •7.5.3 Nerve Fiber Detection Comparison Results
- •7.5.4 Evaluation of Clinical Utility
- •7.6 Conclusion
- •References
- •8.1 Introduction
- •8.2 Methods
- •8.2.1 Linear Feature Detection by MDNMS
- •8.2.2 Check Intensities Within 1D Window
- •8.2.3 Finding Features Next to Each Other
- •8.2.4 Gap Linking for Linear Features
- •8.2.5 Quantifying Branching Structures
- •8.3 Linear Feature Detection on GPUs
- •8.3.1 Overview of GPUs and Execution Models
- •8.3.2 Linear Feature Detection Performance Analysis
- •8.3.3 Parallel MDNMS on GPUs
- •8.3.5 Results for GPU Linear Feature Detection
- •8.4.1 Architecture and Implementation
- •8.4.2 HCA-Vision Features
- •8.4.3 Linear Feature Detection and Analysis Results
- •8.5 Selected Applications
- •8.5.1 Neurite Tracing for Drug Discovery and Functional Genomics
- •8.5.2 Using Linear Features to Quantify Astrocyte Morphology
- •8.5.3 Separating Adjacent Bacteria Under Phase Contrast Microscopy
- •8.6 Perspectives and Conclusions
- •References
- •9.1 Introduction
- •9.2 Bone Imaging Modalities
- •9.2.1 X-Ray Projection Imaging
- •9.2.2 Computed Tomography
- •9.2.3 Magnetic Resonance Imaging
- •9.2.4 Ultrasound Imaging
- •9.3 Quantifying the Microarchitecture of Trabecular Bone
- •9.3.1 Bone Morphometric Quantities
- •9.3.2 Texture Analysis
- •9.3.3 Frequency-Domain Methods
- •9.3.4 Use of Fractal Dimension Estimators for Texture Analysis
- •9.3.4.1 Frequency-Domain Estimation of the Fractal Dimension
- •9.3.4.2 Lacunarity
- •9.3.4.3 Lacunarity Parameters
- •9.3.5 Computer Modeling of Biomechanical Properties
- •9.4 Trends in Imaging of Bone
- •References
- •10.1 Introduction
- •10.1.1 Adolescent Idiopathic Scoliosis
- •10.2 Imaging Modalities Used for Spinal Deformity Assessment
- •10.2.1 Current Clinical Practice: The Cobb Angle
- •10.2.2 An Alternative: The Ferguson Angle
- •10.3 Image Processing Methods
- •10.3.1 Previous Studies
- •10.3.2 Discrete and Continuum Functions for Spinal Curvature
- •10.3.3 Tortuosity
- •10.4 Assessment of Image Processing Methods
- •10.4.1 Patient Dataset and Image Processing
- •10.4.2 Results and Discussion
- •10.5 Summary
- •References
- •11.1 Introduction
- •11.2 Retinal Imaging
- •11.2.1 Features of a Retinal Image
- •11.2.2 The Reason for Automated Retinal Analysis
- •11.2.3 Acquisition of Retinal Images
- •11.3 Preprocessing of Retinal Images
- •11.4 Lesion Based Detection
- •11.4.1 Matched Filtering for Blood Vessel Segmentation
- •11.4.2 Morphological Operators in Retinal Imaging
- •11.5 Global Analysis of Retinal Vessel Patterns
- •11.6 Conclusion
- •References
- •12.1 Introduction
- •12.1.1 The Progression of Diabetic Retinopathy
- •12.2 Automated Detection of Diabetic Retinopathy
- •12.2.1 Automated Detection of Microaneurysms
- •12.3 Image Databases
- •12.4 Tortuosity
- •12.4.1 Tortuosity Metrics
- •12.5 Tracing Retinal Vessels
- •12.5.1 NeuronJ
- •12.5.2 Other Software Packages
- •12.6 Experimental Results and Discussion
- •12.7 Summary and Future Work
- •References
- •13.1 Introduction
- •13.2 Volumetric Image Visualization Methods
- •13.2.1 Multiplanar Reformation (2D slicing)
- •13.2.2 Surface-Based Rendering
- •13.2.3 Volumetric Rendering
- •13.3 Volume Rendering Principles
- •13.3.1 Optical Models
- •13.3.2 Color and Opacity Mapping
- •13.3.2.2 Transfer Function
- •13.3.3 Composition
- •13.3.4 Volume Illumination and Illustration
- •13.4 Software-Based Raycasting
- •13.4.1 Applications and Improvements
- •13.5 Splatting Algorithms
- •13.5.1 Performance Analysis
- •13.5.2 Applications and Improvements
- •13.6 Shell Rendering
- •13.6.1 Application and Improvements
- •13.7 Texture Mapping
- •13.7.1 Performance Analysis
- •13.7.2 Applications
- •13.7.3 Improvements
- •13.7.3.1 Shading Inclusion
- •13.7.3.2 Empty Space Skipping
- •13.8 Discussion and Outlook
- •References
- •14.1 Introduction
- •14.1.1 Magnetic Resonance Imaging
- •14.1.2 Compressed Sensing
- •14.1.3 The Role of Prior Knowledge
- •14.2 Sparsity in MRI Images
- •14.2.1 Characteristics of MR Images (Prior Knowledge)
- •14.2.2 Choice of Transform
- •14.2.3 Use of Data Ordering
- •14.3 Theory of Compressed Sensing
- •14.3.1 Data Acquisition
- •14.3.2 Signal Recovery
- •14.4 Progress in Sparse Sampling for MRI
- •14.4.1 Review of Results from the Literature
- •14.4.2 Results from Our Work
- •14.4.2.1 PECS
- •14.4.2.2 SENSECS
- •14.4.2.3 PECS Applied to CE-MRA
- •14.5 Prospects for Future Developments
- •References
- •15.1 Introduction
- •15.2 Acquisition of DT Images
- •15.2.1 Fundamentals of DTI
- •15.2.2 The Pulsed Field Gradient Spin Echo (PFGSE) Method
- •15.2.3 Diffusion Imaging Sequences
- •15.2.4 Example: Anisotropic Diffusion of Water in the Eye Lens
- •15.2.5 Data Acquisition
- •15.3 Digital Processing of DT Images
- •15.3.2 Diagonalization of the DT
- •15.3.3 Gradient Calibration Factors
- •15.3.4 Sorting Bias
- •15.3.5 Fractional Anisotropy
- •15.3.6 Other Anisotropy Metrics
- •15.4 Applications of DTI to Articular Cartilage
- •15.4.1 Bovine AC
- •15.4.2 Human AC
- •References
- •Index
174 |
L. Domanski et al. |
Length and width measurements seem particularly important in applications (see Sect. 8.5 for some examples). Before the tree growing process is initiated, the length of each segment is estimated [8]. The average width of each segment is also computed using the method proposed by Lagerstrom et al. [9]. The average brightness of the segment is computed not only to guide the watershed process, but also as a reportable measure in itself. As each segment is removed from the queue, we accumulate the length back to the centre for the segment, the longest path back to the centre for the tree and the total length of the tree. In a similar fashion, the average width of the tree, the total area of the tree, the average brightness and integrated intensity of the tree are also accumulated. Once the trees have been grown, the total field area is calculated, defined by the area of the convex hull of all trees associated with a single organizing centre.
A variety of complexity measures for capturing additional morphological measures are also collected via the tree growing process. Often trees display behaviour where a dominant or primary branch extends from the centre, with secondary branches projecting from the primary branches, and recursively. On a per-line basis, we refer to this as branching layer. Root segments are given a primary branching layer coded as “1.” As segments are removed from the queue, they inherit their parent’s branching layer if they represent a child segment with the highest average brightness. The remaining child segments inherit an incremented branching layer. The average branching layer per tree, the number of branching points per tree and the number of extreme segments (i.e., those with no children) are accumulated as the tree is grown.
8.3 Linear Feature Detection on GPUs
While the algorithm presented in Sect. 8.2 is generally considered fast, execution time can become an issue in the case of large images or those containing complex linear structures. In this context, complexity refers to both density of linear structures and their branching rate, as well as the variation of intensity along linear features. In high-throughput biological experiments, where thousands of images may need to be processed in batch during a single experiment, the overall increase in processing time can be significant, thus motivating attempts to improve algorithm performance further.
In this section, we will look at using many-core processing units and parallel programming techniques to help accelerate parts of the linear feature detection algorithm described in Sect. 8.2. This problem will serve as an interesting example of how these methods can be used to accelerate image processing problems in general. We will utilize Graphics Processing Units (GPUs) as the many-core processors in our discussions and tests. These commodity processing chips are now a widely available and popular parallel processing platform, with most personal and office computers containing one or more of them.
8 High-Throughput Detection of Linear Features: Selected Applications... |
175 |
8.3.1 Overview of GPUs and Execution Models
Originally designed to accelerate the rendering of 3D computer graphics, GPUs are now used widely as architecture for executing general purpose parallel programs [10]. Modern GPUs consist of hundreds of light-weight processor cores, capable of executing thousands of parallel threads concurrently. GPUs are coupled to dedicated off-chip RAM through a high-bandwidth memory interface. Data is transferred between this dedicated GPU RAM and a host processor’s memory via an expansion card bus (usually PCI-Express). GPUs also provide a number of on-chip storage spaces including register files, unmanaged shared memory, and various memory caches. Accessing these on-chip storage spaces can be orders of magnitude faster than accessing GPU RAM which, while being high-bandwidth, can incur significant access latencies.
On the NVIDIA GPUs [4] used in our tests, the processors are grouped into a number of streaming multi-processors (SMs), which can concurrently execute a large number of assigned threads by switching execution between different groups of these threads. On-chip storage is arranged such that each SM has its own private register file and shared memory space, which cannot be accessed by threads executing on other SMs.
Threads are logically grouped into n-dimensional blocks whose sizes are customisable. A regular grid of common sized blocks is used to parameterize threads over a problem domain. Each thread is assigned unique n-dimensional grid and block level IDs to distinguish it from other threads. A block is assigned to a single SM for its lifetime, and its threads can synchronize their execution and share data via SM shared memory. Each thread in the grid of blocks executes the same program, which is defined by a parallel kernel function. A grid of threads only executes a single kernel at a time, and on-chip storage does not remain persistent between kernel launches.
For example, the process of executing a simple image operation on the GPU that subtracts image A from B and places the result in image C can be performed as follows:
1.Transfer image A and B from host RAM to GPU RAM.
2.Assign a single thread to each output pixel by constructing a grid of 2D thread blocks to cover the image domain.
3.Launch a subtraction kernel where each thread reads corresponding pixel values from image A and B in GPU RAM and writes the subtraction result to image C in GPU RAM.
4.Transfer image C from GPU RAM to host RAM.
Data transfer between host and GPU can be a performance bottleneck, so it should be avoided where possible. For example, when performing a number of dependent parallel image operations in succession on the GPU, it may not be necessary to transfer the result of each operation back to the host.
176 |
L. Domanski et al. |
Fig. 8.6 Execution times for different stages of linear feature detection for images shown in Fig. 8.7. MDNMS, small object removal, and gap filling steps are described in Sects. 8.2.1–8.2.4. Aggregated time taken by short utility operations performed between steps, such as labelling and skeletonization, is represented as “other”
8.3.2 Linear Feature Detection Performance Analysis
Before attempting to parallelize the algorithm on a GPU, one should analyze performance issues and determine how different changes might take effect. For example, the MDNMS algorithm from Sect. 8.2 is a classic neighborhood filter that computes the value for each pixel in the output image by analyzing only pixels within a small neighborhood around the pixel. Although the number of symmetry checks performed (Sect. 8.2.2) may vary with image content, its performance is primarily determined by the size of the image, as well as by the size and number of linear windows used for filtering. Accelerating this portion of the algorithm should provide performance improvement irrespective of the image complexity. However, the number of false objects and feature mask endpoints in the MDNMS output can increase significantly with input image complexity. This places a higher workload on the steps that remove small objects and bridge gaps in the resulting feature masks. The performance of these steps is, therefore, affected more strongly by image complexity than by size.
Figure 8.6 shows the breakdown of linear feature detection processing time for a number of images with varying size and linear structure complexity (Fig. 8.7). In general, we see that MDNMS takes the largest portion of overall execution time for each image. Gap filling also consumes a large amount time for complex images (Fig. 8.7, img 2 and img 3) due to an increase in the number of feature mask endpoints produced by the MDNMS step, and the need to perform costly shortest path calculations for each endpoint (Sect. 8.2.4). Although small object removal performance also appears to be affected by image complexity, as hypothesized above, its execution time is relatively low compared to the other two steps. The same remark applies to the utility functions.
8 High-Throughput Detection of Linear Features: Selected Applications... |
177 |
Fig. 8.7 Neurite images used in performance tests. (a) img 1: 1,300 × 1,030 pixels. (b) img 2: 1,280 × 1,280 pixels. (c) img 3: 1,280 × 1,280 pixels. (d) img 4: 640 × 640 pixels. (e) img 5: 694 × 520 pixels. (f) img 6: 512 × 512 pixels