3D modelling of real objects has applications such as visualisation, measurement or statistical analysis of populations. The use of 3D models in medical science is useful for achieving accurate diagnoses, creating biomechanical models, developing educational resources, or assisting in improving intervention efficacy. Such models are particularly important for health science research into common conditions such as lumbar spine pathologies, which have a lifetime prevalence of 60–70% in industrialised countries (Kaplan, et al., 2013) and impart a huge social and financial burden on society (Mounce, 2002). Despite the prevalence of spinal pathologies there is a lack of freely available three-dimensional (3D) datasets of individual vertebrae of the human spine, and in particular of the lumbar spine. Most available data sets are commercial,12 of dubious origin and accuracy, or contain data from a single individual; this means that existing datasets are not easily accessed, are not readily able to be utilised, and may not be valid. The existing “SpineWeb”3 platform provides several medical Computed Tomography (CT) scans and Magnetic Resonance Imaging (MRI) of the human spine with reference manual segmentation which is subject to error (André, Dansereau and Labelle, H, 1994). However, this data set does not provide surface data directly as we do in this paper.
3D data sets can be constructed via three main methodologies: volumetric scanning, active scanning (such as laser or sonar), or passive / image-based reconstruction. While volumetric scanning (such as that acquired through CT and MRI) can generate 3D images for the purpose of biomechanical modelling or statistical analyses, there are limitations to the use of this technology in that both methods are costly and require skilled technicians to operate. In addition, CT potentially involves an unacceptable radiation exposure to participants (Brenner and Elliston, 2004) whereas MRI is generally safer but often cannot be used when patients have implants or metallic fragments in their bodies (Shellock and Titterington 2014).
Acquiring 3D images from active scanners such as laser, structured light scanners or sonar scanners also has limitations. These scanners send a signal into the environment and measure the effect the environment has on the signal; they can produce very accurate 3D reconstructions but they can be expensive depending on the accuracy required. Kusnoto and Evans (2002) show that a Minolta Vivid700 3D surface laser scanner achieves an accuracy of 1.9 millimetres for face surface scanning and less than one millimetre for molar scanning, however accuracy of this level is only gained from expensive scanners (e.g. 93000 Euros for a Z35 scanner with an accuracy of ± 18 ~ 148 µm (Slizewski and Semal, 2009)).
Passive scanning typically requires reconstructing 3D objects from multiple images of the object. The only equipment required is a digital camera, a computer, and the relevant software. Therefore, this option is very inexpensive, very flexible and easy to use. The method is usually known as structure-from-motion or multi-view stereo, and it works by finding corresponding points among images. In the case of known camera positions and imaging geometry (“camera calibration”: internal parameters and external parameters of the camera) (Fraser, 2013; Gruen and Huang, 2013; Remondino and Fraser, 2006), the distance of the points from the cameras can be directly reconstructed up to a scale which has unavoidable ambiguity from image-based reconstruction. If camera positions and geometry are unknown, then they too must be estimated, but this is possible and common for such data (Seitz, et al., 2006).
Although such techniques have been used in other applications such as archaeology (Doneus et al., 2011; Kersten and Lindstaedt, 2012; Plets, et al., 2012; Verhoeven, 2011) and reconstructing human crania (Katz and Friess, 2014), there are relatively few studies reporting the results of such techniques in human bone models. Furthermore, there are no studies examining whether it is possible to produce a validated model of complex shapes such as that of a human vertebra.
The lack of accurate 3D data for spinal modelling, and the cost and potential difficulty associated with generating such data, is problematic. Parent et al (2002) used a 3D digitiser of landmarks using a stylus and identified approximately 190 landmarks for lumbar vertebrae, however this did not produce a 3D reconstruction of the vertebra. They demonstrated that the process is accurate, but it has not yet been validated with another method. The goal of this work is to evaluate and validate an inexpensive method of 3D reconstruction of vertebrae and to establish a freely available surface data set of the same.
2 Materials and methods
2.1 Ground-truth data acquisition
To validate image-based reconstruction, we make use of physical measurements and laser-based range reconstruction of ten vertebrae in the data set. This subsection outlines those methods and the data set itself.
2.1.1 Human skeletal material
Human lumbar vertebrae used for the purposes of generating the 3D models were accessed through the W. D. Trotter Anatomy Museum at the Department of Anatomy, Otago School of Medical Sciences, University of Otago, New Zealand. All human material was utilised in accordance with local ethical guidelines and the New Zealand Human Tissue Act (2008). A total of 86 lumbar vertebrae, selected from each of the five lumbar vertebral levels, was utilised for this study. Lumbar vertebrae are typically numbered 1 to 5 from the most superior to inferior, and are denoted as originating from the lumbar region by the prefix L. Lumbar vertebrae used in this study were: 21 from L1, 13 from L2, 19 from L3, 20 from L4, and 13 from L5. Exclusion criteria for this study included gross anatomical abnormality as determined by an experienced anatomist (JC). Vertebrae were from modern Indian donors of both sexes and were from individuals of skeletal maturity (over 25 years of age, as determined by closure of epiphyses).
2.1.2 Manual measurements of vertebrae
Electronic callipers were used to measure five different physical parameters on ten real vertebrae (Figure 1) chosen randomly from the 86 vertebrae available. These parameters were chosen to provide a variety of points that included the main physical properties of a single vertebra. The different physical parameters measured are described in Figure 1.
2.1.3 Arm scanning
Arm scanning models were acquired using a Faro Platinum (Metris, Leuven, Belgium) scanning arm, equipped with a “Model Maker” Z70 scanning head with a range of 10 centimeters. It takes an experienced technician approximately 90 minutes to acquire a 3D model of a single vertebra with the arm scanner. The “Model Maker” Z70 is advertised as having an accuracy of 0.05 millimetres for a flat plane measurement of 2σ-95% and 0.075 millimetres for 3σ-99.5%.4 A “Model Maker” Z70 scanner was rented and used to scan ten vertebrae at a total cost of NZD$400 (average cost of NZD$40 per vertebra scanned). The ten scanned vertebrae were randomly chosen from the 86 vertebrae available.
2.1.4 Validation of arm scanning models
To verify the accuracy of the arm scanning, we measured on constructed models the same distances that were measured on real vertebrae. Measurements on the models were made with the MeshLab5 software. We compared these measurements of real vertebrae and constructed models by calculating the absolute error and the relative error. If we denote the distance on real vertebrae as dr and the distance on reconstructed model as dm, the absolute error is defined as:
Where |.| is the absolute value. The relative error is defined as:
2.2 Generation of 3D models
Constructing 3D models via images requires four steps: generating digital images of the vertebra, pre-processing of the images, constructing the 3D models, and postprocessing of the 3D models.
2.2.1 Image generation
A Canon EOS 650D (Canon Inc., Tokyo, Japan) equipped with an EF 20 millimetres 1:2.8 ultrasonic lens was used to capture images. Each individual vertebra was mounted on a turntable using Blu-Tack6; images were captured from a distance of 40 centimetres. The turntable was rotated approximately 10° between each photograph to get 30–40 images for a single 360° rotation. Approximate, rather than precise, rotations were used because PhotoScan is able to handle such uncertainty and it makes image acquisition easier for the practitioner. Each vertebra was mounted in four different positions, and the camera was positioned at two different heights (horizon and 10 centimetres above the horizon) to acquire a higher degree of overlapping between images. This resulted in eight possible camera/vertebra configurations as shown in Figure 3 and 30–40 images for each configuration, culminating in 240–320 images per vertebra. Acquiring 280 images took approximately 15 minutes. A plain background was used to allow easy separation of the vertebrae from the background. Originally a white background was used, however during the later stages of image acquisition this was swapped for a black background to match the colour of the turntable. Figure 2 shows one position of a single vertebra with the camera positioned 10 centimetres above the horizon with a white background (left image) and with the camera at the horizon with a black background (right image). Images were taken with autofocus off, ISO set to 400, f-stop set to 8.0 and exposure time at 1/30s. No special illumination conditions were set and the room was illuminated both by artificial light (fluorescent tubes) and natural light (from nearby windows).
2.2.2 Image pre-processing
Once the photographic images were acquired for every vertebra, some unwanted parts in the reconstruction were visible such as the turntable or the Blu-tack. In order to remove these parts in the reconstruction, we automatically segmented the images using K-means clustering (Kanungo, et al., 2002). The background was either white or black. In the case of a white background, we have three different colours: vertebrae colour, white background and black turntable in which case K is equal to 3. In the case of a black background, K is equal to 2. The colour of background did not have any impact on the quality of the reconstruction as long as good segmentation was obtained.
The segmentation processing was run until the result was acceptable according to qualitative visualisation (i.e. good segmentation as opposed to poor segmentation Figure 4). K-means initialises centroids randomly and therefore different runs of the algorithm can produce different results. The processing time for one image takes from 10 to 40 seconds depending on the centroid initialisation, number of centroids, size of the images, and also the time for the user to evaluate if the segmentation is good enough. For more details see (Kanungo, et al., 2002).
2.2.3 3D construction of vertebrae models
The 3D construction was performed using an image-based algorithm. An educational licence was acquired to use Agisoft PhotoScan software7 on a Dell PowerEdge R815 with 64 cpus and 512GB RAM (Dell Corp., Austin, TX). The operating system was CentOS8 6.2 (64-bit Linux). It can also run on a standard desktop, but the performance depends highly on the specifications of the machine used. The software takes as input all available segmented images and produces a 3D model in the form of a triangle-based surface mesh and a 3D point cloud. Camera calibration is done automatically by PhotoScan using well established methods. Initial intrinsic parameters are obtained from image EXIF data and are then optimised along with extrinsic parameters.9 It took approximately five hours of computer time to produce a single model from images of an individual vertebra; once the computer modelling was initiated, no supervision of the process was required. In Agisoft PhotoScan, users can choose between three different accuracies: high, medium, and low. Pair pre-selection can be either disabled or generic, which means looking for pairs of images that overlap then matching them; this function can be used to help reduce processing time.10 In this instance, high accuracy was used and pre-selection disabled but if the reconstruction failed, we changed the parameters of pre-selection and lowered the accuracy of reconstruction.
Once the 3D models were constructed, a post-processing step was used to remove spurious model parts as on occasion the Blu-tack was reconstructed as part of the model. This was performed in MeshLab by removing visible vertices that obviously did not belong to the vertebra, while simultaneously visually verifying that the digital reconstruction of the vertebra was consistent with the physical specimen.
PhotoScan sometimes generated disconnected triangles and vertices that did not belong to the surface, duplicated vertices that might generate edges with zero length, and surfaces with zero area. Furthermore, the topology of the object is not necessarily respected in the reconstructed model. The filling tool of MeshLab was used to generate closed object models. Duplicated vertices, zero length edges and zero area triangles were removed from the model programmatically. The vertebral foramen, the hole on the posterior aspect of the vertebra that usually contains the spinal cord, was not considered for this reconstruction because the focus was on reconstructing the complex, external bony shape of the vertebrae in the first instance.
2.3 3D reconstruction validation
The image-based reconstruction was validated by directly comparing the models from the arm scanning and real vertebrae after alignment. The image-based method does not respect the original scale so alignment is required. The models were roughly aligned manually, and then the iterative closest point algorithm (ICP) (Besl, and McKay, 1992) was used to more precisely align the image-based models to the arm scanning models, thus fixing the scale of the image-based models. Then manual measurements of the five distances as presented in Section 2.1.2 were performed.
3.1 Arm scanning models
3.1.1 Manual Measurements
The measurements taken from real vertebrae are presented in Table 1. The distance between most lateral points of the two transverse processes was between 53.3 and 90.7 millimetres, while the width of the vertebral body ranged between 20.9 and 27.2 millimetres. The height of the vertebral body ranged between 41.4 and 57.3 millimetres, and the anterior-posterior length of vertebral body was between 28.6 and 40.3 millimetres. Finally, the anterior-posterior distance between the anterior edge of the vertebral body and the posterior tip of the spinous process was between 65.8 and 96.2 millimetres.
|Arm Scanning Validation|
|Arm Scanning Models (millimetres)||A||56.0||81.6||90.4||65.2||69.6||63.2||79.0||73.2||53.5||76.0|
|Real Models (millimetres)||A||55.7||81.8||90.7||66.0||70.1||64.3||79.1||74.1||53.3||76.7|
|Absolute Error (millimetres)||A||0.3||0.2||0.3||0.8||0.5||1.1||0.1||0.9||0.2||0.7|
|Relative Error (%)||A||0.5||0.2||0.3||1.2||0.7||1.7||0.1||1.2||0.3||0.9|
3.1.2 Arm scanning models
Arm scanning models are shown in Figure 5. The presented models are not whole and complete; some parts are missing, especially around the region incorporating the spinal canal. The manual measurements of the arm scanning models were repeated five times for each physical parameter measured to minimise the human error using the measurement tool of Meshlab. The distances are presented in Table 1.
3.2 Validation of arm scanning models
The differences between the distances measured on real vertebrae (manually measured with electronic callipers) and arm scanning models are shown in Table 1. The maximum difference between the real vertebrae and the models constructed by arm scanning is 4.8 per cent (mean 1.1%, standard deviation 1.0%). The first histogram of Figure 8 shows the histogram of relative errors between arm scanning models and real vertebrae. A Bland-Altman (1986) analysis indicates that the 95% confidence interval for the arm scanning models versus real vertebrae using calliper measurements is between –0.9 and 1.7 millimetres.
3.3 3D reconstruction data
3.3.1 Photographic images
Figure 3 shows examples of the image dataset of the same vertebra from different angles.
3.3.2 3D models visualisation
In this section we present 10 out of 86 vertebrae of this data set (Figure 6). Vertebrae are presented from different angles to have a better view of the data set. These 10 vertebrae match those constructed with arm scanning. As can be seen in Figure 6 there are some models containing the spinal canal and others not. Across the whole data set, 29 models contain the canal and 57 models do not. For our study the reconstruction of the spinal canal has been ignored.
3.4 Validation of 3D reconstruction data
Image-based models were compared with the real vertebrae and the models generated by the arm-scanner. The difference of relative errors and absolute errors between real vertebrae and the image-based models are shown in Table 2. As performed previously for validating the arm scanning models, a similar comparison was performed between image-based models and real vertebrae. The maximum relative error indicated is 19.1% with a mean relative error of 5.2% and a standard deviation of 4.2%. The second histogram of Figure 8 shows the histogram of the relative errors shown in Table 2. A Bland-Altman (1986) analysis indicates that 95% confidence interval for the image-based models versus real vertebrae using calliper measurements is –4.4 to 5.4 millimetres.
|Image-based models validation with real vertebrae|
|Image-based Models (millimetres)||A||51.7||81.5||–*||–*||–*||53.1||–*||–*||52.5||72.0|
|Real Models (millimetres)||A||55.7||81.8||90.7||66.0||70.1||64.3||79.1||74.1||53.3||76.7|
|Absolute Error (millimetres)||A||4.0||0.3||–*||–*||–*||11.2||–*||–*||0.8||4.7|
|Relative Error (%)||A||7.4||0.4||–*||–*||–*||19.1||–*||–*||1.5||6.3|
The differences between the distances measured on arm scanning models and image-based models are shown in Table 3. The maximum relative error was 17.3%, the mean error is 4.7% and the standard deviation 4.1%. The third histogram of Figure 8 shows the histogram of errors between image-based models and arm scanning models. A Bland-Altman (1986) analysis indicates that 95% confidence interval for the image-based models versus arm scanning models is –4.8 to 5 millimetres.
|Image-based models validation with arm scanning models|
|Arm Scanning Models (millimetres)||A||56.0||81.6||90.4||65.2||69.6||63.2||79.0||73.2||53.5||76.0|
|Image-based Models (millimetres)||A||51.7||81.5||–*||–*||–*||53.1||–*||–*||52.5||72.0|
|Absolute Error (millimetres)||A||4.3||0.1||–*||–*||–*||10.1||–*||–*||1.0||4.0|
|Relative Error (%)||A||7.9||0.1||–*||–*||–*||17.3||–*||–*||1.8||5.4|
The distribution of errors for all vertices when comparing the arm scanning models and image-based models is shown in Figure 7. Statistical analysis indicates that 95% of vertices’ errors are less than 3.5 millimetres with a median of 1.1.
Data in Figure 7 indicates that the worst matching vertebra has 90% of vertices less than four millimetres, 75% less than three millimetres, and 57% less than two millimetres. Figure 9 shows heat maps of errors for four example vertebrae.
The image-based method sometimes constructs extra details that are not required, such as the Blu-tack. If there are visible, superfluous items on the images, these parts are removed during post processing. To get a closed manifold after removing these parts, the models were filled with extra faces to close the mesh. The algorithm does not always fit the shapes smoothly; this is demonstrated in the irregularity of the images of vertebrae 4 and 9 in Figure 6. We can see the models generated by arm scanning look less smooth compared to those generated by the image-based method because PhotoScan uses smooth surfaces by default, whereas the arm scanning models are not smoothed. Arm scanning models could be smoothed (e.g. as Poisson surfaces (Kazhdan, Bolitho and Hoppe, 2006)), but since it makes little difference to model accuracy, this was not done.
This study has examined an inexpensive method of providing a 3D reconstruction of a complex anatomical shape. Results indicate success in regards to providing a 3D reconstruction of human vertebrae with 95% of vertices’ errors less than 3.5 millimetres with a median of 1.1 millimetres.
Single human vertebrae are commonly available around the world (Le Bras, et al., 2003; Varol, et al., 2006) however they have been mostly used for physical measurements (e.g. population parameters) and not for 3D reconstruction. If accessed and processed using 3D modelling, these models could generate larger data sets of lumbar vertebrae that could be used for shape modelling and analysis. 3D model data sets of bony structures are an essential component in understanding the variation of body shape in the human population. Making such databases publicly available helps to advance the state-of-the-art in various medical and anatomical fields. The process of using callipers to measure real bone shape has been widely used especially in quantitative morphometric studies (Gour, Shrivastava and Thakare, 2011; Hurxthal, 1968; Kanani, et al., 2012; Varol, et al., 2006; Zamora, Sari-Sarraf and Long, 2003) and is considered the standard method for vertebrae measurements. However, using callipers is insufficient for constructing 3D models of bone shape, and such models are desirable for automatic and semi-automatic interpretation of medical images.
This study used between 240 and 320 images of every individual vertebra for constructing each model. We have not investigated whether this number is optimal. Generally, we need a large amount of views to cover the whole object to be able to model the object completely, however this takes a longer amount of time because of the photography and image analysis process using the software. The closer the camera is positioned to the object, the more detail, precision and accuracy can be obtained in regards to the quality of image acquired. Furthermore, increasing the f-stop from that used in this study (8.0) may also facilitate an improvement in image analysis.
The high number of views used in this study was needed because vertebrae are complex shapes; fewer images might be appropriate but we have not investigated this systematically. In this respect, it was determined that a high number of images was required to assist this process. Katz and Friess (2014) used 65–85 views in a similar process to construct models of human crania, indicating that for some anatomical shapes far fewer images could be used to produce 3D models of acceptable quality. The main advantage of image-based approaches is cost, at the expense of some accuracy. The models produced were accurate enough, with, in the worst case, 90% of reconstructed points being within 4 millimetres of arm scanning models. Given the likely variability in spinal morphology across the population, small reconstruction errors are of little importance if the goal is to build a statistical model of vertebra variability, such as an active shape model (Cootes and Taylor 1995). A further advantage is that the equipment required is very portable and is able to be easily applied in the field and in challenging environments such as underwater. Image-based methods do not appear to require significantly more human time than active scanner methods, although they do require more computer time. Neither method works for internal structures in vivo, for which CT, MRI and ultra-sound are currently the most common approaches.
Other low-cost 3D reconstruction tools have been developed such as KinectFusion (Izadi, et al., 2011; Newcombe, et al., 2011) based on Microsoft’s Kinect sensor. KinectFusion reconstructs 3D models in real time, and is ideal for medium sized objects or scenes. For smaller objects, however, the resolution is limited to 1–2 millimetres per voxel. Meister et al (2012) independently evaluated KinectFusion and found that it was suitable when ten millimetre resolution in world coordinates were sufficient (75% of surface points were within ten millimetres of ground truth). Our results show that for the worst scanned vertebra 75% of the surface points were within four millimetres of the ground truth.
We also initially experimented with the Kinect 1 as a tool for 3D reconstruction, but found the resolution of disparity image was too low (640 × 480 pixels) for our purposes. These drawbacks make the models much less dense which are confirmed by Khoshelham and Elberink (2012). However, KinectFusion could still be useful for many applications, especially given its speed and ease of use.
The method used in this study does have several limitations. It is less accurate than active scanning methods which are preferred if high accuracy is needed. In particular, highly concave object parts, or holes, are often poorly reconstructed – such as the vertebral foramen which was poorly reconstructed in this study. Furthermore, the scale of reconstructed models is arbitrary and if a metric reconstruction is needed, then at least one physical measurement of the object is required. Also, the use of horizontal and vertical calibrated scale bars around the object could guarantee better geometric quality of the modelled objects, since the scale bars can be measured precisely in the high resolution images. The structure from motion technique used by PhotoScan also has limitations as it can fail to find the right sequential bundle in large set of images (Remondino, et al., 2012), which happens in some cases in our reconstruction. We found that setting PhotoScan to construct lower resolution models solved this problem. Additionally, manual editing of the resulting models is often needed to remove gross errors.
Finally, the representativeness of the data set to the general population is uncertain due to the homogeneity of the models used to generate our data set. The vertebrae are all from the same spinal region and included none with pathology. Anatomically, they are representative of “normal” vertebrae, but may not represent the general population, with different ethnicity and origins. However, the method could be used to generate more representative data with little cost if access to appropriate specimens was available.
This study has illustrated a cost effective method for constructing a 3D model of a complicated anatomical shape such as the human vertebrae and has shown that such a method, while not as accurate as active scanning approaches, is accurate enough for several applications including visualisation and for constructing statistical shape models. Although these methods have been used on similar problems previously (notably human crania), they have not been applied to such a large collection of complex anatomical shapes. Both the images and the reconstructed dataset are provided for future use of the research community. The models are in the following repository “3D Lumbar Vertebrae Data Set”,11 and the images are provided upon request as the size is about 1Gb per vertebra. The source code of segmentation and post processing is also available. As far as we are aware, no such public repository for human vertebrae currently exists. Additional investigations are required across different user groups to further validate the generated data and determine its usefulness across applications.