Abstract

The package facilitates the interaction with and manipulation of medical imaging data that conform to the ANALYZE, NIfTI and AFNI formats. The 4 class framework is used to develop basic ANALYZE and NIfTI classes, where NIfTI extensions may be used to extend the fixed-byte NIfTI header. One example of this, that has been implemented, is an -based “audit trail” tracking the history of operations applied to a data set. The conversion from DICOM to ANALYZE/NIfTI is straightforward using the capabilities of . The 4 classes have been developed to provide a user-friendly interface to the ANALYZE/NIfTI data formats; allowing easy data input, data output, image processing and visualization.

## oro.nifti 0.10.3

Introduction

Medical imaging is well established in both the clinical and research areas with numerous equipment manufacturers supplying a wide variety of modalities. The ANALYZE format was developed at the Mayo Clinic (in the 1990s) to store multidimensional biomedical images. It is fundamentally different from the DICOM standard since it groups all images from a single acquisition (typically three- or four-dimensional) into a pair of binary files, one containing header information and one containing the image information. The DICOM standard groups the header and image information, typically a single two-dimensional image, into a single file. Hence, a single acquisition will contain multiple DICOM files but only a pair of ANALYZE files.

The NIfTI format was developed in the early 2000s by the DFWG (Data Format Working Group) in an effort to improve upon the ANALYZE format. The resulting NIfTI-1 format adheres to the basic header/image combination from the ANALYZE format, but allows the pair of files to be combined into a single file and re-defines the header fields. In addition, NIfTI extensions allow one to store additional information (e.g., key acquisition parameters, experimental design) inside a NIfTI file.

The material presented here provides users with a method of interacting with ANALYZE and NIfTI files in . Real-world data sets, that are publicly available, are used to illustrate the basic functionality of . It should be noted that focuses on functions for data input/output and visualization. 4 classes and are provided for further statistical analysis in without losing contextual information from the original ANALYZE or NIfTI files. Images in the metadata-rich DICOM format may be converted to NIfTI semi-automatically using by utilizing as much information from the DICOM files as possible. Basic visualization functions, similar to those commonly used in the medical imaging community, are provided for and objects. Additionally, the package allows one to track every operation on a object in an -based audit trail.

The package should appeal not only to package developers, but also to scientists and researchers who want to interrogate medical imaging data using the statistical capabilities of without writing and validating their own basic data input/output functionality. Table~ lists the key functions for and groups them according to common functionality. An example of using statistical methodology in for the analysis of functional magnetic resonance imaging (fMRI) data is given in section~. Packages already available on CRAN that utilize include: , , , and .

Although the industry standard for medical imaging data is DICOM, another format has come to be heavily used in the image analysis community. The ANALYZE format was originally developed in conjunction with an image processing system (of the same name) at the Mayo Foundation. A common version of the format, although not the most recent, is called ANALYZE 7.5. A copy of the file ANALYZE75.pdf has been included in (accessed via ) since it does not appear to be available from any longer. An ANALYZE 7.5 format image is comprised of two files, the “.hdr” and “.img” files, that contain information about the acquisition and the acquisition itself, respectively. A more recent adaption of this format is known as NIfTI-1 and is a product of the Data Format Working Group (DFWG) from the Neuroimaging Informatics Technology Initiative (NIfTI; ). The NIfTI-1 data format is almost identical to the ANALYZE format, but offers a few improvements

There are several packages that also offer input/output functionality for the NIfTI and ANALYZE data formats in addition to image analysis capabilities for specific MRI acquisition sequences; e.g., , and . The package provides access to NIfTI data via the library .

The NIfTI header inherits its structure (348 bytes in length) from the ANALYZE data format. The last four bytes in the NIfTI header correspond to the “magic” field and denote whether or not the header and image are contained in a single file () or two separate files (), the latter being identical to the structure of the ANALYZE data format. The NIfTI data format added an additional four bytes to allow for “extensions” to the header. By default these four bytes are set to zero.

The first example of reading in, and displaying, medical imaging data in NIfTI format was obtained from the NIfTI website (). Successful execution of the commands

fname <- system.file(file.path("nifti", "mniLR.nii.gz"), package="oro.nifti")
(mniLR <- readNIfTI(fname))

## NIfTI-1 format
##   Type            : nifti
##   Data Type       : 2 (UINT8)
##   Bits per Pixel  : 8
##   Slice Code      : 0 (Unknown)
##   Intent Code     : 0 (None)
##   Qform Code      : 0 (Unknown)
##   Sform Code      : 4 (MNI_152)
##   Dimension       : 91 x 109 x 91
##   Pixel Dimension : 2 x 2 x 2
##   Voxel Units     : mm
##   Time Units      : sec

pixdim(mniLR)

## [1] 0 2 2 2 1 1 1 1

descrip(mniLR)

## [1] "FSL3.2beta"

aux.file(mniLR)

## [1] "none                   "

produces an 4 object (or if the audit trail option is set). Some accessor functions are also provided; e.g., and . The former is used to access the original name of the file (if it has been provided) and the latter is the name of a valid NIfTI header field used to hold a “description” (up to 80 characters in length).

Image information begins at the byte position determined by the slot. In a single NIfTI file (), this is by default after the first 352 bytes. Header extensions extend the size of the header and come before the image information leading to a consequent increase of for single NIfTI files. The split NIfTI () and ANALYZE formats contain pairs of files, where the header and image information are separated, and do not have this problem. In this case is set to 0.

The function has been overloaded so that it behaves differently when dealing with medical image objects ( and ). The command

image(mniLR)

produces a three-dimensional array of the MNI brain, with the default NIfTI axes, and is displayed on a $10{\times}10$ grid of images (Figure a). The function for medical image 4 objects is an attempt to balance minimal user input with enough flexibility to customize the display when necessary. For example, single slices may be viewed by using the option in conjunction with the option to specify the slice.

The second example of reading in and displaying medical imaging data in the NIfTI format was also obtained from the NIfTI website (). Successful execution of the commands

fname <- system.file(file.path("nifti", "mniRL.nii.gz"), package="oro.nifti")
(mniRL <- readNIfTI(fname))

## NIfTI-1 format
##   Type            : nifti
##   Data Type       : 2 (UINT8)
##   Bits per Pixel  : 8
##   Slice Code      : 0 (Unknown)
##   Intent Code     : 0 (None)
##   Qform Code      : 0 (Unknown)
##   Sform Code      : 4 (MNI_152)
##   Dimension       : 91 x 109 x 91
##   Pixel Dimension : 2 x 2 x 2
##   Voxel Units     : mm
##   Time Units      : sec

image(mniRL)

produces a three-dimensional array of the MNI brain that is displayed in a $10{\times}10$ grid of images (Figure b). The two sets of data in Figure are stored in two different orientations, commonly referred to as the and conventions. The neurological convention is where “right is right” and one is essentially looking through the subject. The radiological convention is where “right is left” and one is looking at the subject.

An additional graphical display function has been added for and objects that allows a so-called orthographic visualization of the data.

orthographic(mniRL)

As seen in Figure the mid-axial, mid-sagittal and mid-coronal planes are displayed by default. The slices used may be set using , where $(I,J,K)$ are appropriate indices, and the crosshairs will provide a spatial reference in each plane relative to the other two.

The NIfTI format contains an implicit generalized spatial transformation from the data co-ordinate system $(i,j,k)$ into a real-space “right-handed” co-ordinate system. In this real-space system, the $(x,y,z)$ axes are set such that $x$ increases from left to right, $y$ increases from posterior to anterior and $z$ increases from inferior to superior.

At this point in time the package cannot apply an arbitrary transform to the imaging data into $(x,y,z)$ space – such a transform may require non-integral indices and interpolation steps. The package does accommodate straightforward transformations of imaging data; e.g., setting the $i$ -axis to increase from right to left (the neurological convention). Future versions of will attempt to address more complicated spatial transformations and provide functionality to display the $(x,y,z)$ axes on orthographic plots.

A major improvement in the package is the fact that standard medical imaging formats are stored in unique classes under the 4 system . Essentially, NIfTI and ANALYZE data are stored as multi-dimensional arrays with extra slots created that capture the format-specific header information; e.g., for a object

slotNames(mniRL)

##  [1] ".Data"          "sizeof_hdr"     "data_type"     
##  [4] "db_name"        "extents"        "session_error" 
##  [7] "regular"        "dim_info"       "dim_"          
## [10] "intent_p1"      "intent_p2"      "intent_p3"     
## [13] "intent_code"    "datatype"       "bitpix"        
## [16] "slice_start"    "pixdim"         "vox_offset"    
## [19] "scl_slope"      "scl_inter"      "slice_end"     
## [22] "slice_code"     "xyzt_units"     "cal_max"       
## [25] "cal_min"        "slice_duration" "toffset"       
## [28] "glmax"          "glmin"          "descrip"       
## [31] "aux_file"       "qform_code"     "sform_code"    
## [34] "quatern_b"      "quatern_c"      "quatern_d"     
## [37] "qoffset_x"      "qoffset_y"      "qoffset_z"     
## [40] "srow_x"         "srow_y"         "srow_z"        
## [43] "intent_name"    "magic"          "extender"      
## [46] "reoriented"

c(cal.min(mniRL), cal.max(mniRL))

## [1]   0 255

range(mniRL)

## [1]   0 255

mniRL@"datatype"

## [1] 2

convert.datatype(mniRL@"datatype")

## [1] "UINT8"

Note, an ANALYZE object has a slightly different set of slots. Slots 4–47 are taken verbatim from the definition of the NIfTI format and are read directly from a file. The slot is the multidimensional array (since class inherits from class ) and the slots , and are used for internal bookkeeping. In the code above we have accessed the min/max values of the imaging data using the and accessor functions which matches a direct interrogation of the slot using the function. Looking at the slot provides a numeric code that may be converted into a value that indicates the type of byte structure used (in this case an 8-bit or 1-byte unsigned integer).

As introduced in Section~ there are currently only two accessor functions to slots in the NIfTI header ( and ) – all other slots are either ignored or used inside of functions that operate on ANALYZE/NIfTI objects. The NIfTI class also has the ability to read and write extensions that conform to the NIfTI data format. Customized printing and validity-checking functions are available to the user and every attempt has been made to ensure that the information from the multi-dimensional array is in agreement with the header values.

The constructor function produces valid NIfTI objects, including a consistent header, from an arbitrary array.

n <- 100
(random.image <- nifti(array(runif(n*n), c(n,n,1))))

## NIfTI-1 format
##   Type            : nifti
##   Data Type       : 2 (UINT8)
##   Bits per Pixel  : 8
##   Slice Code      : 0 (Unknown)
##   Intent Code     : 0 (None)
##   Qform Code      : 0 (Unknown)
##   Sform Code      : 0 (Unknown)
##   Dimension       : 100 x 100 x 1
##   Pixel Dimension : 1 x 1 x 1
##   Voxel Units     : Unknown
##   Time Units      : Unknown

random.image@"dim_"

## [1]   3 100 100   1   1   1   1   1

dim(random.image)

## [1] 100 100   1

The function outputs valid NIfTI class files, which can be opened in other medical imaging software. Files can either be stored as standard files or compressed with gnuzip (default).

writeNIfTI(random.image, "random")
list.files(pattern="random")

## [1] "random.nii.gz"

## [1] TRUE

Following on from the 4 implementation of both the NIfTI and ANALYZE data formats, the ability to extend the NIfTI data format header is utilized in the package. Please use the command

options(niftiAuditTrail=TRUE)

to turn on the “audit trail” option in and then execute the function . With the option enabled extensions are properly handled when reading and writing NIfTI data, users are allowed to add extensions to newly-created NIfTI objects by casting them as objects and adding objects to the slot, and all operations that are performed on a NIfTI object will generate what we call an that consists of an -based log .

Figure displays output from the accessor function , the -based audit trail that is stored as a NIfTI header extension.

Each log entry contains information not only about the function applied to the NIfTI object, but also various system-level information; e.g., version of , user name, date, time, etc. When writing NIfTI-class objects to disk, the -based NIfTI extension is converted into plain text and saved using to denote plain ASCII text. The user may control the tracking of data manipulation via the audit trail using the global option .

Basic visualization of and class images can be achieved with any visualization for arrays in . For example, the package provides functions and for visualization . Please note that functions in expect grey-scale values in the range $[0,1]$ , hence the display of data may be performed using

mniLR.range <- range(mniLR)
EBImage::display((mniLR - min(mniLR)) / diff(mniLR.range))

Interactive visualization of multi-dimensional arrays, stored in NIfTI or ANALYZE format, is however best performed outside of at this point in time. Popular viewers, especially for neuroimaging data, include

The package provides basic interactive visualization of ANALYZE/NIfTI data using a / interface .

This is an example of reading in, and displaying, a four-dimensional medical imaging data set in NIfTI format obtained from the evaluation and example data suite (). Successful execution of the commands

filtered.func.data <- 
  system.file(file.path("nifti", "filtered_func_data.nii.gz"), 
              package="oro.nifti")
(ffd <- readNIfTI(filtered.func.data))

## NIfTI-1 format
##   Type            : niftiExtension
##   Data Type       : 4 (INT16)
##   Bits per Pixel  : 16
##   Slice Code      : 0 (Unknown)
##   Intent Code     : 0 (None)
##   Qform Code      : 0 (Unknown)
##   Sform Code      : 0 (Unknown)
##   Dimension       : 64 x 64 x 21 x 64
##   Pixel Dimension : 1 x 1 x 1 x 1
##   Voxel Units     : Unknown
##   Time Units      : Unknown

image(ffd, zlim=range(ffd)*0.95)

produces a four-dimensional (4D) array of imaging data that may be displayed in a $5{\times}5$ grid of images (Figure a). The first three dimensions are spatial locations of the voxel (volume element) and the fourth dimension is time for this functional MRI (fMRI) acquisition. As seen from the summary of object, there are 21 axial slices of fairly coarse resolution ( $4{\times}4{\times}6\;\text{mm}$ ) and reasonable temporal resolution ( $3\;\text{s}$ ). Figure b depicts the orthographic display of the using the axial plane containing the left-and-right thalamus to approximately center the crosshair vertically.

orthographic(ffd, xyz=c(34,29,10), zlim=range(ffd)*0.9)

The programming environment provides a wide variety of statistical methodology for the quantitative analysis of medical imaging data. For example, functional MRI (fMRI) data are typically analyzed by applying a multiple linear regression model, commonly referred to in the literature as a general linear model (GLM), that utilizes the stimulus experiment to construct the design matrix. Estimation of the regression coefficients in the GLM produces a statistical image; e.g., $Z$ -statistics for a voxel-wise hypothesis test on activation in fMRI experiments .

The 4D volume of imaging data in was acquired in an experiment with a repetition time $\text{TR}=3\;\text{s}$ , using both visual and auditory stimuli. The visual stimulus was applied using an on/off pattern for a duration of 60 seconds and the auditory stimulus was applied using an on/off pattern for a duration of 90 seconds. A parametric haemodynamic response function (HRF), with mean $\mu=6$ and standard deviation $\sigma=3$ , is utilized here which is similar to the default values in . We construct the experimental design and HRF in seconds, perform the convolution and then downsample by a factor of three in order to obtain columns of the design matrix that match the acquisition of the MRI data.

visual <- rep(c(-0.5,0.5), each=30, times=9)
auditory <- rep(c(-0.5,0.5), each=45, times=6)
hrf <- c(dgamma(1:15, 4, scale=1.5))
hrf0 <- c(hrf, rep(0, length(visual)-length(hrf)))
visual.hrf <- convolve(hrf0, visual)
hrf0 <- c(hrf, rep(0, length(auditory)-length(hrf)))
auditory.hrf <- convolve(hrf0, auditory)
index <- seq(3, 540, by=3)
visual.hrf <- visual.hrf[index]
auditory.hrf <- auditory.hrf[index]

Figure depicts the visual and auditory stimuli, convolved with the HRF, in the order of acquisition. The design matrix is then used in a voxel-wise GLM, where the function in estimates the parameters in the linear regression. At each voxel $t$ -statistics and their associated $p$ -values are computed for the hypothesis test of no effect for each individual stimulus, along with an $F$ -statistic for the hypothesis test of no effect of any stimuli using the function.

##reduced length due to R package storage limitations
visual.hrf<-visual.hrf[1:64]
auditory.hrf<-auditory.hrf[1:64]
## background threshold: 10% max intensity
voxel.lsfit <- function(x, thresh) { # general linear model
  ## check against background threshold
  if (max(x) < thresh) {
    return(rep(NA, 5))
  }
  ## glm
  output <- lsfit(cbind(visual.hrf, auditory.hrf), x)
  ## extract t-statistic, p-values
  output.t <- ls.print(output, print.it=FALSE)$coef.table[[1]][2:3,3:4]
  output.f <- ls.print(output, print.it=FALSE)$summary[3]
  c(output.t, as.numeric(output.f))
}
## apply local glm to each voxel
ffd.glm <- apply(ffd, 1:3, voxel.lsfit, thresh=0.1 * max(ffd))

Given the multidimensional array of output from the GLM fitting procedure, the $t$ -statistics are separated and converted into $Z$ -statistics to follow the convention used in . For the purposes of this example we have not applied any multiple comparisons correction procedure and, instead, use a relatively large threshold of $Z>5$ for visualization.

dof <- ntim(ffd) - 1
Z.visual <- nifti(qnorm(pt(ffd.glm[1,,,], dof, log.p=TRUE), log.p=TRUE),
                  datatype=16)
Z.auditory <- nifti(qnorm(pt(ffd.glm[2,,,], dof, log.p=TRUE), log.p=TRUE),
                    datatype=16)

yrange <- c(5, max(Z.visual, na.rm=TRUE))
overlay(ffd, ifelse(Z.visual > 5, Z.visual, NA), 
        zlim.x=range(ffd)*0.95, zlim.y=yrange)

yrange <- c(5, max(Z.auditory, na.rm=TRUE))
overlay(ffd, ifelse(Z.auditory > 5, Z.auditory, NA), 
        zlim.x=range(ffd)*0.95, zlim.y=yrange)

Statistical images in neuroimaging are commonly displayed as an overlay on top of a reference image (one of the dynamic acquisitions) in order to provide anatomical context. The command in allows one to display the statistical image of voxel-wise activations overlayed on one of the original EPI (echo planar imaging) volumes acquired in the fMRI experiment. The 3D array of $Z$ -statistics for the visual and auditory tasks are overlayed on the original data for “anatomical” reference in Figure . The $Z$ -statistics that exceed the threshold appear to match know neuroanatomy, where the visual cortex in the occipital lobe shows activation under the visual stimulus (Figure a) and the primary auditory cortex in the temporal lobe shows activation under the auditory stimulus (Figure b).

Medical image analysis depends on the efficient manipulation and conversion of DICOM data. The package has been developed to provide the user with a set of functions that mask as many of the background details as possible while still providing flexible and robust performance.

The future of medical image analysis in will benefit from a unified view of the imaging data standards: DICOM, NIfTI, ANALYZE, AFNI, MINC, etc. The existence of a single package for handling imaging data formats would facilitate interoperability between the ever increasing number of packages devoted to medical image analysis. We do not assume that the data structures in are best-suited for this purpose and we welcome an open discussion around how best to provide this standardization to the end user.

The authors would like to thank the National Biomedical Imaging Archive (NBIA), the National Cancer Institute (NCI), the National Institute of Health (NIH) and all institutions that have contributed medical imaging data to the public domain. The authors would also like to thank K. Tabelow for providing functionality around the AFNI data format. VS is supported by the German Research Council (DFG SCHM 2747/1-1).

true

Brandon Whitcher

Volker J. Schmid

Andrew Thornton

Introduction