- Home
- Neuroconductor Tutorials
- Neurohcp
Getting Data from the Human Connectome Project (HCP)
John Muschelli
2020-10-02
All code for this document is located at here.
Human Connectome Project (HCP)
The Human Connectome Project (HCP) is a consortium of sites whose goal is to map “human brain circuitry in a target number of 1200 healthy adults using cutting-edge methods of noninvasive neuroimaging” (https://www.humanconnectome.org/). It includes a large cohort of individuals with a vast amount of neuroimaging data ranging from structural magnetic resonance imaging (MRI), functional MRI – both during tasks and resting-state– and diffusion tensor imaging (DTI), from multiple sites.
Getting Access to the Data
The data is available to those that agree to the license. Users can either pay to get hard drives of the data sent to them, named “Connectome In A Box”, or access the data online. The data can be obtained through the database at http://db.humanconnectome.org. Data can be downloaded from the website directly in a browser or through an Amazon Simple Storage Solution (S3) bucket. We will focus on accessing the data from S3.
Getting an Access/API Key
Once logged into http://db.humanconnectome.org and the terms are accepted, the user must enable Amazon S3 access for their Amazon account. The user will then be provided an access key identifier (ID), which is required to authenticate a user to Amazon as well as a secret key. These access and secret keys are necessary for the neurohcp package, and will be referred to as access keys or API (application program interface) keys.
Installing the neurohcp
package
We will install the neurohcp
package using the Neuroconductor installer:
source("http://neuroconductor.org/neurocLite.R")
neuro_install("neurohcp", release = "stable")
Setting the API key
In the neurohcp
package, set_aws_api_key
will set the AWS access keys:
set_aws_api_key(access_key = "ACCESS_KEY", secret_key = "SECRET_KEY")
or these can be stored in AWS_ACCESS_KEY_ID
, AWS_SECRET_ACCESS_KEY
environment variables, respectively.
Once these are set, the functions of neurohcp
are ready to use. To test that the API keys are set correctly, one can run bucketlist
:
neurohcp::bucketlist()
Bucket CreationDate
1 hcp-openaccess 2018-08-13T15:10:17.000Z
2 hcp-openaccess-logfiles 2018-07-25T21:19:33.000Z
3 hcp-openaccess-logs-temp 2018-04-20T15:38:14.000Z
4 hcp-openaccess-logstorage-temp 2018-06-08T15:51:53.000Z
5 hcp-openaccess-test 2018-06-29T13:17:02.000Z
6 hcp-openaccess-trail-temp 2018-04-20T15:43:24.000Z
We see that hcp-openaccess
is a bucket that we have access to, and therefore have access to the data.
Getting Data: Downloading a Directory of Data
In the neurohcp
package, there is a data set indicating the scans read for each subject, named hcp_900_scanning_info
. We can subset those subjects that have diffusion tensor imaging:
ids_with_dwi = hcp_900_scanning_info %>%
filter(scan_type %in% "dMRI") %>%
select(id) %>%
unique
head(ids_with_dwi)
# A tibble: 6 x 1
id
<chr>
1 100307
2 100408
3 101006
4 101107
5 101309
6 101410
Let us download the complete directory of diffusion data using download_hcp_dir
:
r = download_hcp_dir("HCP/100307/T1w/Diffusion", verbose = FALSE)
print(basename(r$output_files))
[1] "bvals" "bvecs" "data.nii.gz" "grad_dev.nii.gz"
[5] "nodif_brain_mask.nii.gz"
This diffusion data is the data that can be used to create summaries such as fractional anisotropy and mean diffusivity.
If we create a new column with all the directories, we can iterate over these to download all the diffusion data for these subjects from the HCP database.
ids_with_dwi = ids_with_dwi %>%
mutate(id_dir = paste0("HCP/", id, "/T1w/Diffusion"))
Getting Data: Downloading a Single File
We can also download a single file using download_hcp_file
. Here we will simply download the bvals
file:
ret = download_hcp_file("HCP/100307/T1w/Diffusion/bvals", verbose = FALSE)
Session Info
devtools::session_info()
─ Session info ─────────────────────────────────────────────────────────────────────────────────────────────
setting value
version R version 4.0.2 (2020-06-22)
os macOS Catalina 10.15.6
system x86_64, darwin17.0
ui RStudio
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz America/New_York
date 2020-10-02
─ Packages ─────────────────────────────────────────────────────────────────────────────────────────────────
package * version date lib source
abind 1.4-5 2016-07-21 [2] CRAN (R 4.0.0)
ANTsR 0.5.6.1 2020-06-01 [2] Github (ANTsX/ANTsR@9c7c9b7)
ANTsRCore 0.7.4.6 2020-07-07 [2] Github (muschellij2/ANTsRCore@61c37a1)
assertthat 0.2.1 2019-03-21 [2] CRAN (R 4.0.0)
aws.s3 0.3.21 2020-04-07 [1] CRAN (R 4.0.2)
aws.signature 0.6.0 2020-06-01 [2] Github (cloudyr/aws.signature@5689733)
backports 1.1.10 2020-09-15 [1] CRAN (R 4.0.2)
base64enc 0.1-3 2015-07-28 [2] CRAN (R 4.0.0)
bitops 1.0-6 2013-08-17 [2] CRAN (R 4.0.0)
callr 3.4.4 2020-09-07 [1] CRAN (R 4.0.2)
cli 2.0.2 2020-02-28 [2] CRAN (R 4.0.0)
colorout * 1.2-2 2020-06-01 [2] Github (jalvesaq/colorout@726d681)
crayon 1.3.4 2017-09-16 [2] CRAN (R 4.0.0)
curl 4.3 2019-12-02 [2] CRAN (R 4.0.0)
desc 1.2.0 2020-06-01 [2] Github (muschellij2/desc@b0c374f)
devtools 2.3.1.9000 2020-08-25 [2] Github (r-lib/devtools@df619ce)
digest 0.6.25 2020-02-23 [2] CRAN (R 4.0.0)
dplyr * 1.0.2 2020-08-18 [2] CRAN (R 4.0.2)
ellipsis 0.3.1 2020-05-15 [2] CRAN (R 4.0.0)
evaluate 0.14 2019-05-28 [2] CRAN (R 4.0.0)
EveTemplate 1.0.0 2020-06-01 [2] Github (muschellij2/EveTemplate@ed54115)
extrantsr 3.9.13.1 2020-09-03 [2] Github (muschellij2/extrantsr@00c75ad)
fansi 0.4.1 2020-01-08 [2] CRAN (R 4.0.0)
fs 1.5.0 2020-07-31 [2] CRAN (R 4.0.2)
fslr 2.25.0 2020-09-24 [1] local
generics 0.0.2 2018-11-29 [2] CRAN (R 4.0.0)
glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2)
htmltools 0.5.0 2020-06-16 [2] CRAN (R 4.0.0)
httr 1.4.2 2020-07-20 [2] CRAN (R 4.0.2)
ITKR 0.5.3.2.0 2020-06-01 [2] Github (stnava/ITKR@9bdd5f8)
knitr * 1.30 2020-09-22 [1] CRAN (R 4.0.2)
lattice 0.20-41 2020-04-02 [2] CRAN (R 4.0.2)
lifecycle 0.2.0 2020-03-06 [2] CRAN (R 4.0.0)
magrittr 1.5 2014-11-22 [2] CRAN (R 4.0.0)
Matrix 1.2-18 2019-11-27 [2] CRAN (R 4.0.2)
matrixStats 0.56.0 2020-03-13 [2] CRAN (R 4.0.0)
memoise 1.1.0 2017-04-21 [2] CRAN (R 4.0.0)
mgcv 1.8-32 2020-08-19 [2] CRAN (R 4.0.2)
neurobase * 1.31.0 2020-09-04 [2] local
neurohcp * 0.9.0 2020-10-01 [1] local
nlme 3.1-149 2020-08-23 [2] CRAN (R 4.0.2)
oro.nifti * 0.11.0 2020-09-04 [2] local
pillar 1.4.6 2020-07-10 [2] CRAN (R 4.0.2)
pkgbuild 1.1.0 2020-07-13 [2] CRAN (R 4.0.2)
pkgconfig 2.0.3 2019-09-22 [2] CRAN (R 4.0.0)
pkgload 1.1.0 2020-05-29 [2] CRAN (R 4.0.0)
plyr 1.8.6 2020-03-03 [2] CRAN (R 4.0.0)
prettyunits 1.1.1 2020-01-24 [2] CRAN (R 4.0.0)
processx 3.4.4 2020-09-03 [1] CRAN (R 4.0.2)
ps 1.3.4 2020-08-11 [2] CRAN (R 4.0.2)
purrr 0.3.4 2020-04-17 [2] CRAN (R 4.0.0)
R.matlab 3.6.2 2018-09-27 [2] CRAN (R 4.0.0)
R.methodsS3 1.8.0 2020-02-14 [2] CRAN (R 4.0.0)
R.oo 1.23.0 2019-11-03 [2] CRAN (R 4.0.0)
R.utils 2.9.2 2019-12-08 [2] CRAN (R 4.0.0)
R6 2.4.1 2019-11-12 [2] CRAN (R 4.0.0)
Rcpp 1.0.5 2020-07-06 [2] CRAN (R 4.0.0)
RcppEigen 0.3.3.7.0 2019-11-16 [2] CRAN (R 4.0.0)
remotes 2.2.0 2020-07-21 [2] CRAN (R 4.0.2)
rlang 0.4.7.9000 2020-09-09 [1] Github (r-lib/rlang@60c0151)
rmarkdown * 2.3 2020-06-18 [2] CRAN (R 4.0.0)
RNifti 1.2.2 2020-09-07 [1] CRAN (R 4.0.2)
rprojroot 1.3-2 2018-01-03 [2] CRAN (R 4.0.0)
rsconnect 0.8.16 2019-12-13 [2] CRAN (R 4.0.0)
rstudioapi 0.11 2020-02-07 [2] CRAN (R 4.0.0)
sessioninfo 1.1.1 2018-11-05 [2] CRAN (R 4.0.0)
stapler 0.7.2 2020-07-09 [2] Github (muschellij2/stapler@79e23d2)
stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.2)
stringr * 1.4.0 2019-02-10 [2] CRAN (R 4.0.0)
testthat 2.99.0.9000 2020-09-17 [1] Github (r-lib/testthat@fbbd667)
tibble 3.0.3 2020-07-10 [2] CRAN (R 4.0.2)
tidyselect 1.1.0 2020-05-11 [2] CRAN (R 4.0.0)
usethis 1.6.1.9001 2020-08-25 [2] Github (r-lib/usethis@860c1ea)
utf8 1.1.4 2018-05-24 [2] CRAN (R 4.0.0)
vctrs 0.3.4 2020-08-29 [1] CRAN (R 4.0.2)
WhiteStripe 2.3.2 2019-10-01 [2] CRAN (R 4.0.0)
withr 2.3.0 2020-09-22 [1] CRAN (R 4.0.2)
xfun 0.18 2020-09-29 [1] CRAN (R 4.0.2)
xml2 1.3.2 2020-04-23 [2] CRAN (R 4.0.0)
yaml * 2.2.1 2020-02-01 [2] CRAN (R 4.0.0)
[1] /Users/johnmuschelli/Library/R/4.0/library
[2] /Library/Frameworks/R.framework/Versions/4.0/Resources/library