Bridges User Guide
A community dataset space allows Bridges users from different grants to share data in a common space. Bridges hosts both public and private datasets, providing rapid access for individuals, collaborations and communities with appropriate protections.
Data collections are stored on pylon5, Bridges' persistent file system. The space they use counts toward the Bridges storage allocation for the grant hosting them.
If you would like to host a data collection on Bridges, let us know what you need by completing the Community Dataset Request form.
Publicly available datasets
Some data collections are available to anyone with a Bridges' account. They include:
- Natural Languge Tool Kit Data
NLTK comes with many corpora, toy grammars, trained models, etc. A complete list of the available data is posted at: http://nltk.org/nltk_data/
Available on Bridges at /pylon5/datasets/community/nltk
- Genomics Data
- Several genomics datasets are publicly available.
Dataset Available at Barrnap /pylon5/datasets/community/barrnap BLAST /pylon5/datasets/community/blast CheckM /pylon5/datasets/community/checkm Dammit /pylon5/datasets/community/dammit Dammit uniref90 /pylon5/datasets/community/dammit_uniref90 Homer /pylon5/datasets/community/homer Longranger /pylon5/datasets/community/longranger MetaPhlAn2 /pylon5/datasets/community/metaphlan2 Prokka /pylon5/datasets/community/prokka Repbase /pylon5/datasets/community/repbase Whippet /pylon5/datasets/community/whippet