Accessing eRI storage
What is the difference between a dataset and a project provisioning type?
Both types are filesets on the filesystem (storage).
The dataset consists of one fileset in /tdc_persist/datasets/
.
The project consists of two filesets, one in /tdc_persist/projects/
and the other in /tdc_scratch/projects/
.
On AgR eRI login or compute nodes
Symbolic links are:
/agr/projects
/agr/datasets
/agr/scratch
Links are:/tdc_persist/projects
/tdc_persist/datasets
/tdc_scratch/projects
Mount points are:
/mnt/gpfs/persist/
for both datasets and projects and /mnt/gpfs/scratch/
for projects temp. working space
See also How to access the AgResearch eRI compute cluster
Locating legacy data on eRI
In order to ease the transition from HPC to eRI, a legacy link farm has been created, with symbolic links from the legacy paths to the new locations. This is mounted on all eRI nodes (login and compute).
This addresses the problem of the very many legacy scripts which have hardcoded paths to the legacy data locations.
active -> /agr/persist/projects/{uniqueId}/active
scratch -> /agr/persist/projects/{uniqueId}/scratch
itmp -> /agr/scratch/projects/{uniqueId}
archive -> /agr/persist/datasets/{uniqueId}
Example:
login-0$ ls -l /dataset/blastdata/
total 2
lrwxrwxrwx. 1 eri_migration@iam.flexi.nesi.org.nz eri_migration@iam.flexi.nesi.org.nz 43 Jul 18 16:29 active -> /agr/persist/projects/2002-blastdata/active
lrwxrwxrwx. 1 eri_migration@iam.flexi.nesi.org.nz eri_migration@iam.flexi.nesi.org.nz 44 Jul 18 16:29 scratch -> /agr/persist/projects/2002-blastdata/scratch
lrwxrwxrwx. 1 eri_migration@iam.flexi.nesi.org.nz eri_migration@iam.flexi.nesi.org.nz 36 Jul 18 16:29 itmp -> /agr/scratch/projects/2002-blastdata
lrwxrwxrwx. 1 eri_migration@iam.flexi.nesi.org.nz eri_migration@iam.flexi.nesi.org.nz 36 Jul 18 16:29 archive -> /agr/persist/datasets/2002-blastdata
login-0$ ls -l /dataset/blastdata/active/
454temp hs_faa_bak plant.protein.faa.psi swissprot.00.pnd temp UniVec.nhr
agall.seq.exists junkx1 plant.protein.faa.psq swissprot.00.pni tigr UniVec.nin
agfilter1.pl ln_mirror.sh plant.rna.fna.nhr swissprot.00.pog Tobacco_MF_assembl.exists UniVec.nsd
bgi_sheep ln_mirror.sh.bu1 plant.rna.fna.nin swissprot.00.ppd Tobacco_MF_assembly.fa.exists UniVec.nsi
blastdata.exists log plant.rna.fna.nnd swissprot.00.ppi UMD3_OA_v1 UniVec.nsq
BTA_OA_ver.2.exists merged.dmp plant.rna.fna.nni swissprot.00.psd uniprot_kb.fa UniVec.prop
bt_faa_bak mirror plant.rna.fna.nsd swissprot.00.psi uniprot_kb.fasta.exists unpack.sh
citations.dmp names.dmp plant.rna.fna.nsi swissprot.00.psq uniprot_sprot.fa.exists vector.fa
cs08.seq.exists nodes.dmp plant.rna.fna.nsq swissprot.pal uniprot_swissprot.fasta.exists vector.nhr
delnodes.dmp OA_chromosomes_ver.1.0 public_readonly taxbti.bti.bak UniVec vector.nin
division.dmp OAR_chromosomes_ver.1.0.exists readme.txt taxdb.btd UniVec_Core.exists vector.nnd
est.exists OARv3.0_masked_with_SNPs_and_indels.exists reorg1.sh.bu1 taxdb.btd.bu1 UniVec_Core.fa.exists vector.nni
gc.prt obsolete_deleteafter01112008 riceensembl taxdb.bti UniVec_Core.nhr vector.nsd
gencode.dmp plant.protein.faa.phr sheep_chr_OAR.exists taxdb.bti.bak UniVec_Core.nin vector.nsi
geneious_blast_template plant.protein.faa.pin sheep.v3.0.14th.final.fa.exists taxdb.bti.bu1 UniVec_Core.nsd vector.nsq
geneious_blast_template.0 plant.protein.faa.pnd species taxdb.tar.gz UniVec_Core.nsi vector.tar.gz
geneious_blast_testdb plant.protein.faa.pni stampfiles taxdb.tar.gz.1 UniVec_Core.nsq Wrightson_ESTs.exists
gi plant.protein.faa.pog swissprot.00.phr taxdb.tar.gz.2 UniVec_Core.prop
hg17 plant.protein.faa.psd swissprot.00.pin taxdump.tar.gz UniVec.exists
From a Windows client
S: \\storage.eri.agresearch.co.nz\datasets
s driveM: \\storage.eri.agresearch.co.nz\projects
m drive
For standard workstations, a Group Policy will automatically map these with drive letters as above.
It can also be accessed as network location using the options “Add a network location”, “Quick access”, or “Map network drive…”. in the Windows File Explorer while connected to the AgResearch LAN (or via VPN).
Data Recovery
Weekly snapshots of the scratch directory are stored (4 weeks total)
/mnt/gpfs/scratch/.snapshots
with 4 subdirs like scratch@GMT-2024.10.04-09.00.50
So the latest is /mnt/gpfs/scratch/.snapshots/scratch@GMT-2024.10.04-09.00.50/projects
login-0 ~ $ ll /mnt/gpfs/scratch/.snapshots
total 2
drwxr-xr-x. 8 root root 4096 Feb 19 2024 scratch@GMT-2024.09.13-10.00.48
drwxr-xr-x. 8 root root 4096 Feb 19 2024 scratch@GMT-2024.09.20-10.00.49
drwxr-xr-x. 8 root root 4096 Feb 19 2024 scratch@GMT-2024.09.27-10.00.50
drwxr-xr-x. 8 root root 4096 Feb 19 2024 scratch@GMT-2024.10.04-09.00.50
example: file.txt
is lost from my projects scratch directory can be search for with…..ls -l /mnt/gpfs/scratch/.snapshots/scratch@GMT-2024.10.04-09.00.50/projects/<project_id>/
and then copied back to where it is requiredcp /mnt/gpfs/scratch/.snapshots/scratch@GMT-2024.10.04-09.00.50/projects/<project_id>/file.txt /where_i_need_it/
Related articles