Getting started on AgResearch eRI

Work in Progress: active data transfer/staging is still happening in the background. Hence, not all of your datasets will have read/write access at this point in time.
We are referring to this process as the “cutover process”. Please raise a request to clarify the status for your dataset.

Getting started

When making the move over to eRI, there is some information you can provide or have thought about to help smooth the process.

New AgResearch staff (or those who have not used the existing, legacy HPC) planning to utilise the compute resources, will need a username created and home directory (/home/agresearch.co.nz/username) provisioned.
- Do this by emailing NeSI’s support desk at support@cloud.nesi.org.nz.
- or via the eRI Support Portal
Do you have legacy data that needs to move (be cutover)? this could take some time and requires activation by AgResearch.
- Background: Most data has been transferred from the legacy HPC to the new eRI HPC already. Any data that has not been migrated will not readable/writeable until it has been cut over. This ensures both data sets remain in sync.
- Please make a request to support@cloud.nesi.org.nz
Do you need a new dataset/project (with compute allocation) created or will you be accessing another workspace?
- If you are unsure what the difference between a dataset and a project is, check out the frequently asked questions page for more information.
- Go to ColdFront at https://coldfront.eri.agresearch.co.nz/, a self-service tool that allows you to set up a project, apply for a share of compute resources and perform administrative tasks such as managing other project members and checking the progress of the project all in one place. For more information about this tool and how to connect to it, click here.
- Important: when setting up a new project, you need to give it a title following the format yyyy-nameofproject, for example, 2077-micetrial.
- If you experience difficulty getting set up with ColdFront, you can reach out for more support by emailing NeSI’s support desk.
How do I know if there is already data or a project?
- See How do I check my allocations on eRI?
What software/version might you need?
- If you are unsure what software is available, please ask and we can help build and/or make it available.
- How to find and load software (aka modules)
Finally, NeSI also offers regular online office hour sessions, hosted via Zoom. These sessions are open to anyone - you don't need to be an existing NeSI user. Information on when the next office hour will be hosted is here.

Logging in

OPTIONAL: If working from outside the AgResearch local network, connect to AgR VPN
(or Inscrutable or Iramohio first: e.g. connect via a Terminal application:
ssh inscrutable.agresearch.co.nz)
See also Connecting to the eRI compute cluster from Windows
Once logged on, continue to
Connect to an eRI login node:
ssh username@agresearch.co.nz@login-0.eri.agresearch.co.nz
(or login-1) enter your password if prompted
List the contents of your sandbox project folder:
ls -la /agr/persist/projects/XXXX-abc_defghijklm
List the contents of your scratch folder:
ls -la /agr/scratch/projects/XXXX-abc_defghijklm
Change into the scratch location:
cd /agr/scratch/projects/XXXX-abc_defghijklm
More technical details are given here on how to connect to the AgResearch HPC from your computer.

Accessing storage resources

Every AgResearch user can access the storage. Some datasets have tighter access restrictions.
Please raise a request to clarify the access permissions for your dataset.

See Accessing eRI storage

What is the difference between a dataset and a project provisioning type?

Both types are filesets on the filesystem (storage).
The dataset consists of one fileset in /tdc_persist/datasets/.
The project consists of two filesets, one in /tdc_persist/projects/ and the other in /tdc_scratch/projects/.

On AgR eRI login or compute nodes

Symbolic links are:

/agr/projects
/agr/datasets
/agr/scratch

Links are:
/tdc_persist/projects
/tdc_persist/datasets
/tdc_scratch/projects
Mount points are:

/mnt/gpfs/persist/ for both datasets and projects and
/mnt/gpfs/scratch/ for projects temp. working space

Locating legacy data on eRI

In order to ease the transition from HPC to eRI, a legacy link farm has been created, with symbolic links from the legacy paths to the new locations. This is mounted on all eRI nodes (login and compute).

This addresses the problem of the very many legacy scripts which have hardcoded paths to the legacy data locations.

active -> /agr/persist/projects/{uniqueId}/active
scratch -> /agr/persist/projects/{uniqueId}/scratch
itmp -> /agr/scratch/projects/{uniqueId}
archive -> /agr/persist/datasets/{uniqueId}

Example:

login-0$ ls -l /dataset/blastdata/
total 2
lrwxrwxrwx. 1 eri_migration@iam.flexi.nesi.org.nz eri_migration@iam.flexi.nesi.org.nz 43 Jul 18 16:29 active -> /agr/persist/projects/2002-blastdata/active
lrwxrwxrwx. 1 eri_migration@iam.flexi.nesi.org.nz eri_migration@iam.flexi.nesi.org.nz 44 Jul 18 16:29 scratch -> /agr/persist/projects/2002-blastdata/scratch
lrwxrwxrwx. 1 eri_migration@iam.flexi.nesi.org.nz eri_migration@iam.flexi.nesi.org.nz 36 Jul 18 16:29 itmp -> /agr/scratch/projects/2002-blastdata
lrwxrwxrwx. 1 eri_migration@iam.flexi.nesi.org.nz eri_migration@iam.flexi.nesi.org.nz 36 Jul 18 16:29 archive -> /agr/persist/datasets/2002-blastdata
login-0$ ls -l /dataset/blastdata/active/
454temp                    hs_faa_bak                                  plant.protein.faa.psi            swissprot.00.pnd  temp                            UniVec.nhr
agall.seq.exists           junkx1                                      plant.protein.faa.psq            swissprot.00.pni  tigr                            UniVec.nin
agfilter1.pl               ln_mirror.sh                                plant.rna.fna.nhr                swissprot.00.pog  Tobacco_MF_assembl.exists       UniVec.nsd
bgi_sheep                  ln_mirror.sh.bu1                            plant.rna.fna.nin                swissprot.00.ppd  Tobacco_MF_assembly.fa.exists   UniVec.nsi
blastdata.exists           log                                         plant.rna.fna.nnd                swissprot.00.ppi  UMD3_OA_v1                      UniVec.nsq
BTA_OA_ver.2.exists        merged.dmp                                  plant.rna.fna.nni                swissprot.00.psd  uniprot_kb.fa                   UniVec.prop
bt_faa_bak                 mirror                                      plant.rna.fna.nsd                swissprot.00.psi  uniprot_kb.fasta.exists         unpack.sh
citations.dmp              names.dmp                                   plant.rna.fna.nsi                swissprot.00.psq  uniprot_sprot.fa.exists         vector.fa
cs08.seq.exists            nodes.dmp                                   plant.rna.fna.nsq                swissprot.pal     uniprot_swissprot.fasta.exists  vector.nhr
delnodes.dmp               OA_chromosomes_ver.1.0                      public_readonly                  taxbti.bti.bak    UniVec                          vector.nin
division.dmp               OAR_chromosomes_ver.1.0.exists              readme.txt                       taxdb.btd         UniVec_Core.exists              vector.nnd
est.exists                 OARv3.0_masked_with_SNPs_and_indels.exists  reorg1.sh.bu1                    taxdb.btd.bu1     UniVec_Core.fa.exists           vector.nni
gc.prt                     obsolete_deleteafter01112008                riceensembl                      taxdb.bti         UniVec_Core.nhr                 vector.nsd
gencode.dmp                plant.protein.faa.phr                       sheep_chr_OAR.exists             taxdb.bti.bak     UniVec_Core.nin                 vector.nsi
geneious_blast_template    plant.protein.faa.pin                       sheep.v3.0.14th.final.fa.exists  taxdb.bti.bu1     UniVec_Core.nsd                 vector.nsq
geneious_blast_template.0  plant.protein.faa.pnd                       species                          taxdb.tar.gz      UniVec_Core.nsi                 vector.tar.gz
geneious_blast_testdb      plant.protein.faa.pni                       stampfiles                       taxdb.tar.gz.1    UniVec_Core.nsq                 Wrightson_ESTs.exists
gi                         plant.protein.faa.pog                       swissprot.00.phr                 taxdb.tar.gz.2    UniVec_Core.prop
hg17                       plant.protein.faa.psd                       swissprot.00.pin                 taxdump.tar.gz    UniVec.exists

From a Windows client

S: \\storage.eri.agresearch.co.nz\datasets s drive
M: \\storage.eri.agresearch.co.nz\projects m drive

For standard workstations, a Group Policy will automatically map these with drive letters as above.

It can also be accessed as network location using the options “Add a network location”, “Quick access”, or “Map network drive…”. in the Windows File Explorer while connected to the AgResearch LAN (or via VPN).

Data Recovery

Weekly snapshots of the scratch directory are stored (4 weeks total)

/mnt/gpfs/scratch/.snapshots

with 4 subdirs like scratch@GMT-2024.10.04-09.00.50

So the latest is /mnt/gpfs/scratch/.snapshots/scratch@GMT-2024.10.04-09.00.50/projects

login-0 ~ $ ll /mnt/gpfs/scratch/.snapshots
total 2
drwxr-xr-x. 8 root root 4096 Feb 19  2024 scratch@GMT-2024.09.13-10.00.48
drwxr-xr-x. 8 root root 4096 Feb 19  2024 scratch@GMT-2024.09.20-10.00.49
drwxr-xr-x. 8 root root 4096 Feb 19  2024 scratch@GMT-2024.09.27-10.00.50
drwxr-xr-x. 8 root root 4096 Feb 19  2024 scratch@GMT-2024.10.04-09.00.50

example: file.txt is lost from my projects scratch directory can be search for with…..
ls -l /mnt/gpfs/scratch/.snapshots/scratch@GMT-2024.10.04-09.00.50/projects/<project_id>/
and then copied back to where it is required
cp /mnt/gpfs/scratch/.snapshots/scratch@GMT-2024.10.04-09.00.50/projects/<project_id>/file.txt /where_i_need_it/

Accessing compute resources

Compute access is currently handled via a service request.

How to access the AgResearch eRI compute cluster

Accessing Open Ondemand (OOD)

Browser Access - https://ondemand.eri.agresearch.co.nz/

Login - one of several formats will work for the login

first.last@agresearch.co.nz

userid@agresearch.co.nz THIS IS MOST LIKELY TO WORK

agresearch\userid

Password - your AgResearch password

Authentication

How to get help?

How do I get help?

AgResearch eResearch Infrastructure