Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Note

Work in Progress: active data transfer/staging is still happening in the background. Hence, not all of your datasets will have read/write access at this point in time.
We are referring to this process as the “cutover process”. Please raise a request to clarify the status for your dataset.

...

  • New AgResearch staff (or those who have not used the existing, legacy HPC) planning to utilise the compute resources, will need a username created and home directory (/home/agresearch.co.nz/username) provisioned.

  • Do you have legacy data that needs to move (be cutover)? this could take some time and requires activation by AgResearch.

    • Background: Most data has been transferred from the legacy HPC to the new eRI HPC already. Any data that has not been migrated will not readable/writeable until it has been cut over. This ensures both data sets remain in sync.

    • Please make a request to support@cloud.nesi.org.nz

  • Do you need a new dataset/project (with compute allocation) created or will you be accessing another workspace?

    • If you are unsure what the difference between a dataset and a project is, check out the frequently asked questions page for more information.This will

    • give you access to a project directory, a scratch directory (for high-performance working storage within the HPC environment), and a share of access to HPC (i.e. compute/analysis) resourcesGo to ColdFront at https://coldfront.eri.agresearch.co.nz/, a self-service tool that allows you to set up a project, apply for a share of compute resources and perform administrative tasks such as managing other project members and checking the progress of the project all in one place. For more information about this tool and how to connect to it, click here.

    • Important: when setting up a new project, you need to give it a title following the format yyyy-nameofproject, for example, 2077-micetrial.

    • If you experience difficulty getting set up with ColdFront, you can reach out for more support by emailing NeSI’s support desk.

  • How do I know if there is already data or a project?

  • What software/version might you need?

  • Finally, NeSI also offers regular online office hour sessions, hosted via Zoom. These sessions are open to anyone - you don't need to be an existing NeSI user. Information on when the next office hour will be hosted is here.

...

  • OPTIONAL: If working from outside the AgResearch local network, connect to AgR VPN
    (or Inscrutable or Iramohio first: e.g. connect via a Terminal application:
    ssh inscrutable.agresearch.co.nz)
    See also Connecting to the eRI compute cluster from Windows
    Once logged on, continue to

  • Connect to an eRI login node:
    ssh username@agresearch.co.nz@login-0.eri.agresearch.co.nz
    (or login-1) enter your password if prompted

  • List the contents of your sandbox project folder:
    ls -la /agr/persist/projects/XXXX-abc_defghijklm

  • List the contents of your scratch folder:
    ls -la /agr/scratch/projects/XXXX-abc_defghijklm

  • Change into the scratch location:
    cd /agr/scratch/projects/XXXX-abc_defghijklm

  • More technical details are given here on how to connect to the AgResearch HPC from your computer.

...

Every AgResearch user can access the storage. Some datasets have tighter access restrictions.
Please raise a request to clarify the access permissions for your dataset.

See Accessing eRI storage

What is the difference between a dataset and a project provisioning type?

Both types are filesets on the filesystem (storage).
The dataset consists of one fileset in /tdc_persist/datasets/.
The project consists of two filesets, one in /tdc_persist/projects/ and the other in /tdc_scratch/projects/.

On AgR eRI login or compute nodes

Symbolic links are:

/agr/projects
/agr/datasets
/agr/scratch

Links are:
/tdc_persist/projects
/tdc_persist/datasets
/tdc_scratch/projects
Mount points are:

/mnt/gpfs/persist/ for both datasets and projects and
/mnt/gpfs/scratch/ for projects temp. working space

See also How to access the AgResearch eRI compute cluster

Locating legacy data on eRI

In order to ease the transition from HPC to eRI, a legacy link farm has been created, with symbolic links from the legacy paths to the new locations. This is mounted on all eRI nodes (login and compute).

This addresses the problem of the very many legacy scripts which have hardcoded paths to the legacy data locations.

Info

active -> /agr/persist/projects/{uniqueId}/active
scratch -> /agr/persist/projects/{uniqueId}/scratch
itmp -> /agr/scratch/projects/{uniqueId}
archive -> /agr/persist/datasets/{uniqueId}

Example:

Code Block
login-0$ ls -l /dataset/blastdata/
total 2
lrwxrwxrwx. 1 eri_migration@iam.flexi.nesi.org.nz eri_migration@iam.flexi.nesi.org.nz 43 Jul 18 16:29 active -> /agr/persist/projects/2002-blastdata/active
lrwxrwxrwx. 1 eri_migration@iam.flexi.nesi.org.nz eri_migration@iam.flexi.nesi.org.nz 44 Jul 18 16:29 scratch -> /agr/persist/projects/2002-blastdata/scratch
lrwxrwxrwx. 1 eri_migration@iam.flexi.nesi.org.nz eri_migration@iam.flexi.nesi.org.nz 36 Jul 18 16:29 itmp -> /agr/scratch/projects/2002-blastdata
lrwxrwxrwx. 1 eri_migration@iam.flexi.nesi.org.nz eri_migration@iam.flexi.nesi.org.nz 36 Jul 18 16:29 archive -> /agr/persist/datasets/2002-blastdata
login-0$ ls -l /dataset/blastdata/active/
454temp                    hs_faa_bak                                  plant.protein.faa.psi            swissprot.00.pnd  temp                            UniVec.nhr
agall.seq.exists           junkx1                                      plant.protein.faa.psq            swissprot.00.pni  tigr                            UniVec.nin
agfilter1.pl               ln_mirror.sh                                plant.rna.fna.nhr                swissprot.00.pog  Tobacco_MF_assembl.exists       UniVec.nsd
bgi_sheep                  ln_mirror.sh.bu1                            plant.rna.fna.nin                swissprot.00.ppd  Tobacco_MF_assembly.fa.exists   UniVec.nsi
blastdata.exists           log                                         plant.rna.fna.nnd                swissprot.00.ppi  UMD3_OA_v1                      UniVec.nsq
BTA_OA_ver.2.exists        merged.dmp                                  plant.rna.fna.nni                swissprot.00.psd  uniprot_kb.fa                   UniVec.prop
bt_faa_bak                 mirror                                      plant.rna.fna.nsd                swissprot.00.psi  uniprot_kb.fasta.exists         unpack.sh
citations.dmp              names.dmp                                   plant.rna.fna.nsi                swissprot.00.psq  uniprot_sprot.fa.exists         vector.fa
cs08.seq.exists            nodes.dmp                                   plant.rna.fna.nsq                swissprot.pal     uniprot_swissprot.fasta.exists  vector.nhr
delnodes.dmp               OA_chromosomes_ver.1.0                      public_readonly                  taxbti.bti.bak    UniVec                          vector.nin
division.dmp               OAR_chromosomes_ver.1.0.exists              readme.txt                       taxdb.btd         UniVec_Core.exists              vector.nnd
est.exists                 OARv3.0_masked_with_SNPs_and_indels.exists  reorg1.sh.bu1                    taxdb.btd.bu1     UniVec_Core.fa.exists           vector.nni
gc.prt                     obsolete_deleteafter01112008                riceensembl                      taxdb.bti         UniVec_Core.nhr                 vector.nsd
gencode.dmp                plant.protein.faa.phr                       sheep_chr_OAR.exists             taxdb.bti.bak     UniVec_Core.nin                 vector.nsi
geneious_blast_template    plant.protein.faa.pin                       sheep.v3.0.14th.final.fa.exists  taxdb.bti.bu1     UniVec_Core.nsd                 vector.nsq
geneious_blast_template.0  plant.protein.faa.pnd                       species                          taxdb.tar.gz      UniVec_Core.nsi                 vector.tar.gz
geneious_blast_testdb      plant.protein.faa.pni                       stampfiles                       taxdb.tar.gz.1    UniVec_Core.nsq                 Wrightson_ESTs.exists
gi                         plant.protein.faa.pog                       swissprot.00.phr                 taxdb.tar.gz.2    UniVec_Core.prop
hg17                       plant.protein.faa.psd                       swissprot.00.pin                 taxdump.tar.gz    UniVec.exists

From a Windows client

S: \\storage.eri.agresearch.co.nz\datasets s drive
M: \\storage.eri.agresearch.co.nz\projects m drive

For standard workstations, a Group Policy will automatically map these with drive letters as above.

It can also be accessed as network location using the options “Add a network location”, “Quick access”, or “Map network drive…”. in the Windows File Explorer while connected to the AgResearch LAN (or via VPN).

 

Data Recovery

Weekly snapshots of the scratch directory are stored (4 weeks total)

Code Block
/mnt/gpfs/scratch/.snapshots

with 4 subdirs like scratch@GMT-2024.10.04-09.00.50

So the latest is  /mnt/gpfs/scratch/.snapshots/scratch@GMT-2024.10.04-09.00.50/projects

Code Block
login-0 ~ $ ll /mnt/gpfs/scratch/.snapshots
total 2
drwxr-xr-x. 8 root root 4096 Feb 19  2024 scratch@GMT-2024.09.13-10.00.48
drwxr-xr-x. 8 root root 4096 Feb 19  2024 scratch@GMT-2024.09.20-10.00.49
drwxr-xr-x. 8 root root 4096 Feb 19  2024 scratch@GMT-2024.09.27-10.00.50
drwxr-xr-x. 8 root root 4096 Feb 19  2024 scratch@GMT-2024.10.04-09.00.50

example: file.txt is lost from my projects scratch directory can be search for with…..
ls -l /mnt/gpfs/scratch/.snapshots/scratch@GMT-2024.10.04-09.00.50/projects/<project_id>/
and then copied back to where it is required
cp /mnt/gpfs/scratch/.snapshots/scratch@GMT-2024.10.04-09.00.50/projects/<project_id>/file.txt /where_i_need_it/

Accessing compute resources

...

How to access the AgResearch eRI compute cluster

Accessing Open Ondemand (OOD)

Browser Access - https://ondemand.eri.agresearch.co.nz/

Login - one of several formats will work for the login

first.last@agresearch.co.nz

userid@agresearch.co.nz THIS IS MOST LIKELY TO WORK

agresearch\userid

Password - your AgResearch password

Authentication

Insert excerpt
How to access the AgResearch eRI compute cluster
How to access the AgResearch eRI compute cluster
nameAuthentication
nopaneltrue

...