SlideShare uma empresa Scribd logo
1 de 25
PRESENTED BY:
HPC Exercises
Interacting with High Performance
Computing Systems
6/7/18 1
VirginiaTrueheart, MSIS
Texas Advanced Computing Center
vtrueheart@tacc.utexas.edu
Logging In
• In order to access the TACC machines you will need to login using a
terminal or SSH client
• SSH is an encrypted network protocol to access a secure system
over an unsecured network
• The following example is for XSEDE Login
• You can also log directly into the TACC machine but your password may be
different.
Logging In (Mac Terminal)
$ ssh –l <username> login.xsede.org
Please login to this system using your XSEDE username and password:
password:
Duo two-factor login for <username>
Enter a passcode or select one of the following options
1. Duo Push to XXX-XXX-XXXX
2. Phone call to XXX-XXX-XXXX
Passcode or option (1-2):
6/7/18 3
Logging In pt. 2
# For example, to login to the Comet system at SDSC, enter: gsissh comet
#
# Email help@xsede.org if you require assistance in the use of this system.
[username@ssohub ~]$ gsissh stampede2
6/7/18 4
Interacting with the System
After logging in you will be able to see the TACC Info box which will tell
you what projects you are associated with and how much of the file
system you have used.
Welcome to Stampede2, *please* read these important system notes:
--> Stampede2, Phase 2 Skylake nodes are now available for jobs
--> Stampede2 user documentation is available at:
https://portal.tacc.utexas.edu/user-guides/stampede2
----------------------- Project balances for user vtrue -----------------------
| Name Avail SUs Expires | |
| A-ccsc 189624 2018-12-31 | |
------------------------- Disk quotas for user vtrue --------------------------
| Disk Usage (GB) Limit %Used File Usage Limit %Used |
| /home1 1.9 10.0 19.43 39181 200000 19.59 |
| /work 311.8 1024.0 30.45 225008 3000000 7.50 |
| /scratch 0.0 0.0 0.00 4 0 0.00 |
-------------------------------------------------------------------------------
6/7/18 6
Creating a File
Inline text editors can be very useful when interacting with the system
so lets use a very simple one (nano) to create a file that we can use to
execute some code.
Create a File
login1.stampede2$ cd $WORK
login1.stampede2$ pwd
/work/03658/vtrue/stampede2
login1.stampede2$ nano helloWorld.py
6/7/18 8
Editing a File
• You should now be sitting in the editing environment of the
helloWorld.py file.
• You can type in the code found on the next slide in order to create the
contents of the file.
• Type Ctrl+X to exit and type Y to save the file when you are prompted.
6/7/18 10
#!/usr/bin/env python
"""
Hello World
"""
import datetime as DT
today = DT.datetime.today()
print "Hello World! Today is:"
print today.strftime("%d %b %Y")
A Very Small File
Executing Our File
• It is prohibited to run code on the login nodes as they are a shared
resource.
• In order to run this little code we have written we will first need to
start an idev (interactive development) session.
helloWorld.py
staff.stampede2(1005)$ idev
-> Checking on the status of development queue. OK
-> Defaults file : ~/.idevrc
-> System : stampede2
-> Queue : development (idev default )
[...]
c455-012[knl](1019)$
6/7/18 12
helloWorld.py
staff.stampede2(1005)$ idev
-> Checking on the status of development queue. OK
-> Defaults file : ~/.idevrc
-> System : stampede2
-> Queue : development (idev default )
[...]
c455-012[knl](1019)$ python helloWorld.py
Hello World! Today is:
17 Jun 2018
c455-012[knl](1020)$
6/7/18 13
Do More with Our File
• Now that we see the helloWorld.py file will run on the compute node
and produce output let’s test out the parallel aspects of running on a
node. Namely, accessing all of the cores on the node.
• Type ‘nano helloWorld.py’ to reopen your file and begin editing it
again.
• Input the python code you see on the next slide and then do Ctrl+X
again to save your changes.
#!/usr/bin/env python
"""
Parallel Hello World
"""
from mpi4py import MPI
import sys
size = MPI.COMM_WORLD.Get_size()
rank = MPI.COMM_WORLD.Get_rank()
name = MPI.Get_processor_name()
sys.stdout.write(
"Hello, World! I am process %d of %d on %s.n"
% (rank, size, name))
6/7/18 15
Running in Parallel
• Even though this is still a python code, we are now taking advantage
of the parallel capabilities of the node
• As such you need to start your code by using ‘ibrun’ instead of just
typing ‘python’
• When you run this code you will receive feedback from each core on
the node.
Parallel helloWorld.py
• c455-012[knl](1019)$ ibrun python helloParallel.py
• TACC: Starting up job 1595632
• TACC: Starting parallel tasks...
• Hello, World! I am process 1 of 68 on c456-042.stampede2.tacc.utexas.edu.
• Hello, World! I am process 49 of 68 on c456-042.stampede2.tacc.utexas.edu.
• Hello, World! I am process 66 of 68 on c456-042.stampede2.tacc.utexas.edu.
• Hello, World! I am process 67 of 68 on c456-042.stampede2.tacc.utexas.edu.
• Hello, World! I am process 64 of 68 on c456-042.stampede2.tacc.utexas.edu.
• ...
• TACC: Shutdown complete. Exiting.
6/7/18 17
Exiting an idev Session
• Great! Now that you can see what interactive jobs look like we can go
on to more advanced job submission.
• To leave an idev session simply type ‘exit’. Let’s do that now.
Submitting a Job
• The previous two examples ran a job interactively. This means you
could be on the node and see the output generated as it happened.
• This isn’t always practical though so we need a way to submit jobs
and then leave them to the system to run whenever nodes become
available.
• To do this we’ll take advantage of the SLURM system.
Create a New Nano File
• Create a new file called batchJob.sh
• Input the code found on the next slide and save the file.
An Example SLURM Batch File
#!/bin/bash
#SBATCH -J myJob # Job name
#SBATCH -o myJob.o%j # Name of stdout output file
#SBATCH -e myJob.e%j # Name of stderr error file
#SBATCH -p development # Queue (partition) name
#SBATCH -N 1 # Total # of nodes
#SBATCH -n 68 # Total # of mpi tasks
#SBATCH -t 00:05:00 # Run time (hh:mm:ss)
#SBATCH -A myproject # Allocation name (req'd if you have more than 1)
#SBATCH --mail-user=hkang@austin.utexas.edu
#SBATCH --mail-type=all # Send email at begin and end of job
# Other commands must follow all #SBATCH directives...
module list
pwd
date
# Launch code...
ibrun python helloParallel.py
6/7/18 21
Submitting the Batch Job
• To submit the SLURM Batch job use the following command
• sbatch batchJob.sh
• This should generate a text output to let you know about the
parameters of the job and then provide you with a Job ID once the
job has successfully been admitted to the queues.
Checking the Status of a Job
• Once a job is submitted to the queues, you can’t see it running the
way you could when you were running interactively.
• Instead we use monitoring commands to see what state our job is in.
• The command ’squeue’ is very useful for this and has several flags
available to control it’s output. Let’s use the –u flag to see what all
jobs under our username are doing.
staff.stampede2(1009)$ squeue -u vtrue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1604426 development idv20717 vtrue R 16:57 1 c455-001
staff.stampede2(1010)$ scontrol show job=1604426
JobId=1604426 JobName=idv20717
UserId=vtrue(829572) GroupId=G-815499(815499) MCS_label=N/A
Priority=400 Nice=0 Account=A-ccsc QOS=normal
JobState=RUNNING Reason=None Dependency=(null)
Requeue=0 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
RunTime=00:18:08 TimeLimit=00:30:00 TimeMin=N/A
SubmitTime=2018-06-09T21:27:33 EligibleTime=2018-06-09T21:27:33
StartTime=2018-06-09T21:27:36 EndTime=2018-06-09T21:57:36 Deadline=N/A
PreemptTime=None SuspendTime=None SecsPreSuspend=0
LastSchedEval=2018-06-09T21:27:36
...
6/7/18 24
Cleaning Up Jobs
• When your batch job has finished running it will automatically be
cleared from the queues.
• Your output will be in the folder you pointed it to within your batch
job file.
• If for some reason you wish to cancel your job while it is still running,
you can do so with ‘scancel –jobID>’

Mais conteúdo relacionado

Mais procurados

Linux Tracing Superpowers by Eugene Pirogov
Linux Tracing Superpowers by Eugene PirogovLinux Tracing Superpowers by Eugene Pirogov
Linux Tracing Superpowers by Eugene PirogovPivorak MeetUp
 
Kernel Recipes 2015: Kernel packet capture technologies
Kernel Recipes 2015: Kernel packet capture technologiesKernel Recipes 2015: Kernel packet capture technologies
Kernel Recipes 2015: Kernel packet capture technologiesAnne Nicolas
 
Linux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringLinux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringGeorg Schönberger
 
Systems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting StartedSystems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting StartedBrendan Gregg
 
Linux Kernel Crashdump
Linux Kernel CrashdumpLinux Kernel Crashdump
Linux Kernel CrashdumpMarian Marinov
 
Velocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPFVelocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPFBrendan Gregg
 
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF ExporterLISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF ExporterIvan Babrou
 
1 m+ qps on mysql galera cluster
1 m+ qps on mysql galera cluster1 m+ qps on mysql galera cluster
1 m+ qps on mysql galera clusterOlinData
 
Kernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysisKernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysisAnne Nicolas
 
Intro to linux performance analysis
Intro to linux performance analysisIntro to linux performance analysis
Intro to linux performance analysisChris McEniry
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016Brendan Gregg
 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDPlcplcp1
 
The New Systems Performance
The New Systems PerformanceThe New Systems Performance
The New Systems PerformanceBrendan Gregg
 
Known basic of NFV Features
Known basic of NFV FeaturesKnown basic of NFV Features
Known basic of NFV FeaturesRaul Leite
 
Linux Crash Dump Capture and Analysis
Linux Crash Dump Capture and AnalysisLinux Crash Dump Capture and Analysis
Linux Crash Dump Capture and AnalysisPaul V. Novarese
 
Debugging the Cloud Foundry Routing Tier
Debugging the Cloud Foundry Routing TierDebugging the Cloud Foundry Routing Tier
Debugging the Cloud Foundry Routing TierVMware Tanzu
 
Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)Brendan Gregg
 
Percona Live UK 2014 Part III
Percona Live UK 2014  Part IIIPercona Live UK 2014  Part III
Percona Live UK 2014 Part IIIAlkin Tezuysal
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugginglibfetion
 

Mais procurados (20)

Linux Tracing Superpowers by Eugene Pirogov
Linux Tracing Superpowers by Eugene PirogovLinux Tracing Superpowers by Eugene Pirogov
Linux Tracing Superpowers by Eugene Pirogov
 
Kernel Recipes 2015: Kernel packet capture technologies
Kernel Recipes 2015: Kernel packet capture technologiesKernel Recipes 2015: Kernel packet capture technologies
Kernel Recipes 2015: Kernel packet capture technologies
 
Linux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringLinux Performance Profiling and Monitoring
Linux Performance Profiling and Monitoring
 
Systems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting StartedSystems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting Started
 
Linux Kernel Crashdump
Linux Kernel CrashdumpLinux Kernel Crashdump
Linux Kernel Crashdump
 
Velocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPFVelocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPF
 
Kernel crashdump
Kernel crashdumpKernel crashdump
Kernel crashdump
 
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF ExporterLISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
 
1 m+ qps on mysql galera cluster
1 m+ qps on mysql galera cluster1 m+ qps on mysql galera cluster
1 m+ qps on mysql galera cluster
 
Kernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysisKernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysis
 
Intro to linux performance analysis
Intro to linux performance analysisIntro to linux performance analysis
Intro to linux performance analysis
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDP
 
The New Systems Performance
The New Systems PerformanceThe New Systems Performance
The New Systems Performance
 
Known basic of NFV Features
Known basic of NFV FeaturesKnown basic of NFV Features
Known basic of NFV Features
 
Linux Crash Dump Capture and Analysis
Linux Crash Dump Capture and AnalysisLinux Crash Dump Capture and Analysis
Linux Crash Dump Capture and Analysis
 
Debugging the Cloud Foundry Routing Tier
Debugging the Cloud Foundry Routing TierDebugging the Cloud Foundry Routing Tier
Debugging the Cloud Foundry Routing Tier
 
Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)
 
Percona Live UK 2014 Part III
Percona Live UK 2014  Part IIIPercona Live UK 2014  Part III
Percona Live UK 2014 Part III
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
 

Semelhante a HPC Examples

What’s eating python performance
What’s eating python performanceWhat’s eating python performance
What’s eating python performancePiotr Przymus
 
Practical Operation Automation with StackStorm
Practical Operation Automation with StackStormPractical Operation Automation with StackStorm
Practical Operation Automation with StackStormShu Sugimoto
 
Linux Cluster Job Management Systems (SGE)
Linux Cluster Job Management Systems (SGE)Linux Cluster Job Management Systems (SGE)
Linux Cluster Job Management Systems (SGE)anandvaidya
 
Getting started kali linux
Getting started kali linuxGetting started kali linux
Getting started kali linuxDhruv Sharma
 
Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!Cisco DevNet
 
Linux Capabilities - eng - v2.1.5, compact
Linux Capabilities - eng - v2.1.5, compactLinux Capabilities - eng - v2.1.5, compact
Linux Capabilities - eng - v2.1.5, compactAlessandro Selli
 
Being HAPI! Reverse Proxying on Purpose
Being HAPI! Reverse Proxying on PurposeBeing HAPI! Reverse Proxying on Purpose
Being HAPI! Reverse Proxying on PurposeAman Kohli
 
Reverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande ModemReverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande ModemCyber Security Alliance
 
Functional and scale performance tests using zopkio
Functional and scale performance tests using zopkio Functional and scale performance tests using zopkio
Functional and scale performance tests using zopkio Marcelo Araujo
 
Final ProjectFinal Project Details Description Given a spec.docx
Final ProjectFinal Project Details Description  Given a spec.docxFinal ProjectFinal Project Details Description  Given a spec.docx
Final ProjectFinal Project Details Description Given a spec.docxAKHIL969626
 
OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...
OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...
OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...NETWAYS
 
26.1.7 lab snort and firewall rules
26.1.7 lab   snort and firewall rules26.1.7 lab   snort and firewall rules
26.1.7 lab snort and firewall rulesFreddy Buenaño
 
The genesis of clusterlib - An open source library to tame your favourite sup...
The genesis of clusterlib - An open source library to tame your favourite sup...The genesis of clusterlib - An open source library to tame your favourite sup...
The genesis of clusterlib - An open source library to tame your favourite sup...Arnaud Joly
 
Managing Large-scale Networks with Trigger
Managing Large-scale Networks with TriggerManaging Large-scale Networks with Trigger
Managing Large-scale Networks with Triggerjathanism
 
Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek PROIDEA
 
Docker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackDocker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackJakub Hajek
 
200519 TMU Ubiquitous Robot
200519 TMU Ubiquitous Robot200519 TMU Ubiquitous Robot
200519 TMU Ubiquitous RobotNoriakiAndo
 
CI from scratch with Jenkins (EN)
CI from scratch with Jenkins (EN)CI from scratch with Jenkins (EN)
CI from scratch with Jenkins (EN)Borislav Traykov
 

Semelhante a HPC Examples (20)

What’s eating python performance
What’s eating python performanceWhat’s eating python performance
What’s eating python performance
 
Practical Operation Automation with StackStorm
Practical Operation Automation with StackStormPractical Operation Automation with StackStorm
Practical Operation Automation with StackStorm
 
Linux Cluster Job Management Systems (SGE)
Linux Cluster Job Management Systems (SGE)Linux Cluster Job Management Systems (SGE)
Linux Cluster Job Management Systems (SGE)
 
Getting started kali linux
Getting started kali linuxGetting started kali linux
Getting started kali linux
 
Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!
 
Linux Capabilities - eng - v2.1.5, compact
Linux Capabilities - eng - v2.1.5, compactLinux Capabilities - eng - v2.1.5, compact
Linux Capabilities - eng - v2.1.5, compact
 
Being HAPI! Reverse Proxying on Purpose
Being HAPI! Reverse Proxying on PurposeBeing HAPI! Reverse Proxying on Purpose
Being HAPI! Reverse Proxying on Purpose
 
Reverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande ModemReverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande Modem
 
Functional and scale performance tests using zopkio
Functional and scale performance tests using zopkio Functional and scale performance tests using zopkio
Functional and scale performance tests using zopkio
 
Final ProjectFinal Project Details Description Given a spec.docx
Final ProjectFinal Project Details Description  Given a spec.docxFinal ProjectFinal Project Details Description  Given a spec.docx
Final ProjectFinal Project Details Description Given a spec.docx
 
OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...
OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...
OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...
 
Unix commands
Unix commandsUnix commands
Unix commands
 
26.1.7 lab snort and firewall rules
26.1.7 lab   snort and firewall rules26.1.7 lab   snort and firewall rules
26.1.7 lab snort and firewall rules
 
The genesis of clusterlib - An open source library to tame your favourite sup...
The genesis of clusterlib - An open source library to tame your favourite sup...The genesis of clusterlib - An open source library to tame your favourite sup...
The genesis of clusterlib - An open source library to tame your favourite sup...
 
Managing Large-scale Networks with Trigger
Managing Large-scale Networks with TriggerManaging Large-scale Networks with Trigger
Managing Large-scale Networks with Trigger
 
Activity 5
Activity 5Activity 5
Activity 5
 
Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek
 
Docker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackDocker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic Stack
 
200519 TMU Ubiquitous Robot
200519 TMU Ubiquitous Robot200519 TMU Ubiquitous Robot
200519 TMU Ubiquitous Robot
 
CI from scratch with Jenkins (EN)
CI from scratch with Jenkins (EN)CI from scratch with Jenkins (EN)
CI from scratch with Jenkins (EN)
 

Último

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Último (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

HPC Examples

  • 1. PRESENTED BY: HPC Exercises Interacting with High Performance Computing Systems 6/7/18 1 VirginiaTrueheart, MSIS Texas Advanced Computing Center vtrueheart@tacc.utexas.edu
  • 2. Logging In • In order to access the TACC machines you will need to login using a terminal or SSH client • SSH is an encrypted network protocol to access a secure system over an unsecured network • The following example is for XSEDE Login • You can also log directly into the TACC machine but your password may be different.
  • 3. Logging In (Mac Terminal) $ ssh –l <username> login.xsede.org Please login to this system using your XSEDE username and password: password: Duo two-factor login for <username> Enter a passcode or select one of the following options 1. Duo Push to XXX-XXX-XXXX 2. Phone call to XXX-XXX-XXXX Passcode or option (1-2): 6/7/18 3
  • 4. Logging In pt. 2 # For example, to login to the Comet system at SDSC, enter: gsissh comet # # Email help@xsede.org if you require assistance in the use of this system. [username@ssohub ~]$ gsissh stampede2 6/7/18 4
  • 5. Interacting with the System After logging in you will be able to see the TACC Info box which will tell you what projects you are associated with and how much of the file system you have used.
  • 6. Welcome to Stampede2, *please* read these important system notes: --> Stampede2, Phase 2 Skylake nodes are now available for jobs --> Stampede2 user documentation is available at: https://portal.tacc.utexas.edu/user-guides/stampede2 ----------------------- Project balances for user vtrue ----------------------- | Name Avail SUs Expires | | | A-ccsc 189624 2018-12-31 | | ------------------------- Disk quotas for user vtrue -------------------------- | Disk Usage (GB) Limit %Used File Usage Limit %Used | | /home1 1.9 10.0 19.43 39181 200000 19.59 | | /work 311.8 1024.0 30.45 225008 3000000 7.50 | | /scratch 0.0 0.0 0.00 4 0 0.00 | ------------------------------------------------------------------------------- 6/7/18 6
  • 7. Creating a File Inline text editors can be very useful when interacting with the system so lets use a very simple one (nano) to create a file that we can use to execute some code.
  • 8. Create a File login1.stampede2$ cd $WORK login1.stampede2$ pwd /work/03658/vtrue/stampede2 login1.stampede2$ nano helloWorld.py 6/7/18 8
  • 9. Editing a File • You should now be sitting in the editing environment of the helloWorld.py file. • You can type in the code found on the next slide in order to create the contents of the file. • Type Ctrl+X to exit and type Y to save the file when you are prompted.
  • 10. 6/7/18 10 #!/usr/bin/env python """ Hello World """ import datetime as DT today = DT.datetime.today() print "Hello World! Today is:" print today.strftime("%d %b %Y") A Very Small File
  • 11. Executing Our File • It is prohibited to run code on the login nodes as they are a shared resource. • In order to run this little code we have written we will first need to start an idev (interactive development) session.
  • 12. helloWorld.py staff.stampede2(1005)$ idev -> Checking on the status of development queue. OK -> Defaults file : ~/.idevrc -> System : stampede2 -> Queue : development (idev default ) [...] c455-012[knl](1019)$ 6/7/18 12
  • 13. helloWorld.py staff.stampede2(1005)$ idev -> Checking on the status of development queue. OK -> Defaults file : ~/.idevrc -> System : stampede2 -> Queue : development (idev default ) [...] c455-012[knl](1019)$ python helloWorld.py Hello World! Today is: 17 Jun 2018 c455-012[knl](1020)$ 6/7/18 13
  • 14. Do More with Our File • Now that we see the helloWorld.py file will run on the compute node and produce output let’s test out the parallel aspects of running on a node. Namely, accessing all of the cores on the node. • Type ‘nano helloWorld.py’ to reopen your file and begin editing it again. • Input the python code you see on the next slide and then do Ctrl+X again to save your changes.
  • 15. #!/usr/bin/env python """ Parallel Hello World """ from mpi4py import MPI import sys size = MPI.COMM_WORLD.Get_size() rank = MPI.COMM_WORLD.Get_rank() name = MPI.Get_processor_name() sys.stdout.write( "Hello, World! I am process %d of %d on %s.n" % (rank, size, name)) 6/7/18 15
  • 16. Running in Parallel • Even though this is still a python code, we are now taking advantage of the parallel capabilities of the node • As such you need to start your code by using ‘ibrun’ instead of just typing ‘python’ • When you run this code you will receive feedback from each core on the node.
  • 17. Parallel helloWorld.py • c455-012[knl](1019)$ ibrun python helloParallel.py • TACC: Starting up job 1595632 • TACC: Starting parallel tasks... • Hello, World! I am process 1 of 68 on c456-042.stampede2.tacc.utexas.edu. • Hello, World! I am process 49 of 68 on c456-042.stampede2.tacc.utexas.edu. • Hello, World! I am process 66 of 68 on c456-042.stampede2.tacc.utexas.edu. • Hello, World! I am process 67 of 68 on c456-042.stampede2.tacc.utexas.edu. • Hello, World! I am process 64 of 68 on c456-042.stampede2.tacc.utexas.edu. • ... • TACC: Shutdown complete. Exiting. 6/7/18 17
  • 18. Exiting an idev Session • Great! Now that you can see what interactive jobs look like we can go on to more advanced job submission. • To leave an idev session simply type ‘exit’. Let’s do that now.
  • 19. Submitting a Job • The previous two examples ran a job interactively. This means you could be on the node and see the output generated as it happened. • This isn’t always practical though so we need a way to submit jobs and then leave them to the system to run whenever nodes become available. • To do this we’ll take advantage of the SLURM system.
  • 20. Create a New Nano File • Create a new file called batchJob.sh • Input the code found on the next slide and save the file.
  • 21. An Example SLURM Batch File #!/bin/bash #SBATCH -J myJob # Job name #SBATCH -o myJob.o%j # Name of stdout output file #SBATCH -e myJob.e%j # Name of stderr error file #SBATCH -p development # Queue (partition) name #SBATCH -N 1 # Total # of nodes #SBATCH -n 68 # Total # of mpi tasks #SBATCH -t 00:05:00 # Run time (hh:mm:ss) #SBATCH -A myproject # Allocation name (req'd if you have more than 1) #SBATCH --mail-user=hkang@austin.utexas.edu #SBATCH --mail-type=all # Send email at begin and end of job # Other commands must follow all #SBATCH directives... module list pwd date # Launch code... ibrun python helloParallel.py 6/7/18 21
  • 22. Submitting the Batch Job • To submit the SLURM Batch job use the following command • sbatch batchJob.sh • This should generate a text output to let you know about the parameters of the job and then provide you with a Job ID once the job has successfully been admitted to the queues.
  • 23. Checking the Status of a Job • Once a job is submitted to the queues, you can’t see it running the way you could when you were running interactively. • Instead we use monitoring commands to see what state our job is in. • The command ’squeue’ is very useful for this and has several flags available to control it’s output. Let’s use the –u flag to see what all jobs under our username are doing.
  • 24. staff.stampede2(1009)$ squeue -u vtrue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 1604426 development idv20717 vtrue R 16:57 1 c455-001 staff.stampede2(1010)$ scontrol show job=1604426 JobId=1604426 JobName=idv20717 UserId=vtrue(829572) GroupId=G-815499(815499) MCS_label=N/A Priority=400 Nice=0 Account=A-ccsc QOS=normal JobState=RUNNING Reason=None Dependency=(null) Requeue=0 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=00:18:08 TimeLimit=00:30:00 TimeMin=N/A SubmitTime=2018-06-09T21:27:33 EligibleTime=2018-06-09T21:27:33 StartTime=2018-06-09T21:27:36 EndTime=2018-06-09T21:57:36 Deadline=N/A PreemptTime=None SuspendTime=None SecsPreSuspend=0 LastSchedEval=2018-06-09T21:27:36 ... 6/7/18 24
  • 25. Cleaning Up Jobs • When your batch job has finished running it will automatically be cleared from the queues. • Your output will be in the folder you pointed it to within your batch job file. • If for some reason you wish to cancel your job while it is still running, you can do so with ‘scancel –jobID>’

Notas do Editor

  1. Inputs at password and token will likely appear blank so type carefully
  2. Inputs at password and token will likely appear blank so type carefully
  3. Pay attention to your command prompt. Ofc you can change this if you want but many systems have a default that is designed to be helpful .
  4. Single processor
  5. Single node/task = one output Shift + ZZ to save and exit ls to see if file was saved We’ll come back for this later when we start running some examples but for now make sure it’s saved and try to remember where you put it
  6. Single processor per task (multithreaded) but not yet hyperthreaded Great! Now you know how to run jobs interactively
  7. “slurm batch”