SlideShare uma empresa Scribd logo
1 de 11
Apache Command Set
● What types ?
● What are they ?
● What do they do ?
● Environment
● Configuration
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Hadoop commands – What types ?
● User commands
● Administration commands
● Generic options for all commands
● Configuration options
● Environment
– Variables i.e. HADOOP_PREFIX
– Aliases i.e. hls = hadoop fs -ls
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Hadoop commands – What are they ?
User Commands
● archive – save files to a har archive
● distcp – copy files or directories recursively
● fs – file system commands
– cat – copies file to stdout
– chgrp – change group associated with file
– chmod – change file permissions
– chown – change file ownership
– CopyFromLocal – copy from local file reference
– CopyToLocal – copy to local file reference
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Hadoop commands – What are they ?
User Commands
● fs – file system commands
– count – count of dir / files/ bytes
– cp – copy files
– du – size of files and directories
– dus – display file lengths
– expunge – empty trash
– get – copy files to local file system
– getmerge – get but merge files
– ls – file listing
– lsr recursive ls
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Hadoop commands – What are they ?
User Commands
● fs – file system commands
– mkdir – make directory
– moveFromLocal – put with delete of origin
– mv – move from source to destination
– put – copy between file systems
– rm – remove a file
– rmr – recursive delete
– setrep – change file replication factor
– stat – returns file stat information
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Hadoop commands – What are they ?
User Commands
● fs – file system commands
– tail – display end of file
– test – check file existence / type
– text – output file as text
– touchz – create zero length file
● fsck – HDFS file system check
● fetchdt – get delegation token from name node
● jar – run jar file
● Job – manage mapreduce jobs
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Hadoop commands – What are they ?
User Commands
● pipes – run a pipe job
● queue – interact and view job queue
● version – get Hadoop version
● CLASSNAME – run class named CLASSNAME
● classpath – print the class path
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Hadoop commands – What are they ?
Administration Commands
● balancer – run cluster balancing
● daemonlog – get/set daemon log level
● datanode – run hdfs data node
● dfsadmin – run dfsadmin client
● mradmin – run map reduce admin client
● jobtracker – run mr jobtracker node
● namenode – runs the name node
● secondarynamenode – run secondary name node
● tasktracker – run task tracker node
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Hadoop Environment
See the .bashrc for environment set up
##export HADOOP_HOME=/usr/local/hadoop ## deprecated
export HADOOP_PREFIX=/usr/local/hadoop
export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-i386
unalias hfs &> /dev/null
alias hfs="hadoop fs"
unalias hls &> /dev/null ; alias hls="hfs -ls"
unalias hup1 &> /dev/null ; alias hup1="cd $HADOOP_PREFIX/bin ; ./start-dfs.sh"
unalias hup2 &> /dev/null ; alias hup2="cd $HADOOP_PREFIX/bin ; ./start-mapred.sh"
unalias hdwn1 &> /dev/null ; alias hdwn1="cd $HADOOP_PREFIX/bin ; ./stop-mapred.sh"
unalias hdwn2 &> /dev/null ; alias hdwn2="cd $HADOOP_PREFIX/bin ; ./stop-dfs.sh"
# if using LZO compression then add entry here for viewing
# LZO compressed files
##PATH=$PATH:$HADOOP_HOME/bin ## deprecated
PATH=$PATH:$HADOOP_PREFIX/bin
PATH=$PATH:$JAVA_HOME/bin
export PATH
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Hadoop Configuration
● Configuration files under $HADOOP_PREFIX/conf
● Initial set up in
– core-site.xml
– hdfs-site.xml
– mapred-site.xml
● Example from core-site.xml
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Contact Us
● Feel free to contact us at
– www.semtech-solutions.co.nz
– info@semtech-solutions.co.nz
● We offer IT project consultancy
● We are happy to hear about your problems
● You can just pay for those hours that you need
● To solve your problems

Mais conteúdo relacionado

Destaque

бсп (обоб. урок)
бсп (обоб. урок)бсп (обоб. урок)
бсп (обоб. урок)HomichAlla
 
правописание приставок урок№4
правописание приставок урок№4правописание приставок урок№4
правописание приставок урок№4HomichAlla
 
Troubleshooting mysql-tutorial
Troubleshooting mysql-tutorialTroubleshooting mysql-tutorial
Troubleshooting mysql-tutorialjames tong
 
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)Spark Summit
 
Windowing in Apache Apex
Windowing in Apache ApexWindowing in Apache Apex
Windowing in Apache ApexApache Apex
 
The 5 People in your Organization that grow Legacy Code
The 5 People in your Organization that grow Legacy CodeThe 5 People in your Organization that grow Legacy Code
The 5 People in your Organization that grow Legacy CodeRoberto Cortez
 
Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application  Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application Apache Apex
 
Build your shiny new pc, with Pangoly
Build your shiny new pc, with PangolyBuild your shiny new pc, with Pangoly
Build your shiny new pc, with PangolyPangoly
 
Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)Emilio Coppa
 
Introduction to UNIX Command-Lines with examples
Introduction to UNIX Command-Lines with examplesIntroduction to UNIX Command-Lines with examples
Introduction to UNIX Command-Lines with examplesNoé Fernández-Pozo
 
Introduction to Real-Time Data Processing
Introduction to Real-Time Data ProcessingIntroduction to Real-Time Data Processing
Introduction to Real-Time Data ProcessingApache Apex
 
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache ApexApache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache ApexApache Apex
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsAnton Kirillov
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingApache Apex
 
Intro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataIntro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataApache Apex
 
Capital One's Next Generation Decision in less than 2 ms
Capital One's Next Generation Decision in less than 2 msCapital One's Next Generation Decision in less than 2 ms
Capital One's Next Generation Decision in less than 2 msApache Apex
 
Deep Dive into Apache Apex App Development
Deep Dive into Apache Apex App DevelopmentDeep Dive into Apache Apex App Development
Deep Dive into Apache Apex App DevelopmentApache Apex
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache ApexApache Apex
 
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)Apache Apex
 

Destaque (20)

бсп (обоб. урок)
бсп (обоб. урок)бсп (обоб. урок)
бсп (обоб. урок)
 
правописание приставок урок№4
правописание приставок урок№4правописание приставок урок№4
правописание приставок урок№4
 
Troubleshooting mysql-tutorial
Troubleshooting mysql-tutorialTroubleshooting mysql-tutorial
Troubleshooting mysql-tutorial
 
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
 
Windowing in Apache Apex
Windowing in Apache ApexWindowing in Apache Apex
Windowing in Apache Apex
 
The 5 People in your Organization that grow Legacy Code
The 5 People in your Organization that grow Legacy CodeThe 5 People in your Organization that grow Legacy Code
The 5 People in your Organization that grow Legacy Code
 
Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application  Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application
 
Build your shiny new pc, with Pangoly
Build your shiny new pc, with PangolyBuild your shiny new pc, with Pangoly
Build your shiny new pc, with Pangoly
 
HDFS Internals
HDFS InternalsHDFS Internals
HDFS Internals
 
Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)
 
Introduction to UNIX Command-Lines with examples
Introduction to UNIX Command-Lines with examplesIntroduction to UNIX Command-Lines with examples
Introduction to UNIX Command-Lines with examples
 
Introduction to Real-Time Data Processing
Introduction to Real-Time Data ProcessingIntroduction to Real-Time Data Processing
Introduction to Real-Time Data Processing
 
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache ApexApache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
 
Intro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataIntro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big Data
 
Capital One's Next Generation Decision in less than 2 ms
Capital One's Next Generation Decision in less than 2 msCapital One's Next Generation Decision in less than 2 ms
Capital One's Next Generation Decision in less than 2 ms
 
Deep Dive into Apache Apex App Development
Deep Dive into Apache Apex App DevelopmentDeep Dive into Apache Apex App Development
Deep Dive into Apache Apex App Development
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
 
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
 

Mais de Mike Frampton (20)

Apache Airavata
Apache AiravataApache Airavata
Apache Airavata
 
Apache MADlib AI/ML
Apache MADlib AI/MLApache MADlib AI/ML
Apache MADlib AI/ML
 
Apache MXNet AI
Apache MXNet AIApache MXNet AI
Apache MXNet AI
 
Apache Gobblin
Apache GobblinApache Gobblin
Apache Gobblin
 
Apache Singa AI
Apache Singa AIApache Singa AI
Apache Singa AI
 
Apache Ranger
Apache RangerApache Ranger
Apache Ranger
 
OrientDB
OrientDBOrientDB
OrientDB
 
Prometheus
PrometheusPrometheus
Prometheus
 
Apache Tephra
Apache TephraApache Tephra
Apache Tephra
 
Apache Kudu
Apache KuduApache Kudu
Apache Kudu
 
Apache Bahir
Apache BahirApache Bahir
Apache Bahir
 
Apache Arrow
Apache ArrowApache Arrow
Apache Arrow
 
JanusGraph DB
JanusGraph DBJanusGraph DB
JanusGraph DB
 
Apache Ignite
Apache IgniteApache Ignite
Apache Ignite
 
Apache Samza
Apache SamzaApache Samza
Apache Samza
 
Apache Flink
Apache FlinkApache Flink
Apache Flink
 
Apache Edgent
Apache EdgentApache Edgent
Apache Edgent
 
Apache CouchDB
Apache CouchDBApache CouchDB
Apache CouchDB
 
An introduction to Apache Mesos
An introduction to Apache MesosAn introduction to Apache Mesos
An introduction to Apache Mesos
 
An introduction to Pentaho
An introduction to PentahoAn introduction to Pentaho
An introduction to Pentaho
 

Último

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 

Último (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

An introduction to the Apache Hadoop command set

  • 1. Apache Command Set ● What types ? ● What are they ? ● What do they do ? ● Environment ● Configuration www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 2. Hadoop commands – What types ? ● User commands ● Administration commands ● Generic options for all commands ● Configuration options ● Environment – Variables i.e. HADOOP_PREFIX – Aliases i.e. hls = hadoop fs -ls www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 3. Hadoop commands – What are they ? User Commands ● archive – save files to a har archive ● distcp – copy files or directories recursively ● fs – file system commands – cat – copies file to stdout – chgrp – change group associated with file – chmod – change file permissions – chown – change file ownership – CopyFromLocal – copy from local file reference – CopyToLocal – copy to local file reference www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 4. Hadoop commands – What are they ? User Commands ● fs – file system commands – count – count of dir / files/ bytes – cp – copy files – du – size of files and directories – dus – display file lengths – expunge – empty trash – get – copy files to local file system – getmerge – get but merge files – ls – file listing – lsr recursive ls www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 5. Hadoop commands – What are they ? User Commands ● fs – file system commands – mkdir – make directory – moveFromLocal – put with delete of origin – mv – move from source to destination – put – copy between file systems – rm – remove a file – rmr – recursive delete – setrep – change file replication factor – stat – returns file stat information www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 6. Hadoop commands – What are they ? User Commands ● fs – file system commands – tail – display end of file – test – check file existence / type – text – output file as text – touchz – create zero length file ● fsck – HDFS file system check ● fetchdt – get delegation token from name node ● jar – run jar file ● Job – manage mapreduce jobs www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 7. Hadoop commands – What are they ? User Commands ● pipes – run a pipe job ● queue – interact and view job queue ● version – get Hadoop version ● CLASSNAME – run class named CLASSNAME ● classpath – print the class path www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 8. Hadoop commands – What are they ? Administration Commands ● balancer – run cluster balancing ● daemonlog – get/set daemon log level ● datanode – run hdfs data node ● dfsadmin – run dfsadmin client ● mradmin – run map reduce admin client ● jobtracker – run mr jobtracker node ● namenode – runs the name node ● secondarynamenode – run secondary name node ● tasktracker – run task tracker node www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 9. Hadoop Environment See the .bashrc for environment set up ##export HADOOP_HOME=/usr/local/hadoop ## deprecated export HADOOP_PREFIX=/usr/local/hadoop export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-i386 unalias hfs &> /dev/null alias hfs="hadoop fs" unalias hls &> /dev/null ; alias hls="hfs -ls" unalias hup1 &> /dev/null ; alias hup1="cd $HADOOP_PREFIX/bin ; ./start-dfs.sh" unalias hup2 &> /dev/null ; alias hup2="cd $HADOOP_PREFIX/bin ; ./start-mapred.sh" unalias hdwn1 &> /dev/null ; alias hdwn1="cd $HADOOP_PREFIX/bin ; ./stop-mapred.sh" unalias hdwn2 &> /dev/null ; alias hdwn2="cd $HADOOP_PREFIX/bin ; ./stop-dfs.sh" # if using LZO compression then add entry here for viewing # LZO compressed files ##PATH=$PATH:$HADOOP_HOME/bin ## deprecated PATH=$PATH:$HADOOP_PREFIX/bin PATH=$PATH:$JAVA_HOME/bin export PATH www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 10. Hadoop Configuration ● Configuration files under $HADOOP_PREFIX/conf ● Initial set up in – core-site.xml – hdfs-site.xml – mapred-site.xml ● Example from core-site.xml <property> <name>hadoop.tmp.dir</name> <value>/app/hadoop/tmp</value> <description>A base for other temporary directories.</description> </property> www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 11. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems