SlideShare uma empresa Scribd logo
1 de 9
UNIX - awk




        Data extraction and
     formatted Reporting Tool



               Presentation By

                         Nihar R Paital
Introduction

   Developer          : Alfred Aho
                         Peter Weinberger
                         Brian Kernighan

   Appears in         : Version 7 UNIX onwards

   Developed during   : 1970 s

   Developed at       : Bell Labs

   Category           : UNIX Utility

   Supported by       : All UNIX flavors         Nihar R Paital
Definition


 The AWK utility is a data extraction and
 reporting tool that uses a data-driven
 scripting language consisting of a set of
 actions to be taken against textual data
 (either in files or data streams) for the
 purpose of producing formatted reports.


                               Nihar R Paital
It performs basic text formatting on an input
stream ( A file / input from a pipeline )

 Formatting using input file
$ awk {print $n} Filename
Example:
$ awk {print $1} awk.txt > awk.txt.bak

 Formatting using a filter in a pipeline
$ generate_data | awk {print $1}
Example:
$ cat awk.txt | awk {print $1} > awk.txt.bak


Before proceeding to next slide please create a file named awk.txt with following Contents.

07.46.199.184 [28/Sep/2010:04:08:20] "GET /robots.txt HTTP/1.1" 200 0 "msnbot"
123.125.71.19 [28/Sep/2010:04:20:11] "GET / HTTP/1.1" 304 - "Baiduspider"


                                                                        Nihar R Paital
Basic but important for awk

   Syntax :
        awk {print $n} filename
        Generate data : awk {print $n}

   Awk programs will start with a "{" and end with a "}"

   $0 is the entire line

   Awk parses the line in to fields for you automatically, using any whitespace
    (space, tab) as a delimiter.

   Fields of a regular file will be available using $1,$2,$3 … etc

   NF : It is a special Variable contains the number of fields in the current line. We
    can print the last field by printing the field $NF
   NR : It prints the row number being currently processed.          Nihar R Paital
Basic Examples

 $ awk '{print $0}' awk.txt
 It will print all the lines as they are in File
 $ echo 'this is a test' | awk '{print $3}'
  It will print 'a'
 $ echo 'this is a test' | awk '{print $NF}'
  It prints "test"
 $ awk '{print $1, $(NF-2) }' awk.txt
 It will print the last 3rd word of file awk.txt
 $ awk '{print NR ") " $1 " -> " $(NF-2)}‘
 Output:
      1) 07.46.199.184 -> 200
      2) 123.125.71.19 -> 304
                                                   Nihar R Paital
Advance use of AWK
$ awk '{print $2}' logs.txt
Output:
    [28/Sep/2010:04:08:20]
    [28/Sep/2010:04:20:11]
The date field is separated by "/" and ":" characters.
Suppose I want to print like
[28/Sep/2010
[28/Sep/2010

$ awk '{print $2}' logs.txt | awk 'BEGIN{FS=":"}{print $1}'
Output:
    [28/Sep/2010
    [28/Sep/2010
Here FS=“:” means Field Separator as colon(:)

$ awk '{print $2}' logs.txt | awk 'BEGIN{FS=":"}{print $1}' | sed 's/[//'
Output:
    28/Sep/2010
    28/Sep/2010
Here We are Substituting [ with NULL value                  Nihar R Paital
Advance Use of AWK
If I want to return only the 200 status lines
$ awk '{if ($(NF-2) == "200") {print $0}}' logs.txt

   Output:
   07.46.199.184 [28/Sep/2010:04:08:20] "GET /robots.txt HTTP/1.1" 200 0 "msnbot"


$ awk '{a+=$(NF-2); print "Total so far:", a}' logs.txt

   Output:
   Total so far: 200
   Total so far: 504


$ awk '{a+=$(NF-2)}END{print "Total:", a}' logs.txt

   Output:
   Total: 504
                                                                  Nihar R Paital
Nihar R Paital

Mais conteúdo relacionado

Mais procurados

Our challenge for Bulkload reliability improvement
Our challenge for Bulkload reliability  improvementOur challenge for Bulkload reliability  improvement
Our challenge for Bulkload reliability improvement
Satoshi Akama
 
Live coding java 8 urs peter
Live coding java 8   urs peterLive coding java 8   urs peter
Live coding java 8 urs peter
NLJUG
 
Simseer.com - Malware Similarity and Clustering Made Easy
Simseer.com - Malware Similarity and Clustering Made EasySimseer.com - Malware Similarity and Clustering Made Easy
Simseer.com - Malware Similarity and Clustering Made Easy
Silvio Cesare
 

Mais procurados (12)

Stack c6
Stack c6Stack c6
Stack c6
 
Our challenge for Bulkload reliability improvement
Our challenge for Bulkload reliability  improvementOur challenge for Bulkload reliability  improvement
Our challenge for Bulkload reliability improvement
 
Python Programming for ArcGIS: Part II
Python Programming for ArcGIS: Part IIPython Programming for ArcGIS: Part II
Python Programming for ArcGIS: Part II
 
Live coding java 8 urs peter
Live coding java 8   urs peterLive coding java 8   urs peter
Live coding java 8 urs peter
 
An Introduction to Reactive Cocoa
An Introduction to Reactive CocoaAn Introduction to Reactive Cocoa
An Introduction to Reactive Cocoa
 
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
 
Dutch hug
Dutch hugDutch hug
Dutch hug
 
Unix interview questions
Unix interview questionsUnix interview questions
Unix interview questions
 
Simseer.com - Malware Similarity and Clustering Made Easy
Simseer.com - Malware Similarity and Clustering Made EasySimseer.com - Malware Similarity and Clustering Made Easy
Simseer.com - Malware Similarity and Clustering Made Easy
 
spaCy lightning talk for KyivPy #21
spaCy lightning talk for KyivPy #21spaCy lightning talk for KyivPy #21
spaCy lightning talk for KyivPy #21
 
ELK - from zero to coding class hero
ELK - from zero to coding class heroELK - from zero to coding class hero
ELK - from zero to coding class hero
 
Streaming data to s3 using akka streams
Streaming data to s3 using akka streamsStreaming data to s3 using akka streams
Streaming data to s3 using akka streams
 

Semelhante a Unix - Class7 - awk

101 3.2 process text streams using filters
101 3.2 process text streams using filters101 3.2 process text streams using filters
101 3.2 process text streams using filters
Acácio Oliveira
 
Unit 8 text processing tools
Unit 8 text processing toolsUnit 8 text processing tools
Unit 8 text processing tools
root_fibo
 
ShellAdvanced aaäaaaaaaaaaaaaaaaaaaaaaaaaaaa
ShellAdvanced aaäaaaaaaaaaaaaaaaaaaaaaaaaaaaShellAdvanced aaäaaaaaaaaaaaaaaaaaaaaaaaaaaa
ShellAdvanced aaäaaaaaaaaaaaaaaaaaaaaaaaaaaa
ewout2
 

Semelhante a Unix - Class7 - awk (20)

Awk programming
Awk programming Awk programming
Awk programming
 
awk_intro.ppt
awk_intro.pptawk_intro.ppt
awk_intro.ppt
 
Linux class 15 26 oct 2021
Linux class 15   26 oct 2021Linux class 15   26 oct 2021
Linux class 15 26 oct 2021
 
Unix day4 v1.3
Unix day4 v1.3Unix day4 v1.3
Unix day4 v1.3
 
Awk primer and Bioawk
Awk primer and BioawkAwk primer and Bioawk
Awk primer and Bioawk
 
Daq toolbox examples_matlab
Daq toolbox examples_matlabDaq toolbox examples_matlab
Daq toolbox examples_matlab
 
NS2: AWK and GNUplot - PArt III
NS2: AWK and GNUplot - PArt IIINS2: AWK and GNUplot - PArt III
NS2: AWK and GNUplot - PArt III
 
101 3.2 process text streams using filters
101 3.2 process text streams using filters101 3.2 process text streams using filters
101 3.2 process text streams using filters
 
Unit 8 text processing tools
Unit 8 text processing toolsUnit 8 text processing tools
Unit 8 text processing tools
 
Linux intro 5 extra: awk
Linux intro 5 extra: awkLinux intro 5 extra: awk
Linux intro 5 extra: awk
 
Unix Tutorial
Unix TutorialUnix Tutorial
Unix Tutorial
 
Awk A Pattern Scanning And Processing Language
Awk   A Pattern Scanning And Processing LanguageAwk   A Pattern Scanning And Processing Language
Awk A Pattern Scanning And Processing Language
 
Linux
LinuxLinux
Linux
 
Linux
LinuxLinux
Linux
 
Linux
LinuxLinux
Linux
 
Linux
LinuxLinux
Linux
 
ShellAdvanced aaäaaaaaaaaaaaaaaaaaaaaaaaaaaa
ShellAdvanced aaäaaaaaaaaaaaaaaaaaaaaaaaaaaaShellAdvanced aaäaaaaaaaaaaaaaaaaaaaaaaaaaaa
ShellAdvanced aaäaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Linux com
Linux comLinux com
Linux com
 
Awk Introduction
Awk IntroductionAwk Introduction
Awk Introduction
 
What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?
 

Mais de Nihar Ranjan Paital (11)

Oracle Select Query
Oracle Select QueryOracle Select Query
Oracle Select Query
 
Useful macros and functions for excel
Useful macros and functions for excelUseful macros and functions for excel
Useful macros and functions for excel
 
UNIX - Class5 - Advance Shell Scripting-P2
UNIX - Class5 - Advance Shell Scripting-P2UNIX - Class5 - Advance Shell Scripting-P2
UNIX - Class5 - Advance Shell Scripting-P2
 
UNIX - Class3 - Programming Constructs
UNIX - Class3 - Programming ConstructsUNIX - Class3 - Programming Constructs
UNIX - Class3 - Programming Constructs
 
UNIX - Class2 - vi Editor
UNIX - Class2 - vi EditorUNIX - Class2 - vi Editor
UNIX - Class2 - vi Editor
 
UNIX - Class1 - Basic Shell
UNIX - Class1 - Basic ShellUNIX - Class1 - Basic Shell
UNIX - Class1 - Basic Shell
 
UNIX - Class6 - sed - Detail
UNIX - Class6 - sed - DetailUNIX - Class6 - sed - Detail
UNIX - Class6 - sed - Detail
 
UNIX - Class4 - Advance Shell Scripting-P1
UNIX - Class4 - Advance Shell Scripting-P1UNIX - Class4 - Advance Shell Scripting-P1
UNIX - Class4 - Advance Shell Scripting-P1
 
Test funda
Test fundaTest funda
Test funda
 
Csql for telecom
Csql for telecomCsql for telecom
Csql for telecom
 
Select Operations in CSQL
Select Operations in CSQLSelect Operations in CSQL
Select Operations in CSQL
 

Último

Último (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Unix - Class7 - awk

  • 1. UNIX - awk Data extraction and formatted Reporting Tool Presentation By Nihar R Paital
  • 2. Introduction  Developer : Alfred Aho Peter Weinberger Brian Kernighan  Appears in : Version 7 UNIX onwards  Developed during : 1970 s  Developed at : Bell Labs  Category : UNIX Utility  Supported by : All UNIX flavors Nihar R Paital
  • 3. Definition The AWK utility is a data extraction and reporting tool that uses a data-driven scripting language consisting of a set of actions to be taken against textual data (either in files or data streams) for the purpose of producing formatted reports. Nihar R Paital
  • 4. It performs basic text formatting on an input stream ( A file / input from a pipeline )  Formatting using input file $ awk {print $n} Filename Example: $ awk {print $1} awk.txt > awk.txt.bak  Formatting using a filter in a pipeline $ generate_data | awk {print $1} Example: $ cat awk.txt | awk {print $1} > awk.txt.bak Before proceeding to next slide please create a file named awk.txt with following Contents. 07.46.199.184 [28/Sep/2010:04:08:20] "GET /robots.txt HTTP/1.1" 200 0 "msnbot" 123.125.71.19 [28/Sep/2010:04:20:11] "GET / HTTP/1.1" 304 - "Baiduspider" Nihar R Paital
  • 5. Basic but important for awk  Syntax :  awk {print $n} filename  Generate data : awk {print $n}  Awk programs will start with a "{" and end with a "}"  $0 is the entire line  Awk parses the line in to fields for you automatically, using any whitespace (space, tab) as a delimiter.  Fields of a regular file will be available using $1,$2,$3 … etc  NF : It is a special Variable contains the number of fields in the current line. We can print the last field by printing the field $NF  NR : It prints the row number being currently processed. Nihar R Paital
  • 6. Basic Examples $ awk '{print $0}' awk.txt It will print all the lines as they are in File $ echo 'this is a test' | awk '{print $3}' It will print 'a' $ echo 'this is a test' | awk '{print $NF}' It prints "test" $ awk '{print $1, $(NF-2) }' awk.txt It will print the last 3rd word of file awk.txt $ awk '{print NR ") " $1 " -> " $(NF-2)}‘ Output: 1) 07.46.199.184 -> 200 2) 123.125.71.19 -> 304 Nihar R Paital
  • 7. Advance use of AWK $ awk '{print $2}' logs.txt Output: [28/Sep/2010:04:08:20] [28/Sep/2010:04:20:11] The date field is separated by "/" and ":" characters. Suppose I want to print like [28/Sep/2010 [28/Sep/2010 $ awk '{print $2}' logs.txt | awk 'BEGIN{FS=":"}{print $1}' Output: [28/Sep/2010 [28/Sep/2010 Here FS=“:” means Field Separator as colon(:) $ awk '{print $2}' logs.txt | awk 'BEGIN{FS=":"}{print $1}' | sed 's/[//' Output: 28/Sep/2010 28/Sep/2010 Here We are Substituting [ with NULL value Nihar R Paital
  • 8. Advance Use of AWK If I want to return only the 200 status lines $ awk '{if ($(NF-2) == "200") {print $0}}' logs.txt Output: 07.46.199.184 [28/Sep/2010:04:08:20] "GET /robots.txt HTTP/1.1" 200 0 "msnbot" $ awk '{a+=$(NF-2); print "Total so far:", a}' logs.txt Output: Total so far: 200 Total so far: 504 $ awk '{a+=$(NF-2)}END{print "Total:", a}' logs.txt Output: Total: 504 Nihar R Paital