1. Harness the power of
Sed & Awk
Everything a PHP developer should know about sed & awk
Edition: PHPBenelux, jan 29, 2011, Antwerp
http://en.wikipedia.org/wiki/File:Slender_Loris.jpg Sed & Awk - http://joind.in/2489
4. First, who am I?
Joshua Thijssen (32)
Sed & Awk - http://joind.in/2489
5. First, who am I?
Joshua Thijssen (32)
Senior Software Engineer currently working
at Enrise (4worx)
Sed & Awk - http://joind.in/2489
6. First, who am I?
Joshua Thijssen (32)
Senior Software Engineer currently working
at Enrise (4worx)
Development in PHP, Python, Perl, C, java,
assembly.
Sed & Awk - http://joind.in/2489
7. First, who am I?
Joshua Thijssen (32)
Senior Software Engineer currently working
at Enrise (4worx)
Development in PHP, Python, Perl, C, java,
assembly.
Certified MySQL DBE, MySQL DBA,
LPIC-1, LPIC-2, Zend PHP5, Zend PHP5.3,
Zend Framework, Ubuntu professional.
Sed & Awk - http://joind.in/2489
8. First, who am I?
Joshua Thijssen (32)
Senior Software Engineer currently working
at Enrise (4worx)
Development in PHP, Python, Perl, C, java,
assembly.
Certified MySQL DBE, MySQL DBA,
LPIC-1, LPIC-2, Zend PHP5, Zend PHP5.3,
Zend Framework, Ubuntu professional.
Blogs: http://www.adayinthelifeof.nl
Sed & Awk - http://joind.in/2489
9. First, who am I?
Joshua Thijssen (32)
Senior Software Engineer currently working
at Enrise (4worx)
Development in PHP, Python, Perl, C, java,
assembly.
Certified MySQL DBE, MySQL DBA,
LPIC-1, LPIC-2, Zend PHP5, Zend PHP5.3,
Zend Framework, Ubuntu professional.
Blogs: http://www.adayinthelifeof.nl
http://www.enrise.com/blog
Sed & Awk - http://joind.in/2489
10. First, who am I?
Joshua Thijssen (32)
Senior Software Engineer currently working
at Enrise (4worx)
Development in PHP, Python, Perl, C, java,
assembly.
Certified MySQL DBE, MySQL DBA,
LPIC-1, LPIC-2, Zend PHP5, Zend PHP5.3,
Zend Framework, Ubuntu professional.
Blogs: http://www.adayinthelifeof.nl
http://www.enrise.com/blog
Email: joshua@enrise.com
Sed & Awk - http://joind.in/2489
11. First, who am I?
Joshua Thijssen (32)
Senior Software Engineer currently working
at Enrise (4worx)
Development in PHP, Python, Perl, C, java,
assembly.
Certified MySQL DBE, MySQL DBA,
LPIC-1, LPIC-2, Zend PHP5, Zend PHP5.3,
Zend Framework, Ubuntu professional.
Blogs: http://www.adayinthelifeof.nl
http://www.enrise.com/blog
Email: joshua@enrise.com
Twitter: @jaytaph
Sed & Awk - http://joind.in/2489
15. Comfort zones
You are a PHP programmer...
http://www.gnux-consultant.fr/service/fileadmin/templates/template_fr_1_FILES/elephant-php.gif Sed & Awk - http://joind.in/2489
16. Comfort zones
You are a PHP programmer...
...so you program in PHP
http://www.gnux-consultant.fr/service/fileadmin/templates/template_fr_1_FILES/elephant-php.gif Sed & Awk - http://joind.in/2489
17. Comfort zones
You are a PHP programmer...
...so you program in PHP
http://www.gnux-consultant.fr/service/fileadmin/templates/template_fr_1_FILES/elephant-php.gif Sed & Awk - http://joind.in/2489
18. Comfort zones
You are a PHP programmer...
...so you program in PHP
http://www.gnux-consultant.fr/service/fileadmin/templates/template_fr_1_FILES/elephant-php.gif Sed & Awk - http://joind.in/2489
20. Comfort zones
You are doing this already:
• SQL (where, order, limit, join)
Sed & Awk - http://joind.in/2489
21. Comfort zones
You are doing this already:
• SQL (where, order, limit, join)
• Frameworks (Zend, Symfony, etc)
Sed & Awk - http://joind.in/2489
22. Comfort zones
You are doing this already:
• SQL (where, order, limit, join)
• Frameworks (Zend, Symfony, etc)
• JQuery, Dojo, etc
Sed & Awk - http://joind.in/2489
23. Comfort zones
But why learn new stuff when you can do it in PHP?
Sed & Awk - http://joind.in/2489
24. Comfort zones
But why learn new stuff when you can do it in PHP?
✓ Might be easier to use...
Sed & Awk - http://joind.in/2489
25. Comfort zones
But why learn new stuff when you can do it in PHP?
✓ Might be easier to use...
✓ Might be faster to write...
Sed & Awk - http://joind.in/2489
26. Comfort zones
But why learn new stuff when you can do it in PHP?
✓ Might be easier to use...
✓ Might be faster to write...
✓ Might be better suited for the job...
Sed & Awk - http://joind.in/2489
27. Comfort zones
But why learn new stuff when you can do it in PHP?
✓ Might be easier to use...
✓ Might be faster to write...
✓ Might be better suited for the job...
✓ More efficient
Sed & Awk - http://joind.in/2489
29. Comfort zones
I don’t want to tell you HOW to use Sed & Awk.
!
Sed & Awk - http://joind.in/2489
30. Comfort zones
I don’t want to tell you HOW to use Sed & Awk.
I want to tell you that for certain jobs, tools
like Sed & Awk are much better suited than
PHP.
!
Sed & Awk - http://joind.in/2489
31. Comfort zones
I don’t want to tell you HOW to use Sed & Awk.
I want to tell you that for certain jobs, tools
like Sed & Awk are much better suited than
PHP.
! Know the capabilities of your tools and you
become a better developer...
Sed & Awk - http://joind.in/2489
33. Comfort zones
Why Sed & Awk?
✓Useful for data manipulation
Sed & Awk - http://joind.in/2489
34. Comfort zones
Why Sed & Awk?
✓Useful for data manipulation
✓They work well together
Sed & Awk - http://joind.in/2489
35. Comfort zones
Why Sed & Awk?
✓Useful for data manipulation
✓They work well together
✓Both have a similar processing method
Sed & Awk - http://joind.in/2489
36. Comfort zones
Why Sed & Awk?
✓Useful for data manipulation
✓They work well together
✓Both have a similar processing method
✓Both rely heavily on regular expressions
Sed & Awk - http://joind.in/2489
37. Comfort zones
Why Sed & Awk?
✓Useful for data manipulation
✓They work well together
✓Both have a similar processing method
✓Both rely heavily on regular expressions
✓Nobody really harvest their power
Sed & Awk - http://joind.in/2489
38. Part I : SED
http://www.flickr.com/photos/joachim_s_mueller/4298348196/ Sed & Awk - http://joind.in/2489
40. SED
• is a Stream EDitor
Sed & Awk - http://joind.in/2489
41. SED
• is a Stream EDitor
• applies rules based on a stream of data
(per line)
Sed & Awk - http://joind.in/2489
42. SED
• is a Stream EDitor
• applies rules based on a stream of data
(per line)
• there is no turning back into the stream
(going forward only)
Sed & Awk - http://joind.in/2489
44. Why use SED?
Useful for:
• Changing IP addresses or other data through many files.
mutation of large datasets
Sed & Awk - http://joind.in/2489
45. Why use SED?
Useful for:
• Changing IP addresses or other data through many files.
mutation of large datasets
• complex findin&certain blocks of code/data (for instance,
Only change data
replace
CSV, TXT, SQL files, docblocks etc)
Sed & Awk - http://joind.in/2489
46. Why use SED?
Useful for:
• Changing IP addresses or other data through many files.
mutation of large datasets
• complex findin&certain blocks of code/data (for instance,
Only change data
replace
CSV, TXT, SQL files, docblocks etc)
• complexthe next 10 lines after each 404 code read from an
Only print
retrieval of data
apache log file or print all docblocks and function headers
Sed & Awk - http://joind.in/2489
47. When use SED?
Use sed when:
Don’t use sed when:
Sed & Awk - http://joind.in/2489
48. When use SED?
Use sed when:
• When you need to change hundreds or
thousands of files
• “Complex” mutations
• Fast “one liners” in scripts
Don’t use sed when:
• When you need to change one or two items
• When you need aggregation or variables
Sed & Awk - http://joind.in/2489
49. SED
Most common example:
sed ‘s/foo/bar/g’ old > new
changes ‘foo’ into ‘bar’ throughout the
file ‘old’ and places output into file ‘new’
Sed & Awk - http://joind.in/2489
50. SED
$ cat foo.txt
foo bar foo
bar foo bar
foo bar foo
bar bar foo
foo foo bar
$ sed 's/foo/bar/g' foo.txt
bar bar bar
bar bar bar
bar bar bar
bar bar bar
bar bar bar
Sed & Awk - http://joind.in/2489
51. SED
Another common example:
sed ‘s/foo//g’ old > new
deletes ‘foo’ throughout the file
‘old’ and places output into file ‘new’
Sed & Awk - http://joind.in/2489
52. SED
$ cat foo.txt
foo bar foo
bar foo bar
foo bar foo
bar bar foo
foo foo bar
$ sed 's/foo//g' foo.txt
bar
bar bar
bar
bar bar
bar
Sed & Awk - http://joind.in/2489
53. SED
A bit more advanced:
sed ‘s/foo/FOO/2’ old > new
changes the second ‘foo’ on each line into ‘FOO’
Sed & Awk - http://joind.in/2489
54. SED
$ cat foo.txt
foo bar foo
bar foo bar
foo bar foo
bar bar foo
foo foo bar
$ sed 's/foo/FOO/2' foo.txt
foo bar FOO
bar foo bar
foo bar FOO
bar bar foo
foo FOO bar
Sed & Awk - http://joind.in/2489
55. SED
Sed can use address ranges:
sed ‘1,3 s/foo/bar/g’ file
changes all ‘foo’s to ‘bar’s on lines 1 to 3
Sed & Awk - http://joind.in/2489
56. SED
$ cat foo.txt
foo bar foo
bar foo bar
foo bar foo
bar bar foo
foo foo bar
$ sed '1,3 s/foo/bar/g' foo.txt
bar bar bar
bar bar bar
bar bar bar
bar bar foo
foo foo bar
Sed & Awk - http://joind.in/2489
57. SED
But you can also use a regex:
sed ‘1,/^$/ s/foo/bar/g’ file
changes all ‘foo’s to ‘bar’s on
lines 1 to the first empty line
Sed & Awk - http://joind.in/2489
58. SED
$ cat foo.txt
foo bar foo
bar foo bar
foo bar foo
bar bar foo
foo foo bar
$ sed '1,/^$/ s/foo/bar/g' foo.txt
bar bar bar
bar bar bar
foo bar foo
bar bar foo
foo foo bar
Sed & Awk - http://joind.in/2489
59. SED
$ cat foo.txt
foo bar foo
bar foo bar
foo bar foo
bar bar foo
foo foo bar
$ sed '/^$/,$ s/foo/bar/g' foo.txt
foo bar foo
bar foo bar
bar bar bar
bar bar bar
bar bar bar
Sed & Awk - http://joind.in/2489
60. SED
A ! negates the address range:
sed ‘1,3 ! s/foo/bar/g’ file
changes all ‘foo’s to ‘bar’s on
every line except lines 1 to 3
Sed & Awk - http://joind.in/2489
61. SED
$ cat foo.txt
foo bar foo
bar foo bar
foo bar foo
bar bar foo
foo foo bar
$ sed '1,3 ! s/foo/bar/g' foo.txt
foo bar foo
bar foo bar
foo bar foo
bar bar bar
bar bar bar
Sed & Awk - http://joind.in/2489
62. SED
Multiple commands per range:
sed ‘1,3 { s/foo/bar/g ; s/.*/Line &/ ; }’ file
sed ‘1,3 {
s/foo/bar/g
s/.*/Line &/
}’ file
for line 1 to 3: change foo’s into bar’s
and prepend ‘Line’ to the line
Sed & Awk - http://joind.in/2489
63. SED
$ cat foo.txt
foo bar foo
bar foo bar
foo bar foo
bar bar foo
foo foo bar
$ sed '1,3 { s/foo/bar/g ; s/.*/Line &/ ; }' foo.txt
Line bar bar bar
Line bar bar bar
Line bar bar bar
bar bar foo
foo foo bar
Sed & Awk - http://joind.in/2489
64. SED
Multiple ranges:
sed ‘1,3 s/foo/bar/g
5,7 s/bar/foo/g
s/(.*)/Line 1/’ file
on line 1 to 3: change foo’s into bar’s
on line 5 to 7: change bar’s into foo’s
on all lines: add ‘Line’ in front of the line
Sed & Awk - http://joind.in/2489
65. SED
$ cat foo.txt
foo bar foo
bar foo bar
foo bar foo
bar bar foo
foo foo bar
$ sed '1,3 s/foo/bar/g ; 5,7 s/bar/foo/g ; s/.*/Line: &/' foo.txt
Line: bar bar bar
Line: bar bar bar
Line: bar bar bar
Line: bar bar foo
Line: foo foo foo
Sed & Awk - http://joind.in/2489
66. SED
sed -n ‘
/^cut/ q
1,3 { s/foo/bar/g ; p ; }
4,$ { s/bar/foo/g ; p ; }
’ file
-n means ‘don’t print’ lines
Sed & Awk - http://joind.in/2489
67. SED
sed -n ‘
/^cut/ q
1,3 { s/foo/bar/g ; p ; }
4,$ { s/bar/foo/g ; p ; }
’ file
if a line starts with ‘cut’, end processing
Sed & Awk - http://joind.in/2489
68. SED
sed -n ‘
/^cut/ q
1,3 { s/foo/bar/g ; p ; }
4,$ { s/bar/foo/g ; p ; }
’ file
line 1 to 3 will replace ‘foo’ to ‘bar’ and
print the line to output
Sed & Awk - http://joind.in/2489
69. SED
sed -n ‘
/^cut/ q
1,3 { s/foo/bar/g ; p ; }
4,$ { s/bar/foo/g ; p ; }
’ file
line 4 to the end will replace ‘bar’ to
‘foo’ and print the line to output
Sed & Awk - http://joind.in/2489
70. SED
$ cat file.txt
foo bar foo bar
bar bar foo foo
foo bar foo bar
bar bar foo foo
foo bar foo bar
foo bar foo bar
bar bar foo foo
bar bar foo foo
cut
this line is not added
$ sed -n '/^cut/ q ; 1,3 { s/foo/bar/g ; p ; } ; 4,$ { s/bar/foo/g ; p ; }' file.txt
bar bar bar bar
bar bar bar bar
bar bar bar bar
foo foo foo foo
foo foo foo foo
foo foo foo foo
foo foo foo foo
foo foo foo foo
Sed & Awk - http://joind.in/2489
71. SED commands
a append ‘text’ to output
s substitute data
y transform data (like ‘tr’)
d delete line (don’t print)
n (print) and goto next line
N (print) and add next line to pattern space
p print pattern space
q quit processing
r copy contents of file to pattern space
# comment
= prints current line number
Sed & Awk - http://joind.in/2489
75. SED flow control
b <label> Unconditionally branch to <label>
: <label> Set <label>
t <label> Conditionally branch to <label>
Sed & Awk - http://joind.in/2489
76. SED flow control
solfege.txt sed script
do sed '
re :loop
mi /$/ N
fa s/n */ /
so t loop
la ' solfege.txt
si
do
Pattern buffer
Output
Sed & Awk - http://joind.in/2489
77. SED flow control
solfege.txt sed script
do sed '
re :loop
mi /$/ N
fa s/n */ /
so t loop
la ' solfege.txt
si
do
Pattern buffer
do
Output
Sed & Awk - http://joind.in/2489
78. SED flow control
solfege.txt sed script
do sed '
re :loop
mi /$/ N
fa s/n */ /
so t loop
la ' solfege.txt
si
do
Pattern buffer
do
Output
Sed & Awk - http://joind.in/2489
79. SED flow control
solfege.txt sed script
do sed '
re :loop
mi /$/ N
fa s/n */ /
so t loop
la ' solfege.txt
si
do
Pattern buffer
do
Output
Sed & Awk - http://joind.in/2489
80. SED flow control
solfege.txt sed script
do sed '
re :loop
mi /$/ N
fa s/n */ /
so t loop
la ' solfege.txt
si
do
Pattern buffer
re
Output
do
Sed & Awk - http://joind.in/2489
81. SED flow control
solfege.txt sed script
do sed '
re :loop
mi /$/ N
fa s/n */ /
so t loop
la ' solfege.txt
si
do
Pattern buffer
re
Output
do
Sed & Awk - http://joind.in/2489
82. SED flow control
solfege.txt sed script
do sed '
re :loop
mi /$/ N
fa s/n */ /
so t loop
la ' solfege.txt
si
do
Pattern buffer
re
mi
Output
do
Sed & Awk - http://joind.in/2489
83. SED flow control
solfege.txt sed script
do sed '
re :loop
mi /$/ N
fa s/n */ /
so t loop
la ' solfege.txt
si
do
Pattern buffer
re mi
Output
do
Sed & Awk - http://joind.in/2489
84. SED flow control
solfege.txt sed script
do sed '
re :loop
mi /$/ N
fa s/n */ /
so t loop
la ' solfege.txt
si
do
Pattern buffer
re mi
Output
do
Sed & Awk - http://joind.in/2489
85. SED flow control
solfege.txt sed script
do sed '
re :loop
mi /$/ N
fa s/n */ /
so t loop
la ' solfege.txt
si
do
Pattern buffer
re mi
fa
Output
do
Sed & Awk - http://joind.in/2489
86. SED flow control
solfege.txt sed script
do sed '
re :loop
mi /$/ N
fa s/n */ /
so t loop
la ' solfege.txt
si
do
Pattern buffer
re mi fa
Output
do
Sed & Awk - http://joind.in/2489
87. SED flow control
solfege.txt sed script
do sed '
re :loop
mi /$/ N
fa s/n */ /
so t loop
la ' solfege.txt
si
do
Pattern buffer
re mi fa
Output
do
Sed & Awk - http://joind.in/2489
88. SED flow control
solfege.txt sed script
do sed '
re :loop
mi /$/ N
fa s/n */ /
so t loop
la ' solfege.txt
si
do
Pattern buffer
re mi fa
Output
do
Sed & Awk - http://joind.in/2489
89. SED flow control
solfege.txt sed script
do sed '
re :loop
mi /$/ N
fa s/n */ /
so t loop
la ' solfege.txt
si
do
Pattern buffer
re mi fa
Output
do
Sed & Awk - http://joind.in/2489
90. SED flow control
solfege.txt sed script
do sed '
re :loop
mi /$/ N
fa s/n */ /
so t loop
la ' solfege.txt
si
do
Pattern buffer
so
Output
do
re mi fa
Sed & Awk - http://joind.in/2489
91. Sed not powerful?
Delete all lines containing ‘regex’
sed '/regex/d' filename
Remove all additional white spaces
sed 's/ *$//' filename
Reverse all lines in a file (makes use of the hold buffer)
sed -n '1 ! G ; h ; $ p' filename
Sed & Awk - http://joind.in/2489
92. Sed not powerful?
Meet Sedtris:
A fully functional Tetris clone written in SED
http://uuner.doslash.org/forfun/
Sed & Awk - http://joind.in/2489
93. So use sed when:
Sed & Awk - http://joind.in/2489
94. So use sed when:
• Repetitive work on many files
Sed & Awk - http://joind.in/2489
95. So use sed when:
• Repetitive work on many files
• “complex” mutations
Sed & Awk - http://joind.in/2489
96. So use sed when:
• Repetitive work on many files
• “complex” mutations
• Fast “oneliners” in scripts etc..
Sed & Awk - http://joind.in/2489
97. Part 2 : AWK
http://www.flickr.com/photos/joachim_s_mueller/138409464/in/photostream/ Sed & Awk - http://joind.in/2489
99. AWK
• AWK is a full-fledged programming
language.
Sed & Awk - http://joind.in/2489
100. AWK
• AWK is a full-fledged programming
language.
• There is NO way I can teach you AWK in
+- 20 minutes
Sed & Awk - http://joind.in/2489
101. AWK
• AWK is a full-fledged programming
language.
• There is NO way I can teach you AWK in
+- 20 minutes
• But i’ll try...
Sed & Awk - http://joind.in/2489
103. AWK
• Alfred V. Aho, Peter J.Weinberger, Brain W.
Kernighan
Sed & Awk - http://joind.in/2489
104. AWK
• Alfred V. Aho, Peter J.Weinberger, Brain W.
Kernighan
• Written in 1977 at AT&T Bell Laboratories
Sed & Awk - http://joind.in/2489
105. AWK
• Alfred V. Aho, Peter J.Weinberger, Brain W.
Kernighan
• Written in 1977 at AT&T Bell Laboratories
• Multiple versions: AWK, NAWK, GAWK,
MAWK and more...
Sed & Awk - http://joind.in/2489
106. AWK
• Alfred V. Aho, Peter J.Weinberger, Brain W.
Kernighan
• Written in 1977 at AT&T Bell Laboratories
• Multiple versions: AWK, NAWK, GAWK,
MAWK and more...
• Pattern-directed scanning and processing
language...
Sed & Awk - http://joind.in/2489
109. AWK
• [condition] { actions }
• 2 special “patterns” : BEGIN and END
Sed & Awk - http://joind.in/2489
110. Simple AWK
$ cat solfege.txt
do
re
mi
fa
sol
la
ti
do
$ awk '
BEGIN { print "start" }
/o/ { print "I just saw an o in " $0 }
END { print "the end" }' solfege.txt
start
I just saw an o in do
I just saw an o in sol
I just saw an o in do
the end
Sed & Awk - http://joind.in/2489
111. Apache logfile (combined)
Awk processes through “records” and “fields”
72.30.161.230 - - [18/Jan/2011:20:28:09 +0100] "GET /robots.txt HTTP/1.0" 200 387
"-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/
slurp)"
72.30.161.230 - - [18/Jan/2011:20:28:10 +0100] "GET / HTTP/1.0" 200 7235 "-"
"Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/
slurp)"
you can control the record and field separators
Sed & Awk - http://joind.in/2489
115. Some global AWK knowledge
• You can set the field and record separator
Sed & Awk - http://joind.in/2489
116. Some global AWK knowledge
• You can set the field and record separator
• $FS=”|”; $RS=”t”
Sed & Awk - http://joind.in/2489
117. Some global AWK knowledge
• You can set the field and record separator
• $FS=”|”; $RS=”t”
• $0 holds the complete record (line)
Sed & Awk - http://joind.in/2489
118. Some global AWK knowledge
• You can set the field and record separator
• $FS=”|”; $RS=”t”
• $0 holds the complete record (line)
• $1 holds first field, $2 second field etc...
Sed & Awk - http://joind.in/2489
119. Some global AWK knowledge
• You can set the field and record separator
• $FS=”|”; $RS=”t”
• $0 holds the complete record (line)
• $1 holds first field, $2 second field etc...
• $NF holds number of fields in record
Sed & Awk - http://joind.in/2489
120. Some global AWK knowledge
• You can set the field and record separator
• $FS=”|”; $RS=”t”
• $0 holds the complete record (line)
• $1 holds first field, $2 second field etc...
• $NF holds number of fields in record
• $NR holds CURRENT record
Sed & Awk - http://joind.in/2489
122. Apache logfile (combined)
Print the “user agents” from the logfile
and count them (through external tools)
$ awk -F" '{ print $6 }' apache.log | sort | uniq -c
2 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322) NS8/0.9.6
1 Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
7 Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
Sed & Awk - http://joind.in/2489
123. Apache logfile (combined)
Print the total bytes send out per status code
$ awk -F '{ totals[$9] += $10; }
END { for (i in totals) { printf "%d : %d bytesn", i, totals[i]; } }' apache.log
200 : 26197250 bytes
206 : 180578 bytes
301 : 31072 bytes
302 : 2991 bytes
304 : 44715 bytes
404 : 82866 bytes
500 : 361783 bytes
Sed & Awk - http://joind.in/2489
124. Apache logfile (combined)
Print the “user agents” from the logfile
who triggered a 4xx code
$ awk -F '$9 ~ /4[0-9][0-9]/ { FS="""; $0=$0; print $6; FS=" " }' apache.log
Googlebot/2.1 (+http://www.googlebot.com/bot.html)
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6
libwww-perl/5.805
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.6) Gecko/2009011913 Firefox/3.0.6
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.6) Gecko/2009011913 Firefox/3.0.6
Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8b4) Gecko/20050908 Firefox/1.4
Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8b4) Gecko/20050908 Firefox/1.4
Googlebot/2.1 (+http://www.googlebot.com/bot.html)
Googlebot/2.1 (+http://www.googlebot.com/bot.html)
Googlebot/2.1 (+http://www.googlebot.com/bot.html)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Sed & Awk - http://joind.in/2489
125. Apache logfile (combined)
Awk one liner compared to PHP:
$ awk -F '{ totals[$9] += $10; } END { for (i in totals) { printf "%d : %d bytesn", i, totals[i]; } }' apache.log
<?php
$hash = array();
foreach( file( './apache.log' ) as $line ) {
list( , , , , , , , , $status, $bytes ) = explode( ' ', $line );
if( !isset( $hash[$status] ) ) {
$hash[$status] = 0;
}
$hash[$status] += $bytes;
}
print_r($hash);
Not a whole lot different, but already more
complex and this was just a simple example...
credits to @RichardJ #pfz channel Sed & Awk - http://joind.in/2489
126. Apache logfile (combined)
Sed one liner compared to PHP:
sed ‘/^.o/d’ file
<?php
$stdin = fopen("php://stdin", "r");
while (!feof($stdin)) {
$line = fgets($stdin);
if (preg_match("/^.o/", $line)) continue;
print $line;
}
?>
Much more work....
credits to @RichardJ #pfz channel Sed & Awk - http://joind.in/2489
127. Practical uses for a (PHP) developer
• parse php-errors files, syslog files, apache’s
http access logs.
• Conversion of files you get from your
customers, who always assume you can do
magic with a gazzillion GB’s of (unsorted)
data (and now you can).
Sed & Awk - http://joind.in/2489
129. In conclusion: sed & awk
• are powerful for simple one-liners but can
also be used for complex programs
Sed & Awk - http://joind.in/2489
130. In conclusion: sed & awk
• are powerful for simple one-liners but can
also be used for complex programs
• integrates perfectly with other (unix) tools
like uniq, sort, cut, find, grep, cat, etc...
Sed & Awk - http://joind.in/2489
131. In conclusion: sed & awk
• are powerful for simple one-liners but can
also be used for complex programs
• integrates perfectly with other (unix) tools
like uniq, sort, cut, find, grep, cat, etc...
• are a great way to automate complex and/
or repetitive (editing) tasks
Sed & Awk - http://joind.in/2489
133. In conclusion
• Look outside your
comfort zone for other
(better) tools.
http://files.sharenator.com/slender_loris_Worlds_strangest_looking_animals-s300x451-2279-580.jpg Sed & Awk - http://joind.in/2489
134. In conclusion
• Look outside your
comfort zone for other
(better) tools.
• Can you think of
examples where you
would use Sed or Awk
(instead of php?)
http://files.sharenator.com/slender_loris_Worlds_strangest_looking_animals-s300x451-2279-580.jpg Sed & Awk - http://joind.in/2489
136. Read more on Sed & Awk
Sed:
http://www.gnu.org/software/sed/manual/html_node/index.html
http://www.grymoire.com/Unix/Sed.html
http://www.panix.com/~elflord/unix/sed.html
http://www.linuxtopia.org/online_books/linux_tool_guides/the_sed_faq/index.html
Awk:
http://www.gnu.org/software/gawk/
http://www.grymoire.com/Unix/Awk.html
Sed & Awk - http://joind.in/2489
137. Thank you for your
attention!
Don’t forget to rate
my talk on joind.in
http://joind.in/2489
http://farm5.static.flickr.com/4078/4790219776_2fe3c9af95_b.jpg Sed & Awk - http://joind.in/2489