SlideShare uma empresa Scribd logo
1 de 53
How to hack
 with pack
and unpack


     1
crash course
standard uses
synergies within perl
horrific abuses




          2
crash course




     3
what’s pack?

• like sprintf()
• only for bytes, not for presentation
• template rules very complex
• DWIM: packs empty strings for
  missing arguments



                   4
what’s unpack?

• like sscanf() for bytes
• (not really: we’ll come back to it)
• (mostly) identical template rules to
  pack
• dies if it runs out of input bytes


                    5
my $fourbytes = pack ‘L’, 12;
  my $twelve = unpack ‘L’, $fourbytes;




• perldoc -f pack
• perldoc -f unpack
• perldoc perlpacktut


                         6
standard uses



someone else’s bytes
fixed-width parsing




          7
someone else’s bytes




         8
FFI
XS
C


 9
6
10
6
SIX BYTES ON THAT
   LAST SLIDE!



        10
but seriously...


• struct alignment issues will ruin your
  day
• change the XS and/or C if you can
• Convert::Binary::C to the rescue!


                   11
syscall



• I have never had to do this...
• perldoc -f syscall



                       12
“close to the metal”


• network protocols
• binary file formats
• bytes are language neutral


                  13
fixed-width parsing




        14
• no sscanf() in perl
• substr or regexes...
• unpack is a bit nicer (not much)




                   15
example: contrived pie
   chess       8
   pecan       7
   shaker lemon4
   shoo fly    10


   $pie           = substr $_, 0, 12;
   $deliciousness = substr $_, 12;

   ($pie, $deliciousness) = m/(.{12})(.*)/;

   ($pie, $deliciousness) = unpack 'A12 A*', $_;




• not quite identical...
                         16
synergies


vec()
lvalue substr()
use bytes




        17
vec()
         0000001
        00000011
         0000001
        00000011
         0000001
         0000001
         0000001
         0000001



  18
• vec(): treat a scalar as an arbitrary
  length bit vector

• (you’re not using numbers, are you?)
• pack and unpack ‘b’ template is
  perfect for working with the vector as
  a whole

• convert vectors to and from from
  strings “011100” or lists (0,1,1,1,0,0)

• count bits with unpack checksum
• perldoc -f vec
                      19
example: one million
        bits!
  ## create a 125,001 byte vector
  my $bit_vector = '';
  (vec $bit_vector, 1_000_000, 1) = 1;

  ## stringify: “00000...1”
  my $bits = unpack 'b*', $bit_vector;

  ## listify: (0,0,0,...,1)
  my @bits = split //, unpack 'b*', $bit_vector;

  ## how many bits are on?
  my $on_bits = unpack '%32b*', $bit_vector;



• the 1000001st through 1000008th
 bits are free!
                       20
lvalue substr()




       21
• (or 4-argument substr)
• magic: no realloc iff replacement
  length == original length

• sprintf also might work, depending...



                   22
example: Sys::Mmap
   mmap($shared, 4, PROT_READ|PROT_WRITE,
     MAP_SHARED, $filehandle) or die $!;

   $shared = meaning_of_life();

   munmap($shared);


• 7.5 million years’ work down the tubes!
   mmap($shared, 4, PROT_READ|PROT_WRITE,
     MAP_SHARED, $filehandle) or die $!;

   (substr $shared, 0, 4) =
     pack ‘L’, meaning_of_life();

   munmap($shared);


                          23
use bytes




    24
use bytes
• binary data + DWIM + unicode
• ouch!
• pragma to the rescue: “No matter
  what you think might be in this PV, do
  not cleverly switch to character
  semantics when I’m not looking.”
• pack/unpack themselves don’t care,
  it’s things like length and substr
                   25
eat a snack




please come back


       26
horrific abuses


think like a C programmer
serialization tricks
lazy perlification




             27
think like a C
programmer




      28
typedef struct TWO_THINGS {
     char a;
     char b;
 } two_things;

 two_things things;

 two_things lots_of_things[1000];


• where is things.a? things.
• where is things.b? *(&things + 1).
• where is lots_of_things[2].b?
  lots_of_things + (2 *
  sizeof(two_things)) + 1.

• where is the point? next slide.
                        29
Readonly my $FORMAT => ‘cc’;

  my $things         = pack $FORMAT;
  my $lots_of_things = pack “($FORMAT)1000”;




• where is $things.a? unpack ‘cx’,
  $things;
• where is $things.b? unpack ‘xc’,
  $things;
• where is $lots_of_things[2].b? unpack
  ‘(xx)2xc’, $lots_of_things


                         30
• bytes, bytes, bytes on the brain
• byte offsets a natural way of thinking
  about working with data

• “language neutral” is just a cute way
  of saying “C”




                   31
• “strong typing” the roundabout way
• unpack() == C cast: “I, programmer,
  assure you, language, that these bytes
  contain precisely data of this type,
  and I will live with the consequences if
  I’m wrong.”




                   32
example: SEGV!

  my $bar = unpack 'P', ‘asdf’;




• god, I miss pointers sometimes
• (but not right now)



                         33
No pointers in Perl



         34
No pointers in Perl



         34
but...

• we are not writing C
• because down that road lies madness
• still, its siren song is hard to resist...



                     35
serialization tricks




         36
space efficiency

• Storable: general-purpose
• what does that mean?
• if you’re thinking like a C
  programmer, maybe you can do
  better...


                    37
example: array of shorts
  @shorts = map {int((rand 256)-128)} (1..10000);

  ## 20,000 bytes: 2 bytes per element
  $packed = pack 's*', @shorts;

  ## 20,016 bytes: 2 bytes per element
  $stored = Storable::freeze(@shorts);

  ## harmlessly examine contents of @shorts...
  print quot;$_nquot; for @shorts;

  ## roughly 46,000 bytes: ???
  $stored = Storable::freeze(@shorts);


• Extra credit: deserialize just
  $shorts[2113]...
                         38
fixed width


• depending on what you’re serializing
• interesting properties
• more in a bit


                   39
keyless hashes


• when a hash is really a struct/record
• thinking like a C programmer again!
• serialize bags of them without bags of
  redundant copies of their keys



                   40
idiom
## shape of the “structure” and format are
## passed or encoded separately
Readonly my $TEMPLATE => ‘VVC';
Readonly my @FIELDS   => qw(thing1 thing2 kite);

## get the bytes
my $bytes = get_from_somewhere();

## unpack via hash slice FTW!
my %thing;
@thing{@FIELDS} = unpack $TEMPLATE, $bytes;




                       41
example: keyless hash
my @records = map {
    { thing1 => int rand 4294967296,
      thing2 => int rand 4294967296,
      kite   => int rand 255, } } (1 .. 10000);

## 90,000 bytes: 9 bytes per record
my $packed = pack quot;($TEMPLATE)*quot;,
    map { @{$_}{@FIELDS} } @records;

## roughly 544,000 bytes: 54 bytes per record
my $stored = Storable::freeze(@records);




                       42
lazy perlification




        43
• for transient bytes e.g. from key-value
  storage
• for sparse algorithms e.g. binary
  search
• otherwise, don’t do this!
• or at least, don’t blame me


                   44
example: filtering

• problem scale: 100k x 20k x 100
• idea 1: regular expressions!
• idea 2: binary search, of course!
• idea 3: binary search + lazy
  perlification


                   45
serializing




     46
deserializing




      47
searching




    48
lazy binary search
pack('Ca*', $size,
    pack(“(Z$size)*”, @sorted_haystack));




$size   = unpack('C', ${$frozen_haystack_ref});
$format = ‘Z’ . $size;




...
$element = unpack('x' . ($size * $mid + 1)
    . $format, ${$frozen_haystack_ref});
$cmp = $element cmp $needle;
...


                       49
summary


• bytes, bytes, bytes
• “Premature optimization is the root of
  all evil.” -- Donald Knuth




                   50
?

j.david.lowe@gmail.com
twitter.com/j_david_lowe
dlowe-wfh.blogspot.com
slideshare



             51

Mais conteúdo relacionado

Mais procurados

Introdução ao Perl 6
Introdução ao Perl 6Introdução ao Perl 6
Introdução ao Perl 6garux
 
Exhibition of Atrocity
Exhibition of AtrocityExhibition of Atrocity
Exhibition of AtrocityMichael Pirnat
 
Chapter 2: R tutorial Handbook for Data Science and Machine Learning Practiti...
Chapter 2: R tutorial Handbook for Data Science and Machine Learning Practiti...Chapter 2: R tutorial Handbook for Data Science and Machine Learning Practiti...
Chapter 2: R tutorial Handbook for Data Science and Machine Learning Practiti...Raman Kannan
 
Top 10 php classic traps confoo
Top 10 php classic traps confooTop 10 php classic traps confoo
Top 10 php classic traps confooDamien Seguy
 
Taking Inspiration From The Functional World
Taking Inspiration From The Functional WorldTaking Inspiration From The Functional World
Taking Inspiration From The Functional WorldPiotr Solnica
 
Php radomize
Php radomizePhp radomize
Php radomizedo_aki
 
我在豆瓣使用Emacs
我在豆瓣使用Emacs我在豆瓣使用Emacs
我在豆瓣使用Emacs董 伟明
 
A Few of My Favorite (Python) Things
A Few of My Favorite (Python) ThingsA Few of My Favorite (Python) Things
A Few of My Favorite (Python) ThingsMichael Pirnat
 
M11 bagging loo cv
M11 bagging loo cvM11 bagging loo cv
M11 bagging loo cvRaman Kannan
 
Pim Elshoff "Technically DDD"
Pim Elshoff "Technically DDD"Pim Elshoff "Technically DDD"
Pim Elshoff "Technically DDD"Fwdays
 
PHP in 2018 - Q4 - AFUP Limoges
PHP in 2018 - Q4 - AFUP LimogesPHP in 2018 - Q4 - AFUP Limoges
PHP in 2018 - Q4 - AFUP Limoges✅ William Pinaud
 
"How was it to switch from beautiful Perl to horrible JavaScript", Viktor Tur...
"How was it to switch from beautiful Perl to horrible JavaScript", Viktor Tur..."How was it to switch from beautiful Perl to horrible JavaScript", Viktor Tur...
"How was it to switch from beautiful Perl to horrible JavaScript", Viktor Tur...Fwdays
 

Mais procurados (20)

php string part 3
php string part 3php string part 3
php string part 3
 
Introdução ao Perl 6
Introdução ao Perl 6Introdução ao Perl 6
Introdução ao Perl 6
 
Exhibition of Atrocity
Exhibition of AtrocityExhibition of Atrocity
Exhibition of Atrocity
 
Duralexsedregex
DuralexsedregexDuralexsedregex
Duralexsedregex
 
Chapter 2: R tutorial Handbook for Data Science and Machine Learning Practiti...
Chapter 2: R tutorial Handbook for Data Science and Machine Learning Practiti...Chapter 2: R tutorial Handbook for Data Science and Machine Learning Practiti...
Chapter 2: R tutorial Handbook for Data Science and Machine Learning Practiti...
 
Top 10 php classic traps confoo
Top 10 php classic traps confooTop 10 php classic traps confoo
Top 10 php classic traps confoo
 
Taking Inspiration From The Functional World
Taking Inspiration From The Functional WorldTaking Inspiration From The Functional World
Taking Inspiration From The Functional World
 
Php radomize
Php radomizePhp radomize
Php radomize
 
我在豆瓣使用Emacs
我在豆瓣使用Emacs我在豆瓣使用Emacs
我在豆瓣使用Emacs
 
A Few of My Favorite (Python) Things
A Few of My Favorite (Python) ThingsA Few of My Favorite (Python) Things
A Few of My Favorite (Python) Things
 
M11 bagging loo cv
M11 bagging loo cvM11 bagging loo cv
M11 bagging loo cv
 
Pim Elshoff "Technically DDD"
Pim Elshoff "Technically DDD"Pim Elshoff "Technically DDD"
Pim Elshoff "Technically DDD"
 
Format String Exploitation
Format String ExploitationFormat String Exploitation
Format String Exploitation
 
Perl saved a lady.
Perl saved a lady.Perl saved a lady.
Perl saved a lady.
 
PHP in 2018 - Q4 - AFUP Limoges
PHP in 2018 - Q4 - AFUP LimogesPHP in 2018 - Q4 - AFUP Limoges
PHP in 2018 - Q4 - AFUP Limoges
 
Your code is not a string
Your code is not a stringYour code is not a string
Your code is not a string
 
"How was it to switch from beautiful Perl to horrible JavaScript", Viktor Tur...
"How was it to switch from beautiful Perl to horrible JavaScript", Viktor Tur..."How was it to switch from beautiful Perl to horrible JavaScript", Viktor Tur...
"How was it to switch from beautiful Perl to horrible JavaScript", Viktor Tur...
 
Introduction to Groovy
Introduction to GroovyIntroduction to Groovy
Introduction to Groovy
 
R57.Php
R57.PhpR57.Php
R57.Php
 
C99.php
C99.phpC99.php
C99.php
 

Semelhante a how to hack with pack and unpack

A Re-Introduction to JavaScript
A Re-Introduction to JavaScriptA Re-Introduction to JavaScript
A Re-Introduction to JavaScriptSimon Willison
 
Perl training-in-navi mumbai
Perl training-in-navi mumbaiPerl training-in-navi mumbai
Perl training-in-navi mumbaivibrantuser
 
Introduction to Perl
Introduction to PerlIntroduction to Perl
Introduction to PerlSway Wang
 
Ruby Topic Maps Tutorial (2007-10-10)
Ruby Topic Maps Tutorial (2007-10-10)Ruby Topic Maps Tutorial (2007-10-10)
Ruby Topic Maps Tutorial (2007-10-10)Benjamin Bock
 
Good Evils In Perl
Good Evils In PerlGood Evils In Perl
Good Evils In PerlKang-min Liu
 
Tokyo APAC Groundbreakers tour - The Complete Java Developer
Tokyo APAC Groundbreakers tour - The Complete Java DeveloperTokyo APAC Groundbreakers tour - The Complete Java Developer
Tokyo APAC Groundbreakers tour - The Complete Java DeveloperConnor McDonald
 
Barely Legal Xxx Perl Presentation
Barely Legal Xxx Perl PresentationBarely Legal Xxx Perl Presentation
Barely Legal Xxx Perl PresentationAttila Balazs
 
Beijing Perl Workshop 2008 Hiveminder Secret Sauce
Beijing Perl Workshop 2008 Hiveminder Secret SauceBeijing Perl Workshop 2008 Hiveminder Secret Sauce
Beijing Perl Workshop 2008 Hiveminder Secret SauceJesse Vincent
 
Perl at SkyCon'12
Perl at SkyCon'12Perl at SkyCon'12
Perl at SkyCon'12Tim Bunce
 
Rubish- A Quixotic Shell
Rubish- A Quixotic ShellRubish- A Quixotic Shell
Rubish- A Quixotic Shellguest3464d2
 
Is Haskell an acceptable Perl?
Is Haskell an acceptable Perl?Is Haskell an acceptable Perl?
Is Haskell an acceptable Perl?osfameron
 
PHP Machinist Presentation
PHP Machinist PresentationPHP Machinist Presentation
PHP Machinist PresentationAdam Englander
 
Rapid Development with Ruby/JRuby and Rails
Rapid Development with Ruby/JRuby and RailsRapid Development with Ruby/JRuby and Rails
Rapid Development with Ruby/JRuby and Railselliando dias
 
Самые вкусные баги из игрового кода: как ошибаются наши коллеги-программисты ...
Самые вкусные баги из игрового кода: как ошибаются наши коллеги-программисты ...Самые вкусные баги из игрового кода: как ошибаются наши коллеги-программисты ...
Самые вкусные баги из игрового кода: как ошибаются наши коллеги-программисты ...DevGAMM Conference
 
Hiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret SauceHiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret SauceJesse Vincent
 

Semelhante a how to hack with pack and unpack (20)

A Re-Introduction to JavaScript
A Re-Introduction to JavaScriptA Re-Introduction to JavaScript
A Re-Introduction to JavaScript
 
Perl training-in-navi mumbai
Perl training-in-navi mumbaiPerl training-in-navi mumbai
Perl training-in-navi mumbai
 
Introduction to Perl
Introduction to PerlIntroduction to Perl
Introduction to Perl
 
Short Introduction To "perl -d"
Short Introduction To "perl -d"Short Introduction To "perl -d"
Short Introduction To "perl -d"
 
Scala Sjug 09
Scala Sjug 09Scala Sjug 09
Scala Sjug 09
 
PHP Tips & Tricks
PHP Tips & TricksPHP Tips & Tricks
PHP Tips & Tricks
 
Ruby Topic Maps Tutorial (2007-10-10)
Ruby Topic Maps Tutorial (2007-10-10)Ruby Topic Maps Tutorial (2007-10-10)
Ruby Topic Maps Tutorial (2007-10-10)
 
Good Evils In Perl
Good Evils In PerlGood Evils In Perl
Good Evils In Perl
 
Tokyo APAC Groundbreakers tour - The Complete Java Developer
Tokyo APAC Groundbreakers tour - The Complete Java DeveloperTokyo APAC Groundbreakers tour - The Complete Java Developer
Tokyo APAC Groundbreakers tour - The Complete Java Developer
 
Barely Legal Xxx Perl Presentation
Barely Legal Xxx Perl PresentationBarely Legal Xxx Perl Presentation
Barely Legal Xxx Perl Presentation
 
Beijing Perl Workshop 2008 Hiveminder Secret Sauce
Beijing Perl Workshop 2008 Hiveminder Secret SauceBeijing Perl Workshop 2008 Hiveminder Secret Sauce
Beijing Perl Workshop 2008 Hiveminder Secret Sauce
 
Perl at SkyCon'12
Perl at SkyCon'12Perl at SkyCon'12
Perl at SkyCon'12
 
Rubish- A Quixotic Shell
Rubish- A Quixotic ShellRubish- A Quixotic Shell
Rubish- A Quixotic Shell
 
Is Haskell an acceptable Perl?
Is Haskell an acceptable Perl?Is Haskell an acceptable Perl?
Is Haskell an acceptable Perl?
 
PHP Machinist Presentation
PHP Machinist PresentationPHP Machinist Presentation
PHP Machinist Presentation
 
Rapid Development with Ruby/JRuby and Rails
Rapid Development with Ruby/JRuby and RailsRapid Development with Ruby/JRuby and Rails
Rapid Development with Ruby/JRuby and Rails
 
Самые вкусные баги из игрового кода: как ошибаются наши коллеги-программисты ...
Самые вкусные баги из игрового кода: как ошибаются наши коллеги-программисты ...Самые вкусные баги из игрового кода: как ошибаются наши коллеги-программисты ...
Самые вкусные баги из игрового кода: как ошибаются наши коллеги-программисты ...
 
Ruby 1.9
Ruby 1.9Ruby 1.9
Ruby 1.9
 
PHP and MySQL
PHP and MySQLPHP and MySQL
PHP and MySQL
 
Hiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret SauceHiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret Sauce
 

Último

Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)Samir Dash
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 

Último (20)

Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

how to hack with pack and unpack

  • 1. How to hack with pack and unpack 1
  • 2. crash course standard uses synergies within perl horrific abuses 2
  • 4. what’s pack? • like sprintf() • only for bytes, not for presentation • template rules very complex • DWIM: packs empty strings for missing arguments 4
  • 5. what’s unpack? • like sscanf() for bytes • (not really: we’ll come back to it) • (mostly) identical template rules to pack • dies if it runs out of input bytes 5
  • 6. my $fourbytes = pack ‘L’, 12; my $twelve = unpack ‘L’, $fourbytes; • perldoc -f pack • perldoc -f unpack • perldoc perlpacktut 6
  • 7. standard uses someone else’s bytes fixed-width parsing 7
  • 10. 6 10
  • 11. 6 SIX BYTES ON THAT LAST SLIDE! 10
  • 12. but seriously... • struct alignment issues will ruin your day • change the XS and/or C if you can • Convert::Binary::C to the rescue! 11
  • 13. syscall • I have never had to do this... • perldoc -f syscall 12
  • 14. “close to the metal” • network protocols • binary file formats • bytes are language neutral 13
  • 16. • no sscanf() in perl • substr or regexes... • unpack is a bit nicer (not much) 15
  • 17. example: contrived pie chess 8 pecan 7 shaker lemon4 shoo fly 10 $pie = substr $_, 0, 12; $deliciousness = substr $_, 12; ($pie, $deliciousness) = m/(.{12})(.*)/; ($pie, $deliciousness) = unpack 'A12 A*', $_; • not quite identical... 16
  • 19. vec() 0000001 00000011 0000001 00000011 0000001 0000001 0000001 0000001 18
  • 20. • vec(): treat a scalar as an arbitrary length bit vector • (you’re not using numbers, are you?) • pack and unpack ‘b’ template is perfect for working with the vector as a whole • convert vectors to and from from strings “011100” or lists (0,1,1,1,0,0) • count bits with unpack checksum • perldoc -f vec 19
  • 21. example: one million bits! ## create a 125,001 byte vector my $bit_vector = ''; (vec $bit_vector, 1_000_000, 1) = 1; ## stringify: “00000...1” my $bits = unpack 'b*', $bit_vector; ## listify: (0,0,0,...,1) my @bits = split //, unpack 'b*', $bit_vector; ## how many bits are on? my $on_bits = unpack '%32b*', $bit_vector; • the 1000001st through 1000008th bits are free! 20
  • 23. • (or 4-argument substr) • magic: no realloc iff replacement length == original length • sprintf also might work, depending... 22
  • 24. example: Sys::Mmap mmap($shared, 4, PROT_READ|PROT_WRITE, MAP_SHARED, $filehandle) or die $!; $shared = meaning_of_life(); munmap($shared); • 7.5 million years’ work down the tubes! mmap($shared, 4, PROT_READ|PROT_WRITE, MAP_SHARED, $filehandle) or die $!; (substr $shared, 0, 4) = pack ‘L’, meaning_of_life(); munmap($shared); 23
  • 25. use bytes 24
  • 26. use bytes • binary data + DWIM + unicode • ouch! • pragma to the rescue: “No matter what you think might be in this PV, do not cleverly switch to character semantics when I’m not looking.” • pack/unpack themselves don’t care, it’s things like length and substr 25
  • 27. eat a snack please come back 26
  • 28. horrific abuses think like a C programmer serialization tricks lazy perlification 27
  • 29. think like a C programmer 28
  • 30. typedef struct TWO_THINGS { char a; char b; } two_things; two_things things; two_things lots_of_things[1000]; • where is things.a? things. • where is things.b? *(&things + 1). • where is lots_of_things[2].b? lots_of_things + (2 * sizeof(two_things)) + 1. • where is the point? next slide. 29
  • 31. Readonly my $FORMAT => ‘cc’; my $things = pack $FORMAT; my $lots_of_things = pack “($FORMAT)1000”; • where is $things.a? unpack ‘cx’, $things; • where is $things.b? unpack ‘xc’, $things; • where is $lots_of_things[2].b? unpack ‘(xx)2xc’, $lots_of_things 30
  • 32. • bytes, bytes, bytes on the brain • byte offsets a natural way of thinking about working with data • “language neutral” is just a cute way of saying “C” 31
  • 33. • “strong typing” the roundabout way • unpack() == C cast: “I, programmer, assure you, language, that these bytes contain precisely data of this type, and I will live with the consequences if I’m wrong.” 32
  • 34. example: SEGV! my $bar = unpack 'P', ‘asdf’; • god, I miss pointers sometimes • (but not right now) 33
  • 35. No pointers in Perl 34
  • 36. No pointers in Perl 34
  • 37. but... • we are not writing C • because down that road lies madness • still, its siren song is hard to resist... 35
  • 39. space efficiency • Storable: general-purpose • what does that mean? • if you’re thinking like a C programmer, maybe you can do better... 37
  • 40. example: array of shorts @shorts = map {int((rand 256)-128)} (1..10000); ## 20,000 bytes: 2 bytes per element $packed = pack 's*', @shorts; ## 20,016 bytes: 2 bytes per element $stored = Storable::freeze(@shorts); ## harmlessly examine contents of @shorts... print quot;$_nquot; for @shorts; ## roughly 46,000 bytes: ??? $stored = Storable::freeze(@shorts); • Extra credit: deserialize just $shorts[2113]... 38
  • 41. fixed width • depending on what you’re serializing • interesting properties • more in a bit 39
  • 42. keyless hashes • when a hash is really a struct/record • thinking like a C programmer again! • serialize bags of them without bags of redundant copies of their keys 40
  • 43. idiom ## shape of the “structure” and format are ## passed or encoded separately Readonly my $TEMPLATE => ‘VVC'; Readonly my @FIELDS => qw(thing1 thing2 kite); ## get the bytes my $bytes = get_from_somewhere(); ## unpack via hash slice FTW! my %thing; @thing{@FIELDS} = unpack $TEMPLATE, $bytes; 41
  • 44. example: keyless hash my @records = map { { thing1 => int rand 4294967296, thing2 => int rand 4294967296, kite => int rand 255, } } (1 .. 10000); ## 90,000 bytes: 9 bytes per record my $packed = pack quot;($TEMPLATE)*quot;, map { @{$_}{@FIELDS} } @records; ## roughly 544,000 bytes: 54 bytes per record my $stored = Storable::freeze(@records); 42
  • 46. • for transient bytes e.g. from key-value storage • for sparse algorithms e.g. binary search • otherwise, don’t do this! • or at least, don’t blame me 44
  • 47. example: filtering • problem scale: 100k x 20k x 100 • idea 1: regular expressions! • idea 2: binary search, of course! • idea 3: binary search + lazy perlification 45
  • 50. searching 48
  • 51. lazy binary search pack('Ca*', $size, pack(“(Z$size)*”, @sorted_haystack)); $size = unpack('C', ${$frozen_haystack_ref}); $format = ‘Z’ . $size; ... $element = unpack('x' . ($size * $mid + 1) . $format, ${$frozen_haystack_ref}); $cmp = $element cmp $needle; ... 49
  • 52. summary • bytes, bytes, bytes • “Premature optimization is the root of all evil.” -- Donald Knuth 50

Notas do Editor

  1. talk about: ‘A’, ‘A12’, ‘A*’ and the meaningless whitespace...