Piggy Pack

Hiding things out in the open

See that image above?

It's at this url https://i.imgur.com/9bEZg.png.

Looks like a typical image right? What if I said there was a pop song encoded within the pixels of the image (not in any meta-comment field, but the actual pixels)? All 3 minutes, 32.7 seconds of the pop song too; original, not a MIDI file!

"Proof!", you scoff? Let's hope this isn't snake oil:

#!/usr/bin/env ruby
name='9bEZg'
`curl https://i.imgur.com/#{name}.png | convert - -compress none #{name}.ppm`
ppm = File.open("#{name}.ppm")
3.times { ppm.readline }
enc = ppm.read.split(' ').map { | x | x.to_i }
bytes = 0.upto(enc.length / 3 - 1).map { | x |
    (enc.shift & 7) << 5\
  | (enc.shift & 3) << 3\
  | (enc.shift & 7)
}.pack('C*')
name = bytes.slice!(0, bytes.slice!(0, 1).unpack('C')[0])
out = File.open("/tmp/#{name}", "w")
out << bytes[0, bytes.slice!(0, 8).unpack('q')[0]]
`open /tmp/#{name}`

Yes, run it now, you skeptical naysayer, I'll wait (and cross my fingers). Oh and make sure you have

You should even be able to do it in a one liner:

curl https://raw.githubusercontent.com/kristopolous/PiggyPack/master/piggyunpack.rb | ruby

[1] iTunes works on a mac. Just look in /tmp/ (cmd+shift+g in finder) for the file with a 3gp extension

Did that work?

Ohh, I really hope so! Feel free to follow it on the github page. I'm working on an easy encoder so that you can play too.

How it Works

The basic idea is that you can take the lower, less significant bits of the color values, discard them, and then replace them with arbitrary data.

Although we all now enjoy displays with at least 8 bits per RGB plane (24 bits), it was probably within your lifetime that you had 16 bit displays; which actually weren't that bad and displayed things nearly adequately.

In the 16 bit world, RGB was usually broken up giving 5 bits for red, 6 for green, and 5 for blue (your eye is most sensitive to green, then red, then blue).

So that's what was done here ... I took the most significant 5 bits of red, 6 of green, and 5 of blue, and kept them in tact. Then the lower bits of each of these RGB triplets would form 8 bits (The 3 remaining bits from red + 2 from green + 3 from blue), or one byte from the audio file. This meant a pretty easy and direct mapping so the math and decoding was straight forward.

Note: As a few have pointed out, this is a known use case of steganography, with the wikipedia page having something really close to what I did. Well shucks, due diligence bites me again.

The Format

Since we are overlaying on an image, it can be assumed that our payload won't match the X * Y total pixel resolution of the image exactly (the payload size could be a prime number, for instance). Because of this, a small format was developed.

The header is

  1. 1 byte number specifying the length of the file name
  2. The filename itself
  3. 8 byte lsb number specifying the length of the playload
  4. The payload itself

After the bits are extracted from the overlay image, the above system goes to work and generates the file hidden beneath.

The Encoder

This is much more of a mess and the explanation is left as an exercise to the reader.

#!/usr/bin/env ruby
filename = "audio.3gp"
overlay_name = "base.ppm"
out = File.open("binary.pgm", "w")
handle = File.open(filename, "r")
overlay_handle = File.open(overlay_name, "r")

3.times do out << overlay_handle.readline + "\n"; end
bytes = handle.read

# The length of the file name (1B)
# the filename itself
# The length of the file (8B)
# the file itself
numbers = [
  filename.length,
  filename.unpack("C*"),
  [bytes.length].pack('q*').unpack('C*'),
  bytes.unpack("C*")
].flatten

len = bytes.length

offset = 0
loop {
  begin
    line = overlay_handle.readline
  rescue
    break
  end
  break unless line
  line = line.split(' ').map{ | x | x.to_i }

  0.upto(line.length / 3 - 1) { | x |
    if offset < numbers.length
      byte = numbers[offset]
      pixel = x * 3
      # xxx0 0000 Red channel gets msb top 3 bits as the lsb 
      line[pixel] = line[pixel] & 0xF8 | ((byte >> 5) & 0x7)
      # 000x x000 Green gets the next 2
      line[pixel + 1] = line[pixel + 1] & 0xFC | ((byte >> 3) & 0x3)
      # 0000 0xxx Blue gets lsb 3
      line[pixel + 2] = line[pixel + 2] & 0xF8 | (byte & 0x7)

      offset += 1
    end
  }
  out << line.join(' ') + "\n"
}
out << "\n"
out.close
`convert binary.pgm binary.png`

Future Work

I packed 8 bits because it was computationally easy and compact to express in code. However, I guess one could play with colorspaces (since we are much more sensitive to the luminance of a color then its hue). Also, right now 1/3 of the image data (-header) above is the audio, I am wondering how high that can go while still making the change nearly imperceptible.

If you take the original cat image and then put the modified one as a "Difference" filter in Gimp, you'll see a black image.

Now if you invert the colors because brights are more perceptible than darks, it still looks pretty white (Try looking at your monitor from an angle to see that something is there. The delta isn't noticable, even when you are looking only at it.

If you run auto-equalize on it, you'll get something that looks more like a file, with a distinct header and payload, but the point I think is that there is certainly more wiggle room to pack.

Detecting Widespread Steganography

After the revelation (to me) that this is called "digital steganography", a friend pointed out the quote: For this reason, digital pictures (which contain large amounts of data) are used to hide messages on the Internet and on other communication media. It is not clear how commonly this is actually done.

This led to the proposal: You should write a program to apply statistical tests to least significant bits and see if they match expected randomness distributions. You could even profile the output of various digicam models. Wow, talk about rabbit holes. I may do so. How very interesting.

About

The author is Chris McKenzie; a programmer dedicated to truth, no matter how crazy it gets. Check out his projects on github.