Well, hi everybody. I'm Micah Scott. I'm an artist and hardware hacker. I've been posting
a bunch on Twitter lately about kind of a weird new project I'm working on, and a lot
of people have been curious about it. Let me show you something. This, this machine.
You can buy it on Amazon. $80. High-tech, brushless motor. Three different lasers.
Protodiod amplifier, stepper motor, solenoid, and it can carve patterns onto these plastic
disks. Resolution of like 150 nanometers or so. Like, this machine is pretty neat. It's
sort of a desktop manufacturing device, but like, the things it manufactures are so tiny
we don't usually think of it as manufacturing. So anyway, it's this device I have. It's a
common Blu-ray burner. Well, so what am I doing with this? Well, I'm hacking it. I'm
trying to kind of get inside it, change the patterns that it burns on the disks. You know,
I've seen people take the individual parts out of these burners. Some researchers built
this really cool microscopic tweezer by modifying a DVD burner, and then they used the lens
to kind of focus a laser beam just right so they could hold cells in place while they
worked on them. They just bought like the optical pickup assembly, like just this front
part off of eBay, and then developed their own really simple electronics for controlling
it. Control electronics in here that are pretty great. So far, I know that there are at least
three CPUs on here. Process data at, you know, dozens of megabytes a second. These disks,
they hold 25 gigabytes on just one layer. The processor in here is actually not very
fast, but it's got a lot of interesting little bits of hardware that I'll talk to each other
to make this job work. What I'm really interested in doing is, you know, how can we hack the
hardware we already have on here to kind of do our own work and control these devices
to make our own things? And so, so Micah, what am I doing with this thing? I'm kind of a
hardware hacker, right? Like, I've been doing this for a long time, but I'm also an artist
and I've been really focusing on, you know, what can I do that's going to really open
up, you know, some kind of dialogue or some kind of, you know, make some kind of human
connection? And so, you know, this project has that side too. Cheap open source holograms
for digital graffiti. And I want to use this as a way to kind of open up dialogue across
foundries that people might ordinarily, you know, kind of shy away from. And I want to
do that by kind of taking advantage of this fact that we're really attracted to shiny
new technology. So what if I can make something that seems really shiny and new and something
that's never really been done before, but do it for free and put it out there as open
source and make the parts cheap enough that I can just, you know, have like a cart set
up in an alley in the city and anybody can just, you know, write a pattern or, you know,
type a word on a tablet. And for, you know, a dollar or $2 worth of parts, I could give
them a kit like graffiti that's non-destructive and communicates a message. And so it's really
crazy. I have no idea if this is going to work. If it doesn't work, maybe I just get
a really awesome, fun machine that breaks itself and burns up disks, which would also
be kind of neat. So, you know, that's kind of my end pine the sky goal. So I've got these
robots and they're pretty neat. And normally when you plug the robots into a computer,
they show up as a USB storage device. So this is like the same kind of device as, you know,
a USB thumb drive or hard drive. It's this kind of umbrella USB standard that translates
USB into SCSI, which is the kind of like old school low level storage protocol that all
your disks know how to speak. And that protocol normally says things like, you know, turn
on the disk, read block number four, read block number 50,000. And it turns out that
there are also some commands that do things that aren't in the specifications, things
like upgrade the firmware. I had a suspicion that I would be able to find some commands
on a drive like this and kind of get inside it and start to figure out what makes it work
for that eventual goal of making my own firmware for it. This all started as kind of like a
really crazy idea that I had just before leaving a conference overseas. And so I had, you know,
I had some kind of late night googling sessions trying to find, you know, what's the most
popular CD burner, you know, how hackable does it look? And I found out that not only
is there this, you know, this one is made by Samsung. It's like the number one best seller
right now. It's like $80. Not only was there a really good popular burner that had some
really good specs, with hardly any googling at all, I was able to locate not one, but
two firmware updateers. So that was great. So that's a starting point, right? You know,
you don't really know anything about this thing other than inputs and outputs. You can
find specifications about what the disks that it reads look like and what goes over the
USB cable. You don't know what it goes on in here. So it's a black box problem. You know,
even if I can't modify the firmware updates, if I can understand how the firmware updates
work, then that gives me a lot of information about what goes on inside these drives. This
actually all started on the transatlantic flights back from the conference. You know,
I figured out this drive was a good candidate and found these firmware updateers. I really
like visualizing files. Sometimes I'll actually just open up a binary in Photoshop. There's
a cute little trick you can do with PGM file format, this old school portable gray map.
And since the header is just text, you can just on the command line create a PGM file
with your binary and give it whatever shape you want. So this is what the firmware looks
like in Photoshop. This is pretty much my first starting point. I just kind of arbitrarily
chose the width. This is really just a one dimensional chunk of bytes. The first thing
you can see is just just even scrolling through this in Photoshop. There are these areas that
seem really different. And if I zoom out, you can see these pretty clearly up here. These
white bands are areas that are all 255 or all binary ones. And those indicate flash
memory that's been erased. So anything that's all white is stuff that's been erased and
not programmed. Interesting is you can tell that when they assemble this image, they chose
to break it up into these specific pieces. First of all, you've got these two small pieces
up here, which seem to kind of go together. I happen to know that this boundary right
here at the kind of end of this kind of larger blank area is a really round number in binary.
This is exactly 64 kilobytes. In order to update a system safely, you have to have some part
of the system that kind of stays there permanently and doesn't get replaced when you do updates.
That portion is referred to as the bootloader. Early in the reverse engineering process is
a really important thing to identify if I want to understand how this firmware update
process works so that I can kind of modify it and put my own firmware updates in there.
I need to understand how the bootloader works enough that I can give it a firmware update
that looks correct. Ultimately, it's the bootloader's decision whether or not to transfer control
into the firmware that I've given it.
Down here, you have the, this is the like actual firmware. And so one thing I noticed
is you see these patches here where there are all these heating kind of finger patterns.
Those are aligned words. They're not going to recognize those from ARM instructions
where they're always very rigidly aligned. Then down here, this area is much lower contrast
and doesn't kind of stand out the same way as this stuff. Like these I think might be
like lookup tables and then scroll down some more and then out on this stuff. This is definitely
not code or maybe it's like weird code like jump tables or something. But then we see
another one of these breaks, all ones. Somebody intentionally made a division there and it
looks completely different. It's lower contrast. You see these little, little kind of finger
patterns, the word aligned bits, but they're not aligned vertically. These are instructions
for a different processor. These blocks are actually two firmware images that I think
are for this other DSP processor and one firmware image for this other weird little processor.
It's actually an 8051, this really old school 8-bit processor that is actually still really
common for USB devices. It can be really low power and you know, you don't have to license
them. There's this program IDA, the interactive disassembler. If these firmware images are
just big complicated sets of instructions, binary files that you don't really understand
at all, you can start kind of making margin notes on them. So you can say, oh, this looks
familiar or you know, I kind of want to remember what's going on here. So you can make a note
there, you can give it a label and then it just mercilessly cross references everything.
So you get these little understandings that start building up to bigger understandings.
You start with things you know. What a file like this would look like in IDA when we first
open it up. We just have to start from raw image, which really just means this is a
file that goes into memory somewhere. That's pretty much all I know.
We can do for analysis. Here's our first problem. We actually have multiple processors. And
so it turns out that, you know, the way I actually have to do this is I have to isolate
the firmware images for each separate processor and disassemble them all separately in IDA.
Since the ARM processor is kind of running the show, we'll start with that bunch of options.
Where does this go in memory? Um, zero, good guess. There we go. It's really not much better
than a hex editor at this point. We've got a bunch of data and they're all bytes. We
don't really know what they are. We know that there is a bunch of code in here. So we could
just kind of, you know, based on Photoshop, I guess, you know, we, if we had pointed right
out here, you have to tell it the value of this T register, whether it should be using
32 or 16 bit instructions. So this little code 16 directive means from here on, it's
going to assume things were 16 bit. Isn't the right place for that to be, but I need
to start somewhere. Whoa, I've got something, the middle of a function somewhere. Anytime
there's code, it can assume that the thing after it is also code or the thing that it's
jumping to if it's a branch. So at this point, it actually knows enough to have some fragments
of code that are all connected together. But you know, it's not, this isn't an entire function
even. It's just a bunch of fragments of code. But what I do know is I know where the end
to that function is, which means I know where it can start disassembling something new.
But the thing is, the next function isn't going to start right after the previous function.
The instructions are a fixed size. You often need to refer to some data. That data might
be the same size as your instruction. So you can't include the data in the instruction.
You have to just kind of put it nearby, literal pool or a constant pool, constants, values
that don't change, keep together into a pool and put somewhere out of the way, which is
usually after your function returns. You know, this could be another function, but it probably
isn't. These could be instructions. It's kind of hard to tell. So I'm just going to pick
an address here that's, you know, some ways away in an even number. Try interpreting that
as code. And so I just said command make code failed. So it says, you know, this is probably
actually just like a big lookup table. So I want to mess. I don't know how long, how
big this is. So I'm just going to go down here and pick another number and try to convert
that to code. And that looks like code. That's great. Okay. And then that's a branch, which
means that when the processor hits this, it's going to go somewhere else. And then the stuff
down here may or may not be code because IDA hasn't seen any instructions telling the
processor to go there. So this could be the end of the function. This could be just another
piece of a function that we haven't really totally figured out yet. So, you know, the
first step in this process is actually just kind of going through here. And, you know,
partly by trial and error, partly kind of by intuition, kind of figure out what's code
and what's data. But this actually starts going really fast once you find a starting
point. It's like, it's like if you can find a loose thread to pull on, then they just
kind of start unraveling. Oh, okay. So here's, this is actually a literal pool. Here's a
bunch of numbers that aren't instructions, but they're actually just words. DCD is a
directive in the assembly language that means this is just a word of data. It's just four
bytes of data that's not an instruction. IDA has noticed that other instructions refer
to these. So the really cool thing about IDA is this cross-referencing. If I'm interested
in this value, like let's say this refers, like this actually, I know that this number
looks like a memory address. So I'm going to tell IDA that this is a memory offset. And
IDA got kind of mad about that yet because I haven't told IDA that that memory exists
yet. So like these are actually RAM addresses. And this is like a memory map IO address.
Errors are actually really useful in reverse engineering. It tells you that something's
wrong. You kind of start with this assumption that the thing already works. And if there's
an error, then error is in your own understanding. It's actually really nice to add your own
errors to things. You know, here, here is where the problem might have started. And
you can start to figure out where that flaw in your understanding is. Oh man, this looks
like a big function. Subroutine. It's identified that this, in fact, is a function. And if
I go to graph mode, oh man, it's traced the control flow for that whole function. I spent
that whole transatlantic flight just disassembling the firmware and annotating it and adding
my own notes and trying to figure it out. So in that firmware, I know I can talk to
it over USB. And that's really, that's really the best kind of foothold I've got for getting
into the system. And once I'm inside, I really want to know how to communicate back with
the computer. I want to know how its USB interface works. So after running the firmware updates
a couple times inside a virtual machine with USB logging turned on, I got some log files
that I could then put into the tool that I wrote a couple of lifetimes ago to analyze
data like this. Here the operating system is asking for USB descriptors. Strings show
up pretty easily. They've actually formatted the serial number incorrectly. This is about
the Windows operating system. It's just initializing this like a normal SCSI device. And then you
can see the updater app takes exclusive control over the device. So Windows actually stops
doing its usual polling. And now the updater has control over it. And then the actual update
starts. So there's some commands that kind of repair the updates. Actually, these commands
just kind of check again to make sure the device is the right version. And then these
packets are actually the contents of the update. These bars indicate when a transfer starts
on the timeline and then when the transfer ends. The out packets are from the computer
to the drive as it's updating the firmware. And they complete really quickly. But then
you see these long pauses. And this is actually the drive kind of not getting back to the
computer when it asked for a status update. These in transfers are the computer kind of
asking the transfer I just sent you. How's it going? And usually these complete really
quickly. But this one was long and drawn out. And this is actually where the device is
programming its firmware. Drops everything. Doesn't even respond to the computer. Waits
for the flash memory to program. And this is the flasher tool that is now part of the
CoasterMelt Git repository. And so if you just run it with no arguments it'll tell you what
the disk is. This runs a standard SCSI inquiry. Which is just kind of, hello, how are you?
This includes the revision code. So you can see it has this TS01 firmware. So you can
use this tool to install any firmware image. It'll start out by just sending the device
some of these vendor specific undocumented commands that say, hey, I'm about to give
you a firmware image. And the device is like, hey cool, let me write that into my flash
memory. That's actually really common. Most devices have some kind of firmware that they
run and some kind of memory that they store it into. Most devices have a command you can
use to update that memory. The operating system doesn't really know about these commands.
They're just kind of between the application and the device. So you don't really need administrator
privileges to write over the flash memory on your USB drive. You just need a USB device
connection of some sort. You know, I'm not even really talking to the device at a really
low level. I'm sending it these SCSI commands, which are supposed to be for storage. But
you can use some of the magic SCSI commands to write to flash memory. Here I've just updated
the firmware on the drive. There's no validation happening at this point. The file always just
gets flashed into memory, but it will never overwrite that first 64 kilobytes. So it'll
never overwrite the bootloader. And the bootloader will somehow check the rest of that image to
see if it looks good before it transfers control into it. The first thing I tried was to just
make small modifications to the stock firmware images. Most of the changes I made resulted
in the bootloader refusing to boot the firmware. Instead of going into the firmware named TS01
is their revision name for it, it would give its version number as boot, indicating that
it's just stuck in that bootloader that only knows how to get new firmware. And by disassembling
the bootloader, I can see the SCSI commands that it accepts, and they're very limited.
It has some kind of stubs where it knows how to say, you know, I don't do that when you
ask it some things. So I started to experiment with this tool to see what I could change and
what I couldn't. And then started to kind of cross reference that with the disassembly
of the bootloader to try to figure out where these checks were being made. So this is the
very end of the firmware image. It looks like the kind of thing where if you wanted to kind
of like put a stamp on your firmware image and say like, this is what's inside, this
is where you put it. This first part is a copy of the same identifying string that is sent
back to the computer when it sends that, hi, how are you SCSI command inquiry. Some other
stuff I don't know what it means, but it doesn't really seem to mean much. TS, PSST, some of
these Fs and As, I don't know what those are. But then these two bytes at the very end.
One thing that's really, really great about having multiple firmware updates that you
can find on the internet is that you can diff them. Even if you don't really know what's
going on inside, that gives you some idea of what they have in common. This first part
is pretty much the same, which indicates they didn't really change the bootloader, which
you would expect. At the very end, there are these two bytes that are different in these
two firmware images, and that's the only part of this little suffix that's different. I
was able to cross-reference that back with the disassembly of the bootloader and find
where it was checking those. In a region of memory, it adds up all the bytes, the low
two bytes of that. If there's an unintentional change in the firmware, this will usually
catch it. But it's a very weak check, and it's not cryptographic at all. If you know
how the check is computed, you can always recompute it. I don't want to make a change where if
it does let me boot the image, where I end up getting stuck. I want to make a change
that gives me some indication that I've made a change without breaking the firmware. I
started looking for things I could easily change, like identifying strings. I tried
to change the version number of the firmware. Nope, still in the bootloader. So I spent
a lot more time in IDA. So this is address Hex 10400, which is kind of right after that
first 64 kilobytes. So near the beginning of what would be kind of the replaceable part
of the firmware. And you can see up here there's some code. Somebody intentionally put these
at specific locations in memory. So this could be part of kind of the API that the bootloader
and the replaceable kind of application firmware have between each other. It has some things
in it that look like memory addresses. It has some things that look like random numbers.
And some of the things that look like random numbers are different between these two versions.
That looks interesting. That's the kind of thing that I would be looking for if I was
looking for like another kind of hash. These individual checks are each 128 bits. You know,
I had a lot of guesses about which registers, you know, you put the key into, which weren't
hashing the entire image, hashing little segments. Before I'd gone to the extreme effort of making
a really thorough brute forcing tool to find out how the crypto hardware worked. You know,
I was taking a walk or taking a shower or something. And I realized that the signature
table itself isn't actually protected by any of the signatures. So if the patcher tool
that updates the checksum at the end of the image also just changes the length of the
signature table to zero, then it's fine. So at this point, I can install my own firmware
images patch them so the bootloader will successfully jump into them. Now we need a target. Now
that we can change the code, I need to change something that will give me kind of a foothold
to explore the system some more. But it'll still be safe. It's still something that doesn't
happen automatically on boot, something that I can kind of disable if I need to. The data
is actually coming from the ARM processor itself and not from a DMA device directly.
The SCSI command I could find that met these requirements was this get performance data
command. I would run experiments on it to try to figure out what parts of it were relevant.
My strategy for dealing with this mess was to give myself a way of editing the code and
then use that editing tool to just kind of start removing everything I could while still
preserving that response back to the computer. So when I did this, I found a small set of
hardware registers that I could use in the right sequence to write replies back. Unfortunately,
it doesn't seem to be a general purpose reply kind of register. There's a lot of state in
the SCSI command processing that's really shared between the ARM processor and that
8051 8-bit processor. I can't really kind of completely repurpose a command only from
the ARM side as far as I can tell. I've written a really small backdoor patch that I can put
in firmware that replaces the SCSI command with a new SCSI command I've created that
knows how to read and write memory, call functions. I'm about to install the patched firmware
on the drive using this make flash command in the backdoor directory. Now we're inside
read and write memory, execute code. I went a little bit crazy with writing debug tools.
I wrote this CM shell thing that's part of my little CoasterMelt project, an IPython
command line that kind of bridges Python and assembly language and C++ into this kind of
weird interactive debugger that I can just read memory. That little underscore is just
short for this scratchpad area, which is just some RAM that nobody seemed to be using. You
know, you could disassemble some code, the interrupt vector table, but the really fancy
stuff happens when this tool actually compiles little snippets of code and then runs them
from here. I have a console here that's sort of a standard out where you can send messages
back and it works just by having a little ring buffer in some unused memory where you
write messages there and then this console pulls that buffer and I can compile C code.
There's this eval C command in the debugger, compiles it into a little tiny self-contained
program, sends it to the drive, and then runs it. You know, for example, if I just evaluated
to five, that's a really simple one, but I just compiled a little C program that just
returns five. You know, I could read memory. So I know we already have so many different
ways of reading memory, but you know, if I just wanted to read some memory, you know,
there's some byte stuff, that's fine. There's a variant of this that automatically opens
up the console so you can get results. This will just evaluate this print line, Hello
World, and open up a console automatically. So this just compiled this little C program,
sent it to the drive, and got its results. You know, so I've been keeping some notes
on recipes for this shell. There's some fun stuff you can do. So I want alternate ways
of getting input and output in here. This USB thing is great, but it means that I can't
really completely take control over the ARM CPU. I still need it to get the results back.
If I could just kind of like, forget about USB for a little while and use some other
communication channel, then I can use that to just kind of completely take control over
the ARM processor, poke directly at the hardware, poke directly at the other CPUs on the system.
Then I don't have to also contend with, you know, whatever else the ARM firmware is doing
inside its main loop and in its interrupt handlers. You know, there aren't a lot of
IOs on here, at least not a lot of easy ones. If you look inside here, you know, there's
a bunch of motors, lasers, amplifiers, a bunch of great hardware, but that's all really
complicated. I want like an LED. The easiest IO we have on here is actually on the front
panel. We've got one button and one LED, and that's great. That's exactly what I want.
One bit in and one bit out. I started making a lot of tools to try to figure out where
all this hardware was and figure out the flow of how the firmware worked. I found these
little movable memory regions. The firmware itself is in flash memory. It actually has
some encrypted functions that it will copy into RAM and then decrypt and then move that
entire block of RAM back over where the function normally was so that then it can call it.
If I want to kind of trap the CPU when it gets to a particular point and have it run
some of my code and then go back to what it was doing, I can do that now. This is an
address near the top of the main loop, and I'm going to tell it to go ahead and reset
the target before starting so that I have a clean slate and attach a console. Cool.
So it's resetting the disk. It takes a few seconds. It actually takes this thing a few
seconds to boot. Great, and now it's hooked. If I use the longer version of this hook command,
I can write a little block of C code. This will read from a register that seems to have
the button GPIOs in it. It's casting this number to a pointer because this is actually
a memory address, and then dereferencing that pointer to get some buttons. So if that bit's
zero, then I'm going to run the default hook, which is the kind of built-in function that
displays this stuff. It compiled that little block of code. If there were any errors in
it, it would have already told me about it. Now that's installed and it's running, and
the console's spinning, you can see this indicates that it's active and the device is alive,
but no output. And if I hold down the button, as long as I'm holding it down, I get output.
So it's like a little oscilloscope. But if we can have some way of taking this ARM processor
and kind of putting it in like a little bubble and just looking at all the stuff that goes
in and out of that bubble, then we can get a much better idea of how the system as a
whole works. Emulators are actually just a really great reverse engineering tool. This
is something that, you know, reverse engineers working on video game consoles have done for
a long time, is if you don't understand how a platform works, you try to run a game. And
that really quickly tells you how to update the year understanding of the platform so
that you can run the game correctly. I don't want to build an emulator for this entire
drive because that just involves a whole bunch of stuff that if I can pretend to be the
ARM CPU, then I know how to program it to talk to all the other pieces. So I wanted
to create an emulated ARM CPU running on my desktop computer that could still communicate
with the rest of the hardware peripherals on one of these drives. If I'm going to completely
take over the ARM CPU, I can't rely on this USB connection anymore. I need my own way
of talking to the debugger. Okay, so that brings me to my current state of the art in
terms of where this project is at. Using a small hardware modification to give the device
a serial port so that I can then kind of switch the back door from the USB interface to the
serial interface, and then use that serial interface to perform all of the I.O. on behalf
of an emulated ARM CPU running on the debug host, in this case the Python shell. This
guy right here, MT-1939, which is the system on a chip that just kind of does pretty much
everything for this. Tracking waveforms for the lens, it talks to the photodiode amplifier.
I haven't found so much as a serial bus that communicates between it and the other chips.
The other chips seem to only really do kind of high power heavy lifting. The one on the
right happens to also control the LED. So this pin I'm tapping on it is actually the
3.3 volts LED control signal. So that's the serial data going out through the LED. And
then for the eject button, flat cable is kind of annoying to solder to. So instead I grabbed
it from a via over here by the processor. Your favorite 3.3 volt serial adapter here.
And I also have it connected to a logic analyzer so I can debug the timing. So right now I'm
attached to the device over SCSI. I can read memory. I can, you know, run code. Now I'm
going to use this BitBang command, which attaches to a new debug back door over this serial
port connection. So a bunch of stuff is going to happen. This just installed a hook on the
main loop, delays the whole process so that the SCSI command, the debugger sent, can have
a chance to finish before the whole system hangs. Then we're inside of this loop that
turns off interrupts and just keeps the ARM processor stuck. They're waiting for data
over this kind of BitBang serial port. Our current debug interface is this BitBang device.
And so we can read memory. We can read more memory, but it's a lot slower. Here's the
logic analyzer. Let's see what that actually looks like. This is the serial data from the
computer to the eject button. And here's the serial data from the LED back to the computer.
And so, you know, I send these little commands that have a bunch of padding and some bytes
and then I get a response with a bunch of data and then checksums. So that's cool. I've
got a debugger now. And this doesn't need the USB interface at all. If I can run the
boot process inside an emulator, then I can figure out how to set up these other CPUs.
And I know how to get from a completely uninitialized system to a system that can at least talk
to the outside world. If I'm going to make a standalone firmware for this, that's a good
starting point. Does sim command? First step, initialize simulation states, program counter
zero. And then it starts single stepping. It tells me all the instructions it runs.
And here you can see it giving me a stream of all the loads and stores the processor
makes. It's very repetitive. Loops, clearing memory, waiting on hardware. And I have some
ideas of where to start as far as talking to the USB controller. But the DSP is still
a huge mystery. Like, I don't even know what instruction said it is yet. And I'm only really
guessing about what it is based on context. There's still a lot to do. And at this point,
it's kind of a neat, you know, it's like this point and click adventure game. So if you
ever played like Monkey Island or King's Quest as a kid, it's kind of like, like that except
it's a very big world. And, you know, once I get to the end, instead of just winning
the game, hopefully I make holographic graffiti that everyone can have. Thanks for watching
everyone. And, you know, if you like it, let me know and I'll try to do some more of these
and keep you guys posted. Bye.
