Come to think of it, I should have written a shorter biography.
So thank you for still being here such a late afternoon.
I'm going to give you an entertaining and interesting talk.
So I'm going to start just to engage my audience.
I'm going to start in a little bit unorthodox way.
I want to show you what's in my back.
So this is my conference survival kit.
So among things I have, HDMI to VGA adapter,
I forgot the VGA cable, the DVI cable is there, internet too.
So we all carry these kind of things
because when we get to do the presentation,
we better be able to plug in our laptop into the projector.
And I have a colleague that carries a much bigger
ziplock for stuff with off at him.
But there hasn't been a single time
when I traveled together with him
and they didn't ask for something from his ziplock.
Now come to think of it, in your 2016,
there's got to be a better way to do that.
And if you read the title of my talk,
I think you are kind of starting to guess where I'm going.
So another thing that I also want to observe is all the machines
that we carry around with us.
So here is my Dell.
It's a PC.
I like Linux, so it's running Linux.
But I also have Windows in case somebody
emails me PowerPoint stuff.
There's one guy here with an Apple.
So we do have our preferences, what kind of hardware
we like, what operating systems we like.
But and we think, our emphasis, think we care about it.
And that caring is false.
We actually shouldn't care about all this junk down there.
What we care about are the IO devices
with which we interact and applications.
For as long as I have the proper IO devices with which
I can interact, and for as long as the application,
the computing system, whatever it is,
can deliver my application at performance,
I don't care what's inside this box.
And try to do an experiment.
Get a non-technical person, put them at your desktop
in your office and ask them to point you to a computer.
What are they going to point at?
They're going to point it at the keyboard, mouse, and monitor.
And that ugly box that sits under your table on the desktop
machine is not what they're going to even
be aware of its existence.
And so that's another reason that we really, what we care
about are the devices we interact with and the applications.
And in these days, when we talk about everything cloud,
here is the system that I would like to have.
And actually, this is not going to start.
It's going to sound like philosophical.
But in fact, actually, I'm going to gradually come to something
real.
I'm going to go all the way down to the implementation
details of the system, of the experimental system
that we built that does exactly what I'm going to describe you.
And I'm going to show you some videos to see what
you can do with the system.
So what I really want when I walk into this room,
there are like a bunch of devices.
Here is the projector.
Here's the microphone.
There's this gentleman with a camera.
He has headphones on.
But he's probably monitoring the quality of sound.
But what if I want to use those headphones just
to listen to some music?
So I would have to rewire them, put them somewhere else,
go to Pandora or whatever music service.
So I don't want to do that.
I want to walk into the room.
I want to pull out my cell phone and identify what devices
are around here and which devices are willing to serve them.
And then I want to punch in a couple of easy user interface
commands, and I want my application
to be served to the IO devices that I currently care about,
the tax division.
So and in this case, the network.
So first, I want the device to become a first class network
resource, IP addressable resource with native network
interface that can send in an IP stream that carries
the media that is native to that device.
So in the case of this headset, it's a bi-directional device.
Sorry, did this by wrong.
So it's an audio signal with whatever compression you want.
For monitor, of course, it's obvious it's video.
Here, it's some kind of an encoded events that
represent my keystrokes.
But if I can transfer that over the network,
then body the latency problem, which
by proper strategic placement of computing resources
and by properly solving the context migration problem,
we can take care of.
I can have the system in which I simply
come and snap my fingers, and I have whatever the computing I
need, whether it's from my office, from my home,
or whether it's a presentation for this conference,
whether it's a game.
I can have it wherever I am, given the availability
of the devices that I interact with.
So the idea is that all of the computing,
except the native IO media, is now in the cloud.
Probably you need some kind of a more distributed cloud
architecture, which is not probably definitely you need.
And the only thing that's in the hands of the user
are the IO devices, and the network is long wire.
So now, if I, for example, now bring up
my something with your relate to my desktop
in that kind of way, and I need to show somebody else what's
on my desktop, or I need to swap seats with somebody else's
desktop, it's a matter of redirecting the network flow.
So in the case of this presentation,
I would just walk out, cool out my cell phone,
find that there is a projector available,
punch in a couple of commands, and somewhere in the cloud,
the PowerPoint would run, or a PDF viewer,
and it would stream this presentation
to the projector.
And I could even use my phone to become my clicker
by faking out page up and page down button.
In fact, actually, that's one of the demonstrations
that we did many times whenever some important people
visited their labs for many years.
We've been doing that.
And I actually find that presentation already a bit
boring to myself, because it's not really
showing the full capabilities of the system.
But when the management goes around and shows,
they actually appreciate presentations.
But that's the idea.
So if I have such a system, then if I'm at my desk
and I walk into the meeting room, I can make my desktop,
my office, follow me.
If I married that with the location services, which
is what we also did, we actually deployed a bunch of IP
comes across the building.
And as I walk from one office into another,
all I have to do is have phone in my pocket.
I sit down and, boom, my desktop shows up
at the first available device that is in that room.
And then I go home, and I have exactly the same thing
that I continue to work.
So before we go there, let me now show you
a couple of videos so that you can see it live.
This is not just a philosophical vision.
This is the system that really has been implemented and works.
And I'm going to show you, oopsie, it already started.
I need to.
It's really difficult to, OK.
All right, there we go.
So I'm going to be stopping and starting this video.
And I'm going to talk you through it.
So initially what we have, this is
what one possible control interface may look like.
So this represents my computing resource.
And this is actually a capture of a real system.
That's what it really looks in reality.
It's interfaces right now, a web browser,
but there's no reason why this cannot be an app on a phone.
It's just a simple matter of programming.
And right now, the state of this system
is that I have two computing resources that
are allocated somewhere in the cloud.
And I have three, each one has three devices associated with it.
So one is the monitor, one is the mouse keyboard,
and again, mouse keyboard.
And notice now, if you're kind of starting
to paint a picture in your mind, it says, oh, he just
built a fancy VNC.
So no, the thin client, real thin client model,
is much different from what it says here.
Every device is completely independently manageable
network entity.
So I can individually, so it's not
that I have one thin client that looks like a 2016 version
of the VT220, it is actually that each device is separately
managed, independently managed entity,
and I can move them around freely.
I can compose all kinds of systems.
And I'm going to see something in the video,
some really interesting use cases.
So that's the state of my system.
And here, let's play this video, and this
is what I see on the screen.
So here are those two.
I put them next to each other.
And here are the keyboard and the two mice that are over there.
And these are now, of course, legacy devices.
So these are real cable devices.
So the way I make the network attached
is that I hook up Raspberry Pi board
and just ask you to believe me that one day in the future,
everything's going to shrink enough.
I'm going to be small enough.
It's actually, we already have that capability for monitors.
It's called Smart TV, that one day it's
going to shrink, that it can be easily integrated
with each other.
So right now, for all these devices,
I use the Raspberry Pi and the adapter.
But for all practical purposes, you
can consider that it's actually a device integrated
with the network interface.
And the only universal common denominator is the IP flow.
So that's the state of the system.
So one desktop, Hello World of Graphics.
So GLX Gears and the Steer World.
And these desktops are streamed from the cloud
to these two monitors.
And let's see what happens.
So the video is going to be flipping
between the management interface and what happens here.
But it's not a fake, it's not fake, it's real.
So here, I move the monitor representation
in the control interface.
And I think you can guess what's going to happen.
And basically, now I have the bottom computing resource
streaming its desktop to two monitors.
And the other one is still there,
but I can't see it because it's not streamed.
So the next, we now move the other monitor to the first.
And I got the effect of basically swapping two monitors.
Now, if you're following this, let me ask you.
So if I sat at the keyboard, what would happen?
Yes, because I didn't move the keyboards
and didn't move the mice.
So let's keep going.
So now I'm going to return the system
into the previous state.
Now think of it, you can actually get a broadcast system
that way that I can now stream to multiple sites.
Or I can have a presentation system implemented that way,
because the moment I replace the application on it,
instead of desktop, I put a PowerPoint,
well, then it's the presentation system.
Or if I put a game, then it's a remote gaming system.
There are actually a number of startups
that were doing remote gaming.
Now, here is another use case.
So here, this is now going through the entire process
of starting the resource in the cloud.
And by this time, I'm actually emulating
like a traditional thing client.
So I'm going to start the client program here.
And it's going to connect.
Basically, this is going to be a piece of software
on a traditional machine.
And it has created three ephemeral devices.
And now the desktop is coming up.
But the system is very flexible that it allows
for all kinds of media.
So we saw one streamed to a traditional monitor,
one streamed to a desktop.
And it is built such that it actually
uses the acceleration feature of the underlying GPU.
So you can actually run some really fancy graphics.
And it's optimized, heavily optimized,
for doing this very efficiently.
So in fact, actually, so you're going to see,
so now as I start moving this,
you see the desktop effects are working very smooth.
So desktop is, so I challenge everyone to do that
with DNC or with Citrix.
So here is another interesting,
I'm going to stop the video here.
So here I'm actually now running two instances
where if you pay close attention,
this one is in the web browser.
So both are, as far as the desktop is concerned,
both are just monitor connections.
Desktop actually can't tell the difference
that this is something streamed from the cloud.
But this is being streamed using WebRTC,
this is being streamed using traditional Azure 264, okay?
So with WebRTC, I can display it in the browser.
And again, I'm going to demonstrate
that effects are still working fine.
So, and I did a couple of cuts in the video
to make it shorter.
But so now I'm with one keyboard and mouse
and controlling both because it's the same desktop,
it's overlapped.
And so now here is another thing.
So out there in another room viewed through the webcam,
there's this little network,
which is kind of a 2007 model really crappy machine.
So, and I'm gonna,
so what I'm gonna do is it is running,
reconfigured to run the screen emulator software
that is compatible with the protocol that we use to stream.
And that's in the other room,
that's why you see it over the webcam.
And I'm gonna use the control interface
to attach that one.
So even though in another room,
so let's see what's gonna happen once I,
I should have done a cut here.
So over here in my directory of devices
that are known to the system,
this screen is known, but only the screen.
Okay, so screen and keyboard are separately managed.
So I'm looking for it, I'm finding it
and dragging it over to the computing resource
and it's right there.
So, and that's the same desktop
that is also being streamed to this other monitor,
which I believe is still that, yes,
is still that webRTC.
So it's a little hard to see because of the glare,
but when I move the window,
you're gonna see that they are being moved as well.
All right, so now you can actually tell in the,
all right, so that there was a cut here.
So what I'm doing now in webRTC,
I'm actually starting to gain open arena
for those who are into Linux graphics and stuff.
So, and the game is very much playable.
So, and it's all because the infrastructure,
underlying infrastructure is so well optimized
to run at performance.
And I'm gonna go into the implementation details.
So everything is about latency.
And actually webRTC gives me a little bit of extra latency
primarily because of the browser,
but my native Raspberry Pi client is optimized
to do it within a frame.
So it's very, so there you have a real local experience.
Now this one is really sexy.
So here is now on a tablet, I am emulating the screen.
And now I just brought in another tablet, okay?
So now something interesting is gonna happen.
So I believe that some of you or most of you
are or have been at some point programmers.
And I'm gonna challenge you to tell me
how much programming it takes to do this.
Pretty cool, right?
All right, the answer is it takes no programming.
Now why?
Why?
Because when the infrastructure is already created
such that you are, the network flows are perceived
by the guest system as physical,
the same as physical IO devices, it just works
because how did I implement that?
I bound two screens that just so happen to be implemented
with the, whose screen, they just so happen
to be implemented and emulated with tablets.
And what does the typical desktop do
when you connect two monitors to it?
It starts driving them and you can configure a desktop
to be a wide screen across two monitors.
So that's what I did.
So it's, I cheated.
I created the wide screen across two monitors
and declared that one tablet is the first monitor.
The other tablet is the second monitor.
I brought them in.
So that was a traditional desktop operation
moving across the wide desktop that just so happens
to be on two monitors.
And those two monitors just so happen to be network flows
that are being streamed to tablets.
And it's a very kind of a neat application
that just comes for free.
It just works.
So let's go back to presentation.
So now that you have seen what I can do with the system,
so let's go and quickly discuss how we built it, okay?
And I'm gonna go a little bit
into implementation details.
I personally, I'm a hacker at first place.
And then after that, I'm a hacker engineer and a scientist.
So I always appreciate implementation details.
So here is the high level picture.
So in the central of the system,
we have something called the round controller,
which is a traditional database
with the rest interface for controlling it.
And basically we control it from any control device.
And that's where all of the non-reconstructible state
of the system is.
That basically is telling you where the devices are,
what's their location, what kind of devices,
what protocols the devices speak.
And so on, it also is telling you about
where the servers are that can run the applications
and so on, okay?
So everybody needs to talk to it.
Device talks to it because it needs to report its status.
It needs to report its location.
It needs to report its willingness,
its privilege, its willingness to serve.
And servers are also being both controlled and reported.
And status reported to the round controller.
So any control operations are done through the rest
interface over this control device,
which can be a web browser or a phone in your pocket.
And on the application server,
we captured the application in an LXC container.
And there's no reason why it cannot be a virtual machine.
We actually can also bring up the end we did.
Brought up the virtual machine inside that container.
So when it's Linux, we simply bring up the whole file system
in the container.
If you wanna stream windows, then we would build,
then we would bring up the virtual machine
inside that container.
So, and then the media is streamed directly
from the application server, okay?
Now the interesting part, this is probably,
all of you can picture,
this is a traditional web frontend database system.
So many of them have been built to some extent.
Everything today is that.
So the interesting part is the application server
because that included a lot of hacking.
So especially on a low level.
Because I understand what we wanna do is,
we want to achieve one property
that is called application transparency.
Application transparency is basically the level
at which you can conceal the fact
that your device is remote.
So think of it, if I can bring up the desktop
that has no idea that everything is being streamed,
that all the streaming infrastructure is underneath
that desktop completely hidden
and everything is perceived as local devices,
then every single application
without even touching it just works.
You don't have to recompile it,
you don't have to reconfigure it,
not, you don't have to relink it.
In many cloud systems, if you want to have remote access
to it, you have to almost always rewrite the application
if it was originally written for a local machine.
Here, everything that works in a local machine
also works in here.
So this behaves like a true PC with true connectors
that just so happen to be network flows.
Okay, so underneath in the kernel,
lots of device drivers had to be modified
such that their media, their content can be delivered
out of the container into the host address,
into the host address space
where we actually run something called the mediator
that is our piece of software
that is responsible for streaming.
So if it's a monitor, then it's a video streamer.
If it's a keyboard,
then it's basically interpreter of the keyboard events.
Now, another thing that we also had to do is,
this is now getting really low-level
into the implementations of the Linux kernel.
So when I plug in a monitor into this computer,
it senses it and the desktop gets some events
that are telling it that a new monitor arrived
and the desktop does the right thing.
It activates, it creates the frame buffer on it
and it starts rendering information on it.
So that is the magic in the Linux, it's called Udev.
So that we had to rework to be aware
of the container name spaces
so that when the device arrives,
that we can actually,
and that device is actually a network flow
when that we can intervene on the arrival
and make it look like a hot plug
of a real device in the LXC container.
So here is how display works.
So the display works that in the GPU driver,
we did a modification that we are able
to create multiple, to the fake out existence
of multiple GPU nodes.
And we make them exclusive,
although it's the same GPU,
we make them exclusive to a container.
We give it to the container
and the container thinks that it's a single GPU machine,
but across the container's GPUs can be shared.
If I have a pool of multiple GPUs,
then I can allocate them anywhere I want to.
And then we created a whole infrastructure
that is very efficiently,
and that was by the way,
part of my earlier talk today at this conference,
very efficiently pulling the frame buffer
and delivering it to the user space
and making it look like a video device
to the host user space.
Yes.
How many video devices do you have on user space?
Yeah, excellent question.
So that's how I'm gonna come to that.
So let me just finish the whole.
And then in the user space, that's the software encoder.
Now, because this is visible as dev video,
zero dev video one and so on,
limitations comes from this takes up host resources.
Okay, so that's the first limitation.
So depending on what capabilities of my machine is,
what capabilities of my machine are,
this is going to eat up some of the resources.
We did a very good job of making sure
that this is almost zero latency and almost no copying.
So moving the frame buffer takes almost no resources
other than the bus bandwidth,
which upstream for GPU bus bandwidth is almost free
because everything goes downstream in the GPU.
And so the limitation,
the second part of the limitation comes
from the bus bandwidth.
And the third limitation comes from Linux kernel itself
that it cannot know more than 32 connectors.
It's just that because they're using a 32 bit mask
to annotate connectors.
So 32 bit mask.
So the limitation per GPU is 32 and we have,
but if you are running the software video encoder,
then most likely this is going to be your limited resource.
So we typically ran about like 16 to 20
and that's what we saw when the system got that.
Is that correct? Oh, you want to list them again?
So one limitation is the upstream bus bandwidth
because you're pulling the frame buffer.
So there you can do the bandwidth calculation.
So my frame buffer is this big here at 30 frames per second.
My bus bandwidth, look up the PCI,
whatever PCI express 2.0 or 3.0 specification
and overhead you're going to get bandwidth limitations.
The second limitation depends on how many cores
your machine has and how many, and what processing powers
and what power and what is the memory limitations.
So that's how many of these encoders you can instantiate.
And the third limitation is the number of these virtual
faked out connectors that you can instantiate per GPU.
And that's the, there's a hard limitation
in the Linux kernel for 32 per GPU.
So that we can, we would have to rework a lot
underneath the guts of the Linux.
So the problem, yeah.
So aside from the limitation to 32 GPU, does it look like,
I mean, if I have a GPU on this machine,
one unit on this machine, one unit on that machine,
can I program them in the same way
that I would if they were on?
Oh, that's a totally different problem.
So that's now distributed programming of the,
so no, you're hosting one application on a,
I mean, oh, if you can solve that,
Amazon is going to want to hire you right there
and pay you a lot of money.
So, yes.
It's a course per machine.
Do you have one GPU per core of that?
Or is it a table?
Okay, so I'll give you a typical machine that we use.
So we can, in a one U box, we can populate two GPUs
that inside have two, two, like a double GPU.
So one card, which is a double slot card.
It has two instances in it.
That's like one of the higher end gaming GPUs.
And the machine has, it has two physical processors.
Each processor has 12 cores in it.
And each core, no, it's not 12.
Sorry, it is six cores, but each core is hyperthreaded 2x.
So totally we get 24 cores, NUMA architecture,
and two GPUs per.
On that, we can, we've been easily running
like about 10 or 15 containers.
And then the machine chugged along fine.
So that's a level of sharing.
Now, so with this kind of system,
we know very well how to share display resources
of the GPU and how to stream them out.
And also, we can also fake out many other devices.
So here is an example of how we do it with a keyboard.
So for GPU, we had to do lots of hacking down
at the low level in the kernel.
For the keyboard, it was actually easy
because there is already a kernel module called UInput,
which allows you to inject events
from one end of the user space,
and those events emerged.
Emerges an event device in the other end,
like a kind of a specialized pipe.
So for that, we just have to write the user space application
that is packetizing these events.
And then we had to allocate this device
in a way that events are delivered to the container,
which belongs to the tenant and that are injected
from our control and from the privilege side,
from the whole side.
And then we can go on and on for various devices,
but in the interest of time, I'm gonna skip that.
We do have audio implemented.
We have GPU support implemented.
We have a keyboard mouse, so various those event devices.
But the general architecture is that in the user space
of this thing called mediator and underneath in the kernel,
you have basically, it's a generalized pipe.
So those who are into Unix,
they know what the Unix pipe is.
So this is a generalized version of the pipe.
Yes.
I have a question about the module.
So when you send it out to you to use the software
encoder to encode, you can set it up.
It's not very high.
This is 30 frames per second,
but we can easily encode it.
60 frames per second.
So what's the frame rate, what's the encoding rate?
Oh, you mean what is the bandwidth, typical bandwidth?
For 720p, we were down at about like five megabits per second.
I think for the 1080p, I think it was wiggling
between 10 and 20.
It highly depends on the context.
If you have a still desktop,
it really goes down to nothing.
So basically, you're assuming that
on the device, this late device would actually be closed
if you could encode it?
Yes.
So in Raspberry Pi, that's free.
So Raspberry Pi has this beautiful Broadcom chip,
which they give you the OpenMax library
and they give you the full blown,
and you have a full blown Linux kernel,
so it's completely, not completely open device
because the actual implementation
of the OpenMax library is closed, Broadcom gives you,
but you write an application on it, you get the packet,
you extract the packet,
and you make an OpenMax call and boom, it's on the screen.
So it was actually, I'm trivializing it,
but it was really easy to implement.
It didn't take much.
It's not really, we actually have the way we stream it.
So let's talk about streaming.
I didn't plan to talk about that,
but so first streaming is,
streaming has to be zero latency
because this is a highly interactive system.
So no B frames, so that's off the table, that's first.
So second, if you have an error,
there are no retransmissions, so everything is over UDP.
So what you do is when you have an error,
you get an information that was there
to prevent the error propagation,
you basically start referencing your future frames
to the last known good received frame, okay?
And then we can play with the iframe P frame ratio
to manipulate the bandwidth,
and on the top of it, we also throw in FEC.
So with that, we were able,
so we actually streamed it,
just like a couple of months ago,
there was a demo here in Sunnyvale at our location,
which is originally Nokia location
that now became part of Bell Labs.
They brought in some, again, very important people,
both from outside and from senior leadership,
and we actually streamed from Marahil,
where we had a pool of servers across our corporate network
into Sunnyvale, and it was nice and smooth.
I go home and over VPN,
which I cannot really brag about,
I still get quite a nice video.
Even in my lab, I even have an LTE base station
and I stream it over the LTE base station.
I cannot do it to many clients, but it does work.
And I get, I see, I can feel extra latency,
and it's really sensitive to how you configure the scheduler,
but it works.
So we did stream it across lots of various media,
and it actually works fine.
The real big problem is the latency,
because this is meant for highly interactive applications.
So if you're gonna play a game on that
and you have a 200 millisecond latency, it's game over.
So, you know, so, all right, so let's move on.
So here are some pictures.
So this is a desktop.
So desktop on here is the little Raspberry Pi
that mess of wires as typical lab environment.
There's this little screen, which is streaming 720p,
and the same desktop is being streamed to a software client.
So when we have all kinds of variety of clients
that we wrote, and here is another desktop streamed
on this old network.
So one demo that we have, we put these netbooks together
and we make desktops identically, visually,
and then we run an application that,
I wrote it, you know, nothing just for fun
that is calculating digits of Pi,
but they're actually animating each digit that there are,
so that we actually, just for the fun
of it, we stress the GPU a little bit.
So, and you can actually watch how one desktop is like,
because it's running in the cloud
and everything is just video.
So all that this needs to do is decode video.
It doesn't do any computation.
So, and as we advance with video streaming methods,
that's gonna be more and more efficient.
So, theoretically, this never expires.
This laptop is gonna be too old.
I'm gonna give it to my kids and buy a new one in two years.
And this is a 2007 machine, and they can still run
and make it look like it's, so one thing,
like it's a modern machine.
So one thing that this system brings you
when you separate IO devices from the computing
in that way, it prolongs the life of the devices
that you actually have to buy and own.
And here is that WebRTC.
So this was actually just for the fun of it,
on the same machine where I was streaming this,
I was putting together this presentation.
So that's our point of my presentation
that I'm putting together,
and that's the desktop being streamed to WebRTC, yes.
You slipped something in there, I just wanna make sure
I understand it.
If you want to say this thing is meant primarily
for latency is important because you wanna go on
and use this thing to play by video,
but you can use this thing for lots of stuff.
Yes, I can use it, yes, so.
What did upper management call on that question, yes?
Yeah.
Question is, what did upper management get excited about
when they saw this?
Do they wanna use this thing primarily
as a big video player, or what was it that they were doing?
Yeah, they got excited about streaming powerpoints.
Okay.
That's typically what you can expect from a manager, so.
No, I'm thinking about, is that free,
you're editing that free thing,
is that running on the server?
No, that's actually running on my laptop,
and what's running on the server is this desktop.
Yeah.
But I could, in fact, actually one thing that I could do,
I could run it on the server,
and then in the server I could run the client
and register the client, so then you get this big mirror,
mirror in the mirror effect.
Yeah.
The thing is, I mentioned before,
and I wanna make sure actually this can be done.
Okay, so the thing that I got excited about
when I saw this, is if you wanna do something like
running computational servers,
like deep learning stuff or whatever like that,
having the arbitrary ability to cross-connect GPUs
with the central process that coordinates
all those things.
Okay, that's a completely different problem,
which I kind of feel in this whole,
I'm glad that the audience has asked so many questions,
but I forgot actually to say one important thing.
So what we learned from this system
is how to deal with these virtual display resources.
So this is all about display.
There's a whole...
There's a whole...
Okay, so there is whole, and any kind of IO devices,
but there is whole another dimension,
which is when these desktops are throwing
rendering commands on the GPU,
you have to have something that is,
that is ensuring some quality of service
and making sure that this GPU is shareable.
And GPU sharing today is a very immature problem,
and there are not too many solutions.
The focus will be to use the GPU,
GPU to render what you're rendering.
This particular application is GPU for rendering.
There's no reason why you cannot use it for computation,
but in that case, this whole pipeline here doesn't matter.
You don't need it anymore.
Right.
Okay, so...
So your point is all of this application
that I'm interested in is interesting.
Yeah.
You don't really need a lot of this stuff you have here
because you're trying to make the displays work.
Yes, I'm trying to make this place scalable,
and so one thing here is an example,
another demo that we had a visit two years ago
from a bunch of high school kids.
So how do you make this interesting for high school kids?
So we were walking through the corridors
and we saw like a desktop following us.
As we appeared, boom showed on the screen
because I had the pocket phone in my pocket
and the right becomes across the hallway.
So they got excited about that.
But then I gathered students in the room
and I said, like, what is the most important thing
for the phone, for most important applications?
They were calling your mom, whatever.
I said, well, selfies.
And then we got and started taking selfies together.
And the moment we take selfies,
they were appearing on the screen.
And I said, like, think of it.
Maybe Best Buy could monetize a little bit.
You're going to Best Buy and you have all these TVs around you
and you're like a 10 cents per selfie, show the selfie.
I bet that teenagers would go crazy about that.
What about that projector?
I mean, how far could you somehow drive that projector?
Absolutely.
That's actually the application that management around us
got most excited about.
So what we do, and we name it a zero-touch conference room
because we don't touch the cable.
So you walk into the room with your phone in the pocket.
You sit down and you navigate through a drop box
or a file system or whatever.
And you kind of just drag the file
into the container that is running on your server
that is pre-configured with scripts
to recognize incoming file and start up
the proper viewing presentation.
And then you bind the projector as a video stream.
And then it can be switched between different people
who are sitting in front of it.
Yes, and not only that, but also remote people.
Now, what is the remote person joining?
What does it translate to?
It translates to adding another projector
to the same session.
So on the top of this system, we also
implemented the abstraction called
session, which completely automates that process.
So we have that.
And that actually has been demonstrated publicly.
So that's something that the reason I'm not excited about
that is because as a technical person,
I found like taking every millisecond out
of my streaming pipeline, I found that more exciting
than figuring out why the particular script is not
properly parsing the URL that represents my file.
So yes, so you could actually clean up the conference room
completely.
So I think you could have all your presentations
somewhere up in the cloud.
You walk into the conference, punch, punch, punch, punch,
and you start doing presentation.
And the phone becomes just a keyboard with page up and page
down button, right?
It's like the game will have different people share their
desktop so they can show it.
That's absolutely there.
Yes.
So that's about it.
So I have a couple of these.
This is my backup slide because some people ask me,
so why is this not here?
Why can't I do this with Apple TV or Chromecast?
So I have all the answers for this, but none of you
challenged me.
So I'm going to skip that part.
And also, we're running out of time.
And so it's a tool, important properties of the system.
So first is application transparency.
So what gave us the power to so easily port any application,
games, PowerPoint, you name it, is that we have this system
built such that you don't have to adapt anything.
You just put it in that container and
everything else just works.
And then the second property that really goes away from the
traditional client, thin client model is that each, you
don't have a thin client model like you would today have
that is bundled keyboard, mouse, and video.
But everything is separately manageable entity.
So it's really a super set of what you can do today with
traditional thin clients.
And the fundamental operation on which this is based is all
the device binding.
So we bind the device to the computing resource.
And that is context-sensitive.
So depending on the type of the device, different media flows
get established.
And then the difficulty in building such an infrastructure
is to manage all these flows.
So that's it from me.
So thank you for asking so many questions.
That actually tells me that the talk was worth something.
I have another question.
I'm going to come visit you at your location.
I've got my laptop.
And I want to display my laptop on your system.
How do I tell the system to add me to whatever
resource I'm going to provide?
So there are a couple of ways.
So one way is that you could walk.
So here is one way that works now.
You walk into the room.
And the room is actually going to recognize that you are in
there.
We actually have some optical, some video analytics, the
text is attached.
Basically recognize that people can go to the room and
step at the table and stay there.
And what you get is the screen kind of illuminates and says
welcome to the room.
And there are multiple ways you can show, you can deliver the
presentation to the room.
One way is you can email it to the room.
So you get an ephemeral email address.
And you send your presentation to that address.
And the presentation shows on the screen.
And the system emails you back and says, click on this
URL to get controls.
So you click on the URL and now you get control.
So you don't beam it from your laptop.
But what you do, you provide your content and the
infrastructure is streamed from the cloud to the device.
That question leads me to the exact difference.
Why is this not Chromecast?
Or why is this not Apple TV?
So the way Apple TV works is that you really have to, it
works the way you distract.
There is a little box installed in the room and you come in
with your Mac and then you bind it and then it's being
from your laptop.
If your laptop runs out of battery, the call
presentation is gone.
Or if somebody calls you on a phone and you walk out of the
room, and if you take your device with you, it's gone.
And it has to be on the same lane.
You have to connect to our network.
This way you are actually delivering the contents to
the cloud.
And the cloud does this for you and the control is done
through the, you can actually emulate keystrokes that you
care about over the web interface, which is delivered
to you through the URL.
And there are also other ways we can do it.
We also can, you can also upload it or you can, yeah.
OK.
Any other questions?
Yes.
It is, yes, so if you want to, OK.
So watching YouTube basically means that the YouTube client
is in the container in the cloud.
It is decoding it and putting it on the display.
And now our encoder is re-encoding.
But that is, it's a perfectly, yes, perfectly valid
concern, why are we doing this twice and what is the, and
YouTube has done so good job to manage the
value of that stored video, but.
Everything depends on where the resources are placed and how
accessible your resources are.
And the reason why you have to jump through hoops to make
YouTube work smooth everywhere in the world is because
everything sits in a limited number of data centers behind
many hops of the internet.
So this system assumes that you have actually strategically
placed resources, and that actually leads us to a whole
field which is becoming very hot.
It's called mobile edge computing, where the
cloud is pushed very close to the user and then they're
basically at the last access.
So this is the system that sort of presumes that
distributed architecture.
So at that point, what you are really managing is a very
limited number of hops.
And we had to get that working with our streamer anyway,
because if we don't, then none of the application works.
Yes.
So it's not just availability of distributed computers, but
YouTube encoded the video at once somewhere.
Now people are watching it, it's going to be encoded
because YouTube planet is in the back.
Yes, well that is, well, but YouTube is optimized for one
application which is stored on them.
This system is meant for real-time content where it just
so happens that you are now watching YouTube on it.
So if you build a system for a real-time generated content,
you have to solve a real-time encoded problem.
But I mean, all the studies show that today, what
you've been working with, that it's 50, 60, 70, 80% of the
hops are in this video.
So you can't, I mean, it's not a niche problem, but there's
a range of hops.
Yes, I understand that, but that's the present state of
the world.
So I mean, the world is currently dominated by cat videos.
So one day we actually move away from cat videos, and the
media becomes really something that we interact with in
real-time, then a system like this is talking about more.
The sport video is a live sport event, so that's basically
the director of real-time, but not as
interactive with real-time.
So if you have a couple of seconds delay, you're really
not going to notice unless you have two videos.
So this is even more difficult problem because you only have
that interaction loop.
So a sports video, when you are just consuming the
research, you don't have the interaction loop.
So that's amazing.
So this is, in terms of when we start classifying the
problems, I think this is the super set of all.
If you solve this one, and it's difficult to solve, and then
for many applications, I fully agree.
It's an overkill for many applications.
But ask yourself, is the applications that we are
currently using on the internet, is the infrastructure
a consequence of applications, or is the application
consequence of the available infrastructure?
So if we have an infrastructure that can do more, are new
applications going to come out?
And I believe it ought to.
Any other questions?
Thanks for the speaker.
Thank you.
