Alright, welcome everybody. This is my talk about GitHub's CSS performance. What a beautiful
conference, right? We're in Hawaii. Everybody love that. First of all, thank you for coming
to my talk. I know I'm always sort of astounded when people want to hear what I have to say,
so bear with me. So this talk, quick overview, this talk is about a real situation we had
at GitHub. Earlier this year we started to see some unique performance problems on the
website. They kind of come back to hit us and I'll sort of describe what we were seeing
and some of the solutions we came up with and some of the tools we used to measure our
progress. But first, really quick about me. I'm John Rohan, that's me, on the set of
White Moon Candidate. I don't know, staring at something and that's the ocean behind me.
I'm a design engineer at GitHub. I say that because we kind of lose with titles. Predominately
I've been a front-end engineer and I love design and have designed also. So I'm really
kind of a developer who designs. I got a BS in computer science. I've been struggling
with CSS for eight years. Around that time, IE6 was the new browser and IE5 was kind of
on its way out. Firefox was like shaking stuff up. I'm also a GitHubber. I've been working
at GitHub improving.com for actually exactly one year today's my anniversary. Everyone gets
an octagat statue. We won a dodgeball tournament for charity and I was holding the statue again.
So GitHub's performance problems. At GitHub, we started witnessing some really, really
slow pages. And really what I'm talking about is some scary slow pages. But what I'm talking
about is diff pages. The diff pages are file pages where we show line by line what's changed
in this blob or what has changed in this pull request. They contain a line for every content
that was added and every content that was deleted. And you can imagine some of them get really
huge. Here's an example one I'm going to talk about today. It wasn't even one of our worst
file pages but it was a good middle ground. It's 63 files changed with around 6,000 additions.
It's been called a medium size diff. That's around approximately 9,000 lines across all
files. And 80% of the page load time was just recalculating some of the styles after the
initial load. And that was sort of not acceptable. Here's the profiler in WebKit and the offending
style which is pretty outrageous. 28.16 seconds to recalculate. I usually can't wait that
long to re-heat food. To look at file differences is just crazy.
So let me back up and talk really quick about what causes style recalculations. Usually your
site will recalculate a style after load if you do something like manipulating the DOM
which is adding elements in and out using JavaScript. Hiding stuff with display none
and visibly hidden will cause a recalculated style. That's pretty common because a lot
of people want menus and stuff like that. CSS animations will recalculate your styles.
And user actions like the browser resizing if you have responsive web design or people
scrolling or changing the font size is all through these things. And this can be bad
which it was in our case and may cause browser death. And you'll get one of these which did
happen on some of our worst death pages. We started by examining our CSS code, right?
We found some rules that would lead us to faster load times and I just kind of wanted
to briefly go over them with you that we kind of like use. They're not like set in stone
but they're, you know, we need to try to do this, right? If we can. The first one is
unnecessary tag identifiers. Don't over qualify IDs in classes. So like, here's an example.
If you have UL navigation and UL menu, ID is really like specific. So, you know, there's
only going to be one of them on the page. That menu you can probably do without that.
So drop them. You'll gain some like small performance hits across your pages with that.
No ancestors. I wanted it so badly to call this no parents because you say that at work.
No ancestors is technically better. What I'm talking about here is most modern browsers,
the way they parse the style and they like find the corresponding style to the elements
is they start on the right and they look on the page for all the TVs. So they'll find
this TV and they'll match all the TVs on your page and it could be, you know, tens of thousands
and then they'll look at all of those that have an ancestor of PR and so forth all the
up. So this is just an example of like one of the worst things you could do if you have
like a lot of DOM because you have tag selectors which are like a lot of stuff and you have
like multiple tags like this. Generic selectors can be bad too. Like these are like the most
common use people do with these is resets which is kind of tough because you want to
like, you know, I totally advocate using resets and having like this is kind of like, oh man,
you're matching everything, right? So just be more specific with like within your resets.
Like I want these tag elements to be this certain style, right? To all zero it out. But
remember don't use ancestors if you're using the tag. No unqualified selectors. This is
sort of the same idea as the last one but I wanted to show you in a different way. We
did something like this when we launched some new icons and it ended up like some of the
pages, especially the Diff page where we have like a lot of lines of the same icon. It took
a hit because the fuzzy matching in the class name and the generic like nature of it. No
chaining too. We tried this too and we had to like revert, like redo it. What I mean
is so if you have a class name or if you have three classes, right, small private icon
and they all are chained together to mean something, what ends up happening is you will
overload a key in the class hash and instead what you should do is be more specific and
like actually write out like what that whole thing is. This is pretty interesting and I
wasn't the one that discovered it but I want to show you. When you do chain because I can't
be like never chained because that's a pretty awesome thing to do. When you do chain, we
have some root states that apply to the elements. I don't know if this is a WebKit bug but here's
a little C++ knowledge of how the WebKit indexes stuff. While we were diagnosing the
speed problems, we found interesting results with what class or what ID or what tag WebKit
will use to index the style. In this case, say you have two classes, the WebKit will
match it and put it in the class rules under the hash key dot foo. If you swap them, it
will use dot bar. This is pretty useful to know because if your dot foo is like a class
that matches 10,000 elements, then you may end up having like an overload in that index
and the search on that index will take longer because you have that index in there but like
just simply flipping them will be indexed in a new hash key and then it will be quicker
to look up. This is kind of interesting. The rule is indexed here actually under the
tag rules even though it's got the class theme and this can be bad because all your
DOM is indexed there. You swap it and it ends up being indexed under that error. That's
why I'm wondering if that's a bug or we tried submitting or we were thinking about submitting
a patch to WebKit but it should know which one is the most efficient. Here's another
one dot bar id foo. It's actually indexing that in the class rules under dot bar even
though id foo is the more specific style or it's the quickest lookup that you can have
is with an id rule. If you swap them, it will actually index them under the id rules. Put
the id first if you chain it with the class but this is really overqualified and you don't
even need it. We started tracking how many selectors we actually have in our whole file
across the site in June. It's kind of been a roller coaster but I just wanted to show
you how. Over the months, it's hard because I get how we ship features all the time and
a lot of our features are staff only for a long, long time. Some of them have been there
for nine months where we're just trying to get the interaction perfect right before we
ship it. At the same time, it's got to live alongside new stuff and what we end up doing
is having more CSS for a sliding period of time and killing all the really bad CSS and
moving on in the future which makes things simpler. That stuff is relative. If you don't
have a lot of DOM on your page, it doesn't make a difference. This is what we were seeing
because we had a lot of stuff. Most websites are fine. Everything I just mentioned was
CSS improvements but that's not really the end of the story. It's the marriage of these
fixes with the HTML that really made the difference for us. Let me talk about some of the HTML
overload. GitHub has some very unique and like pages that are unlike most consumer websites
out there. We live off of generated user content but not just small snippets of tweets or here's
a photo that I can display really easily. We have coders changing files that can be hundreds
of lines long. We don't want to cut up what they are looking at when they are looking
at the Diff. We don't want to get in the way of the workflow. How much HTML do we actually
have? Let's look at that medium Diff that I talked about earlier just as a busy example.
First I'm going to show you a typical Diff line in HTML that we have on our site. The
whole file table is a table layout, the file changes. We have two columns for line numbers
for the line that it was before and the line that it is now and then the changed line for
another column. This includes this little add bubble element which we use to let the user
click on the line and add comments in line. Average Diff have around 9,000 lines. Every
one of those table rows there is about 9,000. In that particular Diff I did this console
just to see what was all inside just that content of a table. A cost of 50,000 things
which is like wow. Let's get rid of the HTML. Let's lazy load it or something. We tried
that. We wanted to see maybe the server was taking long or something so we cut out all
the CSS and how long does that take to load. That was 15 milliseconds. Let's go back. What
can we do now? Great. We got this HTML code and we apply our CSSes and we get really
slow pages. Let's reduce the amount of matched HTML on the page. Here's where you start.
The profiler, you can profile a page in any state like after it loads or you can run it
and do some user actions and see what kind of things happen. What it's actually telling
us is here are all our classes and CSS selectors and here's how many elements on the right
that they matched and how long that took out of the total style like calculation.
So now I want to show you what we did and how we each change improve the page. First
of all, think simple. What I want you to do is come up with creative ways to simplify
your HTML. How can you do the same stuff with less marketing? First thing we did was remove
any unnecessary HTML. We like to say, just kill it with fire. When a forest burns down,
the fire is over and there's room for new growth and new life. Here again is our gift
line. Looking closely, we're like, why do we have this div here? I think it was used
for some sort of block layout or something. We rewrote the CSS slightly and removed that.
And that is definitely extra HTML we didn't need and it ended up dropping 6,000 unnecessary
divs that we didn't need. So we were like, wow, okay. Feeling good. Feeling lighter already.
So here's another thing we started looking at. Remove A for line numbers. This is kind
of a controversial, like, well, maybe now it's not accessible, but we'd rather be fast
than accessible. So here on each line number code, we want people to see what was changed
and we'll say, okay, let me click through and then see the whole file at this specific
line. So what we did was we dropped the A's and we wrote a little event handlers to just
actually move the click target to the TDs. So that's what we were left with. And that
was a huge improvement. It actually helped speed by like 37%. B's. And a really strange
change. We found that we could squeeze just a tiny bit more out of the page. So the ad
bubble was matching a lot of stuff. It was just a div tag. You know, it's pretty easy
just to throw a div tag in there and be like, okay, here's a million. To compound the problem,
every time you like hovered down the page, the bubble would actually hide and show on
the left side. So, you know, because we don't want like thousands of bubbles just to be
able to click on. So what we did was we actually changed the tag to B. And what works about
that was we never used B's anywhere. Like, it's sort of depreciated and, you know, it's
not really declared anywhere else in our CSS style. So putting it there meant that it wasn't
matching anything in the CSS. And that was a smaller improvement, but it was 3.5%. So
all these tricks that I talked about, did they help from the CSS rules and the HTML
files? Well, certainly. Here's our page load time over the last 12 months. And this was
when we first started seeing the problems. From February to April, we're sort of like
looking at the problems. We're like, okay, how can we fix this? And then when we sort
of like started really figuring stuff out and then just dropped back down. So we're
developers by nature. We want to make and use things to make our lives easier. So here's,
I just want to talk about some shortcuts we use to, like, help us diagnose all this and
make coding easier. Here's something I worked on and wrote. It's not unlike Bootstrap, but
it's more of a gym package that we have. It's just the absolute bare bone stuff we have
for GitHub so we can easily transfer it around to different GitHub properties, like GIST
or anything else that we launch. We want to have similar button styles and similar input
styles and things like that. But the point for bringing that up was just refactor your
CSS into reusable components. So when you do that, you move bad habits into containers
that when you figure out better ways of doing stuff, you can easily apply them everywhere.
We also use SAS, call it sweet CSS. It's like a shorthand CSS that you can use to do more
complex things, but I also call it a selector CSS because it's pretty good at just creating
really long, like, ancestor selectors like I was describing earlier that's kind of shitty
on performance. So it's both powerful and dangerous. I still like it anyways. We just
come up with other rules to try to keep from that. I wanted to show you guys just an example
of SAS block. Here's what it looks like in SAS. With all these various states going on
and all these ampersands and things, you end up with super long selectors that have multiple
chains and things can get quite crazy. So we came up with a rule just don't go farther
than three levels when you're writing a SAS. If you have to go farther than that, you probably
should just rethink how you're trying to tackle that style problem. So, wow, that was easy,
right? Everything's amazing, though. Well, no. Yes and no. They're better, but we're
still working every day to make them even, even better, right? But they're still like
the very extreme 1% of, like, diffs that are just even worse than the one I showed you
that we're going to have to rethink how we design it. But it's also, it's a large code
base and GitHub has grown a lot in the past, like, two years. It started two years, or
two years ago, it was like 10 people working in coffee shops, and now we're at, like, 130.
It's growing pretty fast, but we still work very fast, even though we're big. I did a
quick, like, get food to, like, figure out who actually has touched the CSS. So here's
everyone that has, like, done some kind of change on the CSS. That's 40 people. And here's
actually the people who would be considered CSS developers. So the truth is, like, only
these kind of, we could be like, well, okay, only these people should be doing these changes,
right? And, you know, they know what to do right and don't touch it. But that sucks.
Restraint sucks, right? Restricting developers. We don't tell our employees what they can't
and can't develop. And it's not in GitHub's DNA as a company, just the way it was born.
So what we do is we operate more like open source. If somebody who was in that list that,
like, sort of knows what they're doing with CSS feels uncomfortable, they make the change
anyway, and they tag one of us and we come in and say, yeah, dude, you're doing great
or, yeah, that sucks. You should write it with less selectors or flip this one and you'll
get a little bit of a performance. So now I just wanted to talk about some of the tools
we use to, like, figure out, like, what we were using to figure out all these performance
problems and how we diagnosed them and what, basically, good stuff. This is really cool.
One of our JavaScript expert guys wrote this, Josh Peek. Dude is awesome. It's a tool that
actually sends CSS selectors to it and it will analyze and give you back a score on,
like, how fast this could be and, like, what category it's matched under. Here's the open
source page for it. On there, he just says, think of it like SQL, explain for CSS selectors.
I told him I was going to tell everybody to use this and he was like, oh, shit. So go
and make it better. He even created a little, like, GitHub page demo where you can just
throw in a little thing and it will tell you a category. I'm not sure what's best for
a city means. He explains it on there. And it's a node package, too, so you can just,
like, put in the output, do that. Here's some of the WebKit DevTools that I like to use
and that we use to sort of look at everything. The profiler was great for finding the greatest
offenders, like I said, as it shows all the matched elements on the page. Here it is again.
The timeline. The timeline is useful for when you're experiencing the problems. You can
actually run it and see exactly what is triggering each, like, recalculating style. And in Chrome
or in Canary, you can actually click through to the source files now, which is pretty awesome.
Audits. I like Audits a lot. So far in my GitHub career, I have actually removed more
code from the code base than I have added. And the way I do that is I actually find a
lot of unused CSS that we still have just kind of, like, lazing around. You know, maybe
Chris on a hacking spree the first week was, like, throwing some CSS in a file and we just
kind of forgot about it. But so in DevMode on GitHub, we don't compress the file. We
leave them all out, right? And what you can do is if you write your code in, like, really
small sections of, like, okay, here's CSS just for this page, then later you can come
back and be on that page and actually know, like, what you can remove. So here's an example
for the dashboard, which is, like, when you're logged in and you go to github.com, this is
the page you see. I ran it really quick in my DevMode and I can see in the dashboard
that CSS that is 44% not used. And it actually lists out the selectors. So some of these
are, like, states that, like, they're just not triggered. But, like, I can quickly search
through and find things that might be really good for just lobbing off. So that's pretty
awesome. Graphite. Graphite, I just wanted to talk about it real quick. It's how we collect
and display all that data. Like, I showed you those two graphs of, like, page load times
and CSS like there's no expert on graphite, but I'm going to attempt to talk about it
real quick. From the website, graphite is highly scalable, real-time graphing system.
As a user, you write an application that collects blah, blah, blah. I think you get the idea.
So graphite has a back end called carbon that they wrote, which is basically just a process
like it's an API that just accepts all the data you throw at it and processes it. And
they also wrote their own specialized DB, they call Whisper, which is supposed to be
really fast for what they do. But the reason I want to bring it up is because GitHub wrote
a front end to it. We have something called, we call it the graph store. And what it does
is it actually saves graphing queries into, like, really easy, like, understandable, like,
line. So we can be like, like, you know, we don't have to look up all these data values
and then just come and, like, check on how the CSS selectors are being, like, browser
errors. And that allows us to put graphs everywhere. We have monitors, like, all over the office
and we just display graphs so we know when, like, user signups drop or, like, pull requests
are failing or something like that. And we live in the chat room a lot. So we actually
programmed our chat bot to pull back any graphs that we want to see, like, right away. So,
like, you know, maybe we hear about some problem, like, okay, who about, you know, what's going
on with user signups? What's going on, you know, with this or that, right? And that's
where the real, like, awesome part of saving the query, because we don't have to remember
the, like, give me an integral sum of these elements between these dates and just be like,
give me that real quick. All right, well, I guess I'm pretty boring, but here's the summary
of my talk, right? So if you weren't paying attention, because, I mean, like, this is
like too distracting or something, then you can pay attention. So, simplify your CSS.
Just use best judgment, but, like, you know, the easier and things you make on yourself
than the faster your page will be. Try to minimize HTML DOM matches. So just, like,
you know, like I said, with the B example, like, try and look at what is matching most
and then see what you can, like, kind of move off into other areas. Refactor and reuse CSS.
And communicate with your teammates. There's no reason to be upset when they make mistakes
and, you know, they shouldn't be upset at you. So you just be cool. Everybody be cool.
Graph and monitor everything, because it's, like, awesome. I love, like, just watching
that stuff and making it look good. Oh, that's my talk. Hopefully it wasn't, like, five minutes.
Thank you for listening. I've already posted the talk on speaker deck, so you can go and
look at it. Good question.
Well, I noticed that, at some point, if a designer to make the decision to reduce the
size of the commit page display, so it's a maximum of, like, 50 characters.
Oh, yeah.
It's smaller. We haven't thought about that. The main motivation behind the commit messages
is just to adhere to the, like, we want to let people know when they're reaching that
get limit, which, like, drops part of your sentence off into the more part, you know,
which ends up looking weird. Yeah. I don't know. There's a lot of discussion around whether,
like, we should just, like, simplify how we display it or, like, try to make it as, like,
clean as possible, you know. So we're still thinking about it.
Yeah.
So changing the selector, do you mention something about how the values are formed?
Not a lot, but I assume mostly, I'm pretty sure most of the, like, modern browsers do
a lot of the same stuff, you know. It really depends on the engine.
I mostly, I was worried, you know, when I get a lot of questions about this, but I mostly
geared this talk around WebKit because we have close to 70% of our users use WebKit.
And then, like, the rest of it is Firefox and then IE. So, you know, we try to cater,
like, you guys are developers. You always use the best stuff, right? So we try to use
what we see our users use and fix that.
So you're really doing tasks to change.
It wasn't really, like, you know, there wasn't any, like, huge spike between three and four.
Three was just sort of, like, I don't know, the golden number. We're just, like, you know,
okay, we can't let it get crazy. And we kind of figured we can do most stuff with three
indention levels that we need to, you know. Occasionally, you'll come across, like, like,
oh, I need this, like, really quick, real hover thing with a color change or something.
You know, you might be in four, but we try to keep it going like that.
So, before you notice the problem, is, you know, you use CSS to provide a good quality
development process?
I sort of knew about them, but, like, we didn't, like, you know, use them religiously to, like,
make sure stuff that was going on was matching all that stuff. But they came in really useful
once we were like, well, what's going on? Oh, here's some tools that let us, like, see
exactly what's going on there.
Yeah, we do use them a lot more now. It would be cool if we could somehow figure out a way
to, like, graph that, huh? To take that data and shove it into a graph item and, you know,
get it wherever we want it, right?
So, instead of putting it all in the same directory, instead of putting it all in the same directory,
you can just put it in the same directory.
Like, having an ID as an ancestor and then, like, finding children of that.
Oh, yeah.
I mostly, we'll mostly do classes. And the difference between just an ID and a class is
very small, you know? So, the benefit with using the classes is you can use something
multiple times across the page, you know? I mean, in theory, we could have, like, ID
about page and then select those inside there, right?
Right. If you're, if you're, if you have the ancestor chain ID, class, class, class,
it really doesn't make a difference if it's class, class, class, class, or just ID, class,
class, class, class. Okay. Anyone else? Okay.
Oh, yeah. You want to see a picture of it, or?
Yeah. I think it's part of each, each build, I think. We have, like, a grep.
And it seems they all sort of put it together at the same time.
Yeah. We're, right, right. We just have some, like, quick scripts that, that after they've
been compiled and sassed, we look at the gzip value.
I think, I think what happened was it was, like, you know, one ops dev was, like, this
is awesome and ran, or started running a server and had a web page that looked really crappy.
And then, you know, we have, all our designers are coders also, so they would use it and
they'd be like, this looks like shit. Let me, let me design it. You know, so it sort of
works in that way where, like, if something wants to go, are we, like, some guys want
something to go out and they don't have the skill set to do the other side of it, then
we'll kind of, like, push it and make it annoying for the other side.
It's to where you're like, okay, like, I want to spend time to fix this. We don't tell anybody,
like, hey, go fix the graph store, you know.
That, as a designer, you probably have a number of developers that you don't fit in that sense.
They're all pretty decent. Some of them are really awful at writing what you know and
stuff, but, you know, they, they, you know, people admit it and they know their weaknesses
and, but we still encourage people to work on stuff, you know.
I have the freedom to go try coding on Libgit too if I want, but I just don't really, like,
have a desire to, so I don't really push myself there. But I could totally just be like, hey,
I want to have a code C++, right? And they would totally look at my pull request, right,
and help me code better.
All right. Thanks for coming, guys.
Thank you.
