It’s 2015 and drawing text is still hard (WebGL, ThreeJS)

Preface (TL; DR):

  • After reading this blog post you should know about some techniques and performance optimizations around rendering text in WebGL/OpenGL ES. (Signed Distance Fields) SDF + Mesh Merging + Texture Mapping are the answer and are the correct set of compromises in terms of performance and quality *for us*. Our “String Texture” method was the most performant way to render text that we found as long as an application didn’t have “canvas zooming” needs.
  • Here are links to all the demos in this post.
  • We have open sourced our rendering engine Cartogram. It provides a nice abstraction on top of ThreeJS, and a lot of tools in a world where “mesh merging” is the default. Feel free to play around with it and use it in production. Docs, more optimizations and interface changes are coming soon!

Dear text; ughhhhh… You’d think in 2015 there’d be nothing interesting left to do when it comes to rendering text. It turns out browsers and application platforms like iOS and Android shield you from much of the pain you’d normally experience in the text rendering department. WebGL, in contrast, has no native way to render text, and while ThreeJS (and other libraries) provide means for creating text, they don’t provide for solutions for rendering lots of text. With a recent release here at Eventbrite, we created a new rendering system for our venue maps using WebGL and ThreeJS. This rendering system solves the text problem in a couple ways. So let’s talk about how we developed those solutions.

Failure #1: ThreeJS TextGeometry

Screen Shot 2015-07-13 at 11.15.41 AM

ThreeJS provides a text geometry out of the box. This works out pretty well when you have just a few pieces of text on screen. Just rendering the word “hello” is easy and quick; generating almost 6000 vertices and 2000 faces. This sounds impressive, but why so many vertices?  The text geometry acts much like a model loader. ThreeJS’ TextGeometry reads a font file, and determines where to position all the vertices based on the font type and the input string. This technique is really cool and powerful!

The requirements of our Venue Map Application indicate that we need approximately 1000 strings on screen simultaneously in some cases. In our tests, generating 300 strings takes far too long, and in another trial, 1000 strings would simply crash the browser. This technique just doesn’t work for us.

The reason it takes so long to render is due to the number of “ThreeJS.Materials” we are creating. Every string in this simple system causes 1 new Material to be created. Every Material causes 1 new draw call. Every draw call, on each frame causes communication between the CPU and the GPU. Communication between the CPU and GPU is typically what causes the biggest performance bottlenecks in WebGL applications. To make this application faster we need to limit what the CPU needs to do, and offload more work to the GPU. This leads to our next strategy.

Failure #2: Merged Mesh TextGeometries

Screen Shot 2015-07-13 at 11.40.22 AM

So what can we do? Well… we can merge meshes. Luckily, ThreeJS provides a solution for this.

In our example application code example, we create a piece of text, merge it using ThreeJS’ built in `.merge()` function and provide a material index. 100 strings were generated and merged together fairly quickly, we create just 1 shared material, and thus there is only 1 draw call.

Let’s scale this up a bit: How does this technique handle 300 strings? Not so well, turns out. Refresh the page a few times. You’ll see a slight delay before the text appears. When we crank up this example to 1000 strings, while the application doesn’t crash it takes ten or twenty seconds to render. This is not good!

At this point if you were to measure what operation takes the longest you will see issues stemming from the “merge” function. Merging just takes a long time. It turns out the more vertices you have, the longer it takes to merge them all together. I guess the only answer is to have less vertices… but how?!

Failure #3: String Textures

If we can’t represent a string as an unbounded set of vertices, let’s bound how many vertices we can have. At the very least we can likely make the vertex count more predictable. A common technique in this scenario is to create a texture representing the more complicated object, and paste it on top of our geometry. The way to do this in WebGL with dynamic text is to create a texture using HTML5 canvas then write strings to that texture.

This technique worked fairly well. Now we have 6 vertices, and 1 face per string. The problem; however, is that we are creating 1 texture per string, which means we have 1 draw call per texture. This is just as bad as our first method.

Partial Failure #4: Merged String Textures + Merged Geometries

Screen Shot 2015-07-31 at 12.31.37 PM

What we need now is a combined approach from our previous solutions. Namely, we need a solution that will both limit the number of draw calls and vertex count. This is where UV mapping comes into play.

UV mapping allows developers to select portions of a texture to layer on top of their geometries. With this technique in mind we can create 1 “super texture” typically known as a sprite sheet that represents all our text. This requires laying out all of your text next to each other, incrementing a pointer as you write text to this sprite sheet, and keep a lookup map of where each string is located to reference later. Combining this technique with our mesh merging technique works out really well. In our cartogram string texture example we render a staggering 4000 strings in nearly the same amount of time as our previous examples took to render 300 strings.

Now for a new challenge. On non-Retina devices, our text looks crunchy and professionally speaking, “I hate it and think it’s stupid!” This is mainly caused by not being able select a specific size for our text and when we zoom the text loses quality. I spent maybe 2 weeks across several months tweaking settings, but no matter what I did our text still looked terrible. “Should we turn anti-aliasing on or off?”, “Should we make the text bolder”, “ugh we don’t want it to be bold”, “ok fine its bold”, “dammit it still looks stupid!” “Where do we go from here???” We threw all our tricks at the problem, we solved the performance issues, but we were left with quality issues.

Solution #5: Success! SDF Fonts + Glyph Texture and Compromises

Screen Shot 2015-07-31 at 12.32.30 PM

Actually our UX designer was more frustrated with the text than we were. While browsing online he stumbled across various articles about signed distance fields (SDF). We were skeptical, and knew SDF would take some work to integrate.

So what is SDF? In short, we create a sprite sheet with every character we care about from our font, and encode some extra data about smoothing in place of the alpha channel of the sprite sheet. We then later use this data about smoothing in our shader and render a crisp, smooth character at any zoom level. This technique works pretty well, it is slower, but the trade-offs are worth while. In our example here we render 1000 strings. Our Cartogram repo implements an SDF font render. In recent months, it seems a number of packages have popped up on npm as well to assist with SDF related things.

We had some prep work we had to do before we could use this technique:

  1. First we had to generate SDF textures for 2 different font faces. We did this using Hiero from libgdx (a cross platform game engine).
  2. Convert output from Hiero to JSON.
  3. Create some magic to map smoothing to zoom distance.
  4. Lastly, for each string we iterated through all the characters, created geometries for them, mapped the UV values from the texture and merge them into the rest of our scene.

Whew, now we got beautiful text at every resolution and zoom level!

It isn’t all sunshine and daisies though:

  1. SDF works great when you have a font that supports all the characters you need. The manual nature of SDF means there is no font stack. This means if the user gives us some fancy unicode character that isn’t in our main font then we can’t render it. The solution for us was the fallback to the previous method of rendering text. It does look worse, but your program will definitely not crash and the text will appear as it should.
  2. Kerning… yea we need it. We don’t have a solution for this yet. Maybe some other SDF font generator has a way to pull out kerning information from a font. If someone finds something, we’d love to see it. Until then enjoy your slightly non-monospaced text.
  3. None of our solutions currently account for multiline text. While we don’t think it would be too hard to account for multi-line text, we would be adding another level of complexity. We’ve been using tooltips layered on top of our scene for this purpose.
  4. This solution is slower than our previous one. We just need to accept that and render less text at any given moment.

For now we are happy with the summation of these approaches. It would however be ideal if text was handled automatically in WebGL, or be more optimally handled in ThreeJS.  The OpenGL ES communities must have the same issues as illustrated by LibGDX solving some of the same problems. Wouldn’t it make sense to handle text at a much lower level if everyone is building tools to solve it on their respective platforms?

Has anyone found how to solve the issues listed above with SDF fonts? Want to vent about the state of OpenGL text, or have ideas then shoot a message over to me: @parrissays!

Have thoughts? Discuss this article on Hacker News