Sunday, September 25, 2011

GLSL

Most people have already seen GLSL in action in several Minecraft mods, ranging from bump- and parallax mapping to bending the world into one giant acid trip. Other games, such as Mythruna, use it out of the box for effects like HDR bloom.

Personally, I prefer Hexahedra's current look, with the simple textures and the blocky lighting. But there are several other interesting things possible that have nothing to do with aesthetics.

One of them is the use of texture arrays. An old technique to speed up rendering is to use a texture atlas: a large texture that combines several smaller ones. The drawback is that such textures cannot be tiled, so every cube has to be drawn separately. Minecraft uses (used?) a texture atlas that is exactly one tile wide, so the textures can at least be tiled horizontally. Strips of cube faces can be merged together, which cuts back on the amount of work the GPU needs to do.


A texture array is a kind of 3-D texture. The individual textures are arranged in a stack. Each slice can now be tiled both horizontally and vertically, so we can merge even more faces. The gains are the most obvious for open water, where 256 cubes with the same texture and lighting can be merged into a single, large square.

And in one fell swoop, it also solves all problems with textures bleeding into each other at the edges because of mipmapping and rounding errors.

There's another advantage. Every corner used to be stored as three shorts (x, y, z) for the position, two floats (u, v) for the coordinates in the texture atlas, and three bytes (r, g, b) for the lighting. That's 68 bytes per face.

Such restrictions no longer exist in GLSL. We can put the coordinates in bytes. (I'd put them in nibbles, but the range is 0-16, not 0-15). Now that we're using texture arrays, we need a short to pick a slice, and two nibbles for the u,v coordinates. Three more nibbles for the lighting: ambient, sunlight, and artificial light. The actual light colors are grabbed from uniforms, global values we need to set only once. So we're down to 32 bytes per face. Nice!

Hmm... The u,v coordinates now move in lockstep with the vertex positions. It turns out that by storing a normal, we can go from world coordinates to texture coordinates by a simple multiplication. We only have six possible normals, so that's three bits. And we're down to 28 bytes per face. Even better, we can now use the normals for the ambient lighting as well. Dusk and dawn are going to look spectacular if we can give the west (c.q. east) faces a red or pink hue.



I look at the top of the screen where it says "20 FPS". Frown. Revert to previous shaders. 60 FPS. The fuck?

Turns out a simple lookup in a table with six 2x3 matrices is painfully slow in GLSL. It might not be worth the extra 4 bytes per face, but it would be neat to have anisotropic ambient light. Perhaps a lookup in a small 1D texture would solve this? I'll have to spend some more time on this.

Tuesday, July 26, 2011

New test data

There has been a lot of activity recently around Cubic Chunks, a Minecraft mod that raises the height limit to 4096. Some of the old yMod maps have been dusted off and converted to this new format. Since it is based on McRegion, it isn't difficult to import in Hexa.

One of those maps is ModernCraft. I just wanted to show some screenshots, because it looks awesome.




Tuesday, July 19, 2011

Occlusion culling

The best way to speed things up is to send, load, and draw as little as possible. Hexahedra now uses hardware occlusion culling at the client side to determine which chunks to fetch over the network and show to the player.

Hardware occlusion checking is done by drawing the scene first, and then pretending to draw a box (or any convenient primitive) around whatever you want to cull. The GPU then tells you if this box would have been visible or not, had it actually been drawn. If it is visible, you can then proceed to load and draw the actual million-triangle object as usual.


How not to do it

It would be nice if we could check if the 16x16x16 bounding box of a chunk is visible, before we grab it from the server, decompress it, make a mesh, and push it to the graphics card.

We run into a couple of problems right away. First, to make sure the check is useful, the chunks that could potentially occlude it must be drawn first. For a chunk somewhere straight in front of the player, there's only one potential occluder. (For example, the chunk at {2,0,0} is only covered by the one at {1,0,0}.) Most chunks will have 2 or 3 potential occluders, though. And to determine whether to draw them or not, we'll have to check their occluders first...

Technical complications aside (chunks cannot be turned into meshes until their six neighbors are loaded, both the GPU queries and network queries are asynchronous, etc.) it is easy to see this is tricky to get working. And it doesn't work that well either. Suppose we have a chunk A, and three neighboring chunks B, C, and D that are A's potential occluders. B and C are visible, but they're on the edge of the screen. D is just off the edge of the screen, so we can't do a visibility check just yet. But until it becomes visible, we can't issue the query for A. (Well, we could,  but that would result in way too many queries and loads of false positives.)  However, if B and C don't cover A completely, we will see an ugly gap until the player turns his head far enough for D to enter the screen.

So that's where I lost a lot of time last month. :P


Better occlusion checking

The solution is not to check the bounding boxes, but the faces that touch the chunks that are already visible. Here's an illustration of how it is implemented right now:


Step 1: the player is standing in this chunk, which is of course visible to the player. Six queries are issues, one for every face of this chunk. (In the illustration you see only four). The queries are shown as lines. The dotted lines are outside the player's view, so we can't resolve them yet. The solid lines, however, will be answered by the GPU soon.

Step 2: The GPU has told us both faces are visible, so we fetch the corresponding chunks from the server, and draw them. For the new visible chunks, we issue new visibility queries. There are now four queries underway, and four are on hold.

Step 3: Some of the queries of chunk 1 have returned, and two faces are invisible. We mark the chunks as occluded, and we don't load them or issue any further queries for them.

Step 4: More data comes back, but in the meantime the player turns his head to the right. The face on the right enters the screen, and the query becomes active.

Step 5: The face is visible, and chunk 2 is loaded and drawn. Three new queries are sent to the GPU, including the one for the face that borders the occluded chunk above it.

Step 6: That face is visible, so chunk 3 is now marked visible as well. The other two faces were occluded.


Coming up

It appears there is still a problem where I get false negatives back. This is annoying, because the terrain shows a hole at that point. Here's a screenshot with one of such hiccups. The queries are drawn in green, for debugging purposes.



Usually a shake of the mouse solves it, and the algorithm chugs along as if nothing happened, but it's really annoying and ugly.  Also, I need to integrate the heightmap, so I can skip air chunks. (As you can see, there's a lot of unnecessary checking going on up in the sky. This will speed things up tremendously.)

Once that's taken care of, I hope to be able to post some videos showing all this in action, and also some statistics of just how much memory and triangles this saves under realistic conditions.

Tuesday, May 24, 2011

Scripting

Luabind is awesome.

I've spent some time with this library, and it's a really amazing C++ wrapper for Lua. Binding functions, classes, even callbacks and anonymous functions, it's a breeze.

Materials were never supposed to be hardcoded, but now the framework finally allows you to specify your own in a Lua script:

define_material(1, { name = "dirt", texture = { 1 }, hardness = 1.5 })
define_material(2, { name = "grass", texture =  { 1, 0, 1 } })
define_material(3, { name = "water", texture =  { 239 }, transparency = 0.5, viscosity = 0.2 })

...And so on. My first idea was to read this from the world database, or an XML file, but nothing is as convenient and flexible as this.

(The "texture" property is an array with 1 to 6 indices to the texture atlas. If only one value is given, all faces of the cube will use that texture. In the case of grass, texture '1' is used for the sides, '0' for the top, and '1' again for the bottom. An array with 6 values would specify different textures for every face.)

Strictly speaking, this isn't real scripting yet, just configuration. So let's take a look at events:

function launch (player)
    player:change_speed(0,0,10)
    player:shout("WHEE!")
end

on_approach(40, 40, 0, 3, 6, launch)

On_approach calls a function whenever a player gets near (40, 40, 0), within 3 blocks distance. (The coordinates are given relative to the center of the world). Any player entering this area will be launched into the air, by adding 10 m/s to their vertical speed. This event won't fire again until the player is further away than 6 blocks.

Similar triggers could be added to materials, items, and mobs. This, for example...

define_material(2, { on_touch = launch })

would be really, really annoying. But it would not overwrite anything from the first definition, it would only add a trigger.

Next thing I'll need to do is look at security. It's very important that the scripts are properly sandboxed. Also, I should study Gary's Mod, to get an idea of how to properly design such an API.


Edit: I'm jotting down some ideas here.

Saturday, May 21, 2011

More screenies

So, people wanted to see more screenshots! Let's take a look at the first two steps of the terrain generator:


The first step uses 2-D perlin noise to generate a height map. Nothing fancy.


The second step mi-... WHAT.

*fixes bug*


The second step mixes in 3-D perlin noise in some places. (This whole "in some places" bit will be fleshed out later into a nice, flexible biome system). The general shape of the terrain is still the same as in the first picture, though.

3-D Perlin noise is, simply put, a function that takes an x,y,z position as its input, and returns a value beween -1.0 and 1.0. It is a bit like measuring the temperature at a given point. For two positions that are right next to eachother, the returned values will be almost the same. But two points far apart will return completely unrelated values. This value is multiplied by some factor, 30 in this case.

Once we have this value, we subtract the relative height of that point to the surface that was generated in step 1.  If the result is positive, a block of grass or dirt is placed. If not, it's air.  The end result is some weird craggy terrain, and some floating boulders.

The same valley, seen from the other side:



And some steeper terrain, while we're at it.



Thursday, May 19, 2011

Bigass dragon invades Broville

The Minecraft world reader used to be a storage module. So instead of the usual Sqlite database, the game would read and write all terrain data to a Minecraft save game. I have upgraded it to be a proper world generator, so I can combine it with other generators (and go back to the much faster sqlite for storage).


This screenshot shows the game running with three generators: infinite flatland, Broville, and the Binvox reader. Now that the basic framework is in place, I can move on to creating more interesting generators, such as a tree planter, a cave digger, rivers, etc. ("Interesting" in the sense that they use the output of the previous modules. The ones shown here just superimpose their data, much like the layers in Photoshop.)

A few major changes under the hood: the world is now 4,294,967,296 blocks in every direction, including height. The old limitation of 65,536 blocks still applies to some terrain generators, though.  

A lot of the code now uses Boost threadpool, a not-yet official library for dividing up work across multiple threads. It works great on my 4 cores, although the code for feeding the results back to OpenGL is still a bit ugly. The result runs very smooth, but glitchy as well. Ergh, concurrency bugs... *headdesk*

Edit:
More screenies, yay.





I just noticed in that last screenie that one of the Minecraft chunks missing. The shadow inside that square hole looks great though. ;P

Had a couple of deadlocks and spontaneous suicides in the client while flying around. Holy shit this needs some serious debugging. Also, a little noclip screenshot to show that there are still caves underneath all this:


Saturday, April 9, 2011

Back on track

Alright, I'm back from my long trip across Mexico, and development has resumed!

I've ditched Irrlicht in favor of SFML and straight-up OpenGL. Most importantly, this will help me to understand how 3-D works closer to the metal. But it also gives me more possibilities to get better performance and a smaller memory footprint. I'm currently trying to wrap my brain around CHC++, it looks quite promising.

Right now, the code is at the same point as it was back in early January, so instead of the usual screenshots, here's how the lightmap is constructed. First, there's the blueish light from the sky dome:


In the second pass, the directional sunlight is added to that: