Friday, April 29, 2011

Working on some more performance optimizations.

I've been spending the last couple weeks working on performance optimizations.

The area I've been looking at, mainly, is where I store what cubes are visible in a chunk.  I'm currently using a Dictionary<Vector3,bool> with the Vector3 being a position of the cube and the bool as just a throw-away variable.  It's the smallest possible type as it only takes up 1 byte.  The reason I'm using a Dictionary is I want to be able to do a lookup without having to search through the whole array.  Dictionaries are indexed and have an O(1) key search, but they also use a good amount of overhead.  I decided to test different types of dictionaries and see how much space they take.  I created a trio of nested for loops to insert 32,768 objects in the Dictionary (the size of a full chunk).  By changing the types, I was able to see the amount of bytes each Dictionary took.  Here are my results:


18,568 dictionary<string,bool>
15,444 dictionary<vector3,bool>
15,424 dictionary<vector3,int/int16>
13,620 dictionary<int,bool>
13,612 dictionary<int16,bool>
13,608 dictionary<int,int>
13,588 dictionary<int16,int16>

I knew the <string,bool> would be high, but I tested it anyway for good measure.  My current solution is 2nd highest (<vector3,bool>) and the lowest solution simply isn't feasible (<int16,int16>).  What is feasible, though, is the <int,int> or <int,bool> solutions.  The only problem is that it'd have to be a 1-way conversion.

When I instantiate a chunk, I go through every possible cube position x,y,z and check if a cube is supposed to be there.  If it is, I put it into an Dictionary<Vector3,bool> called allCubesVisible.  After that's all done, I go to the next step which is detect which cube faces should be displayed.  I use a foreach(Vector3 location in allCubesVisible.Keys) to go through the dictionary and build the faces.  If a cube is visible, I add it to a master Dictionary<Vector3,bool> called allCubesallCubes only has cubes with visible faces.  From here on out, allCubesVisible becomes a lookup table and nothing else (it's cheaper to check if a key exists in a dictionary than it is to run the Perlin Noise algorithm each time).

So my thinking is that once I'm done figuring out the initial set of visible cubes, I can convert my allCubesVisible dictionary from <vector3,bool> to <int,int>.  Since I won't need to go through the allCubesVisible through a foreach anymore, I can convert it without looking back.  I found a formula online that I can use:   

index = x + (y * chunkSize) + (z * chunkSize * chunkSize);

[edit:  that formula doesn't work.  It generates duplicates.  Time to figure out a new one.]

I don't think there's a way to go back to a Vector3 position from that, but it'd still give me a unique index for each position that I can use as a lookup.

So anyway, I'm toying around with a lot of ideas.  One of the concepts I toyed with is destroying the allCubesVisible Dictionary and just running the Perlin Noise function check each time.  This resulted in a drastically lower memory profile, but chunks went from taking 100-200ms to be built to 4-500ms.  It let me build a lot more cubes though and I got several nice screenshots from it.

Neat chain of islands off the coast.



Mountain range out in the distance.

Wish I had a skybox so the in-game screenshot didn't look so bad..  maybe I'll take a break from optimization and dip my feet into .02a to add a skybox.