While working on a new demo video for the 0.7 release, I encountered speed problems. I was doing tests with 4,000 actors, hoping to move up to 10,000, and the calculation times were approaching 30 minutes per 12 frames. One of the points of BlenderPeople is to be able to do your initial simulations quickly enough so that you can rerun them with different parameters if you don't like how they turned out. But waiting an hour for 1 second of motion is way too much.
So, I've spent the last week really analyzing the code and the amount of time BP spends on each actor. After that, I went to work optimizing. Here's what I've done:
- Faster routine for finding the nearest actor, cutting this search time by 50%. A typical round evaluates this twice (once for opponents and once for allies).
- Removed some cruft from the functions that stick the Actors to the ground. There was some unnecessary branching, and other ineffeficienies left over from the days when BP searched the whole ground mesh without the aid of a quadtree. This also got rid of some dumb errors where Actors couldn't find their place on the ground.
- Made a local cache of the ground quadtree. The majority of evaluation time for each Actor was taken up with both sticking it to the ground, and in determining the surrounding terrain colors for pathfinding, both of which rely on the ground tree. The in-memory python implementation of the ground tree works soooooo much faster that I couldn't measure it. Python's timing tools, which I was using to do all of this, were showing times of 0.0 seconds for in-memory operations vs. 0.039 seconds (or thereabouts) for the database lookups to do the same thing.
- psyco. Someone suggested that I check it out a while ago, and I have. It's now being used in BlenderPeople, but it's one of those things that if it's there, it helps, but if it's not, it won't hurt. So, if you have psyco installed (piece of cake, really), you'll see a nice speedup. If not, no biggie, and you won't even know.
So, overall, I've cut processing time way down. It's tolerable on my single Athlon XP 1800. Tests have shown decent speed increases when running the MySQL database on a seperate machine. I've been looking at purchasing an Athlon 64 machine, and it would be great to do some benchmarks on it, especially if I could offload the MySQL work to my "old" XP 1800. That would rock. Of course, an ideal setup would be a dual proc machine, with Blender running on one proc and MySQL bound to the other. I wonder if an Athlon 64 networked to an XP 1800 would be faster than a single MP 2900 machine running two processors? If anyone wants to buy them for me, I'll be happy to let you know how it turns out.