Tuesday, 31 March 2009

Some Collisions, Multi-Threading & New Lighting

Got a few things done this week. The first being a basic collision detection setup.

The collision system had to be fast so the most basic starting point is to use spheres to check the collisions. The opted for system was somewhat inspired by Nvidia's "Pearls" method for the Nalu demo and the theory behind the Adaptive Wisp tree. In basic terms, there are collision spheres arranged along the key hairs with respect to how many spheres are intended in total. When these spheres collide repulsive forces to keep them apart are added to the physics calculation. On top of this for speed the hairs are checked only against the hairs nearest to them for collisions. This is done by using a 3x3 filter on an array. The problems that still remain with the collisions though are that hairs only react to collisions rather than trying to prevent. This leads to a lack of equlirium if the collision spheres come into contact in the hair's rest position.

As the application is beginning to struggle for speed is seemed about time to implement a large optimisation. This quite simply was to remove the physics and related functon calls to a separate thread and leave only the rendering in the main program. As the only shared resource between these tasks is the vertex buffer which is only used in one function a single mutex wrapped around these sections seems to suffice. The performance gain was quite substantial. Obviously this optimisation would not work on single core CPUs.

Another improvement is the implementation of the Kajiya and Kay lighting model. This lighting model works on tangents, treating the hair as an infinitely thin cylinder. The end result is a much more distributed specular highlight and better lighting in general .

Thursday, 26 March 2009

Transparency

Finally starting to get things moving in the area of effects for the hair. It may not look like much progress but the problem in implementing it was probably due to the early z-test in the graphics pipeline. What seemed to be happening was akin to the problems before Alpha to Coverage(basically a method for multi sampling pixels in terms of depth to get the final blended colour) was introduced and the colours of most back pixels were not being alpha blended into the final result. After looking around and seeing some hints it seemed that even with Alpha to Coverage implemented it remains wise to implement a pre-z pass also to prevent geometry being removed by the early z-test.

Due to the fact that back objects must be drawn they had to be ordered to drawn in a back-to-front order. The sort method selected to use was stl sort as it is unlikely that I could make a faster algorithm given the time in particular. The initial pre z pass was too slow at first, so slow you could see the objects being arranged. This was mainly due to the fact that it was rearranging whole classes which of course are large sections of memory. A slight rethink and rearrange later had the classes being stored as pointers, which then makes rearranging with a compare function handed to stl sort almost as fast as arranging a list of intergers.

Thursday, 19 March 2009

LoD, Interpolated Hairs and Length Constraints

A large step in making the application more viable for use in an actual game is the Level of Detail filtering. As the hairs are built of polygonal strips it allows for a relatively simple algoritm to be implemented using an Index Buffer. Strips at current are built of 64 vertices, 2x8 for each section of the cubic splines. This number is handy also in that it is a power of 2 which allows recursive halving of the number of vertices making up a strip, i.e. 64, 32, 16, 8. This means that instead of creating new vertex data for each detail level only different sets of indices are needed to render with the same data in the vertex buffer. These sets of indices will iterate through every vertex, every second pair of vertices, every fourth pair of vertices and every eighth set of vertices. While the actual implementation is not as smooth as this theory suggests the addition has become critical to the performance with the addition of a geometry shader.


The interpolated hairs are added to the application using a geometry shader. Geometry shaders are a relatively new addition to the shader architecture but of most significance is their ability to produce more primitives than what they take in. The basis for duplicating the hairs using the geometry shader is simple:
  1. Take in three vertices
  2. Add those vertices to the current triangle stream
  3. Restrart the strip
  4. Add those same vertices again but slightly displaced
  5. Repeat 4 and 5 as needed
The main problem to note when using the geometry shader is that things can quickly become expensive as all these strips must still pass through the pixel shader after being created.


Length constraints proved elusive at first and the problems caused by slight inconsistencies proved to come in an interesting range. One of the first mistakes when implementing was using the old vertex data as opposed to the new, this for a while had vertices shooting off into infinity. A good deal of head scratching, debugging and class rearranging solved this problem to present another. The next problem in the constraints is probably best described as imploding/exploding hair as the very rigid constraints and the mass spring system seemed to have conflicting aims. At this points it must be pointed out that the method of constraint went something like this:

vMoving = (vFixed) + Normalize((vMoving) - (vFixed)) * (fIntendedDistance);

Where pmoving is the vector to be moved and pfixed is static. This method it turns out is only viable for so much of the constraints. To prevent the imploding/exploding effect most of the points for the hair must be constrained using a slightly different approach which is:

vector3 vDelta = (v2) - (v1);
float fDistance = vDelta .x*vDelta .x+vDelta .y*vDelta .y+vDelta .z*vDelta .z;
fDistance = sqrt(fDistance);
vDelta *= 0.5f * (((fIntendedDistance) - fDistance) / fDistance);
(v1) -= vDelta;
(v2) += vDelta;


In all it now looks something like this:

Tuesday, 10 March 2009

Progress Report

Application Progress

Generally close to intended time scale at current but still slightly behind where things were hoped to be.

Framework

Most of the framework is complete though further changes may be made, especially when optimisation comes into the main focus. Original framework based on DirectX 9 experience but changes had to be made for DirectX 10 to make work easier and generally improve the tidiness and accessibility of the objects in the application.

B-Spline generating and rendering proved fairly easy, maths may need rechecked. These formed the basis for the keyhairs, the keyhairs are then rendered later on.

Rendering

Keyhairs are currently drawn as Polygon strips based on the previously mentioned B-Splines. Could probably just as easily have used Beziers but the higher level of continuity keeps the hair constrained to a fairly smooth degree.

Basic Phong shading to give the hairs ambient, diffuse and specular lighting. Has also given an awareness of the DirectX 10 and HLSL 10 systems.

Physics

Implemented Verlet Integration to handle hair physics. Updates constrained to 10 millisecond frequency due to variable time steps causing errors and to reduce the performance impact. The mass spring system has also been implemented in the hair physics. The mass spring system is based on the simple calculations of Hooke's law. Spring strength is set by dividing based strength by a quadratic equation (based on Henrik Halen's work) parameters of which are based on the distance from the “root” of the hair.

Documentation Progress

Probably lagging a little in keeping up with documentation. Definite need to work on the Blog and Dissertation more frequently.

Dissertation

Started writing the introduction sections and bullet pointing the methods and implementation sections. The main introductions is mostly based on the proposal introduction which is already written and covers most of the important information. Approx 1000 words at current.

Estimated Progress

Click the chart to view at better quality.



Friday, 6 March 2009

Verlet Integration and the Start of Hair Physics

I finally have a good start on the physics for the hair the Verlet integration was fairly simple to integrate into the application. The first force I applied to it, which is the only force that'll be almost constant, was gravity and the result was infinitely falling hair which was amusing. There were some odd issues though to do with variable time stepping. I narrowed it down to the time in the equation for drag/air resistance:

m_vecForceAccumulator[iMember] -=
m_fDrag * (vPosition - m_vecOldPositions[iMember]) / APPROX_DELTA_TIME;

and found that as has been previously noted that fixed time stepping prevents the problem occurring in the first place. There was another serious problem to note with the fixed time stepping too and this was due to the need to recalculate normals and re-Map the vertex buffer to update all the vertex information, in short the frame rate dropped from approx 2000fps to 50 fps. Locking APPROX_DELTA_TIME at 10 milliseconds (0.01 seconds) and only running the physics when the APPROX_DELTA_TIME has passed and only doing the normals and vertex buffer work if physics has been recalculated in the current loop seems to have solved some problems.

Besides adding the Verlet Integration and decreasing the amount of simulated hair strands (a.k.a. the Keyhairs that will do the work for the rest of the hair) I also added in the Mass Spring Model mainly because infinitely falling objects aren't of much use to anyone. This is a simple system based on the each strips starting position and it suspends the hairs from the starting position using springs rather than letting them fall. Also important is that the elasticity of the spring decreases the further away it is from the "root" of the hair.

No Gravity...
With Gravity