Frenchdog’s Weblog

Entries categorized as ‘houdini’

ICE vs VOP, a performance comparison.

September 12, 2009 · 13 Comments

Since ICE release last year, I often read some comparison between this new Softimage tool and Houdini.
At first glance, ICE look like Houdini VOPs. A network system to manipulate low level data (attributes).
Most of the time the comparison statement is that ICE is faster and easier than VOPs.
It is hard for me to argue that ICE is easier. Very often, we find an application easier because we just spend more time in it.
In this article I will focus on the performance side. I will try to be as neutral as I can and so you won’t know which one is my favorite  ;) .

But is ICE really looks like VOPs ?

ICE is a node based system to manipulate data on points. Each ICE nodes take some input(s) and do some operations and then output the result(s).
The set of all those ICE nodes give an ICEtree, a Softimage operator.
An ICE node use the SIMD architecture and is multithread.
SIMD means single instruction multiple data. So if you need to add a vector A (a single instruction) to all your point positions (multiple data)
it will execute much faster than processing the same addition in a loop one point after the over one.
Multithreaded means that for a given task (like the vector A addition) several threads will be used at the same time (concurrently).
Here is a concret example from the Softimage SDK doc.
This way, for each nodes in an ICEtree, each time it is possible, the multithreading feature will be used. That’s the important point to understand. ICE run several little multithreaded programmes (ICE nodes) instead of one big multithreaded program (ICEtree operator).

VOPs are a little bit different. A VOP node is just a representation of a VEX function. VEX is a language designed for writing custom nodes (and Mantra shaders)
and it also support SIMD and multithreading. A set of VOPs gives a VOP operator (something like an ICEtree operator). This VOP network will “just” build some VEX code (you can read this code if you right click in the vop output node). It is a big difference compared to ICE architecture. The first time a VOP network is called, Houdini will pre-optimize the compiled VEX code and cull out any instructions which don’t contribute to the final data. It will also convert “non-varying” variable to constants. This way the VEX code is very optimised and can be even faster than a C++ Houdini custom operator !

We can clearly see now that ICE is a “compiled programs by nodes” system instead of a “source code by nodes system” like VOPs.
It looks too me that the multithreading can be much more optimised with the ICE system but to be sure, nothing worth a comparison test !

The comparison

It is not so obvious to compare those two systems. Simple math operations will be rather easy to compare as we are able do to a really similar graphs in the two
applications. But very often, ICE/VOP will need to get some data from other objects in the scene. Those geometry queries are done by some factory nodes and
it will be hard to really know how it is done under the hood. I could stick to pure mathematics operations comparison but I think it will be much more interesting
to compare scenes with common production scenario too. For the timing I will use the Houdini Performance Monitor and the ICE Performance Timer (set to Time all threads) in order to get only the ICEtree operator and VOP operator evaluation time.
Time will be display below each ICE/VOP graph pictures.

A VOP node is  just a representation of a VEX function.  A set of VOPs gives a VOP operator (something like an ICEtree operator).
VEX is a language designed for writing custom nodes (and also Mantra shaders) and it also support SIMD and can be multithreaded too.
So the major difference between ICE and VOP architecture is that the VOP network will “just” build some VEX code
(you can read this code if you right click in the vop outputnode).
The first time a VOP network is called, Houdini/mantra will pre-optimize the compiled VEX code and
cull out any instructions which don’t contribute to the final data. It will also convert “non-varying” variable to constants.

And now the empiric test !

Test A : vectorOperations_simple.scn/hip

Some particles positions are modified using some dot and cross products between their normalized position and a normalized varying vector.
This is a pure math operation comparison.

vectorOperation_ogl

10000000 particles

vectorOperation_ICEtree

ICE : 1820 ms

vectorOperation_VOPs

VEX : 3270 ms

ICE is twice faster to do vector processing !
But let see the other scene test…

————————————————————————————————————————————————————————-

Test B : vectorOperations_basicTapper.scn/hip

In this test, we scaled down the X and Z components of each point position along the Y axis of a box (200X200X200 points).
A Get Bounding Box function is called to re-map the Y value of each points between zero and one.

vectorOperation_tapper_ogl

vectorOperation_tapper_ICEtree

ICE : 29.6 ms

vectorOperation_tapper_VOPs

VEX : 67.3 ms

The only difference is that ICE use “Get Minimum in Set” and “Get Maximum in Set” nodes to return a scalar value. Houdini use the “getbbox” function
to directly return the min and max corner of the box. In this test ICE is still twice faster.

————————————————————————————————————————————————————————-

Test C : pcloudQuery.scn/hip
This test compare the “Get Closest Points” ICE node to the set of VOPs  ”pcopen”, “pciterate” and “pcimport”.
It is very similar to the pcBulletHoles file from the Houdini exchange.
A rather dense mesh (40000 pts) is displaced when some particles (400000 in the test)  are very closed to its surface.
The search radius is set to 0.1 (for a grid of 4.2 by 4.2 units), and maximum number of points is set to 10.

pcloud_query_ICEtree

ICE : 2741 ms

pcloud_query_VOPs_a

VEX : 1009 ms

This time Houdini is more than twice  faster ! Lets try to understand why.

Checking timing for each ICEtree nodes, it is clear that the bottleneck came from the Get Closest Points node  as it takes 95% of total processing.
From this node,  you get a point locator data. I think it is a Softimage concept only. This kind of object (programmabily speaking) represent a location on the surface of a geometry and compute the linear variation of each attributes along this surface. This is a very useful feature but for access to points clouds it looks like it is not so fast. I don’t think that the locator is trying to interpolate between points (as there is no surface in a point cloud), but it’s a fact, it works slower than in Houdini to get the points data.

In Houdini, we use the pcopen function to get  a bunch of points (a handle) in a radius around a position, and pcimport function to get attributes from the points in the handle. Those specialized functions are really fast.

If ICE is slower in this test, it is because of the Get Closest Points implementation, not because of the ICE graph architecture.

————————————————————————————————————————————————————————-

Test D : raycast_reflexionWmap.scn/hip

A camera ray is compute and set for each grid points. Then, to test some space transformations operations, I’m using the point reference frame (X and Z  axis tangent to the point surface and Y axis parallel to the point normal) to compute a reflection ray and then find the intersection with the elephant.  The test use “Raycast” ICE node and “Intersect” VOP node.

reflection_wmap_OGL

reflection_wmap_ICEtree

ICE : 32247 ms using space transformation to get the reflection vector

specular_reflexion_VOPs

VEX : 975 ms using space transformation to get the reflection vector

————————————————————————————————————————————————————————-

The difference is huge in favor of Houdini. This time, the bottleneck  is in the Get > self.PointReferenceFrame operation. I was rather surprised because “PointReferenceFrame”  is a factory attribute in Softimage (in Houdini I built it from scratch).

specular_reflexion_ICEAxisAngle

ICE : 402 ms to compute reflection using axis and angle

So I rebuild the test with an other reflection vector function. Using the normal axis to rotate the camera ray to 180 degree .

specular_reflexion_VOPsAxisAngle

VEX : 850 ms to compute reflection using axis and angle

This time ICE is twice faster than Houdini !
On a side note, this setup could be handy to quickly fine tune a reflection
position on the surface of an object.

————————————————————————————————————————————————————————-

Test E : pcloudQuery_flowAroundBunny.scn/hip

This is a particle simulation test. 200 000 particles are emitted per seconds (frame rate :29.97).
Each particles looking for the closest point on the bunny surface. From this closest point, some vector operations are done to get the “flow around surface” behaviour.

This time I choose to split the Houdini setup in two parts in order to use the “Attribute Transfert” POP node. This node is often use in Houdini when you need to get some interpolated data on the object surface and unfortunately I don’t think we can use a similar operation inside a VOP graph.
So in Houdini, I first get the closest bunny points attributes (P and N) with the “Attribute Transfert” node and then use a VOP to edit the particle velocity and get the flow effect.

bunny_flow_ogl

bunny_flow_ICEtree

ICE : 12 seconds (to go to frame 30).

bunny_flow_VOPs

Houdini : 59 seconds (to go to frame 30) !

————————————————————————————————————————————————————————-

On this scenario, ICE is way much faster !  In Houdini the bootleneck is precisely the “Attribute Transfert” POP node.
So lets give it an other try in Houdini using the pcopen, pciterate and pcimport functions. The graph is much longer to setup that in ICE (where you only need one node…), but it is not exactly the same as it doesn’t return interpolated attributes like the ICE locator object on a geometry surface ;) .

bunny_flow_VOPs2

VEX : 9 seconds (to go to frame 30)

————————————————————————————————————————————————————————-

Again for the specific case of point cloud only query, Houdini is faster than ICE. I also test in ICE replacing the Closest Location by a Closest Points node but it was twice slower…

————————————————————————————————————————————————————————-

Conclusion :

The original comparison idea came from a discussion with Thiago Costa about ICE and Houdini speed. Thiago explained me that we can think of ICE like a GPU engine. I think it is an interesting analogy. ICE is a specialized application for vector processing. Each time you will need to do this kind of job and nothing else, it will be very fast.
VOPs are maybe less optimized for vector processing, however, for heavy particles queries, the Houdini point clouds functions are also very specialized and seem to win in this particular field (no pun intended ;) ).

The goal of this article was not to prove that an application is faster than the other one but to show that it is no so simple to compare both systems as timing results depends on what you need to do and how can you do it.
Furthermore, Houdini and Softimage are both rather huge applications and so can’t be describe to just a VOPs or ICE system.

I hope you enjoy this (rather long) flight.

Cheers !

Guillaume Laforge


Categories: dev · houdini · xsi

Houdini Renderset V1

January 11, 2009 · Leave a Comment

A Renderset tools update :

I’m using the Render menu now to execute the python scripts. Here is a quick demo here.

 

 

 

 

 

 

I uploaded those scripts to the Houdini Exchange.

Copy the script and otls folders in one of your $HOME  path for example.  if you want to call those python scripts from the render menu, copyalso  the “MainMenuCommon” file into your $HOME path.
If you want to use them from the shelf, you can copy the code from ”new_renderset.py” into the script tab of a new tool.
Same thing for the  ”add_to_renderset.py”.

The otl file is a simple python Object Operator. By default (when you create it)  this object is just a subnet, and so you can add some objects inside without using the HDA mechanism (the “allow editing of contents”) .  I use this object to find all the renderset nodes as their type is “renderset”. The only little problem is that I can’t create a custom icon for this HDA as it turns it into a “real” HDA .

Categories: houdini

Houdini RenderSet tool

January 4, 2009 · Leave a Comment

Thanks to the cold winter here in France, I spent a large part of this weekend in front of two screens. 

The first screen was in theatre to see Igor.
I must confess that I didn’t really want to go see it but my daughter found the good words I guess. But I enjoyed it, maybe more than wall-e or madacascar2. Igor is a very good movie ! I like those moment when you go to see a movie without reading any critics or seeing any teasers and get a good surprise :) .

The second screen was on my laptop.
I’m learning HOM (Houdini Object Model) in my spare time and I decided to translate my houdini render pass system into a shelf tool. I changed the name to RenderSet, just because it is not exactly the same than Softimage RenderPass :) . It is in early stage at the moment. I’m using the houdini shelf tool to just run some python code for now.  So, no HDA (Houdini Digital Asset) to install and  no need to “allow editing of contents”  of some nodes anymore. The tool workflow looks promising to me. Now I need to implement a serious user interface (I don’t like the pop up system to much). I should also add some RenderSet presets (matte, shadow, occlusion etc…), and some other little ideas too… 

You can clic on the image to see a little demo :

ui1

Categories: houdini

fin flow

July 19, 2008 · 4 Comments

When I was a child I wanted to be a fluid dynamics enginner.

Categories: houdini

Houdini Passe system V1

July 15, 2008 · 1 Comment

I found time to do some little test with the digital asset from this post so I think I can share it now.
In object context, tab > digital asset > guil pass will give you the operator. To use it you must “allow its content”, otherwise the rendered passes will be empty.

WARNING : The digital asset is create with Houdini apprentice edition. I just share it to show a way to manage passe rendering in Houdini. If you put this digital asset in your productions hip file, once saved, it will be a none commercial file !!!

Categories: houdini

An Houdini passes system

June 29, 2008 · 2 Comments

Switching regularly between Houdini and XSI, I often miss the Softimage passes system. Houdini gives us several options to render the same animation with different “render settings” in one hip file, but nothing so fast and easy to manage than the XSI passes system.

As this little diagram illustrate ( made with dial, a simple but very usefull free tool) , it looks like it is possible to build an HDA to mimic Softimage system in Houdini (if we considere that an object node with an “object merge” sop looks like a partition).

The HDA interface is there to allow the user to quickly set those “object merge/partition” with a material and some rendering option (only Phantom for the moment to mimic the primary ray toggle we could add using an override on an XSI partition).

The trick is to limit the scope of the mantra rop used by this HDA to only the “objects merge/partitions” from this HDA. The way I found is to use a null named “out_passe” below the object merge and a null named “not_passe” not connected and used for render and display.
Then I use a pre-render script like this one :
“opset -r on /obj/`opname(“../..”)`/partition_0/out_passe”.
and a post-render script like this :
“opset -r on /obj/`opname(“../..”)`/partition_0/not_passe”.

There is maybe a simple way to limite the scope of a rop to only one object node but I was too leasy to investigate more :-p.

Here is a little demo of an early version of this HDA.

For now each “passes” has its own renderer like in XSI before V6. A global/local system for the render options could be possible but it is so easy to link parameters and build its own rendering panel in Houdini that I don’t want to add such a feature.

To finished here is a picture of a famous french point break. No relation with the passe sytem, but I just don’t want that you call the police !

Categories: houdini