Announcing pbjAS – An ActionScript 3.0 Pixel Bender Shader Library

One of the major new features in Flash Player 10 is Pixel Bender. Like its name suggests, the primary purpose of Pixel Bender is to allow you to easily manipulate pixels inside a Flash application. A great demo of this is the FotoBooth application, which applies different filters to webcam input. While tweaking pixels is the primary purpose of Pixel Bender, it can also be used as a multi-threaded number crunching machine. You can pass it some numbers, have it perform some mathematical operations on those numbers, and then return the result. This opens up some very interesting opportunities to get outside of the normal single-threaded nature of ActionScript in Flash Player.

Pixel Bender number crunching machines can be used as filters for things on screen or used for just generic number crunching. To create one of these number crunching machines, known as a Shader, you would usually use the Pixel Bender Toolkit. The Pixel Bender Toolkit runs on Mac and Windows and allows you to write a Shader in the Pixel Bender shader language. Shaders can then be exported for use in Flash Player.

To run a Shader in a Flash application (whether it’s built with Flex or something else) the compiled Shader (a .pbj file) needs to be loaded and then either run as a ShaderFilter or a ShaderJob. A ShaderJob can either run synchronously or asynchronously. If run asynchronously then it’s non-blocking and the ActionScript code continues to run at the same time as the Shader. Here is an example for how to run a ShaderFilter:

    [Embed(source="VerySimpleFilter.pbj", mimeType="application/octet-stream")]
    var TestShaderClass:Class;
 
    testShader = new Shader(new TestShaderClass());
 
    var shaderFilter:ShaderFilter = new ShaderFilter(testShader);
    i.filters = [shaderFilter];

Here is how you run a ShaderJob:

    [Embed(source="VerySimpleFilter.pbj", mimeType="application/octet-stream")]
    var TestShaderClass:Class;
 
    var testShader:Shader = new Shader(assembledPBJByteArray);
 
    result = new ByteArray();
 
    var shaderJob:ShaderJob = new ShaderJob(testShader, result, width, height);
    shaderJob.start();

This is very cool stuff and there are a ton of amazing possible uses of Pixel Bender. But I wasn’t happy with having to use the Pixel Bender Toolkit to create Shaders. Compiled Pixel Bender Shaders are just byte code that tells the Pixel Bender virtual machine what to do. So why can’t they be created at runtime in ActionScript? Well, now they can!

The pbjAS library is an AS3 library for creating Pixel Bender Shaders in ActionScript. When I began researching it I discovered that Tinic Uro (a Flash Player engineer) had created C++ applications to assemble and disassemble Pixel Bender byte code. This would have been a great starting place but I don’t know C or C++ and at the time knew nothing about byte code. Luckily I found that Haxe already had a Pixel Bender Shader library and since the Haxe language is similar to ActionScript this was a great starting place for pbjAS.

I began writing pbjAS by porting the Haxe code to ActionScript. The languages are pretty similar but I needed to rework a few things. For instance Haxe has Enums while AS3 does not; Haxe can do a switch statement on a type while AS3 can’t. As I began porting the code I also started writing unit tests. There are two major parts of the library: the assembler, which creates Pixel Bender Shaders, and the disassembler, which converts Pixel Bender Shaders into their AS3 representations. Both parts are now working as confirmed by a simple unit test that takes a prebuilt Shader, disassembles it, and then reassembles it into the byte code that matches the prebuilt Shader.

Assembling a Shader with pbjAS requires creating a PBJ object, which contains information about the Shader parameters and operations. The operations are defined in an assembly-like manner. This may seem strange for those familiar with the Pixel Bender Toolkit because it uses a higher level language. The assembly-like language is very similar to the way the byte code is actually organized. While the Pixel Bender assembly language is pretty straightforward, it is not the ideal language for building number crunching machines. So phase two of the pbjAS project is to wrap the assembly language with a higher level language – perhaps MathML.

The steps to create a Pixel Bender Shader with pbjAS are:

1) Get the pbjAS library and include it in a project’s library path

2) Create a PBJ object and set some metadata about the Shader:

      var myPBJ:PBJ = new PBJ();
      myPBJ.version = 1;
      myPBJ.name = "SingleMulFilter";

3) Add input and output parameters (of type PBJParam) to the PBJ:

myPBJ.parameters = [
  new PBJParam("num1", new Parameter(PBJType.TFloat, false, new RFloat(0, [PBJChannel.R]))),
  new PBJParam("num2", new Parameter(PBJType.TFloat, false, new RFloat(1, [PBJChannel.R]))),
  new PBJParam("product", new Parameter(PBJType.TFloat, true, new RFloat(2, [PBJChannel.R])))];

Each PBJParam has a name, a Parameter, and optional metadata. In this example the first parameter’s name is num1. Its Parameter is of type TFloat. There are four types of floats: TFloat (a single float), TFloat2 (contains two floats), TFloat3 (contains three floats), and TFloat4 (contains four floats). Each parameter also needs to specify whether it is an output parameter. In the example above the first two parameters are inputs while the last one is an output parameter. The third argument in Parameter’s constructor is the register for the parameter. A register is basically an identifier for the parameter – it specifies an index and information about how to access the parameter. The first parameter has a register index of zero. Like parameter types there are four register types: RFloat, RFloat2, RFloat3, and RFloat4. Every register has four possible channels. This corresponds with the four floats typically used in pixel manipulation: Red, Green, Blue, and Alpha. Every register other than RFloat4 (which uses all four channels) must specify which channel(s) to use. In the example above each register uses only one channel so it’s an RFloat and the channel they use on the register is the Red channel, which is declared as “PBJChannel.R”.

4) Next add the operations to perform on the input:

myPBJ.code = [
  new OpMul(new RFloat(0, [PBJChannel.R]), new RFloat(1, [PBJChannel.R])),
  new OpMov(new RFloat(2, [PBJChannel.R]), new RFloat(0, [PBJChannel.R]))];

There are numerous operations that can be performed on the specified data. A full list of the operations is in the pbjas.ops package or look in the Pixel Bender Reference (currently only found in the Pixel Bender Toolkit). In this example the first operation multiplies register number 0′s Red channel by register number 1′s Red channel. The result always goes into the first specified register, in this case register 0. The next operation moves the value in register 0′s Red channel to register 2′s Red channel, which is the output parameter.

5) The byte code for the Shader now needs to be created:

var assembledPBJByteArray:ByteArray = PBJAssembler.assemble(myPBJ);

6) Then a new Shader is created from the byte code:

var testShader:Shader = new Shader(assembledPBJByteArray);

7) Any input parameters can now be specified on the Shader:

testShader.data.num1.value = [Math.round(Math.random() * 1000)];
testShader.data.num2.value = [Math.round(Math.random() * 1000)];

Notice that the values are specified by the names given to the input parameters. Also the values must be set in an Array because depending on the parameter type there can be between one and four values for each input parameter.

8) Create a Vector to hold the result:

var result:Vector.<Number> = new Vector.<Number>();

Shaders can have results as either Vector, ByteArray, or BitmapData. Shaders can also have input parameters as type Texture, which can be passed to the Shader as any of those three types.

9) Create a ShaderJob, which will allow the Shader to be run:

var shaderJob:ShaderJob = new ShaderJob(testShader, result, 1, 1);

The third parameter is the width of the input and the fourth is the height. These parameters determine how many times to run the Shader. In this case it will just be run once. If a Vector of length twenty were used as an input to the Shader then the width multiplied by the height must be equal to twenty. Pixel Bender will take advantage of multiple CPUs if they exist but for that to happen the height of the ShaderJob must be more than one. So it’s best to give a ShaderJob a height of at least two if it will be run on a data set with more than one item. The output from the Shader is always equal to the length of the input. However Textures in Pixel Bender can have from one to four channels. If a Texture with four channels is used on a Vector then the number of items in the Vector must be a multiple of four. Also in that case the height and width would be one fourth the length of the Vector.

10) Start the Shader in synchronous mode:

shaderJob.start(true);

The optional parameter to the start method on the shaderJob tells it whether to run synchronously or asynchronously. In this case it’s set to true meaning it should run synchronously. If it’s run asynchronously then an event listener needs to be registered on the shaderJob’s complete event. Asynchronous ShaderJobs will not block the UI.

You can see the full source code for this example in the pbjas’s TestShaderJob unit test.

Demos

The first demo creates a simple multiply Shader that then is just applied to an image. The Shader is run when the slider value changes.

(view source)

This second demo compares the difference between calculating a bunch of square roots in AS3 and in Pixel Bender. Pixel Bender does well but most of it’s time is actually still being spent in AS3 code execution moving data around. Since it’s a really simple Pixel Bender filter this isn’t a great example of just how fast Pixel Bender is but it still beats out AS3 for very large data sets. Also in this demo I’m using a small library I started playing with that will slice up the input into multiple ShaderJobs. The reason I do this is that the maximum theoretical input size for a Pixel Bender Shader is 16,777,216 items. However due to a bug in Flash Player I consistently get crashes on data sets larger than about 2,000,000 items. So with very large data sets it’s nice to have some automatic slicing. Also the maximum height or width of a Shader is 8,192. So each Shader’s height and width needs to be calculated to avoid hitting that limit. This stuff is in the MathPBJ project. Here is the sqrt demo:

(view source)

The next demo shows that a more complex Shader can crunch numbers while not locking the UI:

(view source)

Finally this last demo isn’t very exciting. It’s just the unit tests for the pbjAS library. But if you want to better understand how to use the library the best place to learn is by looking at the unit tests.

(view source)

Resources

Now that you know what pbjAS is and how to use it here are some other resources you might need to get started:

Future Plans

This is really just a 0.1 release of the pbjAS library. So there is still more to do. One of the major things I’d like to do is to wrap pbjAS with a higher level language, perhaps MathML. That would make using Pixel Bender for number crunching transparent and easy. Before that though I need to get 100% unit test coverage on the existing library. I’d love any help on these items. All the code is in GitHub awaiting your contributions! So please let me know what you think and feel free to contribute!

This entry was posted in Flex, Pixel Bender. Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.
  • jdk

    AS3 is faster than Pixel Bender for every size of data set on my system. Presumably any system with a supported video card will be faster in Pixel Bender or is there some other qualification?

    Thanks for sharing this library; the Mac/Win software requirement had kept me from trying this, until now.

  • http://www.paulschoneveld.com/ Paul Schoneveld

    This looks awesome! I’ve recently been doing some experimental projects and I was just looking for some new ways to get more power out of flash. This might just be the thing I need!

    I’m going to see what I can do with pbjAS and I’ll let you know if I get some good results.

  • http://www.jamesward.com James Ward

    Hi jdk,

    Today Pixel Bender runs solely on the CPU not the GPU. Do you have a dual-core / dual-cpu machine?

    -James

  • rumori

    Great work, guess we’re on the same wavelength with this.
    I’m also working on a node based IDE.

    http://rumori.sourcebinder.org/?page_id=7

  • Pingback: Announcing pbjAS - An ActionScript 3.0 Pixel Bender Shader Library | Lively Flash Tuts

  • Pingback: pbjAS Library « Jufa

  • Gustavs

    What was your motivation to port the library manually? The haXe implementation can generate AS3 code so it should be possible to get about the same thing quickly.

    Nonetheless, this is a good opportunity for AS-only users to use the library too, I suppose.

  • http://www.jamesward.com James Ward

    Hi Gustavs,

    I debated that but decided it would be easier and provide me more flexibility to just port the code. I had also considered using Alchemy on Tinic’s libraries but decided not to do that for the same reason.

    -James

  • http://www.mxml.it Giorgio Natili

    Hi James,

    As we discussed on the phone it’s a good solution and was the same I figured out with Franco Ponticelli (really related to HaXe) and that we planned to put in place…
    The great stuff here is the HaXe library and everybody have to thanks a lot the HaXe community guys that have played with bytes for us…

    By the way, your library helped me much to complete the solution Franco and I have planned and coded a lot, the main differences between our and your is the use of Enumerations (simulated in AS3 but native in HaXe) and some bugs we are fixing…

    :)

    If you believe it could help I can show to you a preview, at the end the code of us will be Open Source too…

  • Pingback: Some interesting articles : Mihai CORLAN

  • http://leichtgewicht.at Martin Heidegger

    Very nice, except that the “Compare Sqrt”-demo crashed my browser… gotta digg it

  • http://www.jamesward.com James Ward

    Hi Martin,

    I slice the Sqrt job into blocks of 2,000,000 items. Maybe that is too large for your machine. How much RAM do you have?

    -James

  • Lars

    Very interesting article, thank you!
    We are planning to use PixelBender in our financial flex application to introduce som more math related stuff.
    I was wondering whether there was a method to further speed up calculations by going parallelly.?
    Example we want to calculate 500 option prices using the black & Scholes formula.
    http://en.wikipedia.org/wiki/Black-Scholes

    These 500 options changes basically every second during market hours, so to use several shaders parallelly would be great. Have searched the Net for such approach, but can´t find any examples.

    Any ideas as how to go about this.

    Cheers,
    Lars

    • Hernan

      Lars, we successfully migrated Black Scholes Formulas from AS3 to PixelBender in our financial flex application and it´s working great. Please feel free to contact me if you need more details

  • Andrew Mullins

    Lovely (except for the crash).

    “Calculate Sqrts” crash :
    ff 3.0.5
    fp 10.0.22.87

    Intel Q6600 w/ 8GB RAM on Win Vista Ultimate

    Tested without fault on Macbook w/ 4GB RAM using ff 3.0.8.

  • http://www.jamesward.com James Ward

    Hi Lars,

    You can kick off multiple asynchronous ShaderJobs at the same time.

    -James

  • http://www.bmo-design.de BMo

    very cool, thx, great work.

  • Pingback: Flex in Zurich and Copenhagen Next Week | James Ward – RIA Cowboy

  • Carlos

    Hello, James,

    Pretty interesting technology! Hey, I just LOVED the wallpaper in the demo (here: http://www.adobe.com/products/flex/media/flexapp ). Is this a personal picture or something from the web? Any way I can get a hold of it? Thanks!

  • http://www.jamesward.com James Ward

    Hi Carlos,

    The picture is the cover for First Steps in Flex and was taken by a friend of mine while we were in the Dolomites in Italy.

    -James

  • Carlos

    Awesome! Those are superb pictures! Thank you, Sir!

  • http://www.jamesward.com James Ward

    If you upgrade to the latest Flash Player (10,0,32,18 ) then the crashes should go away since FP-1845 has been fixed.

  • Pingback: ?????? ???? | dieBuster

  • http://www.hibernum.com lvdeluxe

    Hi James,

    Im trying to use your library for matrix calculations but I cant get any result for now (2 days i’m playing with it…)

    Basically, i’d like to multiply 2 matrices but as PBJAS output cannot be a TFloat4x4, i cannot get the result directly.

    So I thought using a TFloat4x4 as src1 and TFloat4x4 as src2, then manipulating their values and putting them in a Float4 would do the job but, obviously it doesn’t…

    Here are my parameters :

    new PBJParam(“_OutCoord”, new Parameter(PBJType.TFloat2, false, new RFloat(0, [PBJChannel.R,PBJChannel.G]))),
    new PBJParam(“src1″, new Parameter(PBJType.TFloat4x4, false, new RFloat(1))),
    new PBJParam(“src2″, new Parameter(PBJType.TFloat4x4, false, new RFloat(5))),
    new PBJParam(“output”, new Parameter(PBJType.TFloat4, true, new RFloat(9, [PBJChannel.R,PBJChannel.G,PBJChannel.B,PBJChannel.A])))

    src1 and src2 width and height are 1 (with 16 values for each)
    output has a width of 4 because i want to make it a matrix after the pbjas calculations (TFloat4 * 4 = 16 values for my 4X4 matrix)

    But every opcode i tried with these values is crashing the player, for ex:

    new OpSampleLinear(new RFloat(9,[PBJChannel.R,PBJChannel.G,PBJChannel.B,PBJChannel.A]),new RFloat(0,[PBJChannel.R, PBJChannel.G]), 1)

    Here I was expecting to get the values of the first row of src1 in my output but the player crashes without a warning….there is something i dont understand here but no idea what it can be

    Could you please give me a tip on how to use 4X4 matrices with your lib ?

    Thanks in advance and congrats for this awesome tool, i had some incredible performance improvments with it, i’ll let you know when i’ll finish a project working with pbjas

  • http://www.jamesward.com James Ward

    Hi lvdeluxe,

    I haven’t tested the TFloat4x4 at all. So it’s possible that it’s broken. I’ll try to create a unit test for it first. Once we can verify that TFloat4x4 works at all then we can see how to make it do the matrix multiply.

    Sorry this stuff is not very well tested. I didn’t think anyone was using it – but I’m glad to see that you are!

    Feel free to email me so we can correspond more easily: jaward at adobe dot com

    -James

  • Pingback: Smart-Page.net » Blog Archive » SmartReLight - relighting with Pixel Bender

  • http://tweetmasher.com eco_bach

    James
    I’m guessing the square root demo might contain a bug. With the latest debug version 10,1,53,7 of fp, selecting the PB option gives inconsistent results, and sometimes hangs the browser. Selecting the AS3 option always works.

  • http://www.jamesward.com James Ward

    @eco_bach I’m not seeing any problems with 10,1,53,20 on Linux. Can you test with the latest FP 10.1 RC?

  • Manuel

    Hi I have a problem as I can do a search iten in a database in flex 4
    I can see the fields in a database, but how do I find a expesifico data or a single database in a data grid

    Hola tengo un problema como puedo hacer un buscador de iten en una base de datos en flex 4
    yo puedo visualizar los campos de una base de datos pero como hago para buscar un dato epesifico de una base de datos en un data grid

  • http://con.cept.me Chris

    Hmm interesting but AS 3 seems 400% faster on every value :p

    • http://www.jamesward.com James Ward

      Which test? And which version of Flash Player?

      • http://masputih.com Anggie Bratadinata

        I guess Chris was talking about the sqrt test. AS3 is 3 – 4 times faster on my Dell laptop too. AS3 finishes in less than 1 ms on 20 million numbers while pixel bender takes more than 3 secs.

        FP version:10.1.82.76

        In case you’re interested, here’s my machine’s spec:
        Win 7 32-bit. 4GB RAM. Core2 Duo. ATI Radeon Mobility 3450HD.

        Cool stuff, though. I’m going to use your library in my current project. Thanks for sharing!

        • http://www.jamesward.com James Ward

          Interesting. I guess AS3 code execution has gotten a lot faster in Flash Player 10.1! That is great!

          • Antti Hahl

            I’m not sure about that ;) Pixel Bender is here about 40% faster. Time for 20 million was on average 2000ms vs. 2800ms for AS3.

            This was with the latest Flash Player 10.1, 64-bit Windows 7 and Intel Q6600.

            Also tested the 64-bit preview of Flash but there was not much difference in results.

  • Clark

    10.1.82.76

    Intel i7.

    The pixel bender square roots are 4-5 times faster than AS3 for me.

  • Pingback: AS3 Audio Mixer with dynamic track count using Pixel Bender | BlixtSystems

  • Janther

    Hi,
    does this library only work with numerical operations or could i perform some other stuff like a multithreaded object instantiation?

    • http://www.jamesward.com James Ward

      It is just for mathematical operations.

  • Brian

    nice stuff.. can that shader be used in conjunction with a zoom function as well?

    • http://www.jamesward.com James Ward

      I think so. Which zoom function are you referring to?

  • http://riaoo.com Y.Boy

    Hi, Ward!
    I found the PBJAS through your blog. It’s a good thing sure. But, do you have a idea to compile the pbk into pkj ? use ActionScript 3, not the Pixel bender toolkit or pbutil.exe.

    Thanks and hope you reply!
    My email: y_boy126com

  • http://twitter.com/internetsurfing James Jackson

    Why not use Haxe? and HaXe Shader Language? Which allows you to generate pixel bender bitcode using strict typing in the same language as you code you application… check it out..  http://haxe.org/manual/hxsl

  • Kyang

    Hi,
    Could you give me a hand to translate the code blow into ASpbj:
    {    input image4 src;    parameter float x_shift;    parameter float y_shift;    output pixel4 dst;
        void    evaluatePixel()    {        float2 out_new = float2(outCoord().x+x_shift,outCoord().y+y_shift);        dst = sampleNearest(src,out_new );    }}

    Thanks a million.



  • View James Ward's profile on LinkedIn