Sunday, August 23, 2015

Contravariance and luminance to add safety to uniforms

It’s been a few days I haven’t posted about luminance. I’m on holidays, thus I can’t be as involved in the development of the graphics framework as I’m used to on a daily basis. Although I’ve been producing less in the past few days, I’ve been actively thinking about something very important: uniform.

What people usually do

Uniforms are a way to pass data to shaders. I won’t talk about uniform blocks nor uniform buffers – I’ll make a dedicated post for that purpose. The common OpenGL uniform flow is the following:

  1. you ask OpenGL to retrieve the location of a GLSL uniform through the function glGetUniformLocation, or you can use an explicit location if you want to handle the semantics on your own ;
  2. you use that location, the identifier of your shader program and send the actual values with the proper glProgramUniform.

You typically don’t retrieve the location each time you need to send values to the GPU – you only retrieve them once, while initializing.

The first thing to make uniforms more elegant and safer is to provide a typeclass to provide a shared interface. Instead of using several functions for each type of uniform – glProgramUniform1i for Int32, glProgramUniform1f for Float and so on – we can just provide a function that will call the right OpenGL function for the type:

class Uniform a where
  sendUniform :: GLuint -> GLint -> a -> IO ()

instance Uniform Int32 where
  sendUniform = glProgramUniform1i

instance Uniform Float where
  sendUniform = glProgramUniform1f

-- and so on…

That’s the first step, and I think everyone should do that. However, that way of doing has several drawbacks:

  • it still relies on side-effects; that is, we can call sendUniform pretty much everywhere ;
  • imagine we have a shader program that requires several uniforms to be passed each time we draw something; what happens if we forget to call a sendUniform? If we haven’t sent the uniform yet, we might have an undefined behavior. If we already have, we will override all future draws with that value, which is very wrong… ;
  • with that way of representing uniforms, we have a very imperative interface; we can have a more composable and pure approach than that, hence enabling us to gain in power and flexibility.

What luminance used to do

In my luminance package, I used to represent uniforms as values.

newtype U a = U { runU :: a -> IO () }

We can then alter the Uniform typeclass to make it simpler:

class Uniform a where
  toU :: GLuint -> GLint -> U a

instance Uniform Int32 where
  toU prog l = U $ glProgramUniform1i prog l

instance Uniform Float where
  toU prog l = U $ glProgramUniform1f prog l

We also have a pure interface now. I used to provide another type, Uniformed, to be able to send uniforms without exposing IO, and an operator to accumulate uniforms settings, (@=):

newtype Uniformed a = Uniformed { runUniformed :: IO a } deriving (Applicative,Functor,Monad)

(@=) :: U a -> a -> Uniformed ()
U f @= a = Uniformed $ f a

Pretty simple.

The new uniform interface

The problem with that is that we still have the completion problem and the side-effects, because we just wrap them without adding anything special – Uniformed is isomorphic to IO. We have no way to create a type and ensure that all uniforms have been sent down to the GPU…

Contravariance to save us!

If you’re an advanced Haskell programmer, you might have noticed something very interesting about our U type. It’s contravariant in its argument. What’s cool about that is that we could then create new uniform types – new U – by contramapping over those types! That means we can enrich the scope of the hardcoded Uniform instances, because the single way we have to get a U is to use Uniform.toU. With contravariance, we can – in theory – extend those types to all types.

Sounds handy eh? First thing first, contravariant functor. A contravariant functor is a functor that flips the direction of the morphism:

class Contravariant f where
  contramap :: (a -> b) -> f b -> f a
  (>$) :: b -> f b -> f a

contramap is the contravariant version of fmap and (>$) is the contravariant version of (<$). If you’re not used to contravariance or if it’s the first time you see such a type signature, it might seem confusing or even magic. Well, that’s the mathematic magic in the place! But you’ll see just below that there’s no magic no trick in the implementation.

Because U is contravariant in its argument, we can define a Contravariant instance:

instance Contravariant U where
  contramap f u = U $ runU u . f

As you can see, nothing tricky here. We just apply the (a -> b) function on the input of the resulting U a so that we can pass it to u, and we just runU the whole thing.

A few friends of mine – not Haskeller though – told me things like “That’s just theory bullshit, no one needs to know what a contravariant thingy stuff is!”. Well, here’s an example:

newtype Color = Color {
    colorName :: String
  , colorValue :: (Float,Float,Float,Float)

Even though we have an instance of Uniform for (Float,Float,Float,Float), there will never be an instance of Uniform for Color, so we can’t have a U Color… Or can we?

uColor = contramap colorValue float4U

The type of uColor is… U Color! That works because contravariance enabled us to adapt the Color structure so that we end up on (Float,Float,Float,Float). The contravariance property is then a very great ally in such situations!

More contravariance

We can even dig in deeper! Something cool would be to do the same thing, but for several fields. Imagine a mouse:

data Mouse = Mouse {
    mouseX :: Float
  , mouseY :: Float

We’d like to find a cool way to have U Mouse, so that we can send the mouse cursor to shaders. We’d like to contramap over mouseX and mouseY. A bit like with Functor + Applicative:

getMouseX :: IO Float
getMouseY :: IO Float

getMouse :: IO Mouse
getMouse = Mouse <$> getMouseX <*> getMouseY

We could have the same thing for contravariance… And guess what. That exists, and that’s called divisible contravariant functors! A Divisible contravariant functor is the exact contravariant version of Applicative!

class (Contravariant f) => Divisible f where
  divide :: (a -> (b,c)) -> f b -> f c -> f a
  conquer :: f a

divide is the contravariant version of (<*>) and conquer is the contravariant version of pure. You know that pure’s type is a -> f a, which is isomorphic to (() -> a) -> f a. Take the contravariant version of (() -> a) -> f a, you end up with (a -> ()) -> f a. (a -> ()) is isomorphic to (), so we can simplify the whole thing to f a. Here you have conquer. Thank you to Edward Kmett for helping me understand that!

Let’s see how we can implement Divisible for U!

instance Divisible U where
  divide f p q = U $ \a -> do
    let (b,c) = f a
    runU p b
    runU q c
  conquer = U . const $ pure ()

And now let’s use it to get a U Mouse!

let uMouse = divide (\(Mouse mx my) -> (mx,my)) mouseXU mouseYU

And here we have uMouse :: U Mouse! As you can see, if you have several uniforms – for each fields of the type, you can divide your type and map all fields to the uniforms by applying several times divide.

The current implementation is almost the one shown here. There’s also a Decidable instance, but I won’t talk about that for now.

The cool thing about that is that I can lose the Uniformed monadic type and rely only on U. Thanks to the Divisible typeclass, we have completion, and we can’t override future uniforms then!

I hope you’ve learnt something cool and useful through this. Keep in mind that category abstractions are powerful and are useful in some contexts.

Keep hacking around, keep being curious. A Haskeller never stops learning! And that’s what so cool about Haskell! Keep the vibe, and see you another luminance post soon!

Sunday, August 16, 2015

Never forget your git stashes again!

It’s been a while I’m experiencing issues with git stash. If you don’t know that command yet, git stash is used to move all the changes living in your staging area into a special place: the stash.

The stash is a temporary area working like a stack. You can push changes onto it via git stash or git stash save; you can pop changes from top with git stash pop. You can also apply a very specific part of the stack with git stash apply <stash id>. Finally you can get the list of all the stashes with git stash list.

We often use the git stash command to stash changes in order to make the working directory clear again so that we can apply a patch, pull some changes, change branch, and so on. For those purposes, the stash is pretty great.

However, I often forget about my stashes – I know I’m not the only one. Sometimes, I stash something and go to cook something or just go out, and when I’m back again, I might have forgotten about what I had stashed, especially if it was a very small change.

My current prompt for my shell, zsh, is in two parts. I set the PS1 environnment variable to set the regular prompt, and the RPROMPT environnment variable to set a reversed prompt, starting from the right of the terminal. My reversed prompt just performs a git command to check whether we’re actually in a git project, and get the current branch. Simple, but nice.

I came up to the realization that I could use the exact same idea to know whether I have stashed changes so that I never forget them! Here’s a screenshot to explain that:

As you can see, my prompt now shows me how many stashed changes there are around!

The code

I share the code I wrote with you. Feel free to use it, modify it and share it as well!

# …

function gitPrompt() {
  # git current branch
  currentBranch=`git rev-parse --abbrev-ref HEAD 2> /dev/null`
  if (($? == 0))
    echo -n "%F{green}$currentBranch%f"

  # git stash
  stashNb=`git stash list 2> /dev/null | wc -l`
  if [ "$stashNb" != "0" ]
    echo -n " %F{blue}($stashNb)%f"

  echo ''

PS1="%F{red}%n%F{cyan}@%F{magenta}%M %F{cyan}%~ %F{yellow}%% %f"

# …

Have fun!

Tuesday, August 11, 2015

Luminance – what was that alignment stuff already?

Yesterday, I released a new article about how I implement vertex arrays in luminance. In that article, I told you that the memory was packed with alignment set to 1.

Well, I’ve changed my mind. Some people pointed out that the good thing to do for most GPU is to align on 32-bit. That is, 4 bytes. The alignment should be 4 bytes, then, not 1.

There might be an issue with that. If you store a structure with attributes which sizes are not a multiple of 4 bytes, it’s likely you need to add padding.

However, I just reviewed my code, and found this:

instance (GPU a,KnownNat n,Storable a) => Vertex (V n a) where
instance (Vertex a,Vertex b) => Vertex (a :. b) where

Those are the single instances for Vertex. That means you can only use V and (:.) to build up vertices. Look at the V instance. You’ll find a GPU typeclass constraint. Let’s look at its definition and instances:

class GPU a where
  glType :: Proxy a -> GLenum

instance GPU Float where
  glType _ = GL_FLOAT

instance GPU Int32 where
  glType _ = GL_INT

instance GPU Word32 where
  glType _ = GL_UNSIGNED_INT

Woah. How did I forget that?! Let me translate those information to you. That means we can only have 32-bit vertex component! So the memory inside vertex buffers will always be aligned on 4 bytes! No need to worry about padding then!

The first implication is the fact you won’t be able to use Word16, for instance. You’ll need to stick to the three types that have a GPU instance.

Note: that doesn’t prevent us from adding Double later on, because a Double is a 64-bit type, which is a multiple of 4 bytes!

That’s all I have for today. I’m working on something very exciting linked to render batching. I’ll talk about that when it’s cooked. ;)

Keep the vibe; keep building awesome things, and as always, thank you for reading me!

Monday, August 10, 2015

Luminance – Vertex Arrays

I’ve been up working on vertex arrays in my work-in-progress graphics framework, luminance, for several days. I’m a bit slow, because I’ve been through a very hard breakup and have been struggling to recover and focus. But here I am!

So, what’s new?

OpenGL allows programmers to send vertices to the GPU through what is called a vertex array. Vertex specification is performed through several functions, operating on several objects. You need, for instance, a vertex buffer object, an index buffer object and a vertex array object. The vertex buffer stores the vertices data.



For instance, you could imagine a teapot as a set of vertices. Those vertices have several attributes. We could use, for instance, a position, a normal and a bone index. The vertex buffer would be responsible of storing those positions, normals and bone indices. There’re two ways to store them:

  1. interleaved arrays ;
  2. deinterleaved arrays.

I’ll explain those later on. The index buffer stores integral numbers – mainly set to unsigned int – that index the vertices, so that we can connect them and create lines, triangles or more complex shapes.

Finally, the vertex array object is a state object that stores links to the two buffers and makes a connection between pointers in the buffer and attribute indices. Once everything is set up, we might only use the vertex array object. The exception is when we need to change the geometry of an object. We need to access the vertex buffer and the index buffer and upload new data. However, for now, that feature is disabled so that the buffers are not exposed to the programmer. If people think that feature should be implemented, I’ll create specialized code for that very purpose.

Interleaved and deinterleaved arrays

Interleaved arrays might be the most simple to picture, because you use such arrays every day when programming. Let’s imagine you have the following type in Haskell:

data Vertex = Vertex {
    vertPos    :: X
  , vertNor    :: Y
  , vertBoneID :: Z
  } deriving (Eq,Show)

Now, the teapot would have several vertices. Approximately, let’s state the teapot has five vertices – yeah, ugly teapot. We can represent such vertices in an interleaved array by simply recording them in a list or an array:



As you can see, the attributes are interleaved in memory, and the whole pattern is cycling. That’s the common way to represent an array of struct in a lot of languages, and it’s very natural for a machine to do things like that.

The deinterleaved version is:



As you can see, with deinterleaved arrays, all attributes are extracted and grouped. If you want to access the third vertex, you need to read the third X, the third Y and the third Z.

Both the methods have advantages and drawbacks. The cool thing about deinterleaved arrays is that we can copy huge regions of typed memory at once whilst we cannot with interleaved arrays. However, interleaved arrays store continuous structures, so writing and reading a structure back might be faster.

An important point to keep in mind: because we plan to pass those arrays to OpenGL, there’s no alignment restriction on the structure. That is, everything is packed, and we’ll have to pass extra information to OpenGL to tell it how to advance in memory to correctly build vertices back.

Generalized tuple

I think I haven’t told you yet. I have a cool type in luminance: the (:.) type. No, you don’t have to know how to pronounce that. I like to call it the gtuple type, because it’s a generalized tuple. You can encode (a,b), (a,b,c) and all kind of tuples with (:.). You can even encode single-typed infinite tuple! – a very special kind of list, indeed.

data a :. b = a :. b

infixr 6 :.

-- a :. b is isomorphic to (a,b)
-- a :. b :. c is isomorphic to (a,b,c)

newtype Fix f = Fix (f (Fix f)) -- from Control.Monad.Fix
type Inf a = Fix ((:.) a) -- infinite tuple!

Pretty simple, but way more powerful than the regular, monomorphic tuples. As you can see, (:.) is a right-associative. That means that a :. b :. c = a :. (b :. c).

That type will be heavily used in luminance, thus you should get your fet wet with it. There’s actually nothing much to know about it. It’s a Functor. I might add other features to it later on.

The Storable trick

The cool thing about (:.) is that we can provide a Storable instance for packed memory, as OpenGL requires it. Currently, the Storable instance is implemented like this:

instance (Storable a,Storable b) => Storable (a :. b) where
  sizeOf (a :. b) = sizeOf a + sizeOf b
  alignment _ = 1 -- packed data
  peek p = do
    a <- peek $ castPtr p
    b <- peek . castPtr $ p `plusPtr` sizeOf (undefined :: a)
    pure $ a :. b
  poke p (a :. b) = do
    poke (castPtr p) a
    poke (castPtr $ p `plusPtr` sizeOf (undefined :: a)) b

As you can see, the alignment is set to 1 to express the fact the memory is packed. The peek and poke functions use the size of the head of the tuple to advance the pointer so that we effectively write the whole tuple in packed memory.

Then, let’s rewrite our Vertex type in terms of (:.) to see how it’s going on:

type Vertex = X :. Y :. Z

If X, Y and Z are in Storable, we can directly poke one of our Vertex into a luminance buffer! That is, directly into the GPU buffer!

Keep in mind that the Storable instance implements packed-memory uploads and reads, and won’t work with special kinds of buffers, like shader storage ones, which require specific memory alignment. To cover them, I’ll create specific typeclasses instances. No worries.

Creating a vertex array

Creating a vertex array is done through the function createVertexArray. I might change the name of that object – it’s ugly, right? Maybe Shape, or something cooler!

createVertexArray :: (Foldable f,MonadIO m,MonadResource m,Storable v,Traversable t,Vertex v)
                  => t v
                  -> f Word32
                  -> m VertexArray

As you can see, the type signature is highly polymorphic. t and f represent foldable structures storing the vertices and the indices. And that’s all. Nothing else to feed the function with! As you can see, there’s a typeclass constraint on v, the inner vertex type, Vertex. That constraint ensures the vertex type is representable on the OpenGL side and has a known vertex format.

Disclaimer: the Traversable constraint might be relaxed to be Foldable very soon.

Once tested, I’ll move all that code from the unstable branch to the master branch so that you guys can test it. :)

About OpenGL…

I eventually came to the realization that I needed to inform you about the OpenGL prerequisites. Because I want the framework to be as modern and well-designed as possible, you’ll need… OpenGL 4.5. The latest version, indeed. You might also need an extension, ARB_bindless_texture. That would enable the framework to pass textures to shader in a very stateless way, which is our objective!

I’ll let you know what I decide about that. I don’t want to use an extension that is not implemented almost everywhere.

What’s next?

Well, tests! I need to be sure everything is correctly done on the GPU side, especially the vertex format specification. I’m pretty confident though.

Once the vertex arrays are tested, I’ll start defining a render interface as stateless as I can. As always, I’ll keep you informed!

Saturday, August 01, 2015

Luminance – framebuffers and textures

I’m happily suprised that so many Haskell people follow luminance! First thing first, let’s tell you about how it grows.

Well, pretty quickly! There’s – yet – no method to make actual renders, because I’m still working on how to implement some stuff (I’ll detail that below), but it’s going toward the right direction!


Something that is almost done is the framebuffer part. The main idea of framebuffers – in OpenGL – is supporting offscreen renders, so that we can render to several framebuffers and combine them in several fancy ways. Framebuffers are often bound textures, used to pass the rendered information around, especially to shaders, or to get the pixels through texture reads CPU-side.

The thing is… OpenGL’s framebuffers are tedious. You can have incomplete framebuffers if you don’t attach textures with the right format, or to the wrong attachment point. That’s why the framebuffer layer of luminance is there to solve that.

In luminance, a Framebuffer rw c d is a framebuffer with two formats. A color format, c, and a depth format, d. If c = (), then no color will be recorded. If d = (), then no depth will be recorded. That enables the use of color-only or depth-only renders, which are often optimized by GPU. It also includes a rw type variable, which has the same role as for Buffer. That is, you can have read-only, write-only or read-write framebuffers.

And of course, all those features – having a write-only depth-only framebuffer for instance – are set through… types! And that’s what is so cool about how things are handled in luminance. You just tell it what you want, and it’ll create the required state and manage it for you GPU-side.


The format types are used to know which textures to create and how to attach them internally. The textures are hidden from the interface so that you can’t mess with them. I still need to find a way to provide some kind of access to the information they hold, in order to use them in shaders for instance. I’d love to provide some kind of monoidal properties between framebuffers – to mimick gloss Monoid instance for its Picture type, basically.

You can create textures, of course, by using the createTexture w h mipmaps function. w is the width, h the height of the texture. mipmaps is the number of mipmaps you want for the texture.

You can then upload texels to the texture through several functions. The basic form is uploadWhole tex autolvl texels. It takes a texture tex and the texels to upload to the whole texture region. It’s your responsibility to ensure that you pass the correct number of texels. The texels are represented with a polymorphic type. You’re not bound to any kind of textures. You can pass a list of texels, a Vector of texels, or whatever you want, as long as it’s Foldable.

It’s also possible to fill the whole texture with a single value. In OpenGL slang, such an operation is often called clearing – clearing a buffer, clearing a texture, clearing the back buffer, and so on. You can do that with fillWhole.

There’re two over functions to work with subparts of textures, but it’s not interesting for the purpose of that blog entry.

Pixel format

The cool thing is the fact I’ve unified pixel formats. Textures and framebuffers share the same pixel format type (Format t c). Currently, they’re all phantom types, but I might unify them further and use DataKinds to promote them to the type-level. A format has two type variables, t and c.

t is the underlying type. Currently, it can be either Int32, Word32 or Float. I might add support for Double as well later on.

c is the channel type. There’re basically five channel types:

  • CR r, a red channel ;
  • CRG r g, red and green channels ;
  • CRGB r g b, red, green and blue channels ;
  • CRGBA r g b a, red, green, blue and alpha channels ;
  • CDepth d, a depth channel (special case of CR; for depths only).

The type variables r, g, b, a and d represent channel sizes. There’re currently three kind of channel sizes:

  • C8, for 8-bit ;
  • C16, for 16-bit ;
  • C32, for 32-bit.

Then, Format Float (CR C32) is a red channel, 32-bit float – the OpenGL equivalent is R32F. Format Word32 (CRGB C8 C8 C16) is a RGB channel with red and green 8-bit unsigned integer channels and the blue one is a 16-bit unsigned integer channel.

Of course, if a pixel format doesn’t exist on the OpenGL part, you won’t be able to use it. Typeclasses are there to enforce the fact pixel format can be represented on the OpenGL side.

Next steps

Currently, I’m working hard on how to represent vertex formats. That’s not a trivial task, because we can send vertices to OpenGL as interleaved – or not – arrays. I’m trying to design something elegant and safe, and I’ll keep you informed when I finally get something. I’ll need to find an interface for the actual render command, and I should be able to release something we can actually use!

By the way, some people already tried it (Git HEAD), and that’s amazing! I’ve created the unstable branch so that I can push unstable things, and keep the master branch as clean as possible.

Keep the vibe, and have fun hacking around!