Thursday, September 14, 2017

State of luminance

I’ve been a bit surprised for a few days now. Some rustaceans posted two links about my spectra and luminance crates on reddit – links here and here. I didn’t really expect that: the code is public, on Github, but I don’t communicate about them that much.

However, I saw interested people, and I think it’s the right time to write a blog post about some design choices. I’ll start off with luminance and I’ll speak about spectra in another blog post because I truly think spectra starts to become very interesting (I have my own shading language bundled up within, which is basically GLSL on steroids).

luminance and how it copes with fast and as-stateless-as-possible graphics

Origins

luminance is a library that I wrote, historically, in Haskell. You can find the package here if you’re interested – careful though, it’s dying and the Rust version has completely drifted path with it. Nevertheless, when I ported the library to Rust, I imported the same “way of programming” that I had·have in Haskell – besides the allocation scheme; remember, I’m a demoscener, I do care a lot about performance, CPU usage, cache friendliness and runtime size. So the Rust luminance crate was made to be hybrid: it has the cool functional side that I imported from my Haskell codebase, and the runtime performance I wanted that I had when I wrote my two first 64k in C++11. I had to remove and work around some features that only Haskell could provide, such as higher-kinded types, type families, functional dependencies, GADTs and a few other things such as existential quantification (trait objects saved me here, even though I don’t use them that much in luminance now).

I have to admit, I dithered a lot about the scope of luminance — both in Haskell and Rust. At first, I thought that it’d be great to have a “base” crate, hosting common and abstracted code, and “backend” crates, implementing the abstract interface. That would enable me to have several backends – OpenGL, Vulkan, Metal, a software implementation, something for Amiigaaaaaaa, etc. Though, time has passed, and now, I think it’s:

Overkill.
A waste of framework.

The idea is that if you need to be very low-level on the graphics stack of your application, you’re likely to know what you are doing. And then, your needs will be very precise and well-defined. You might want very specific piece of code to be available, related to a very specific technology. That’s the reason why abstracting over very low-level code is not a good path to me: you need to expose as most as posssible the low-level interface. That’s the goal of luminance: exposing OpenGL’s interface in a stateless, bindless and typesafe way, with no or as minimal as possible runtime overhead.

Today

Today, luminance is almost stable – it still receives massive architecture redesign from time to time, but it’ll hit the 1.0.0 release soon. As discussed with kvark lately, luminance is not about the same scope as gfx’s one. The goals of luminance are:

To be a typesafe, stateless and bindless OpenGL framework.
To provide a friendly experience and expose as much as possible all of the OpenGL features.
To be very lightweight (the target is to be able to use it without std nor core).

To achieve that, luminance is written with several aspects in mind:

Allocation must be explicitely stated by the user: we must avoid as much as possible to allocate things in luminance since it might become both a bottleneck and an issue to the lightweight aspect.
Performance is a first priority; safety comes second. If you have a feature that can be either exclusively performant or safe, it must then be performant. Most of the current code is, for our joy, both performant and safe. However, some invariants are left around the place and you might shoot your own feet. This is an issue and some reworking must be done (along with tagging some functions and traits unsafe).
No concept of backends will ever end up in luminance. If it’s decided to switch to Vulkan, the whole luminance API will and must be impacted, so that people can use Vulkan the best possible way.
A bit like the first point, the code must be written in a way that the generated binary is as small as possible. Generics are not forbidden – they’re actually recommended – but things like crate dependencies are likely to be forbidden (exception for the gl dependency, of course).
Windowing must not be addressed by luminance. This is crucial. As a demoscener, if I want to write a 64k with luminance, I must be able to use a library over X11 or the Windows API to setup the OpenGL context myself, set the OpenGL pointers myself, etc. This is not the typical usecase – who cares besides demosceners?! – but it’s still a good advantage since you end up with loose coupling for free.

The new luminance

luminance has received more attention lately, and I think it’s a good thing to talk about how to use it. I’ll add examples on github and its docs.rs online documentation.

I’m going to do that like a tutorial. It’s easier to read and you can test the code in the same time. Let’s render a triangle!

Note: keep in mind that you need a nightly compiler to compile luminance.

Getting your feet wet

I’ll do everything from scratch with you. I’ll work in /tmp:

$ cd /tmp

First thing first, let’s setup a lumitest Rust binary project:

$ cargo init --bin lumitest
  Created binary (application) project
$ cd lumitest

Let’s edit our Cargo.toml to use luminance. We’ll need two crates:

The luminance crate.
A way to open the OpenGL context; we’ll use GLFW, so the luminance-glfw crate.

At the time of writing, corresponding versions are luminance-0.23.0 and luminance-glfw-0.3.2.

Have the following [dependencies] section

[dependencies]
luminance = "0.23.0"
luminance-glfw = "0.3.2"

$ cargo check

Everything should be fine at this point. Now, let’s step in in writing some code.

extern crate luminance;
extern crate luminance_glfw;

use luminance_glfw::{Device, WindowDim, WindowOpt};

const SCREEN_WIDTH: u32 = 960;
const SCREEN_HEIGHT: u32 = 540;

fn main() {
  let rdev = Device::new(WindowDim::Windowed(SCREEN_WIDTH, SCREEN_HEIGHT), "lumitest", WindowOpt::default());
}

The main function creates a Device that is responsible in holding the windowing stuff for us.

Let’s go on:

match rdev {
  Err(e) => {
    eprintln!("{:#?}", e);
    ::std::process::exit(1);
  }

  Ok(mut dev) => {
    println!("let’s go!");
  }
}

This block will catch any Device errors and will print them to stderr if there’s any.

Let’s write the main loop:

'app: loop {
  for (_, ev) in dev.events() { // the pair is an interface mistake; it’ll be removed
    match ev {
      WindowEvent::Close | WindowEvent::Key(Key::Escape, _, Action::Release, _) => break 'app,
      _ => ()
    }
  }
}

This loop runs forever and will exit if you hit the escape key or quit the application.

Setting up the resources

Now, the most interesting thing: rendering the actual triangle! You will need a few things:

type Position = [f32; 2];
type RGB = [f32; 3];
type Vertex = (Position, RGB);

const TRIANGLE_VERTS: [Vertex; 3] = [
  ([-0.5, -0.5], [0.8, 0.5, 0.5]), // red bottom leftmost
  ([-0., 0.5], [0.5, 0.8, 0.5]), // green top
  ([0.5, -0.5], [0.5, 0.5, 0.8]) // blue bottom rightmost
];

Position, Color and Vertex define what a vertex is. In our case, we use a 2D position and a RGB color.

You have a lot of choices here to define the type of your vertices. In theory, you can choose any type you want. However, it must implement the Vertex trait. Have a look at the implementors that already exist for a faster start off!

Important: do not confuse between [f32; 2] and (f32, f32). The former is a single 2D vertex component. The latter is two 1D components. It’ll make a huge difference when writing shaders.

The TRIANGLE_VERTS is a constant array with three vertices defined in it: the three vertices of our triangle. Let’s pass those vertices to the GPU with the Tess type:

// at the top location
use luminance::tess::{Mode, Tess, TessVertices};

// just above the main loop
let triangle = Tess::new(Mode::Triangle, TessVertices::Fill(&TRIANGLE_VERTS), None);

This will pass the TRIANGLE_VERTS vertices to the GPU. You’re given back a triangle object. The Mode is a hint object that states how vertices must be connected to each other. TessVertices lets you slice your vertices – this is typically enjoyable when you use a mapped buffer that contains a dynamic number of vertices.

We’ll need a shader to render that triangle. First, we’ll place its source code in data:

$ mkdir data

Paste this in data/vs.glsl:

layout (location = 0) in vec2 co;
layout (location = 1) in vec3 color;

out vec3 v_color;

void main() {
  gl_Position = vec4(co, 0., 1.);
  v_color = color;
}

Paste this in data/fs.glsl:

in vec3 v_color;

out vec4 frag;

void main() {
  frag = vec4(v_color, 1.);
}

And add this to your main.rs:

const SHADER_VS: &str = include_str!("../data/vs.glsl");
const SHADER_FS: &str = include_str!("../data/fs.glsl");

Note: this is not a typical workflow. If you’re interested in shaders, have a look at how I do it in spectra. That is, hot reloading it via SPSL (Spectra Shading Language), which enables to write GLSL modules and compose them in a single file but just writing functions. The functional programming style!

Same thing as for the tessellation, we need to pass the source to the GPU’s compiler to end up with a shader object:

// add this at the top of your main.rs
use luminance::shader::program::Program;

// below declaring triangle
let (shader, warnings) = Program::<Vertex, (), ()>::from_strings(None, SHADER_VS, None, SHADER_FS).unwrap();

for warning in &warnings {
  eprintln!("{:#?}", warning);
}

Finally, we need to tell luminance in which framebuffer we want to make the render. It’s simple: to the default framebuffer, which ends up to be… your screen’s back buffer! This is done this way with luminance:

use luminance::framebuffer::Framebuffer;

let screen = Framebuffer::default([SCREEN_WIDTH, SCREEN_HEIGHT]);

And we’re done for the resources. Let’s step in the actual render now.

The actual render: the pipeline

luminance’s approach to render is somewhat not very intuitive yet very simple and very efficient: the render pipeline is explicitly defined by the programmer in Rust, on the fly. That means that you must express the actual state the GPU must have for the whole pipeline. Because of the nature of the pipeline, which is an AST (Abstract Syntax Tree), you can batch sub-parts of the pipeline (we call such parts nodes) and you end up with minimal GPU state switches. The theory is as following:

At top most, you have the pipeline function that introduces the concept of shading things to a framebuffer.
Nested, you find the concept of a shader gate. That is, an object linked to its parent (pipeline) and that gives you the concept of shading things with a shader.
- Nested, you find the concept of rendering things. That is, can set on such nodes GPU states, such as whether you want a depth test, blending, etc.
- Nested, you find the concept of a tessellation gate, enabling you to render actual Tess objects.

That deep nesting enables you to batch your objects on a very fine granularity. Also, notice that the functions are not about slices of Tess or hashmaps of Program. The allocation scheme is completely ignorant about how the data is traversed, which is good: you decide. If you need to borrow things on the fly in a shading gate, you can.

Let’s get things started:

use luminance::pipeline::{entry, pipeline};

entry(|_| {
  pipeline(&screen, [0., 0., 0., 1.], |shd_gate| {
    shd_gate.shade(&shader, |rdr_gate, _| {
      rdr_gate.render(None, true, |tess_gate| {
        let t = &triangle;
        tess_gate.render(t.into());
      });
    });
  });
});

We just need a final thing now: since we render to the back buffer of the screen, if we want to see anything appear, we need to swap the buffer chain so that the back buffer become the front buffer and the front buffer become the back buffer. This is done by wrapping our render code in the Device::draw function:

dev.draw(|| {
  entry(|_| {
    pipeline(&screen, [0., 0., 0., 1.], |shd_gate| {
      shd_gate.shade(&shader, |rdr_gate, _| {
        rdr_gate.render(None, true, |tess_gate| {
          let t = &triangle;
          tess_gate.render(t.into());
        });
      });
    });
  });
});

You should see this:

As you can see, the code is pretty straightforward. Let’s get deeper, and let’s kick some time in!

use std::time::Instant;

// before the main loop
let t_start = Instant::now();
// in your main loop
let t_dur = t_start.elapsed();
let t = (t_dur.as_secs() as f64 + t_dur.subsec_nanos() as f64 * 1e-9) as f32;

We have the time. Now, we need to pass it down to the GPU (i.e. the shader). luminance handles that kind of things with two concepts:

Uniforms.
Buffers.

Uniforms are a good match when you want to send data to a specific shader, like a value that customizes the behavior of a shading algorithm.

Because buffers are shared, you can use buffers to share data between shader, leveraging the need to pass the data to all shader by hand – you only pass the index to the buffer that contains the data.

We won’t conver the buffers for this time.

Because of type safety, luminance requires you to state which types the uniforms the shader contains are. We only need the time, so let’s get this done:

// you need to alter this import
use luminance::shader::program::{Program, ProgramError, Uniform, UniformBuilder, UniformInterface, UniformWarning};

struct TimeUniform(Uniform<f32>);

impl UniformInterface for TimeUniform {
  fn uniform_interface(builder: UniformBuilder) -> Result<(Self, Vec<UniformWarning>), ProgramError> {
    // this will fail if the "t" variable is not used in the shader
    //let t = builder.ask("t").map_err(ProgramError::UniformWarning)?;

    // I rather like this one: we just forward up the warning and use the special unbound uniform
    match builder.ask("t") {
      Ok(t) => Ok((TimeUniform(t), Vec::new())),
      Err(e) => Ok((TimeUniform(builder.unbound()), vec![e]))
    }
  }
}

The UniformBuilder::unbound is a simple function that gives you any uniform you want: the resulting uniform object will just do nothing when you pass values in. It’s a way to say “— Okay, I don’t use that in the shader yet, but don’t fail, it’s not really an error”. Handy.

And now, all the magic: how do we access that uniform value? It’s simple: via types! Have you noticed the type of our Program? For the record:

let (shader, warnings) = Program::<Vertex, (), ()>::from_strings(None, SHADER_VS, None, SHADER_FS).unwrap();

See the type is parametered with three type variables:

The first one – here Vertex, our own type – is for the input of the shader program.
The second one is for the output of the shader program. It’s currently not used at all by luminance by is reserved, as it will be used later for enforcing even further type safety.
The third and latter is for the uniform interface.

You guessed it: we need to change the third parameter from () to TimeUniform:

let (shader, warnings) = Program::<Vertex, (), TimeUniform>::from_strings(None, SHADER_VS, None, SHADER_FS).unwrap();

And that’s all. Whenever you shade with a ShaderGate, the type of the shader object is being inspected, and you’re handed with the uniform interface:

shd_gate.shade(&shader, |rdr_gate, uniforms| {
  uniforms.0.update(t);

  rdr_gate.render(None, true, |tess_gate| {
    let t = &triangle;
    tess_gate.render(t.into());
  });
});

Now, change your fragment shader to this:

in vec3 v_color;

out vec4 frag;

uniform float t;

void main() {
  frag = vec4(v_color * vec3(cos(t * .25), sin(t + 1.), cos(1.25 * t)), 1.);
}

And enjoy the result! Here’s the gist that contains the whole main.rs.

Sunday, July 30, 2017

Rust GLSL crate

It’s been almost two months that I’ve been working on my glsl crate. This crate exposes a GLSL450 compiler that enables you to parse GLSL-formatted sources into memory in the form of an AST. Currently, that AST is everything you get from the parsing process – you’re free to do whatever you want with it. In the next days, I’ll write a GLSL writer (so that I can check that I can parse GLSL to GLSL…). I’d love to see contributions from Vulkan people to write a SPIR-V backend, though!

Just for the record, the initial goal I had in mind was to parse a subset of GLSL for my spectra demoscene framework. I’ve been planning to write my own GLSL-based shading language (with modules, composability, etc. etc.) and hence I need to be able to parse GLSL sources. Because I wanted to share my effort, I decided to create a dedicated project and here we are with a GLSL crate.

Currently, you can successfully parse a GLSL450-formatted source (or part of it, I expose all the intermediary parsers as well as it’s required by my needs to create another shading language over GLSL). See for instance this shading snippet parsed to an AST.

However, because the project is still very young, there a lot of features that are missing:

I followed the official GLSL450 specifications, which is great, but the types are not very intuitive – some refactoring must be done here;
I use nom as a parser library. It’s not perfect but it does its job very well. However, error reporting is pretty absent right now (if you have an error in your source, you’re basically left with Error(SomeVariant) flag, which is completely unusable;
I wrote about 110 unit tests to ensure the parsing process is correct. However, there’s for now zero semantic check. Those are lacking.
About the semantic checks, the crate is also missing a semantic analysis. I don’t really know what to do here, because it’s a lot of work (like, ensure that the assigned value to a float value is a real float and not a boolean or ensure that a function that must returns a vec3 returns a vec3 and not something else, etc.). This is not a trivial task and because this is already done by the OpenGL drivers, I won’t provide this feature yet. However, to me, it’d be a great value.

If you’re curious, you can start using the crate with glsl = "0.2.1" in your Cargo.toml. You probably want to use the translation_unit parser, which is the most external parser (it parses an actual shader). If you’re looking for something similar to my need and want to parse subparts of GLSL, feel free to dig in the documentation, and especially the parsers module, that exports all the parsers available to parse GLSL’s parts.

Either way, you have to pass a bytes slice. If your source’s type is String or &str, you can use the str::as_bytes() function to feed the input of the parsers.

Not all parsers are exported, only the most important ones. You will not find an octal parser, for instance, while it’s defined and used in the crate internally.

Call to contribution

If you think it’s worth it, I’m looking for people who would like to:

write a GLSL to SPIR-V writer: it’s an easy task, because you just have to write a function of the form AST -> Result<SPIRV, _>, for being pragmatic;
test the crate and provide feedback about any error you’d find! Please do not open an issue if you have an error in your source and find “the error from the parsers is not useful”, because it’s already well established this is a problem and I’m doing my best to solve it;
any kind of contributions you think could be interesting for the crate.

Happy coding, and keep the vibe!

Sunday, July 23, 2017

On programming workflows

For the last month, I had technical talks with plenty of programmers from all around the globe – thank you IRC! Something interesting that often showed up in the discussion was the actual workflow we have when writing code. Some people are use to IDE and won’t change their tools for anything else. Some others use very basic text editors (I even know someone using BBEdit for his job work! crazy, isn’t it?). I think it’s always a good thing to discuss that kind of topic, because it might give you more hindsight on your own workflow, improve it, share it or just show your nice color scheme.

I’ll be talking about:

My editor

I’ve tried a lot of editors in the last ten years. I spent a year using emacs but eventually discovered – erm, learned! – vim and completely fell in love with the modal editor. It was something like a bit less than ten years ago. I then tried a lot of other editors (because of curiosity) but failed to find something that would help be better than vim to write code. I don’t want to start an editor war; here’s just my very personal point of view on editing. The concept of modes in vim enables me to use a very few keystrokes to perform what I want to do (moving, commands, etc.) and I feel very productive that way.

A year ago, a friend advised me to switch to neovim, which I have. My editor of the time is then neovim, but it’s so close to vim that I tend to use (neo)vim. :)

I don’t use any other editing tool. I even use neovim for taking notes while in a meeting or when I need to format something in Markdown. I just use it everywhere.

My neovim setup

I use several plugins:

Plugin 'VundleVim/Vundle.vim'
Plugin 'ctrlpvim/ctrlp.vim'
Plugin 'gruvbox'
Plugin 'neovimhaskell/haskell-vim'
Plugin 'itchyny/lightline.vim'
Plugin 'rust-lang/rust.vim'
Plugin 'jacoborus/tender.vim'
Plugin 'airblade/vim-gitgutter'
Plugin 'tikhomirov/vim-glsl'
Plugin 'plasticboy/vim-markdown'
Plugin 'cespare/vim-toml'
Plugin 'mzlogin/vim-markdown-toc'
Plugin 'ElmCast/elm-vim'
Plugin 'raichoo/purescript-vim'
Plugin 'easymotion'
Plugin 'scrooloose/nerdtree'
Plugin 'ryanoasis/vim-devicons'
Plugin 'tiagofumo/vim-nerdtree-syntax-highlight'
Plugin 'mhinz/vim-startify'
Plugin 'Xuyuanp/nerdtree-git-plugin'
Plugin 'tpope/vim-fugitive'
Plugin 'MattesGroeger/vim-bookmarks'
Plugin 'luochen1990/rainbow'

Vundle

A package manager. It just takes the list you read above and clones / keeps updated the git repositories of the given vim packages. It’s a must have. There’re alternatives like Pathogen but Vundle is very simple to setup and you don’t have to care about the file system: it cares for you.

ctrlp

This is one is a must-have. It gives you a fuzzy search buffer that traverse files, MRU, tags, bookmarks, etc.

I mapped the file search to the , f keys combination and the tag fuzzy search to , t.

gruvbox

The colorscheme I use. I don’t put an image here since you can find several examples online.

This colorscheme also exists for a lot of other applications, among terminals and window managers.

haskell-vim

Because I write a lot of Haskell, I need that plugin for language hilighting and linters… mostly.

lightline

The Lightline vim statusline. A popular alternative is Powerline for instance. I like Lightline because it’s lightweight and has everything I want.

rust.vim

Same as for Haskell: I wrote a lot of Rust code so I need the language support in vim.

tender

A colorscheme for statusline.

vim-gitgutter

I highly recommend this one. It gives you diff as icons in the symbols list (behind the line numbers).

vim-glsl

GLSL support in vim.

vim-markdown

Markdown support in vim.

vim-toml

TOML support in vim.

TOML is used in Cargo.toml, the configuration file for Rust projects.

vim-markdown-toc

You’re gonna love this one. It enables you to insert a table of contents wherever you want in a Markdown document. As soon as you save the buffer, the plugin will automatically refresh the TOC for you, keeping it up-to-date. A must have if you want table of contents in your specifications or RFC documents.

elm-vim

Elm support in vim.

purescript-vim

Purescript support in vim.

easymotion

A must have. As soon as you hit the corresponding keys, it will replace all words in your visible buffer by a set of letters (typically one or two), enabling you to just press the associated characters to jump to that word. This is the vim motion on steroids.

nerdtree

A file browser that appears at the left part of the current buffer you’re in. I use it to discover files in a file tree I don’t know – I use it very often at work when I work on a legacy project, for instance.

vim-devicons

A neat package that adds icons to vim and several plugins (nerdtree, ctrlp, etc.). It’s not a must-have but I find it gorgeous so… just install it! :)

vim-nerdtree-syntax-highlight

Add more formats and files support to nerdtree.

vim-startify

A cute plugin that will turn the start page of vim into a MRU page, enabling you to jump to the given file by just pressing its associated number.

It also features a cow that gives you fortune catchphrases. Me gusta.

nerdtree-git-plugin

Git support in nerdtree, it adds markers in front of file that have changes, diff, that were added, etc.

vim-fugitive

A good Git integration package to vim. I use it mostly for its Gdiff diff tooling directly in vim, even though I like using the command line directly for that. The best feature of that plugin is the integrated blaming function, giving you the author of each line directly in a read-only vim buffer.

vim-bookmarks

Marks on steroids.

rainbow

This little plugins is very neats as it allows me to add colors to matching symbols so that I can easily see where I am.

Workflow in Rust

I’m a rustacean. I do a lot of Rust on my spare time. My typical workflow is the following:

I edit code in neovim
Depending on the project (whether I have a robust unit tests base or not or whether I’m writing a library or an app), I use several cargo commands:

for a library project, I split my screen in two and have a cargo watch -x test running; this command is a file watcher that will run all the tests suites in the project as soon as a file changes;
for an app project, I split my screen in two and have a cargo watch – similar to cargo watch -x check running; this command is a file watcher that will proceed through the compilation of the binary but it will not compile it down to an actual binary; it’s just a check, it doesn’t produce anything you can run. You can manually run that command with cargo check. See more here;
for other use cases, I tend to run cargo check by hand and run cargo build to test the actualy binary

Because I’m a unix lover, I couldn’t work correctly without a terminal, so I use neovim in a terminal and switch from the console to the editor with the keybinding C-z and the jobs and fg commands. I do that especially to create directories, issue git commands, deploy things, etc. It’s especially ultra cool to run a rg search or even a find / fd. I sometimes do that directly in a neovim buffer – with the :r! command – when I know I need to edit things from the results. Bonus: refactoring with tr, sed and find -exec is ultra neat.

Workflow in Haskell

The workflow is almost exactly the same besides the fact I use the stack build --fast --file-watch command to have a file watcher.

Haskell’s stack doesn’t currently the awesome check command Rust’s cargo has. Duh :(

I also have a similar workflow for other languages I work in, like Elm, even though I use standard unix tools for the file watching process.

Git workflow

Aaaah… git. What could I do without it? Pretty much nothing. git is such an important piece of software and brings such an important project philosophy and behaviors to adopt.

What I find very cool in git – besides the tool itself – is everything around it. For instance, on GitHub, you have the concept of Pull Request (PR) – Merge Request in GitLab (MR). Associated with a good set of options, like disallowing people to push on the master branch, hooks, forcing people to address any comments on their MR, this allows for better code reviews and overall a way better quality assurance of the produced code than you could have without using all this infrastructure. Add a good set of DevOps for deployment and production relatide issues and you have a working team that has no excuse to produce bad code!

Some git candies I love working with

My neovim fugitive plugin allows me to open a special buffer with the :Gblame command that gives me a git blame annotation of the file. This might be trivial but it’s very useful, especially at work when I have my colleagues next me – it’s always better to directly ask them about something than guessing.

Another one that I love is :Gdiff, that gives you a git diff of the modification you’re about to stage. I often directly to a git diff in my terminal, but I also like how this plugin nails it as well. Very pratictal!

General note on workflow

It’s always funny to actually witness difference in workflows, especially at work. People who use mostly IDEs are completely overwhelmed by my workflow. I was completely astonished at work that some people hadn’t even heard of sed before – they even made a Google search! I’m a supporter of the philosophy that one should use the tool they feel comfortable with and that there’s no “ultimate” tool for everyone. However, for my very own person, I really can’t stand IDEs, with all the buttons and required clicks you have to perform all over the screen. I really think it’s a waste of time, while using a modal editor like neovim with a bépo keyboard layout (French dvorak) and going back and forth to the terminal is just incredibly simple, yet powerful.

I had a pretty good experience with Atom, a modern editor. But when I’ve been told it’s written with web technologies, the fact it’s slow as fck as soon as you start having your own tooling (extensions), its pretty bad behavior and incomprensible “Hey, I do whatever the fck I want and I’ll just reset your precious keybindings!” or all the weird bugs – some of my extensions won’t just work if I have an empty pane open, wtf?!… well, I was completely bolstered that GUI interfaces, at least for coding and being productive, are cleary not for me. With my current setup, my hands never move from the keyboard – my thrists are completely static. With all the candies like easymotion, ctrlp, etc. etc. I just can’t find any other setups faster and comfier than this one.

There’s even an extra bonus to my setup: because I use mostly unix tools and neovim, it’s pretty straigth-forward to remote-edit something via ssh, because everything happens in a terminal. That’s not something you can do easily with Atom, Sublime Text or any other editors / IDEs – and you even pay for that shit! No offence!

However, there’s a caveat: because pretty much everything I do is linked to my terminal, the user experience mostly relies on the performance of the terminal. Using a bad terminal will result in an overall pretty bad experience, should it be editing, compiling, git or ssh. That’s why I keep lurking at new terminal emulaters – alacritty seems very promising, yet it’s still too buggy and lacks too many features to be production-ready to me – but it’s written in Rust and is GPU-accelerated, hells yeah!

Conclusion

Whoever you are, whatever you do, whomever you work with, I think the most important thing about workflow is to find the one that fits your needs the most. I have a profound, deep digust for proprietary and closed software like Sublime Text and IDEs that use GUIs while keyboard shortcut are just better. To me, the problem is about the learning curve and actually wanting to pass it – because yes, learning (neo)vim in details and mastering all its nits is not something you’ll do in two weeks; it might take months or years, but it’s worth it. However, as I said, if you just feel good with your IDE, I will not try to convert you to a modal editor or a unix-based workflow… because you wouldn’t be as productive as you already are.

Keep the vibe!

Thursday, April 20, 2017

Postmortem #1 – Revision 2017

On the weekend of 14th – 17th of April 2017, I was attending for the forth time the easter demoparty Revision 2017. This demoparty is the biggest so far in the world and gathers around a thousand people coming from around the world. If you’re a demomaker, a demoscene passionated or curious about it, that’s the party to go. It hosts plenty of competitions, among photos, modern graphics, oldschool graphics, games, size-limited demos (what we call intros), demos, tracked and streamed music, wild, live compo, etc. It’s massive.

So, as always, once a year, I attend Revision. But this year, it was a bit different for me. Revision is very impressive and most of the “big demogroups” release their productions they’ve been working on for months or even years. I tend to think “If I release something here, I’ll just be kind of muted by all those massive productions.” Well, less than two weeks before Revision 2017, I was contacted by another demogroup. They asked me to write an invitro – a kind of intro or demo acting as a communication production to invite people to go to another party. In my case, I was proposed to make the Outline 2017 invitro. Outline was the first party I attended years back then, so I immediately accepted and started to work on something. That was something like 12 days before the Revision deadline.

I have to tell. It was a challenge. All productions I wrote before was made in about a month and a half and featured less content than the Outline Invitation. I had to write a lot of code from scratch. A lot. But it was also a very nice project to test my demoscene framework, written in Rust – you can find spectra here or here

An hour before hitting the deadline, the beam team told me their Ubuntu compo machine died and that it would be neat if I could port the demo to Windows. I rushed like a fool to make a port – I even forked and modified my OpenAL dependency! – and I did it in 35 minutes. I’m still a bit surprised yet proud that I made it through!

Anyway, this post is not about bragging. It’s about hindsight. It’s a post-mortem. I did that for Céleri Rémoulade as I was the only one working on it – music, gfx, direction and obviously the Rust code. I want to draw a list of what went wrong and what went right. In the first time, for me. So that I have enhancement axis for the next set of demos I’ll make. And for sharing those thoughts so that people can have a sneak peek into the internals of what I do mostly – I do a lot of things! :D – as a hobby on my spare time.

You can find the link to the production here (there’re Youtube links if you need).

What went wrong

Sooooooooo… What went wrong. Well, a lot of things! spectra was designed to build demo productions in the first place, and it’s pretty good at it. But something that I have to enhance is the user interaction. Here’s a list of what went wrong in a concrete way.

Hot-reloading went wiiiiiiiiiiild²

With that version of spectra, I added the possibility to hot-reload almost everything I use as a resource: shaders, textures, meshes, objects, cameras, animation splines, etc. I edit the file, and as soon as I save it, it gets hot-reloaded in realtime, without having to interact with the program (for curious ones, I use the straight-forward notify crate crate for registering callbacks to handle file system changes). This is very great and it saves a lot of time – Rust compilation is slow, and that’s a lesson I’ve learned from Céleri Rémoulade: keeping closing the program, making a change, compiling and then running is a waste of time.

So what’s the issue with that? Well, the main problem is the fact that in order to implement hot-reloading, I wanted performance and something very simple. So I decided to use shared mutable smart states. As a Haskeller, I kind of offended myself there – laughter! Yeah, in the Haskell world, we try hard to avoid using shared states – IORef – because it’s not referentially transparent and reasoning about it is difficult. However, I tend to strongly think that in some very specific cases, you need such side-effects. I’m balanced but I think it’s the way to go.

Well, in Rust, shared mutable state is implemented via two types: Rc/Arc and Cell/RefCell.

The former is a runtime implementation of the Rust borrowing rules and enables you to share a value. The borrowing rules are not enforced at compile-time anymore but dynamically checked. It’s great because in some cases, you can’t know how long your values will be borrowed for or live. It’s also dangerous because you have to pay extra attention to how you borrow your data – since it’s checked at runtime, you can literally crash your program if you’re not extra careful.

Rc means ref counted and Arc means atomic-ref counted. The former is for values that stay on the same and single thread; the latter is for sharing between threads.

Cell/RefCell are very interesting types that provide internal mutation. By default, Rust gives you external mutation. That is, if you have a value and its address, can mutate what you have at that address. On the other hand, internal mutation is introduced by the Cell and RefCell types. Those types enable you to mutate the content of an object stored at a given address without having the exterior mutation property. It’s a bit technical and related to Rust, but it’s often used to mutate the content of a value via a function taking an immutable value. Imagine an immutable value that only holds a pointer. Exterior mutation would give you the power to change what this pointer points to. Interior mutation would give you the power to change the object pointed by this pointer.

Cell only accepts values that can be copied bit-wise and RefCell works with references.

Now, if you combine both – Rc<RefCell<_>>, you end up with a single-thread shareable – Rc<_> – mutable – RefCell<_> – value. If you have a value of type Rc<RefCell<u32>> for instance, that means you can clone that integer and store it everywhere in the same thread, and at any time, borrow it and inspect and/or mutate it. All copies of the value will observe the change. It’s a bit like C++’s shared_ptr, but it’s safer – thank you Rust!

So what went wrong with that? Well, the borrow part. Because Rust is about safety, you still need to tell it how you want to borrow at runtime. This is done with the [RefCell::borrow()](https://doc.rust-lang.org/std/cell/struct.RefCell.html#method.borrow] and RefCell::borrow_mut() functions. Those functions return special objects that borrow the ref as long as it lives. Then, when it goes out of scope, it releases the borrow.

So any time you want to use an object that is hot-reloadable with my framework, you have to call one of the borrow functions presented above. You end up with a lot of borrows, and you have to keep in mind that you can litterally crash your program if you violate the borrowing rules. This is a nasty issue. For instance, consider:

let cool_object = …; // Rc<RefCell<Object>>, for instance
let cool_object_ref = cool_object.borrow_mut();
// mutate cool_object

just_read(&cool_object.borrow()); // borrowing rule violated here because a mut borrow is in scope

As you can see, it’s pretty simple to fuck up the program if you don’t pay extra attention to what you’re doing with your borrow. To solve the problem above, you’d need a smaller scope for the mutable borrow:

let cool_object = …; // Rc<RefCell<Object>>, for instance

{
  let cool_object_ref = cool_object.borrow_mut();
  // mutate cool_object
}

just_read(&cool_object.borrow()); // borrowing rule violated here because a mut borrow is in scope

So far, I haven’t really spent time trying to fix that, but that’s something I have to figure out.

Resources declaration in code

This is a bit tricky. As a programmer, I’m used to write algorithms and smart architectures to transform data and resolve problems. I’m given inputs and I provide the outputs – the solutions. However, a demoscene production is special: you don’t have inputs. You create artsy audiovisual outputs from nothing but time. So you don’t really write code to solve a problem. You write code to create something that will be shown on screen or in a headphone. This aspect of demo coding has an impact on the style and the way you code. Especially in crunchtime. I have to say, I was pretty bad on that part with that demo. To me, code should only be about transformations – that’s why I love Haskell so much. But my code is clearly not.

If you know the let keyword in Rust, well, imagine hundreds and hundreds of lines starting with let in a single function. That’s most of my demo. In rush time, I had to declare a lot of things so that I can use them and transform them. I’m not really happy with that, because those were data only. Something like:

let outline_emblem = cache.get::<Model>("Outline-Logo-final.obj", ()).unwrap();
let hedra_01 = cache.get::<Model>("DSR_OTLINV_Hedra01_Hi.obj", ()).unwrap();
let hedra_02 = cache.get::<Model>("DSR_OTLINV_Hedra02_Hi.obj", ()).unwrap();
let hedra_04 = cache.get::<Model>("DSR_OTLINV_Hedra04_Hi.obj", ()).unwrap();
let hedra_04b = cache.get::<Model>("DSR_OTLINV_Hedra04b_Hi.obj", ()).unwrap();
let twist_01 = cache.get::<Object>("DSR_OTLINV_Twist01_Hi.json", ()).unwrap();

It’s not that bad. As you can see, spectra features a resource cache that provides several candies – hot-reloading, resource dependency resolving and resource caching. However, having to declare those resources directly in the code is a nasty boilerplate to me. If you want to add a new object in the demo, you have to turn it off, add the Rust line, re-compile the whole thing, then run it once again. It breaks the advantage of having hot-reloading and it pollutes the rest of the code, making it harder to spot the actual transformations going on.

This is even worse with the way I handle texts. It’s all &'static str declared in a specific file called script.rs with the same insane load of let. Then I rasterize them in a function and use them in a very specific way regarding the time they appear. Not fun.

Still not enough data-driven

As said above, the cache is a great help and enables some data-driven development, but that’s not enough. The main.rs file is more than 600 lines long and 500 lines are just declarations of of clips (editing) and are all very alike. I intentionally didn’t use the runtime version of the timeline – but it’s already implemented – because I was editing a lot of code at that moment, but that’s not a good excuse. And the timeline is just a small part of it (the cuts are something like 10 lines long) and it annoyed me at the very last part of the development, when I was synchronizing the demo with the soundtrack.

I think the real problem is that the clips are way too abstract to be a really helpful abstraction. Clips are just lambdas that consume time and output a node. This also has implication (you cannot borrow something for the node in your clip because of borrowing rules ; duh!).

Animation edition

Most of the varying things you can see in my demos are driven by animation curves – splines. The bare concept is very interesting: an animation contains control points that you know have a specific value at a given time. Values in between are interpolated using an interpolation mode that can change at each control points if needed. So, I use splines to animate pretty much everything: camera movements, objects rotations, color masking, flickering, fade in / fade out effects, etc.

Because I wanted to be able to edit the animation in a comfy way – lesson learned from Céleri Rémoulade, splines can be edited in realtime because they’re hot-reloadable. They live in JSON files so that you just have to edit the JSON objects in each file and as soon as you save it, the animation changes. I have to say, this was very ace to do. I’m so happy having coded such a feature.

However, it’s JSON. It’s already a thing. Though, I hit a serious problem when editing the orientations data. In spectra, an orientation is encoded with a unit quaternion. This is a 4-floating number – hypercomplex. Editing those numbers in a plain JSON file is… challenging! I think I really need some kind of animation editor to edit the spline.

Video capture

The Youtube capture was made directly in the demo. At the end of each frame, I dump the frame into a .png image (with a name including the number of the frame). Then I simply use ffmpeg to build the video.

Even though this is not very important, I had to add some code into the production code of the demo and I think I could just refactor that into spectra. I’m talking about three or four lines of code. Not a big mess.

Compositing

This is will appear as both pros. and cons. Compositing, in spectra, is implemented via the concept of nodes. A node is just an algebraic data structure that contains something that can be connected to another thing to compose a render. For instance, you get find nodes of type render, color, texture, fullscreen effects and composite – the latter is used to mix nodes between them.

Using the nodes, you can build a tree. And the cool thing is that I implemented the most common operators from std::ops. I can then apply a simple color mask to a render by doing something like

render_node * RGBA::new(r, g, b, a).into()

This is extremely user-friendly and helped me a lot to tweak the render (the actual ASTs are more complex than that and react to time, but the idea is similar). However, there’s a problem. In the actual implementation, the composite node is not smart enough: it blends two nodes by rendering them into a separate framebuffer (hence two framebuffers), then sample via a fullscreen quad the left framebuffer and then the right one – and apply the appropriate blending.

I’m not sure about performance here, but I feel like this is the wrong way to go – bandwidth! I need to profile.

On a general note: data vs. transformations

My philosphy is that code should be about transformation, not data. That’s why I love Haskell. However, in the demoscene world, it’s very common to move data directly into functions – think of all those fancy shaders you see everywhere, especially on shadertoy. As soon as I see a data constant in my code, I think “Wait; isn’t there a way to remove that from the function and have access to it as an input?”

This is important, and that’s the direction I’ll take from now on for the future versions of my frameworks.

What went right!

A lot as well!

Hot reloading

Hot reloading was the thing I needed. A hot-reload everything. I even hot-reload the tessellation of the objects (.obj), so that I can change the shape / resolution of a native object and I don’t have to relaunch the application. I saved a lot of precious time thanks to that feature.

Live edition in JSON

I had that idea pretty quickly as well. A lot of objects – among splines – live in JSON files. You edit the file, save it and tada: the object has changed in the application – hot reloading! The JSON was especially neat to handle splines of positions, colors and masks – it went pretty bad and wrong with orientations, but I already told you that.

Compositing

As said before, compositing was also a win, because I lifted the concept up to the Rust AST, enabling me to express interesting rendering pipeline just by using operators like *, + and some combinators of my own (like over).

Editing

Editing was done with a cascade of types and objects:

a Timeline holds several Tracks and Overlaps;
a Track holds several Cuts;
a Cut holds information about a Clip: when the cut starts and ends in the clip and when such a cut should be placed in the track;
a Clip contains code defining a part of the scene (Rust code, can’t live in JSON for obvious reasons;
an Overlap is a special object used to fold several nodes if several cuts are triggered at the same time; it’s used for transitions mostly;
alternatively, a TimelineManifest can be used to live-edit all of this (the JSON for the cuts has a string reference for the clip, and a map to actual code must be provided when folding the manifest into an object of type Timeline).

I think such a system is very neat and helped me a lot to remove naive conditions (like timed if-else if-else if-else if…-else nightmare). With that system, there’s only one test per frame to determine which cuts must be rendered (well, actually, one per track), and it’s all declarative. Kudos.

Resources loading was insanely fast

I thought I’d need some kind of loading bars, but everything loaded so quickly that I decided it’d be a wast of time. Even though I might end up modifying the way resources are loaded, I’m pretty happy with it.

Conclusion

Writing that demo in such a short period of time – I have a job, a social life, other things I enjoy, etc. – was such a challenge! But it was also the perfect project to stress-test my framework. I saw a lot of issues while building my demo with spectra and a lot of “woah, that’s actually pretty great doing this that way!”. I’ll have to enhance a few things, but I’ll always do that with a demo as a work-in-progress because I target pragmatism. As I released spectra on crates.io, I might end up writing another blog entry about it sooner or later, and even speak about it at a Rust meeting!

Keep the vibe!

Tuesday, February 07, 2017

Lifetimes limits – self borrowing and dropchecker

Lately, I’ve been playing around with alto in my demoscene framework. This crate in the replacement of openal-rs as openal-rs has been deprecated because unsound. It’s a wrapper over OpenAL, which enables you to play 3D sounds and gives you several physical properties and effects you can apply.

The problem

Just to let you fully understand the problem, let me introduce a few principles from alto. As a wrapper over OpenAL, it exposes quite the same interface, but adds several safe-related types. In order to use the API, you need three objects:

an Alto object, which represents the API object (it holds dynamic library handles, function pointers, etc. ; we don’t need to know about that)
a Device object, a regular device (a sound card, for example)
a Context object, used to create audio resources, handle the audio context, etc.

There are well-defined relationships between those objects that state about their lifetimes. An Alto object must outlive the Device and the Device must outlive the Context. Basically:

let alto = Alto::load_default(None).unwrap(); // bring the default OpenAL implementation in
let dev = alto.open(None).unwrap(); // open the default device
let ctx = dev.new_context(None).unwrap(); // create a default context with no OpenAL extension

As you can see here, the lifetimes are not violated, because alto outlives dev which outlives ctx. Let’s dig in the type and function signatures to get the lifetimes right (documentation here).

fn Alto::open<'s, S: Into<Option<&'s CStr>>>(&self, spec: S) -> AltoResult<Device>

The S type is just a convenient type to select a specific implementation. We need the default one, so just pass None. However, have a look at the result. AltoResult<Device>. I told you about lifetime relationships. This one might be tricky, but you always have to wonder “is there an elided lifetime here?”. Look at the Device type:

pub struct Device<'a> { /* fields omitted */ }

Yep! So, what’s the lifetime of the Device in AltoResult<Device>? Well, that’s simple: the lifetime elision rule in action is one of the simplest:

If there are multiple input lifetime positions, but one of them is &self or &mut self, the lifetime of self is assigned to all elided output lifetimes. (source)

So let’s rewrite the Alto::open function to make it clearer:

fn Alto::open<'a, 's, S: Into<Option<&'s CStr>>>(&'a self, spec: S) -> AltoResult<Device<'a>> // exact same thing as above

So, what you can see here is that the Device must be valid for the same lifetime as the reference we pass in. Which means that Device cannot outlive the reference. Hence, it cannot outlive the Alto object.

impl<'a> Device<'a> {
  // …
  fn new_context<A: Into<Option<ContextAttrs>>>(&self, attrs: A) -> AltoResult<Context>
  // …
}

That looks a bit similar. Let’s have a look at Context:

pub struct Context<'d> { /* fields omitted */ }

Yep, same thing! Let’s rewrite the whole thing:

impl<'a> Device<'a> {
  // …
  fn new_context<'b, A: Into<Option<ContextAttrs>>>(&'b self, attrs: A) -> AltoResult<Context<'b>>
  // …
}

Plus, keep in mind that self is actually Device<'a>. The first argument of this function then awaits a &'b Device<'a> object!

rustc is smart enough to automatically insert the 'a: 'b lifetime bound here – i.e. the 'a lifetime outlives 'b. Which makes sense: the reference will die before the Device<'a> is dropped.

Ok, ok. So, what’s the problem then?!

The (real) problem

The snippet of code above about how to create the three objects is straight-forward (though we don’t take into account errors, but that’s another topic). However, in my demoscene framework, I really don’t want people to use that kind of types. The framework should be completely agnostic about which technology or API is used internally. For my purposes, I just need a single type with a few methods to work with.

Something like that:

struct Audio = {}

impl Audio {
  pub fn new<P>(track_path: P) -> Result<Self> where P: AsRef<Path> {}

  pub fn toggle(&mut self) -> bool {}

  pub fn playback_cursor(&self) -> f32 {}

  pub fn set_playback_cursor(&self, t: f32) {}
}

impl Drop for Audio {
  fn drop(&mut self) {
    // stop the music if playing; do additional audio cleanup
  }
}

This is a very simple interface, yet I don’t need more. Audio::set_playback_cursor is cool when I debug my demos in realtime by clicking a time panel to quickly jump to a part of the music. Audio::toggle() enables me to pause the demo to inspect an effect in the demo. Etc.

However, how can I implement Audio::new?

The (current) limits of borrowing

The problem kicks in as we need to wrap the three types – Alto, Device and Context – as the fields of Audio:

struct Audio<'a> {
  alto: Alto,
  dev: Device<'a>,
  context: Context<'a>
}

We have a problem if we do this. Even though the type is correct, we cannot correctly implement Audio::new. Let’s try:

impl<'a> Audio<'a> {
  pub fn new<P>(_: P) -> Result<Self> where P: AsRef<Path> {
    let alto = Alto::load_default(None).unwrap();
    let dev = alto.open(None).unwrap();
    let ctx = dev.new_context(None).unwrap();

    Ok(Audio {
      alto: alto,
      dev: dev,
      ctx: ctx
    })
  }
}

As you can see, that cannot work:

error: `alto` does not live long enough
  --> /tmp/alto/src/main.rs:14:15
   |
14 |     let dev = alto.open(None).unwrap();
   |               ^^^^ does not live long enough
...
22 |   }
   |   - borrowed value only lives until here
   |
note: borrowed value must be valid for the lifetime 'a as defined on the body at 12:19...
  --> /tmp/alto/src/main.rs:12:20
   |
12 |   fn new() -> Self {
   |                    ^

error: `dev` does not live long enough
  --> /tmp/alto/src/main.rs:15:15
   |
15 |     let ctx = dev.new_context(None).unwrap();
   |               ^^^ does not live long enough
...
22 |   }
   |   - borrowed value only lives until here
   |
note: borrowed value must be valid for the lifetime 'a as defined on the body at 12:19...
  --> /tmp/alto/src/main.rs:12:20
   |
12 |   fn new() -> Self {
   |                    ^

error: aborting due to 2 previous errors

What’s going on here? Well, we’re hitting a problem called the problem of self-borrowing. Look at the first two lines of our implementation of Audio::new:

let alto = Alto::load_default(None).unwrap();
let dev = alto.open(None).unwrap();

As you can see, the call to Alto::open borrows alto – via a &Alto reference. And of course, you cannot move a value that is borrowed – that would invalidate all the references pointing to it. We also have another problem: imagine we could do that. All those types implement Drop. Because they basically all have the same lifetime, there’s no way to know which one borrows information from whom. The dropchecker has no way to know that. It will then refuse to code creating objects of this type, because dropping might be unsafe in that case.

What can we do about it?

Currently, this problem is linked to the fact that the lifetime system is a bit too restrictive and doesn’t allow for self-borrowing. Plus, you also have the dropchecker issue to figure out. Even though we were able to bring in alto and device altogether, how do you handle context? The dropchecker doesn’t know which one must be dropped first – there’s no obvious link at this stage between alto and all the others anymore, because that link was made with a reference to alto that died – we’re moving out of the scope of the Audio::new function.

That’s a bit tough. The current solution I implemented to fix the issue is ok–ish, but I dislike it because it adds a significant performance overhead: I just moved the initialization code in a thread that stays awake until the Audio object dies, and I use a synchronized channel to communicate with the objects in that thread. That works because the thread provides us with a stack, that is the support of lifetimes – think of scopes.

Another solution would be to move that initialization code in a function that would accept a closure – your application. Once everything is initialized, the closure is called with a few callbacks to toggle / set the cursor of the object living “behind” on the stack. I don’t like that solution because it modifies the main design – having an Audio object was the goal.