Sergey Potapov

A word from rustacean, rubist and linuxoid.

Rust 2018

With this post, I would like to give my feedback to the rust community in the scope of A Call for Community Blogposts.

This article is structured in the following way:

  • A little bit of my background
  • Things that I like in Rust.
  • Things that I miss or things that can be improved.

My background

For the last 10 years, my main programming language is Ruby and my main working area is web development. I started hacking Rust about 1.5 years ago, as you may guess it’s not very typical for web developers to jump to system programming languages. So my perception is quite different from the majority who comes with C++ or Java background.

I’ve decided to learn Rust by doing: I search for a vacuum in the ecosystem and pick some little libraries that seem interesting to me and realistic to implement and maintain for one person. You can find them on github.

Apart from that, I’ve implemented a little framework to develop and test trading strategies and an arbitrage bot for cryptocurrencies. Those are quick’n’dirty projects, where I’ve just tried to prototype ideas.

What I found awesome about Rust?

Here is just a short list of things:

  • Type safety
  • Package management with cargo
  • Pattern matching
  • Meaningful and helpful error messages (I could see how it was improved during last 1.5 years)
  • Doc tests
  • Dead code warnings
  • Community and the way Mozilla organizes the work
  • Performance
  • Language syntax and its expressiveness
  • Ability to write law-level as well as high-level code.
  • Usage of ! to indicate a magic behind macros

Every point here deserves its own discussion but in this post, I’d like to focus on the stuff that can be improved.

Improvement points

Here is just a list of things that I sometimes miss in Rust.

TryFrom and TryInto traits

Quite often I need to convert one type into another with possible failure. An idiomatic way to do this would be using TryFrom and TryInto traits, but they are not stable yet. Those are just like From and Into but return Result. Hope they will be stabilized soon.

Parametrization of generic types with constants

I don’t know is there an RFC for this already, but it would be nice to be able to pass constants like usize (maybe some others) to generic types. Let’s say I want to implement a structure to calculate moving average of length SIZE. The pseudo-code may look like this:

1
2
3
4
5
struct MovingAverage<SIZE: usize> {
    array: [SIZE; f64]
}

let ma: MovingAverage<10> = MovingAverage::new();

Note: for this particular case one can come up with a workaround parameterizing MovingAverage with an array type. For example:

1
2
3
4
5
struct MovingAverage<T> {
    array: T
}

let ma: MovingAverage<[10; f64]> = MovingAverage::new();

Shared trait bounds

Sometimes when I deal with generic types and there are too many trait bounds, the code gets monstrous, and the worst is that I need to duplicate it.

Consider the following example:

1
2
3
4
5
6
7
8
9
10
11
impl<A: X, B: Y, C: Z> for Foo<A, B, C>  {
    ...
}

impl<A: X, B: Y, C: Z> for Bar<A, B, C>  {
    ...
}

impl <B: Y, C: Z> for Baz<B, C> {
    ...
}

It would be cool to be able to define the trait bounds only once for implementation of all structures, like in the following pseudo-code:

1
2
3
4
5
6
7
8
9
10
11
12
13
scope<A: X, B: Y, C: Z> {
    impl for Foo<A, B, C> {
        ...
    }

    impl for Bar<A, B, C> {
        ...
    }

    impl for Baz<B, C> {
        ...
    }
}

Crazy generic types are hard to read in error messages

If one happens to work with a big chain of iterators or futures, they could see error messages with huge dreadful generic types.

I’ll take one relatively simple example from reddit to illustrate what I mean:

1
2
3
4
--> examples\echo_client_server.rs:51:22
   |
51 |                     .boxed()
   |                      ^^^^^ within `futures::AndThen<Box<futures::Future<Error=std::io::Error, Item=Box<line::Client>> + std::marker::Send>, futures::Map<Box<futures::Future<Error=std::io::Error, Item=std::string::String>>, [closure@examples\echo_client_server.rs:46:34: 49:30 client:_]>, [closure@examples\echo_client_server.rs:44:32: 50:22 i:_]>`, the trait `std::marker::Send` is not implemented for `futures::Future<Error=std::io::Error, Item=std::string::String>`

It’s quite hard to understand the data type from the first glance. I prefer manually to reformat such complex types into the readable multi-line representation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
futures::AndThen<
    Box<
        futures::Future<
            Error=std::io::Error,
            Item=Box<line::Client>
        >
        + std::marker::Send
    >,
    futures::Map<
        Box<
            futures::Future<
                Error=std::io::Error,
                Item=std::string::String
            >
        >,
        [closure@examples\echo_client_server.rs:46:34: 49:30 client:_]
    >,
    [closure@examples\echo_client_server.rs:44:32: 50:22 i:_]
>

But it would be nice if rustc could emit similar error messages for me.

Large source files

I’ve noticed many popular Rust libraries including the standard library contain large source files (> 1000 LOC). Probably it’s a question of taste, however I would prefer to keep things in the more granular way: one entity (structure + functions) per file. In my opinion, when a code base is organized like this it’s easier to read and maintain.

More Rust in production

It would be pleasant to see more production usage of Rust in 2018 and more new job positions for Rust developers opened =)

How to Run Rust Tests Automatically

If you prefer to use TDD approach to write Rust code like I do, you would need a fast feedback from your tests.

TDD cycle

Running cargo test every time you change a code base, quickly becomes a routine task. But you can automate this using cargo-watch plugin:

1
cargo watch -x test

Now cargo test will run automatically on every change in the source files.

However it can be still inconvenient, because you need to switch between your text editor and a terminal to see a result.

To overcome the problem we can use desktop notifications, so I’ve decided to create another small cargo plugin cargo-testify.

You can install it with the following command:

1
cargo install cargo-testify

And then run it within your rust project:

1
cargo testify

It detects changes in source files, run tests, and shows friendly desktop notifications, just like in the GIF below:

cargo-testify

Announcing Crystalium Organization

Hello, dear Crystal community!

I discovered Crystal in December 2015. It was so exciting, I wanted to learn! And the best way to learn a programming language is to use it. So I started looking for what is missed in the Crystal ecosystem and what would be interesting for me to implement. That’s how I ended up with 6-8 crystal projects in my github account, that seem to be used by other people.

Since that my life and my interests have changed. My focus moved to different things I’d like learn, and I see I don’t have enough time to maintain properly the projects I started. Some of them have opened issues and pull requests for pretty long time. I thought I will address them in few days.. or in few weeks.. But months passed and this has not happened. So I have to apologize.

I recall “The Cathedral and the Bazaar” book by Eric S. Raymond, who said that if one can not maintain an open source project it should be transferred to anther maintainer if possible.

So I’ve decided to create Crystalium github organization, where I moved some of my projects:

  • icr - Interactive console for Crystal programming language
  • cossack - HTTP client with middleware support
  • jwt - JSON web tokens implemented in Crystal
  • kiwi - unified interface for key-value stores
  • leveldb - Crystal bindings for LevelDB
  • bloom_filter - bloom filter implementation

If you’d like to become a member of the organization and maintain some of these projects in collaboration with others please let me know by sending an email to blake131313 at gmail.

If you are a designer and you’d like to create a logo you are welcome as well.

Thanks!

UPDATE:

I got an invitation to move the projects to crystal-community what makes sense to me. I believe it’s a better option, than creating another separate organization.

Exposing a Rust Library to C

Intro

Recently I’ve ported whatlang library to C (whatlang-ffi) and I’d like to share some experience.

DISCLAIMER: I am not a professional C/C++ developer, so it means:

  • I will describe some things that may look very obvious.
  • The outcome probably will not be a 100% idiomatic C code.
  • If you know how some things can be done better, please let me know by writing a comment.

Hello from Rust example

First let’s make a minimal C program, that calls Rust.

1
2
3
cargo new whatlang-ffi
cd whatlang-ffi
mkdir examples

Add this to Cargo.toml:

[lib]
name = "whatlang"
crate-type = ["staticlib", "cdylib"]

It tells cargo that we want to compile a static library and get .so object.

In src/lib.rs we implement a small function that prints a message to stdout:

1
2
3
4
#[no_mangle]
pub extern fn print_hello_from_rust() {
    println!("Hello from Rust");
}

Introduction to Rust Whatlang Library and Natural Language Identification Algorithms

I’d like to announce a new Rust library whatlang. Its purpose is to detect natural languages by a given text. Let me show you a quick example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
extern crate whatlang;

use whatlang::detect;

fn main() {
    // A sentence in German
    let text = "Das ist einfach Deutsch.";

    // Detect langauge and unwrap the infromation
    let info = detect(&text).unwrap();

    // Print an ISO 639-3 language code (e.g. "eng", "rus", "deu", etc)
    println!("Detected language: {:?}", info.lang().to_code());

    // Print a script (e.g. "Latin", "Cyrillic", "Arabic", etc)
    println!("Script: {:?}", info.script());

    // Can we rely on this information?
    println!("Is reliable: {}", info.is_reliable());
}

The output:

Detected language: deu
Script: Latin
Is reliable: true

NLP, Toki Pona and Ruby. Part 2: Language Detector

Previous articles:

In the first article we created a simple tokenizer, today we’re going to create a language detector to identify Toki Pona text among other texts.

First I want to say that are at least few good libraries for detecting natural languages in ruby:

But those are for mainstream: French, English, German… We want Toki Pona! Also, since we are focused on Toki Pona only, we can get much more precise results.

NLP, Toki Pona and Ruby: Part 1

Intro

During last few years, I spent a lot of time learning foreign languages like Esperanto, Spanish and German. After a while, I came up with an idea that I can apply this knowledge in computer science.

When I decided this I was completely new to Computational Linguistics(CL) and Natural Language Processing(NLP). However after reading a number of articles I got some basic ideas.

What I am gonna do

To dive into CL/NLP I’ve decided implement Toki Pona -> English translator from scratch. It’s interesting to see which issues I will face and how I will solve them. It will make me go through number of stages of language processing:

  • Lexical analysis
  • Language detection (I want to distinguish Toki Pona from other languages)
  • Morphological analysis (actually will be skipped because of simplicity of Toki Pona)
  • Syntax analysis
  • Word translation
  • Syntax tree conversion
  • Generation of final translation with respect to English grammar.

Anyway, this list is not strict, and probably it will be modified in the future.

What I am not gonna do

There are many tools and libraries that already exist in Ruby for NLP. I am not gonna use any of them here neither cover them in the articles. If you need something like that, please take a look at ruby-nlp. It’s a document that gathers a variety of NLP tools implemented in ruby.

Lazy Object Pattern in Ruby

I few days ago my colleague Arthur Shagall reviewing my code suggested me to use Lazy Object pattern to postpone some calculations during the load time. I hadn’t heard about the pattern before and even googling it didn’t give my much information. So I have decided to write this article to cover the topic.

Intention

Lazy Object allows you to postpone some calculation until the moment when the actual result of the calculation is used. That may help you to speed up booting of the application.

Implementation

It is pretty simple. We create a proxy object that takes a calculation block as its property and execute it on first method call.

1
2
3
4
5
6
7
8
9
10
11
12
13
class LazyObject < ::BasicObject
  def initialize(&callable)
    @callable = callable
  end

  def __target_object__
    @__target_object__ ||= @callable.call
  end

  def method_missing(method_name, *args, &block)
    __target_object__.send(method_name, *args, &block)
  end
end

Usage example 1

A constant assignment like this:

1
SQUARES = Array.new(10) { |i| i** 2}

Could be converted to this one:

1
SQUARES = LazyObject.new { Array.new(10) { |i| i** 2} }

So now if you want to use SQUARES it still behaves like an array:

1
2
3
SQUARES.class  # => Array
SQUARES.size   # => 10
SQUARES        # => [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Usage example 2

Let’s say you have models State and Address in you Rails application. What you want do is to validate inclusion of address.state in states.

You can just hardcore the list of states:

1
2
3
4
5
class Address < ::ActiveRecord::Base
  STATES = ["AL", "AK", "AZ", "AR", "CA", "CO"]   # and so on

  validates :state, inclusion: { in: STATES }
end

But it does not reflect your changes in DB in any way.

Then you can fetch the values from DB:

1
STATES = State.all.map(&:code)

It seems to look better, but there are 2 possible pitfalls:

  • It increases load time (1 more SQL query)
  • It may cause real troubles if STATES is initialized before State model is seeded. In this case STATES will be empty.

So that is the situation where Lazy Object is useful:

1
STATES = LazyObject.new { State.all.map(&:code) }

Ruby gem

If your prefer to have it as a ruby gem, please take a look at rubygems.org/gems/lazy_object.

Thanks for reading!

Ignore Files With Git Locally

Sometimes it’s necessary to ignore some files in a repository only locally. For rails developers it’s often ./config/database.yml file. Every developer has his own database configuration.

With git it can be easily achieved, we may instruct git no to track changes in certain files:

1
git update-index --assume-unchanged ./config/database.yml

Next time we type git status the changes in ./config/database.yml won’t be shown.

If you think you need to track that file again, just do:

1
git update-index --no-assume-unchanged ./config/database.yml

(pay attention to --no prefix).

Thanks!

How to Compare Audio in Ruby

Or how to implement sound_like RSpec matcher

The problem I’m trying to solve in this article is comparison of two audio files. We’ll figure out how to verify that they sound similar.

I was developing an application that has a deal with audio processing and I had to write a test to verify outcome audio file matches a one from fixtures. Well, I’ve decided to compare audio binaries like these:

1
expect(File.read('outcome.mp3')).to eq File.read('fixture.mp3')

And it worked!

But soon my colleagues let me know I had broken the build. It turned out that outcome.mp3generated on their Mac books didn’t match fixture.mp3 generated on my linux laptop, despite the fact that both sounded absolutely the same. Probably we had different codecs. So I had to come up with a better idea.