Introduction

Lockjaw is a fully static, compile-time dependency injection framework for Rust inspired by Dagger.

Why use a dependency injection framework

The main purpose of dependency injection is to separate the concerns of creating an object and using an object. In larger projects creating an object soon becomes a complicated process since creating an object ofter requires other objects that needs to be created first(a dependency). Once the object dependency graph grows deep and wide adding a new edge or node becomes a painful endeavor. For example, you may need to modify the signature of a dozen methods and interfaces just to pass the new object all the way to the site it is actually going to be used. This often ends up with anti-patterns like god objects and global states holding objects so the path can be shortened.

The dependency injection technique alleviates this problem by making an object only receive the objects they need. The object no longer needs to know what is required to create those objects, nor will have any change if their indirect dependency changes. Responsibility for creating objects is moved to factories.

A dependency injection framework further manages this by automating the creation of the factories. Users simply specify what each object needs, which implementation to use for an interface, etc., and the objects can be created.

Why use Lockjaw as the dependency injection framework

Lockjaw is inspired by Dagger, which is a mature dependency injection framework for Java. Lockjaw has feature parity with Dagger ( except producers, which may not be too useful in Rust with async/await available.)

Main features:

  • Compile time dependency resolution
    • Lockjaw makes sure all dependencies are fulfilled at compile time. The code will fail to compile if a dependency is missing, there are duplicated bindings for the same type, or if the dependency graph has cycles. There will be no runtime errors which are harder to detect.
  • Relatively readable diagnostic messages.
    • When a dependency is missing Lockjaw tries to tell you why it is even in the dependency graph, and where the dependency cycle is.
  • Cross-crate injection
    • Lockjaw is designed to be used across crates. Clients are able to inject bindings provided by libraries if they also use Lockjaw.
  • Minimal generated code surface
    • While procedural macros are utilized heavily by Lockjaw, it avoids directly modifying the code the attributes macros are placed on. Only a few generated methods are visible to the user. This is especially important since most Rust IDEs today does not understand the output of procedural macros, a few extra type hints on let expressions is enough to make autocomplete functional.
  • Optional binding, Multibinding, and generated components for plugin systems
    • Lockjaw allows inversion of control between library crates and their user. A library is able to define hooks for clients that depends on the library to inject. This is especially useful to test multiple clients using a library in isolation.

Why NOT use Lockjaw as the dependency injection framework

While the generated code generally is OK-ish, abhorrent techniques are used to generate the code under the confinement of the current Rust proc-macro system and is extremely fragile.

Please read the before using chapter before using Lockjaw. Lockjaw currently cannot be recommended for serious work in good conscious. YOU HAVE BEEN WARNED

Before Using Lockjaw

Before using lockjaw you should carefully consider if it is really for you. Having compile time dependency injection is wonderful, but the current Rust proc-macro system does not make it easy, and there are a lot of trade-offs.

Robustness

Lockjaw is experimental in its nature, with the main goal of reaching feature parity with Dagger. It tries to answer:

  • What dependency injection would look like in Rust?
    • Is it even possible?
    • Is it going to be useful?
    • How should lifetimes and borrows be handled?
  • What are the hurdles when trying to implement a dependency injection framework in Rust?

Currently lockjaw focuses on "can" instead of "should". If there is any hacky undocumented compiler behavior we can abuse to implement a feature, It will be used. While we try to make it bug-free and generate safe code in the current version of Rust, there are no guarantee it will continue to work in the future.

See the caveats section for all horrible hacks used.

Future efforts might be made to get language features Lockjaw need implemented in Rust, but that is a very long road.

Maintenance

Lockjaw is a quite complicated project, but it is built by the developer with short attention span and other priorities, as a hobby project, while also trying to learn Rust. Do not expect continuous support, and especially consider that newer Rust can break Lockjaw at any moment.

Irreversibility

Dependency injection frameworks are very invasive and will change how code are written. If you decide to remove Lockjaw in the future, you may need to rewrite the whole project.

Only use lockjaw if...

  • You also like to live dangerously.
  • You are experimenting things.
  • You are working on a small project, and it won't be maintained in the future.
  • Someone else will be maintaining the project in the future, and you are a horrible individual.
  • You love dependency injection so much you are willing to patch bugs yourself, and freeze Rust version if it breaks Lockjaw.

Glossary

Binding

A recipe to build an instance of a type. It includes dependencies (other types that needs to be prepared first before the instance can be created), and a way to transform the dependencies into a new instance, either by a user supplied method or Lockjaw internal generation.

Module

In all Lockjaw documentations, "module" refers to a dependency injection module. When referring to Rust modules mod will be used instead.

Changes

0.3

  • Dependency gathering is now done through the build script instead of proc_macro.
    • This removes the need for cross macro communication as all info is now readily available when proc_macro runs.
    • Sources are parsed twice once in the build script and once by the compiler. However, this is already required to perform path resolution
    • The proc_macro now only generate codes, using the result from the build script. The proc macro no longer need to resolve global path names.
    • Path resolution/dependency gathering now use syn to parse the source instead of using tree-sitter-rust to be more consistent with how proc_macro parse the code.
  • The lockjaw::prologue!(path/to/src) is removed.
    • The build script is able to directly infer the source path.
  • No longer need to specify root/test in lockjaw::epilogue

Project Setup

Cargo

Add lockjaw to the [dependencies] and [build_dependencies] section of your Cargo.toml:

[dependencies]
lockjaw = "*"

[build-dependencies]
lockjaw = "*"

The proc_macro and runtime library are packaged into the same crate, so this is the only target you need. While the proc_macro library is heavy, Rust should be able to optimize them away in the resulting binary. The runtime is pretty light, and the generated code is supposed to be zero cost abstraction.

Build script

Lockjaw also needs some environment setup, and requires a build script. Add build.rs next to Cargo.toml, and call lockjaw::build_script() in main() inside it:

// https://github.com/azureblaze/lockjaw/tree/main/userguide/projects/setup/build.rs
fn main() {
    lockjaw::build_script();
}

The build script is required at the 'root' of your project, included binaries and any sub-crate with tests. Lockjaw will ask you to do this if this step is missing.

The build script scans through all source under the crate and its dependencies to locate any bindings that should be a part of the dependency graph. This is required as path resolution cannot be done in a proc_macro

Epilogue macro

You also must call the lockjaw::epilogue!() macro in the root of your root crate (lib.rs or main.rs).

// https://github.com/azureblaze/lockjaw/tree/main/userguide/projects/setup/src/main.rs
lockjaw::epilogue!();

Source of this chapter

Injecting Objects

Lockjaw can create objects for you, but you need to let lockjaw know how to create the object and what is needed to create them. These recipes for object creation is called bindings, which forms nodes in a dependency graph, and add edges to other bindings it depends on.

The most simple binding is constructor injection binding.

Constructor injection

A binding can be created for a struct by using a constructor method, which is a static method associated to the struct that returns a new instance. The field constructor ( Foo {a : 0, ...}) is not directly used by Lockjaw since methods are more expressive when a none injected field needs a default value or when transformations are needed on the input.

A struct can be made injectable by marking a struct impl block with the #[injectable] attribute, and then mark the constructor method as #[inject].

struct Foo {}

#[injectable]
impl Foo {
    #[inject]
    pub fn new() -> Foo {
        Foo {}
    }
}

Now Lockjaw understands when trying to create Foo, it should call Foo::new().

Note that since it is an associated method, constructor injection only works on a type you own ( you can actually change its implementation). For foreign types like imported crates a different method will be discussed in the providing objects chapter.

Constructor dependencies

To create an object, it may many need objects of other types. This is called dependencies. In this example, Bar depends on having an instance of Foo to be created.

In constructor injections, dependencies are listed with its parameters. Lockjaw will try to use available bindings to create all arguments and pass them to the constructor. If a parameter does not have a binding, Lockjaw will fail compilation with missing bindings.

Note that since we are not asking Lockjaw to actually create the object yet, binding validation won't be performed. Lockjaw is assuming there are some bindings else where it does not know about yet.

struct Bar {
    foo: Foo,
    i: i32,
}

#[injectable]
impl Bar {
    #[inject]
    pub fn new(foo: Foo) -> Bar {
        Bar { foo, i: 42 }
    }
}

If the struct has other fields that can be initialized without injection, like i, it can be directly assigned. If the object needs a runtime value (for example, if i needs to be assigned by the caller with a user input), then factories will be needed, which will be discussed later.

Manual injection

For a moment let's forget about Lockjaw, and try to do dependency injection manually. With the binding information we have we can write a factory that can create the objects we just defined:

struct Factory {}

impl Factory {
    pub fn create_foo(&self) -> Foo {
        Foo::new()
    }

    pub fn create_bar(&self) -> Bar {
        Bar::new(self.create_foo())
    }
}

#[test]
fn test() {
    let factory = Factory {};

    let _foo = factory.create_foo();
    let _bar = factory.create_bar();
}

Note that there is one method for each binding, only taking &self and returning the binding type. Inside the method it calls the constructor method we just marked, and calls other binding methods to generate the argument.

The factory is an object instead of just methods, since it might need to carry states in the future (For example, returning a reference to a shared object owned by the factory.)

Writing the factory by hand gets complicated and boring fast. In the next chapter we will ask Lockjaw to generate it.

Source of this chapter

Requesting objects

In the last chapter we manually wrote an object factory:

struct Factory {}

impl Factory {
    pub fn create_foo(&self) -> Foo {
        Foo::new()
    }

    pub fn create_bar(&self) -> Bar {
        Bar::new(self.create_foo())
    }
}

#[test]
fn test() {
    let factory = Factory {};

    let _foo = factory.create_foo();
    let _bar = factory.create_bar();
}

Now we will ask Lockjaw to automatically generate it. A factory generating objects using a certain dependency graph is called a Component in Dagger terminology.

Defining the component

#[component]
trait MyComponent {
    fn create_foo(&self) -> Foo;

    fn create_bar(&self) -> Bar;
}

Comparing to the manual factory there are not a lot of changes. Since Lockjaw Cannot generate the implementation immediately, we use a trait instead of a struct as the interface has to be abstract. The trait is then annotated with the #[component] attribute that instructs Lockjaw to generate the implementation. Lockjaw is able to identify all #[inject] bindings automatically.

For every type we wish to be able to directly create from the component, a method returning the type should be added to the trait. Like the manual factory, the method should take &self (and nothing else) so it may further use the component to create the dependencies it needs. The name of the method does not really matter, but since you are going to call it later you'd probably want something sensible.

Note that if a type is not directly needed, the trait does not need a method for it. For example while Bar needs Foo, if we are never going to use Foo in main() we can delete the create_foo() method. The trait methods does not affect Lockjaw internal generation, it only declares what needs to be publicly provided.

Creating and using the component.

Since create_foo(&self) is a method, we need to create the component to be able to call it. Lockjaw generates a static build() method on the trait that can be used to create the component.

#[test]
fn test() {
    let component: Box<dyn MyComponent> = <dyn MyComponent>::build();

    let _foo = component.create_foo();
    let _bar = component.create_bar();
}

build() returns Box<dyn COMPONENT>. Most IDE today does not understand symbols generated by procedural macros, so you might want to manually hint the type in the let expression.

Once the component is created, the trait methods can be used to create the objects.

Source of this chapter

Providing Objects

In this chapter we will discuss another way to create bindings.

Limitations of constructor injection bindings

Constructor injections are designed to be the only way to create an instance of the type it binds. It is even considered bad practice to call the constructor manually. Hence, constructor injections has some limitations:

  • Constructor injections can only be done by owned types (can only be defined by the mod that defines the type itself.).
    • If you don't own the type you should not say something is the only way to create it.
  • Can only create concrete types
    • Sometimes you may want to bind traits and swap the implementation, maybe at runtime.

Modules

Obviously Lockjaw is not going to ask the world to use it or the user to rewrite everything they use with it, so it gives other ways to bind types. Since these bindings are no longer the "one true way to create things", and different bindings for the same type may be needed within the same program, the user needs to be able to select which bindings to use in each dependency graph.

In Lockjaw, these elective bindings are defined in modules, and the component can choose what modules to install, which imports its bindings. Note that in Lockjaw documentation modules always refer dependency injection modules, and mod will be used to refer to Rust modules.

To declare a module with Lockjaw the #[module] attribute should be used to mark the impl block of a struct.

struct MyModule {}

#[module]
impl My Module {
  ...
}

The impl block will contain the binding definitions.

For now the modules should be static (without fields). Modules with fields will be discussed in builder modules

#[provides] bindings

The #[provides] binding annotates a method that returns the type.

    #[provides]
    pub fn provide_i32() -> i32 {
        42
    }

Like #[inject], the #[provides] method can also request other bindings from the dependency graph, and produce the target value with it.

    #[provides]
    pub fn provides_string(i: i32) -> String {
        format!("{}", i)
    }

Installing modules

#[module] on its own is just a collection of bindings and does not do anything. It must be installed in a #[component] to joint the dependency graph. This is done by listing the module type in the modules metadata of the component.

#[component(modules: [MyModule])]
trait MyComponent {
    fn i32(&self) -> i32;

    fn string(&self) -> String;
}

A lot of Lockjaw attribute macros also takes metadata arguments, which is comma separated key : value pairs in a parenthesis. The values are usually string literal, integers, types, arrays of values (foo : [value 1, value 2]), or more metadata (foo : { key : value }). In this case modules takes an array of types (of #[modules]).

Providing trait is a bit more complicated and will be discussed later.

Source of this chapter

Builder Modules

In the previous chapter the modules are without fields, so Lockjaw can easily create instances of it (In fact no instance are created since Lockjaw can just call the static methods). However sometimes we want to be able to affect the values that are bound at runtime, hence need to be able to change what #[provides] does using fields in the module.

struct MyModule {
    i: i32,
}

#[module]
impl MyModule {
    #[provides]
    pub fn provide_i32(&self) -> i32 {
        self.i
    }
}

Since the struct now has fields Lockjaw can no longer automatically create it, the user must manually pass in the modules when creating the component.

Implementation note: While Lockjaw can also try to use Default or some other mechanisms, usages like this implies the module has mutable state and generally is a bad idea.

Using builder modules

Instead of passing the runtime modules to the component one by one, they are collected in a single struct annotated by the #[builder_modules] attribute.

#[builder_modules]
struct MyBuilderModules {
    my_module: MyModule,
}

Every field in the struct should be a module. Using a struct makes sure each module will be required by the compiler/IDE while exposing the least amount of generated code(which is harder for users and IDEs to understand, it is better to spell everything out in visible code.).

The #[builder_modules] can then be installed in the component using the builder_modules metadata

#[component(builder_modules: MyBuilderModules)]
trait MyComponent {
    fn i32(&self) -> i32;
}

The component only accepts one#[builder_modules], which is likely to be specifically tailored for the component. The modules itself can be shared.

The builder_modules metadata can be used at the same time with the modules metadata. modules should be preferred whenever possible as they are easier to use.

Creating components with builder modules

If the builder_modules metadata is specified, the #[builder_modules] struct will become the parameter for the build() method of the component which the user must pass.

#[test]
fn test() {
    let my_module = MyModule { i: 42 };
    let my_builder_modules = MyBuilderModules { my_module };
    let component: Box<dyn MyComponent> = <dyn MyComponent>::build(my_builder_modules);
    assert_eq!(component.i32(), 42);

    let other_component: Box<dyn MyComponent> = <dyn MyComponent>::build(MyBuilderModules {
        my_module: MyModule { i: 123 },
    });
    assert_eq!(other_component.i32(), 123);
}

Source of this chapter

Binding traits

#[provides] trait

A trait can be provided using the #[provides] binding.

    #[provides]
    pub fn provide_i32_maker(impl_: I32MakerImpl) -> Box<dyn I32Maker> {
        Box::new(impl_)
    }

However, Lockjaw is going to be particular when trying to request a trait object from the dependency graph. The concrete implementation of the trait may contain reference to the component, but ideally this is not something the consumer of the trait should care about, so Lockjaw enforces that any trait it provides must not outlive the component. The worst case 'ComponentLifetime is assumed, so consumers don't have to change when it actually happens.

The Box returned by the component must be bound by the component's lifetime(same as self).

#[component(modules: [MyModule])]
trait ProvideComponent {
    fn i32_maker(&'_ self) -> Box<dyn I32Maker + '_>;
}

#[binds] trait

While #[provides] kind of works, binding an implementation to a trait interface is a common operation so Lockjaw has the #[binds] attribute to make them easier to use.

For an interface and an implementation:

pub trait Logger {
    fn log(&self, msg: &str);
}

struct StdoutLogger;

#[injectable]
impl StdoutLogger {
    #[inject]
    pub fn new() -> StdoutLogger {
        StdoutLogger
    }
}

impl Logger for StdoutLogger {
    fn log(&self, msg: &str) {
        println!("{}", msg);
    }
}

#[binds] can be used to create binding that says "when the Logger interface is needed, use StdoutLogger as the actual implementation":

    #[binds]
    pub fn bind_stdout_logger(_impl: StdoutLogger) -> Cl<dyn Logger> {}

The method body must be empty, as Lockjaw will replace it.

TheCl in the return type means Component lifetimed, which is a wrapper around a type forcing it not outlive the component. Having this wrapper makes it easier for the compiler to deduce the lifetime.

With the binding defined the Logger can now be used by other classes, without caring about the actual implementation.

pub struct Greeter<'component> {
    logger: Cl<'component, dyn Logger>,
}

#[injectable]
impl Greeter<'_> {
    #[inject]
    pub fn new(logger: Cl<dyn Logger>) -> Greeter {
        Greeter { logger }
    }

    pub fn greet(&self) {
        self.logger.log("helloworld!");
    }
}

Note that Logger still has to be injected as Cl<dyn Logger>, and Greeter is also bound by the lifetime of the component.

Unit testing with dependency injection

StdoutLogger writes its output straight to the console, so it is hard to verify Greeter actually sends the correct thing. While we can give StdoutLogger special apis to memorize what it logs and give access to tests, having test code in prod is generally bad practice.

Instead we can use dependency injection to replace the environment Greeter runs in. We can create a TestLogger that writes the logs to memory and can read it later, bind it to the Logger with a module, and install the module in a component for test that has all test bindings. We are than able to test Greeter without adding test code to the Greeter itself:

#[cfg(test)]
pub mod testing {
    use crate::{Greeter, Logger};
    use lockjaw::{component, injectable, module, Cl};

    static mut MESSAGES: Vec<String> = Vec::new();

    pub struct TestLogger;

    #[injectable]
    impl TestLogger {
        #[inject]
        pub fn new() -> TestLogger {
            TestLogger
        }

        pub fn get_messages(&self) -> Vec<String> {
            unsafe { MESSAGES.clone() }
        }
    }

    impl Logger for TestLogger {
        fn log(&self, msg: &str) {
            unsafe { MESSAGES.push(msg.to_owned()) }
        }
    }

    pub struct TestModule;

    #[module]
    impl TestModule {
        #[binds]
        pub fn bind_test_logger(_impl: TestLogger) -> Cl<dyn Logger> {}
    }

    #[component(modules: [TestModule])]
    pub trait TestComponent {
        fn greeter(&self) -> Greeter;
        fn test_logger(&self) -> TestLogger;
    }

    #[test]
    fn test() {
        let component: Box<dyn TestComponent> = <dyn TestComponent>::build();

        component.greeter().greet();

        assert_eq!(
            component.test_logger().get_messages(),
            vec!["helloworld!".to_owned()]
        );
    }
}

Generally, a library should also provide a test implementation and a module that binds the test implementation. The consumer of the library can then test by installing the test module instead of the real module. This allows test infrastructure to easily be shared. Some kind of test scaffolding can also be created to auto generate the component and inject objects into tests, but that is out of scope for lockjaw itself.

Note that in the TestLogger MESSAGES is a static mutable, since log() and get_messages() is going to be called on different instances of TestLogger. This is unsafe and bad, so in the next chapter we will discuss how to handle this by forcing a single instance of TestLogger to be shared among everything that uses it.

Source of this chapter

Scoped Bindings

By default everytime a dependency needs to be satisfied, lockjaw creates a new instance, and move it to the dependency (field or method parameter). This is not always desired since an object may be used to carry some common state, and we want every type that depends on it to get a reference to a single instance instead (singletons).

In the last chapter, we had to use a mutable static state to store the messages TestLogger logs, since we need to read the same messages later but a different instance of TestLogger will be created.

To do this, the scope metadata can be specified on a #[injecatable] or #[provides] , passing a component's path. This means there are only one instance of the type for objects created by the same instance of component (they are not global singletons, you can still have multiple instances if you have multiple components).

    #[derive(Default)]
    pub struct TestLogger {
        messages: RefCell<Vec<String>>,
    }

    #[injectable(scope: TestComponent)]
    impl TestLogger {
        #[inject]
        pub fn new() -> TestLogger {
            TestLogger::default()
        }

        pub fn get_messages(&self) -> Vec<String> {
            self.messages.borrow().clone()
        }
    }

    impl Logger for TestLogger {
        fn log(&self, msg: &str) {
            self.messages.borrow_mut().push(msg.to_owned())
        }
    }

Other types can depend on a scoped type as a reference (&T) or Cl<T>

        fn test_logger(&self) -> Cl<TestLogger>;
    }

    #[test]
    fn test() {
        let component: Box<dyn TestComponent> = <dyn TestComponent>::build();

        component.greeter().greet();

        assert_eq!(
            component.test_logger().get_messages(),
            vec!["helloworld!".to_owned()]
        );
    }
}

epilogue!();

Although #[binds] has to explicitly ask for &TestLogger

        #[binds]
        pub fn bind_test_logger(_impl: &TestLogger) -> Cl<dyn Logger> {}

Note that Greeter hasn't changed at all. Cl<T> allows a type to decouple itself from whether the type depended on is scoped or not. It may be an owned instance or a shared instance, but the type does not care as it will not try to move it.

Lifetime

Scoped objects are owned by the component and has the same lifetime as it.

Handling mutability

In most uses a scoped type probably should be mutable to make it useful. However we cannot request it as &mut T since certainly multiple objects will try to request it. Scoped types must implement interior mutability itself and use an immutable interface. In the example TestLogger use a RefCell so the messages can be mutated even when the TestLogger itself is immutable.

Sometimes it might be easier to wrap the whole class in a memory container like a RefCell or RwLock. The container metadata can be used on a #[injectable] to bind the type as &CONTAINER<T> instead of &T

Qualifiers

Sometimes users may need multiple shared instances of the same type for different purposes. For example the application may need different ThreadPool:

  • A thread pool for UI with a single thread, since OpenGL does not like concurrency.
  • A thread pool for general purpose CPU bound tasks, with a thread for each physical core.
  • A thread pool for IO bound tasks, with a small amount of threads. It is IO bound so more threads won't make it faster
  • A thread pool for other blocking tasks with unbounded threads. Most of them will be idling anyway.

They all share the same type/interface, but have different internal characteristics. Trying to bind them all will cause duplicated binding failures. Furthermore, the user need to be able to specify which one they want.

New type idiom

In most cases, the new type idiom is preferred. With new types, Lockjaw sees them as unrelated and won't cause duplicated binding issues. The compiler also provides strong compile time type check to ensure the correct one is used (an API that do IO work can explicitly ask for the IoThreadPool).

However, the new type idiom does not work with bindings with type generated by Lockjaw such as providers, optional bindings, and multibinding containers. It may also cause a lot of wrapping/unwrapping when working with third party libraries.

Qualifiers

Qualifiers are hints for Lockjaw to give a type different tags, so they can be bound separately.

A qualifier must be declared first using the #[qualifier] attribute macro.

#[qualifier]
pub struct Q1;

The struct body does not matter and probably should be empty.

Once the qualifier is declared, it can then be used in the #[qualified] attribute on a method in a #[module], marking the return type as qualified. Lockjaw will treat the bindings as distinct types.

#[module]
impl MyModule {
    #[provides]
    pub fn provide_string() -> String {
        "string".to_owned()
    }

    #[provides]
    #[qualified(Q1)]
    pub fn provide_q1_string() -> String {
        "q1_string".to_owned()
    }

    #[provides]
    #[qualified(Q2)]
    pub fn provide_q2_string() -> String {
        "q2_string".to_owned()
    }
}

Qualified types can be injected using the #[qualified] attribute on the parameter, or the component method.

#[component(modules: [MyModule])]
pub trait MyComponent {
    fn string(&self) -> String;
    #[qualified(Q1)]
    fn q1_string(&self) -> String;
    #[qualified(Q2)]
    fn q2_string(&self) -> String;
}
    #[inject]
    pub fn new(#[qualified(Q)] s: String) -> Foo {
        Foo { s }
    }

Provider<T>

Normally when injecting a dependency, one instance of the dependency is created before creating the depending object. This may not be ideal since the depending object might:

  • Want multiple instance of the object, for example, populating an array.
  • Have cyclic dependency at runtime

For every binding T, Lockjaw also automatically creates a binding to Provider<T>, which creates a new instance of T everytime get() is called.

Since a Provider needs to use the component to create the instance, its lifetime is bound by the component.

Creating multiple instances

Provider<T> can be used to create instances on request.

struct Foo {
  bar_provider: Provider<Bar>,
  bars : Vec<Bar>,
}

impl Foo{
    #[inject]
    pub fn new(bar_provider: Provider<Bar>) -> Foo {
      bar_provider,
      bars: vec![bar_provider.get(), bar_provider.get(), bar_provider.get()],
    }
    
    pub fn add_more_bar(&mut self){
       self.bars.push(self.bar_provider.get())
    }
}

Bypassing runtime cyclic dependency

Since regular dependencies must be created before instantiating an object, cyclic dependencies will result in a recursive stack overflow when the constructor is called. Lockjaw will detect this situation and refuse to compile your project.

However sometimes the dependency is only used at runtime, not at object construction. This is especially common when singleton classes need to refer to each other. By using Provider<T> the cycle can be broken.

struct Foo<'component> {
    bar_provider: Provider<'component, Box<Bar<'component>>>,
}

#[injectable]
impl<'component> Foo<'component> {
    #[inject]
    pub fn new(bar: Provider<'component, Box<Bar<'component>>>) -> Foo<'component> {
        Self { bar_provider: bar }
    }

    pub fn create_bar(&self) -> Box<Bar<'component>> {
        self.bar_provider.get()
    }
}

struct Bar<'component> {
    foo: Box<Foo<'component>>,
}

#[injectable]
impl<'component> Bar<'component> {
    #[inject]
    pub fn new(foo: Box<Foo<'component>>) -> Bar<'component> {
        Self { foo }
    }
}

In this example, while instantiating Bar, instantiates a Foo, Bar won't be created until Foo.create_bar() is called, hence creating either won't trigger a stack overflow.

Trying to call Provider.get may still lead to a stack overflow, and Lockjaw cannot check this for you.

Lazy

Lazy<T> is a wrapper around a Provider<T>, which creates the object once and caches the result. The object will only be created when get() is called, and subsequent invocations returns a reference to the same object.

Examples

https://github.com/azureblaze/lockjaw/blob/main/tests/lazy.rs

Factory

Sometimes an object needs not only injected values but also runtime values to be created, such as constructor parameters that depends on user input.

This can be handled by writing a factory that injects bindings as a Provider, and combine it with the runtime value to create the object. For example,

pub struct Foo {
    pub i: i32, // runtime
    pub s: String, // injected
}

pub struct FooFactory<'component>{
    s_provider: Provider<'component, String>
}

#[injectable]
impl FooFacotory<'_> {
    #[inject]
    pub fn new(s_provider: Provider<String>) -> FooFactory {
       FooFactory { s_provider }
    }
    
    pub fn create(&self, i: i32) -> Foo {
       Foo { i, s : self.s_provider.get() }
    }
}

This is a lot of boilerplate, and can be automated by using #[factory] instead of #[inject]

#[injectable]
impl Foo {
    #[factory]
    fn create(#[runtime] i: i32, s: String) -> Self {
        Self { i, s }
    }
}

Runtime parameters needs to be marked with the #[runtime] attribute.

FooFactory will be created by Lockjaw, with a method with the same name as the marked method taking only runtime parameters.

    let foo = component.foo_factory().create(42);

Factory traits

The factory can also be instructed to implement a trait by using the implementing metadata .

pub trait FooCreator {
    fn create(&self, i: i32) -> Foo;
}

#[injectable]
impl Foo {
    #[factory(implementing: FooCreator)]
    fn create(#[runtime] i: i32, phrase: String) -> Self {
        Self { i, phrase }
    }
}

The method name and runtime signature must match the trait method the factory should override.

This is especially useful to bind the factory to a trait

    #[binds]
    pub fn bind_foo_creator(impl_: FooFactory) -> Cl<dyn FooCreator> {}

Examples

https://github.com/azureblaze/lockjaw/blob/main/tests/injectable_factory.rs https://github.com/azureblaze/lockjaw/blob/main/tests/injectable_factory_implementing.rs

Optional bindings

Sometimes a binding might be optional, being behind a cargo feature or provided by an optional library. It will be useful to allow such bindings to be missing from the dependency graph, and detect whether such binding exists.

In Lockjaw a binding can be declared as optional by using the #[binds_option_of] method attribute in a #[module]

    #[binds_option_of]
    pub fn binds_option_of_string() -> String {}

The #[binds_option_of] method should take no parameter and return the type T to bind as Option<T>. This does not actually bind the T.

    fn option_string(&self) -> Option<String>;

If T is actually bound somewhere else, injecting Option<T> will result in Some(T). Otherwise it will be None.

Multibinding

Multibinding is a special type of binding in Lockjaw that allows duplicated bindings. Instead of enforcing one binding per type, multibindings gather the bindings into a collection, allowing "everything implementing a type" to be injected. This is especially useful to build a plugin system where an unspecified amount of implementations can be handled.

Multibindings comes in 2 flavors, a Vec<T> binding that simply collects everything, and HashMap<K,V> binding where key collisions are checked at compile time.

Vec<T> multibindings

A #[provides] or #[binds] binding can also be marked with the #[into_vec] attribute, which means instead of directly binding to T, the binding should be collected into a Vec<T>.

With the bindings:

    #[provides]
    #[into_vec]
    pub fn provide_string1() -> String {
        "string1".to_owned()
    }

    #[provides]
    #[into_vec]
    pub fn provide_string2() -> String {
        "string2".to_owned()
    }

Vec<String> can be injected with the values ["string1", "string2"]. This works across all modules that are installed.

#[into_vec] can also be #[qualified]

    #[provides]
    #[qualified(Q)]
    #[into_vec]
    pub fn provide_q_string1() -> String {
        "q_string1".to_owned()
    }

Which result in #[qualified(Q)] Vec<String>. Note that the container is qualified instead of the content.

#[into_vec] also works with #[binds]

    #[binds]
    #[into_vec]
    pub fn bind_bar(impl_: crate::Bar) -> Cl<dyn crate::Foo> {}

    #[binds]
    #[into_vec]
    pub fn bind_baz(impl_: crate::Baz) -> Cl<dyn crate::Foo> {}

Which allows Vec<Cl<dyn Foo>> to be injected. This is a common way to implement event callbacks.

Providing multiple items

A method marked with #[elements_into_vec] can return Vec<T>, which will get merged with other #[into_vec] and #[elements_into_vec].

    #[provides]
    #[elements_into_vec]
    pub fn provide_strings() -> Vec<String> {
        vec!["string3".to_owned(), "string4".to_owned()]
    }

This allows multiple bindings to be provided at once. It also allows a binding method to decide not to provide anything at runtime, by returning an empty Vec.

Duplication behaviors

Lockjaw's #[into_vec] strays from Dagger's @IntoSet, as Hash and Eq are not universally implemented in Rust. This mean the Vec may contain duplicated values.

However, duplicated modules are not allowed by Lockjaw, so each binding method will be called at most once when generating the Vec.

If deduplication is needed, you can add another provider that does the conversion:

#[provides]
pub fn vec_string_to_set_string(v: Vec<String>) -> HashSet<String> {
    HashSet::from_iter(v)
}

Examples

https://github.com/azureblaze/lockjaw/blob/main/tests/module_provides_into_vec.rs

HashMap<K,V> multibinding

Similar to Vec<T> multibinding, a #[provides] or #[binds] binding can also be marked with the #[into_map] attribute, which collects the key-value pair into a HashMap

While a map can be created by multibinding Vec<(K,V)> or some other entry generating mechanisms, the HashMap<K,V> multibinding has additional compile time checks to make sure there are no key collisions.

Map keys

The value type is specified by the binding method return value, but the key type and value needs to be specified by a metadata in the #[into_map] attribute.

string_key

string_key specifies the map key is a String. The value must be a string literal, lockjaw is unable to resolve more complex compile time constants.

This example binds to HashMap<String,String>:

    #[provides]
    #[into_map(string_key: "1")]
    pub fn provide_string1() -> String {
        "string1".to_owned()
    }

    #[provides]
    #[into_map(string_key: "2")]
    pub fn provide_string2() -> String {
        "string2".to_owned()
    }

i32_key

i32_key specifies the map key is an i32. The value must be an i32 literal, lockjaw is unable to resolve more complex compile time constants.

This example binds to HashMap<i32,String>:

    #[provides]
    #[into_map(i32_key: 1)]
    pub fn provide_i32_string1() -> String {
        "string1".to_owned()
    }

    #[provides]
    #[into_map(i32_key: 2)]
    pub fn provide_i32_string2() -> String {
        "string2".to_owned()
    }

Other types are not implemented. i32 ought to be enough for everyone.

enum_key

i32_key specifies the map key is an enum. Since the enum is going to be used as the map key, it must satisfy the same constraints HashMap gives, which is implementing Eq and Hash. It also must be a simple enum with no fields so Lockjaw knows how to compare them at compile time (meaning comparing the name is enough).

#[derive(Eq, PartialEq, Hash)]
pub enum E {
    Foo,
    Bar,
}

This example binds to HashMap<E,String>:

    #[provides]
    #[into_map(enum_key: E::Foo)]
    pub fn provide_enum_string1() -> String {
        "string1".to_owned()
    }

    #[provides]
    #[into_map(enum_key: Bar)]
    pub fn provide_enum_string2() -> String {
        "string2".to_owned()
    }

Lockjaw is able to infer the enum type (E) if the value is imported (use E::Bar), but the code maybe be more readable if the type is explicitly spelled out, especially most IDEs today cannot properly inspect tokens inside the metadata.

Qualifiers

#[into_map] can also be #[qualified]

    #[provides]
    #[qualified(Q)]
    #[into_map(string_key: "1")]
    pub fn provide_q_string1() -> String {
        "q_string1".to_owned()
    }

Which result in #[qualified(Q)] HashMap<String, String>. Note that the container is qualified instead of the content.

Dynamic map entries

All bindings in #[into_map] must be resolved at compile time, There are no #[elements_into_vec] equivalent such as #[elements_into_map].

However dynamic map entries can be achieved by rebinding Vec<(K,V)> into a HashMap<K,V>.

Examples

https://github.com/azureblaze/lockjaw/blob/main/tests/module_provides_into_map.rs

Empty multibindings

The binding for Vec<T> and HashMap<K,V> is automatically generated when #[into_vec] or #[into_map] is encountered. However when such binding does not exist and someone depended on the collection, Lockjaw cannot be sure if it should provide an empty collection since it should be a multibinding, or if the user forgot to bind the collection.

This usually happens when a library defines a multibinding for events, etc., but does not bind anything to it itself, and clients aren't forced to use the event.

A #[multibinds] method that returns the collection type should be declared in such case, to let Lockjaw know a multibinding collection is intended, but it may be empty.

#[module]
impl MyModule {
    #[multibinds]
    fn vec_string() -> Vec<String> {}

    #[multibinds]
    #[qualified(Q)]
    fn q_vec_string() -> Vec<String> {}

    #[multibinds]
    fn map_string_string() -> HashMap<String, String> {}

    #[multibinds]
    #[qualified(Q)]
    fn q_map_string_string() -> HashMap<String, String> {}
}

#[multibinds] also serves as documentation on the #[module] specifying it is expecting the multibinding.

Subcomponents

Scopes enforces an object to be a singleton within the component, so everyone can interact with the same instance. However there are times when we need multiple sets of such instances. For example, we might be implementing a web server, and there are a lot of shared resources with every session. We want the session to be independent, so we need to create component for each session:

impl Server {
    pub fn new_session(url: Url) -> Session {
      Session{
         component: <dyn SessionComponent>::build({url,...})
      }
    }
}

However, there are also some resource that belongs to the whole server shared by all sessions, for example, an IO bound thread pool. We have to rebind this in every SessionComponent

impl Server{
    #[inject]
    pub fn new(#[qualified(IoBound)] io_threadpool: &ThreadPool) { ... }
    
        pub fn new_session(&self, url: Url) -> Session {
        Session {
            component: <dyn SessionComponent>::build({
                url,
                self.io_threadpool,
                ...
            })
        }
    }
}

Managing these will soon get ugly.

Instead we can create a #[subcomponent] which can have distinct scoped bindings, modules, but also has access to bindings in its parent component.

Using #[subcomponent] is almost identical to a regular #[component], except that the #[subcomponent] has to be installed in a parent component, and the method to create an instance.

Installing a #[subcomponent]

A #[subcomponent] is installed by first listing it in a #[module] using the subcomponents metadata

#[module(subcomponents: [MySubcomponent])]
impl ParentComponentModule {

Then installing the #[module] in a parent component. The parent component can be either a regular component or another subcomponent.

Creating a instance of the subcomponent

The #[module] with a subcomponents: [FooSubcomponent] metadata creates hidden binding of Cl<FooSubcomponentBuilder> which can be injected to create new instances of FooSubcomponent by calling build(). build() also takes the corresponding #[builder_modules] if the subcomponent is defined with the builder_modules metadata

build can be called multiple times to create independent subcomponents, with the parent being shared.

Lifetime

The lifetime of the subcomponent is bound by its parent.

Examples

https://github.com/azureblaze/lockjaw/blob/main/tests/sub_component.rs

Defined components

One of the issues with using #[component] and #[subcomponent] is that modules still has to be listed, which means anything using the component will depend on everything. The component is also generated in the crate, so other crates depending on it is not able to expand the dependency graph, which makes multibindings less useful. Additionally, unit tests often needs a different set of modules, so the whole component has to be redefined.

In a large project there maybe tens and even hundreds of modules, and this will become very difficult to manage.

Instead of #[component] and #[subcomponent], Lockjaw also provides #[define_component] and#[define_subcomponent] which automatically collects modules from the entire build dependency tree, so they no longer need to be manually installed.

Automatically installing modules

#[modules] can be automatically installed in a component by using the install_in metadata. The metadata takes a path to a #[define_component] trait. Alternatively, it can also be a path to Singleton, which means it should be installed in every #[define_component] but not #[define_subcomponent].

Such modules cannot have fields.

#[module(install_in: MyComponent)]
impl MyModule {
    #[provides]
    pub fn provide_string() -> String {
        "string".to_owned()
    }
}

#[define_component]
pub trait MyComponent {
    fn string(&self) -> String;
}

Entry points

Ideally a component should only be used at the program's entry point, and rest of the program should all use dependency injection, instead of trying to pass the component around. However sometimes callbacks will be called from non-injected context, and the user will need to reach back into the component.

These kinds of usage will cause the requesting methods in a component to bloat, and add redundant dependencies or cycle issues to everyone that uses the component.

With #[define_component] , #[entry_point] can be used.

An #[entry_point] has binding requesting methods just like a component. The install_in metadata needs to be used to install the #[entry_point] in a component. Once installed, the

<dyn FooEntryPoint>::get(component : &dyn FooComponent) -> &dyn FooEntryPoint

method can be used to cast the opaque component into the entry point, and access the dependency graph.

#[entry_point(install_in: MyComponent)]
pub trait MyEntryPoint {
    fn i(&self) -> i32;
}

#[define_component]
pub trait MyComponent {}

#[test]
pub fn main() {
    let component: Box<dyn MyComponent> = <dyn MyComponent>::new();

    assert_eq!(<dyn MyEntryPoint>::get(component.as_ref()).i(), 42)
}

Testing with #[define_component]

While compiling tests, Lockjaw gathers install_in modules only from the [dev-dependencies] section of Cargo.toml instead of the regular [dependencies], even though [dev-dependencies] inherits [dependencies]. This is due to tests often have conflicting modules with prod code. any prod modules that need to be used in tests has to be relisted again in the [dev-dependencies] section.

Caveats

This section discusses the creative solutions Lockjaw used to achieve its goal. They are abhorrent engineering practices that abuses undocumented behaviors of Rust, and are the main reasons you should not use Lockjaw in any serious project.

They are documented so maybe someone can come up with a better solution, or Rust can provide new language features to make Lockjaw usable.

Path resolution

Lockjaw need to know the fully qualified path of a type, so they can be compared against each other.

Bypassing visibility

A lot of symbols need should be private to the module/crate, but also give an exclusive bypass to Lockjaw, so it can be used by a component generated elsewhere, possibly a different crate.

Late implementation generation

Rust only allows impl blocks in the same mod the struct is in. However, some implementations have to be generated at the mod root or a different crate, where information are more complete.

Path resolution

Lockjaw need to know the fully qualified path of a type, so they can be compared against each other.

In Rust, all a proc_macro can see is tokens, which is too early to resolve the type path. When a Foo identifier is encountered, it is difficult for the macro to understand whether it is a type declared in the local module, or a type from somewhere else brought in by a use declaration. Rust don't even tell the macro what the local module is.

Base mod of the file

The first problem is the proc_macro doesn't even know where the source being compiled is at. The file!() and module_path!() would be a perfect solution to this, but eager macro expansion is required for a proc_macro to be able to utilize it.

proc_macro2::Span::source_file() also exists, but it is nightly feature and requires procmacro2_semver_exempt which is contagious.

In the Lockjaw build script this is resolved by looking at the crate's manifest and using CARGO_CRATE_NAME/CARGO_PKG_NAME along with cargo metadata to figure out the exact path to the source. Path info is only needed during dependency gathering which is no longer done with proc_macro, and component generation which is always ::crate:: and never reference by other part of the code.

mod structure and use declarations

A file can still contain nested mod in it, each importing more symbols with the use declaration. For a given token, lockjaw needs to know which mod it is in, and what symbols are brought into that scope. This requires parsing the whole file, so we can keep what the span of each mod is and what use are inside it.

syn::parse_file() sounds like a good fit for this, however the tokens it produces does not record proper spans, so we cannot use it to find the position of mod.

Lockjaw handles this by parsing the whole file in the build script so it knows which mod it is in.

Bypassing visibility

Visibility control works a bit weird with dependency injection. When a type is private, only context that have visibility should be able to inject it, but the dependency injection framework should still be able to construct it, even if the generated code is in some random mod. Currently, Rust only allow visibility bypass to be granted to a mod that is a parent of the current mod.

Lockjaw handles this with the #[component_visible] attribute macro. The macro modifies the struct declaration, so it is declared as public with a hidden name, and then alias the original name with the original visibility. Internally Lockjaw uses the hidden name. Everything is actually public.

This type of hack is hard to perform on a mod, so every mod that has a binding must be visible to the crate root (using pub(crate)). Lockjaw then reexport the hidden type as public at the crate root.

Late implementation generation

Rust requires the impl block to appear in the same mod is the type it implements. Items cannot be added to a mod later either. However with Lockjaw information to generate the implementation may not be available at the time, especially for component builders. Furthermore, the implementation may not even be possible in the same crate for #[define_component].

Component builders should be associated method of the component, or at least a freestanding function in the same mod. Otherwise, it will be hard for users to locate them.

Lockjaw handles this by implementing such methods by calling an extern method. Which will later be generated. While this works, if the user forgets to call the code generation macro, a cryptic linker error about missing symbol will appear.

In addition, some late implementation methods needs a unique name since it might clash with other components. A unique name cannot be generated because the only thing we know is the local component name in a proc_marco, and a component with the same name might exist under a different mod/crate. Instead, the generated code declares a static mut *const () address under the same mod as the component which it will transmute when it needs to call the implementation. The late generated implementation will set this address to the real implementation in the component's builder(implementation knows the full path of the address.). This is a constant assignment and hopefully the compiler can optimize it away.

Code of Conduct

Our Pledge

In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.

Our Standards

Examples of behavior that contributes to creating a positive environment include:

  • Using welcoming and inclusive language
  • Being respectful of differing viewpoints and experiences
  • Gracefully accepting constructive criticism
  • Focusing on what is best for the community
  • Showing empathy towards other community members

Examples of unacceptable behavior by participants include:

  • The use of sexualized language or imagery and unwelcome sexual attention or advances
  • Trolling, insulting/derogatory comments, and personal or political attacks
  • Public or private harassment
  • Publishing others' private information, such as a physical or electronic address, without explicit permission
  • Other conduct which could reasonably be considered inappropriate in a professional setting

Our Responsibilities

Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.

Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.

Scope

This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.

This Code of Conduct also applies outside the project spaces when the Project Steward has a reasonable belief that an individual's behavior may have a negative impact on the project or its community.

Conflict Resolution

We do not believe that all conflict is bad; healthy debate and disagreement often yield positive results. However, it is never okay to be disrespectful or to engage in behavior that violates the project’s code of conduct.

If you see someone violating the code of conduct, you are encouraged to address the behavior directly with those involved. Many issues can be resolved quickly and easily, and this gives people more control over the outcome of their dispute. If you are unable to resolve the matter for any reason, or if the behavior is threatening or harassing, report it. We are dedicated to providing an environment where participants feel welcome and safe.

Reports should be directed to azureblaze, the Project Steward(s) for Lockjaw. It is the Project Steward’s duty to receive and address reported violations of the code of conduct. They will then work with a committee consisting of representatives from the Open Source Programs Office and the Google Open Source Strategy team. If for any reason you are uncomfortable reaching out to the Project Steward, please email opensource@google.com.

We will investigate every complaint, but you may not receive a direct response. We will use our discretion in determining when and how to follow up on reported incidents, which may range from not taking action to permanent expulsion from the project and project-sponsored spaces. We will notify the accused of the report and provide them an opportunity to discuss it before any action is taken. The identity of the reporter will be omitted from the details of the report supplied to the accused. In potentially harmful situations, such as ongoing harassment or threats to anyone's safety, we may take action without notice.

Attribution

This Code of Conduct is adapted from the Contributor Covenant, version 1.4, available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html

How to Contribute

The Lockjaw project is still very experimental, and may not have enough bandwidth to accept contributions. However if you still wish to proceed, there are a few small guidelines you need to follow.

Contributor License Agreement

Contributions to this project must be accompanied by a Contributor License Agreement. You (or your employer) retain the copyright to your contribution; this simply gives us permission to use and redistribute your contributions as part of the project. Head over to https://cla.developers.google.com/ to see your current agreements on file or to sign a new one.

You generally only need to submit a CLA once, so if you've already submitted one (even if it was for a different project), you probably don't need to do it again.

Code reviews

All submissions, including submissions by project members, require review. We use GitHub pull requests for this purpose. Consult GitHub Help for more information on using pull requests.

Community Guidelines

This project follows Google's Open Source Community Guidelines .