Building reliable and performant distributed programs that span cloud machines and devices is a challenging endeavor, but one that more and more developers are required to tackle. Foremost among the challenges is effectively handling restart, reconnection, and recovery to a valid state. This is where AMBROSIA (Actor-Model-Based Reliable Object System for Internet Applications), a new open source project from Microsoft Research, can help.
Rather than placing the burden on application developers to build fault-tolerance into their systems from scratch, AMBROSIA provides a general-purpose distributed programming platform that automatically handles failure and lets the developer focus on the core logic of their application.
We call this property of AMBROSIA “virtual resiliency”. Virtual resiliency is achieved by running your application code in an AMBROSIA immortal (see diagram below), which handles checkpointing, logs all communications going into and out of your application, and writes them to storage. This allows the immortal to automatically recover from a failure by replaying all activity from the most recent checkpoint of the application, and also to seamlessly reconnect to other Ambrosia services. Debugging also becomes simple when you can reproduce failure conditions by simply stepping through logged activity with a debugger.
Furthermore, this approach provides exactly-once semantics for messages between distributed components running as AMBROSIA immortals, and it does this in a performant and cost-effective manner. Though this type of an implementation might at first glance look expensive, we’ve found it to be consistently cheaper to run than other commonly used microservices architectures.
We are excited to launch AMBROSIA as an open source project that can be run on both Windows and Linux. The first release includes support for C#, with planned expansion across other languages.
Questions? Let us know in the comments.