lunes, 24 de octubre de 2022

Using Node.js AsyncLocalContext to store tracing information

Usually when building Node.js services you want to be able to include in your logs some identifier (trace ID or user ID) that is shared for all the logs for the same request or event handler.   That way when you need to debug a failed request you can easily filter all the logs belonging to the same request.

This is very easy to implement in other execution environments thanks to the concept of thread-local variables where you can store some information for the duration of a task in a storage that is specific of the thread executing that task.   For example in Java you can make use of the ThreadLocal class to store information that then will be available for the rest of the execution of that thread. 

    private static ThreadLocal<RequestContext> context = new ThreadLocal<>(); context.set(new RequestContext(requestId));

This is not directly applicable to Node.js because all the tasks of a process are executed asynchronously in the same thread, but there is some similar functionality provided by the AsyncLocalContext cass.   In this case the context is not per thread but per call stack, so two flows of execution will have different stacks and different values for the variables stored.

Let's look at it with an example using the express nodejs framework.


First thing you want to do is assign a new context to each request and assign it to the current asyncLocalStorage.  This can be done with a express middleware to make it transparent for all the requests:

import { AsyncLocalStorage } from 'node:async_hooks';
const asyncLocalStorage = new AsyncLocalStorage();

app.use((_req, res, next) => {
const context = { traceId: crypto.randomUUID(), begin: Date.now() };
asyncLocalStorage.run(context, async () => {
next();
});
});

In this example we are associating a new UUID traceId and the begin time with every request.

Next you want your logger to add the information from the context when generating a log.   In this example we are using winston library for logging and a custom formatter to add the context data:

const addLoggingContext = winston.format(info => {
const context = asyncLocalStorage.getStore() as any;
return context ? { ...info, ...context, duration: Date.now() - context.begin } : info;
});

logger.add(new winston.transports.Console({
format: winston.format.combine(
addLoggingContext(),
winston.format.simple(),
),
}));

With that in place we can add some logs.    We can add some basic ones in the middleware and some in the request processing for example:

app.use((_req, res, next) => {
res.on('finish', () => {
logger.info('End request');
});
const context = { traceId: crypto.randomUUID(), begin: Date.now() };
asyncLocalStorage.run(context, async () => {
logger.info('Begin request');
next();
});
});

app.get('/', async (_req, res) => {
logger.info('Some interesting log');
res.sendStatus(200);
});

And if we make some requests now to our HTTP server we can see the expected output:

info: Begin request {"begin":1666640047930,"duration":0,"traceId":"ff3fcf6c-c7e5-4a70-a3a3-143bff9b2993"}
info: Some interesting log {"begin":1666640047930,"duration":1,"traceId":"ff3fcf6c-c7e5-4a70-a3a3-143bff9b2993"}
info: Begin request {"begin":1666640047933,"duration":0,"traceId":"4f2b4471-8d0f-481c-90bb-72bd33fe4ab2"}
info: Some interesting log {"begin":1666640047933,"duration":1,"traceId":"4f2b4471-8d0f-481c-90bb-72bd33fe4ab2"}
info: End request {"begin":1666640047930,"duration":1005,"traceId":"ff3fcf6c-c7e5-4a70-a3a3-143bff9b2993"}
info: End request {"begin":1666640047933,"duration":1004,"traceId":"4f2b4471-8d0f-481c-90bb-72bd33fe4ab2"}

So we have achieved what we wanted (to have a unique traceId shared for all the logs of each request) in a transparent way (using a express middleware) taking advantage of the node capabilities included in the node:async_hooks" package.

Note: You probably also want those tracing identifiers to be preserved when forwarding requests to other services but that's outside of the scope of this post.


lunes, 4 de junio de 2018

Video bubbles UI using Electronjs



After today's release of Houseparty's Mac app showing a new approach for video conversations UI based on bubbles I was wondering if it would be possible to build a similar user experience using Electron framework and web technologies.


It was easy to find that Electron has support for transparent and frameless windows so I decided to give it a try and figure out if it would be technically possible to build something similar to that.

To build the app I used the electron quick-start and only edited two files:

HTML File

First I modified the HTML file to add video capturing from the camera using the standard getUserMedia API and showed your local video stream in 3 <video> elements.  To make rounded bubbles you can use standard CSS attributes:


Note the style "webkit-app-region: drag" to make the window draggable from anywhere.

main.js File

You have to update the main.js file to create the main window as a frameless and transparent window:

const mainWindow = new BrowserWindow({ frame: false, transparent: true });

With those 2 tiny changes I was able to have something similar to the bubbles video experience created by Houseparty.



So I successfully built 0.01 % of the new Houseparty Mac app using Electron!  Enjoy it!




jueves, 3 de mayo de 2018

Two-level hashes in Redis using LUA and MsgPack

Redis hashes are a very powerful data structure allowing you to store key-value properties associated with a given Redis key.   For example you could store for each user all it's devices and the last time they were online:

UserId1
   DeviceId1=1525038228
   DeviceId2=1525038128

But at some point maybe you need to store something more than the last time each device was online.   Maybe a name, last time offline or a status (online/offline/away/busy...).

So basically we want to store nested hashes with a structure similar like this one:

UserId1
   DeviceId1
     LastOnline=1525038228
     LastOffline=1525028228
     Status=away
   DeviceId2=1525038228
     LastOnline=1525038128
     LastOffline=1525028128
     Status=busy

Unfortunately for us this is not a structure supported out of the box in Redis, so we need to flatten it a little bit and use something like Json values to store all that information:

UserId1
   DeviceId1={"LastOnline":1525038228, "LastOffline":1525028228, "Status": "away"}
   DeviceId2={"LastOnline": 1525038128, "LastOffline": 1525028128, "Status": "busy"}

With this approach the problem is solved, but when we need to update one of those values (f.e. LastOnline of DeviceId2) we need to do a HGET plus a HSET to redis.  This is problematic because it is:
  • Slower as it requires 2 round trip times to complete the operation
  • More complex as you need to use the WATCH command to run both commands simulating a transaction to avoid race conditions
  • Less efficient because you need to receive and send the whole Json value over the network
Fortunately there are two features of Redis that combined can give us something very similar to what we need.

The first feature is the ability to execute Lua scripts as part of a Redis command and the second feature are the standard Lua modules included in latest Redis versions allowing to serialize data as Json or MsgPack formats.

In this python example you can see the Lua scripts to write and read any property in these nested hashes:


The first Lua script updates a nested field.  To do that it gets the field value with HGET, deserialize it with 'cmsgpack.unpack', then update the field, serialize it again with 'cmsgpack.pack' and stores it back with HSET.  
The second Lua script returns all the nested fields.  To do that it gets the value with HGET, deserialize it with 'cmsgpack.unpack' and converts it to a list so that it can be sent in a redis response.

Disclaimer: I haven't used Lua much in the last 10y so that can probably be simpler/cleaner.

Size

MessagePack serialization is more compact than Json.  If you check the value stored in Redis after running the previous script you get this:

127.0.0.1:6379> hgetall user_id
1) "device1"
2) "\x81\xablast_online\xceZ\xeb\x0f\x13"

We should make it even smaller with shorter key names (f.e. "on" instead of "LastOnline").

Performance

I didn't have time to do a detailed performance test but just to check if something was terribly wrong I tried setting and getting one of those nested values 10.000 times in a loop against a local server and checked the time it took:
  Option 1: Lua/MessagePack:  2.15 secs
  Option 2: Use raw Redis commands storing nested hash as Json and using a transaction (GET + SET) to update a subfield: 2.42 secs


The same idea can be implemented with custom Redis modules instead of Lua scripts.   For example that is what the ReJSON module does. That is probably a little bit faster than the Lua script approach but there are many cases where you cannot install custom Redis modules (f.e. when using managed Redis instances in  AWS) 

miércoles, 22 de noviembre de 2017

Playing with Redis Geo structures

One of the things I've never used in Redis were the commands provided to store and access geolocated data.   From version 3.2 Redis includes 6 new very simple commands that can be used to store tags associated with coordinates and calculate distances between those tags or find the tags around some specific coordinates.

Let's try to build a simple prototype to see how it works.   Imagine that you have a service where users can rate anything (people, places, restaurants...)* and at some point you want to show in the UI of your app what things other people around you are rating right now. 

Almost every database right now has support to store this type of geographically located information but for use cases like this one where you want very fast access to ephemeral information an in-memory database can be a very good choice.

Storing data

The GEOADD command in Redis is the one you have to insert a new tag asociated with a specific position (latitude, longitude).

The structure supported in Redis for geolocated data has an insertion and query time that is O(log(N)) complexity where N is the number of items in the set, so probably you don't want to have all the data in the same set but partition it by country or some other grouping that makes sense for your use case.   In our example we could try partitioning it per city.

So everytime somebody rates something identified by a tag we will do this insert in redis:
GEOADD city latitude longitude tag
For example:
GEOADD sanfrancisco -122 37 goldengate
Redis stores this geographical information internally in a Sorted Set, so we can use any of the sorted set commands to manipulate or retrieve the list of items stored:
ZRANGE sanfrancisco 0 -1
1) "goldengate"

Retrieving data

There are two commands that you can use to make geographical queries on the stored data depending on your use case:


For our use case, everytime somebody opens the app we will retrieve all the tags around his position sorted by distance.

GEORADIUS sanfrancisco -122.2 37.1 5 km
1) "goldengate"

How does it work internally

As we mentioned before everything is stored inside Redis in the existing Sorted Set structures.

The way those zsets are leveraged are by using a score based on the latitude and longitude.  Basically by generating the zset scores interleaving the bits of the latitude and longitude of each entry you can later make queries to retrieve all the tags in a specific geographical square as a range of those scores.

That way with 9 ranges you can get all the areas around a specific point.  And those ranges can be of any size to be able to make queries using different radius just by trimming bits at the end of the score.

This technique is called geohashing and makes this geo commands very easy to implement on top of sorted sets.

Hope this is useful for other people implementing similar services, the truth is I never stop being amazed by Redis...

* Disclaimer: I built that service with some friends, you can see it in http://www.pleason.com/

lunes, 8 de mayo de 2017

How to (not) reuse code between Android and iOS

Most of the mobile applications we build these days have to work in two different platforms (Android and iOS).  Each of these platforms has its own frameworks, tools and programming languages so usually you end up building two completely separated applications and many times even built by separate teams.

[Note: If you are using some cross-platform development environment like react-native or Xamarin or building a Web/Hybrid app you are "lucky" and this post doesn't apply to you :)]

Unless you are working in a very simple app at some point you will realize that there are some parts of the application that you are implementing twice because you need it in both platforms (for example some business logic or the code to make requests to the HTTP server APIs).

Based on the capabilities of Android and iOS you have basically two options:
Option 1: Implement everything twice using the official language and libraries of each platform (f.e. implement the access to HTTP APIs using Swift and URLSession in the iOS app and using Java and Volley in the Android app)
Option 2: Implement the reusable code in C++ and compile it in the iOS app (creating a Objective C++ wrapper) and use it in the Android app (creating a JNI wrapper).

These are some possible advantages of Option 1:
  • Code is usually easier to read and maintain when written in modern languages (for example Swift vs C++).
  • Native integration: When using an Android library to make HTTP requests it will be probably integrated with the system proxy configuration and validates the SSL certificates with the system CAs by default.
  • No plumbing/boring code to write to provide access to the C++ library from the application (for example with JNI).  This can be partially mitigated using frameworks like SWIG to autogenerate the wrappers but it is still boring and usually problematic.
  • Simpler to debug because there is a single layer instead of having to make calls accross layers with different technologies(for example with JNI).
  • Build process faster and simpler because of less libraries/tools (for example no ndk required)
These are some possible advantages of Option 2:
  • No duplicated code to develop and maintain.
  • Avoid inconsistencies in naming, algorithms, protocols implementation because it is implemented in a single place.
  • Performance can be better.  Almost this is not an issue in most of the cases.
As we can see there are important pros and cons of both options so let's try another approach....  Let's check what are other popular mobile libraries doing?

I put some of those libraries in a diagram across two axis: Y for size/complexity of the library and X for number of platforms to support.  Other relevant variable could be how relevant is the performance optimisation but I don't want to make a 3D diagram :)  

In blue libraries using Option 1 and In green libraries using Option 2
[Apology: I picked some popular libraries I have used in the past and the lines of code and number of platforms is just an estimation, I didn't really count them]

As we can see most of the popular libraries are using Option 1 reimplementing the library twice, once for Android and once for iOS.  On the other side some big libraries related to real time communications or databases are using Option 2 implementing the core in C++ and exposing it with wrappers to Java and Objective-C applications.

Conclusion

What is the right solution probably depends on the type of project and the team building it but in my opinion in many (or most) of the cases it is less effort to develop and maintain 2 simple implementations than writing and maintaining a single more complex implementation plus the wrappers to different platforms.   In addition you can (should) mitigate the issues of Option 1 making use of tools to autogenerate code when possible, for example using protocol buffers/grpc for the client-server communication or swagger to generate clients for REST APIs.

I'm very interested on knowing your opinion on this topic, What do you think?   What are you doing right now in your projects?

miércoles, 19 de abril de 2017

Multiplatform Travis Projects (Android, iOS, Linux in the same build)

Using travis to build and test your code is usually a piece of cake and highly recommended but last week I tried to use travis for a non so conventional project and it ended up being more challenging than expected.

The project was a C library with Java and Swift wrappers and my goal was to generate Android, iOS and Linux versions of that library using Travis.   The main problem with my plan was that you have to define the "language" of project in your travis.yaml file and in my case... should it be android, objective-c or cpp project?

It would be great if travis would support multilanguage projects [1] or multiple yaml files per project [2] but apparently none of that is going to happen in the short term.

Linux
I decided to build the Linux part using docker to make sure I can use the same environment locally, in travis and in production.

iOS
Given the fact that the only way to build an iOS project is using OSX images and that there is no docker support in travis for OSX I had to use the multiple operating systems capabilities in travis [3].

Android
This ended up being the most challenging part.  Android projects require a lot of packages (tools, sdks, ndks, gradle...) so I decided to use docker also for this to make sure I had the same environment locally and in travis.    There were some docker images for this and I took many ideas form them, but I decided to generate my own [4].

To not have a too crazy travis.yaml file I put all the steps to install prerequirements and to launch the build process in shell scripts (2 scripts per platform).  That simplifies the travis configuration and also let me reuse the steps if I want to build locally or in jenkins eventually.   My project folder looks like this:

    /scripts/ios
       before_install.sh
       build.sh
    /scripts/android
       before_install.sh
       build.sh
    /scripts/linux
       before_install.sh
       build.sh

The most interesting scripts (if any) are the android and ios ones.

    #!/bin/bash
    echo "no additional requirements needed"

    #!/bin/bash
    xcodebuild build -workspace ./project.xcworkspace -scheme 'MyLibrary' -destination 'platform=iOS Simulator,name=iPhone 6,OS=10.3'

    #!/bin/bash
    docker pull ggarber/android-dev

    #!/bin/bash
    docker run --rm -it --volume=$(pwd):/opt/workspace --workdir=/opt/workspace/samples/android ggarber/android-dev gradle build


With that structure and those scripts the resulting travis.yaml file is very simple:

language: cpp

sudo: required
dist: xenial

os:
  - linux
  - osx

osx_image: xcode8.3

services:
  - docker

before_install:
  - if [[ "$TRAVIS_OS_NAME" != "osx" ]]; then ./scripts/linux/before_install.sh  ; fi
  - if [[ "$TRAVIS_OS_NAME" != "osx" ]]; then ./scripts/android/before_install.sh ; fi
  - if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then ./scripts/ios/before_install.sh     ; fi

script:
  - if [[ "$TRAVIS_OS_NAME" != "osx" ]]; then ./scripts/linux/script.sh  ; fi
  - if [[ "$TRAVIS_OS_NAME" != "osx" ]]; then ./scripts/android/script.sh ; fi
  - if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then ./scripts/ios/script.sh     ; fi

This is working fine although the build process is a little bit slow so these are some ideas to explore to try to improve it in the future:
  • Linux and Android builds could run in parallel.
  • Android docker images are very big (not only mine but all the ones I found).   According to docker hub it is 2GB compressed image.  Probably there are ways to strip this down.
  • I'm not caching the android packages being downloaded during the build process inside the docker container.

[1] https://github.com/travis-ci/travis-ci/issues/4090
[2] https://github.com/travis-ci/travis-ci/issues/3540
[3] https://docs.travis-ci.com/user/multi-os/
[4] https://github.com/ggarber/docker-android-dev

lunes, 6 de febrero de 2017

Using Kafka as the backbone for your microservices architecture

Disclaimer: I only use the word microservices here to get your attention.  Otherwise I would say your platform, your infrastructure or your services.

In many cases when your application and/or your team start growing the only way to maintain a fast development and deployment pace is to split the application and teams in different smaller units.   In case of teams/people that creates some interesting and not necessarily easier to solve challenges but this post is focused on the problems and complexity created in the software/architecture part.

When you split your solution in many components there are at least two problems to solve:
  • How to pass the information from one component to another (f.e. how do you notify all the sub-components when a user signs up so that you send him notifications, start billing him, generate recommendations...)
  • How to maintain the consistency of all the partially overlapped data stored in the different components (f.e. how do you remove all the user data from all the sub-components when the user decides to drop out from your service)

Inter component communication

At a very high level there are two communication models that are needed in most of the architectures:
  •  Synchronous request/response communications.  This has his own challenges and I recommend to use gRPC and some best practices around load balancing, service discovery, circuit breakers.... (find here my slides for TEFCON 2016) but it is usually a well understood model.
  • Asynchronous event based communications where a component generates an event and one or many components receive it and implement some logic in response to that event.
The elegant way to solve this second requirement is having in the middle a bus or a queue (depending on the reliability guarantees required for the use case) where producers send events and consumers can read those events from it.    There are many solutions to implement this pattern but when you have to handle heterogeneous consumers (that consume events at different rates or with different guarantees) or you have a massive amount of events or consumers the solution is not so obvious.

Data consistency

The biggest problem to solve in pure microservices architectures is probably how to ensure data consistency.   Once you split your application in different modules with data that is not completely independent (at the very least they all have the information about the same users) you have to figure out how to maintain that information in sync.

Obviously you have to try to maintain these dependencies and duplicated data as small as possible but usually at least you have to solve the problem of having the same users created in all of them.

To solve it you need a way to sync the data changes between different components that could be duplicated and need to be updated in other components.  So basically you need a way to replicate data that ensures the eventual consistency of it.

The Unified Log solution

If you look at those two problems they can be reduced to a single one: To have a real-time and reliable unified log that you can use to distribute events among different components with different needs and capabilities.   That's exactly the problem that LinkedIn had and what they built Kafka to solve.   The post "The Log: What every software engineer should know about real-time data's unifying abstraction" it is a very very recommended reading.

Kafka decouples the producers from the consumers including the ability to have slow consumers without affecting rest of the consumers.  Kafka does that and at the same time supports very high rates of events (it is common to have hundreds of thousands per second) with very low latencies (<20 msecs easily).  All these features while still being a very simple solution and providing some advanced features like organizing events in topics, preserving ordering of the events or handling consumer groups.

Those Kafka characteristics make it suitable to support most the inter-component communication use cases including events distribution, logs processing and data replication/synchronization.  All with a single simple solution by modeling all these communications as an infinite list of ordered events accessible for multiple consumers using a centralized unified log.

This post was about Kafka but all/most-of-it is equally applicable to the Amazon clone Kinesis. 

You can follow me in Twitter if you are interested in Software and Real Time Communications.