Upcoming Vert.x Http Proxy updates

Introduction

vertx-http-proxy is a component in the Vert.x ecosystem often used as an HTTP reverse proxy. Coming from Google Summer of Code 2024, some new features are implemented under this project to make its usage easier and eliminate boilerplate code required for common scenarios. In this blog post, I introduce some features that are about to be included in the official release and how to use them, followed by reflections on the GSoC experience.

The updates mentioned in this post are not yet available from the maven repository. The actual results are subject to the final provided API.

Interceptors

During HTTP exchanges, developers can modify incoming requests and responses with interceptors. This update provides a quick way to create different types of interceptors. For example, In the previous version, if you wanted to remove certain fields from the header, you had to create the interceptor like this:

proxy.addInterceptor(new ProxyInterceptor() {
    @Override
    public Future<ProxyResponse> handleProxyRequest(ProxyContext context) {
    context.request().headers().remove("header-to-remove");
    return context.sendRequest();
    }
});

Now it’s possible to create an interceptor to filter headers with a single method:

proxy.addInterceptor(
    HeadersInterceptor.filterRequestHeaders(Set.of("header-to-remove")));

We can now also use the BodyInterceptor to conveniently modify the body of requests or responses. This interceptor supports various types of data by default, meaning that you can treat the body directly as either bytes, strings, or json types for transformation by using BodyTransformer. For example, to create an interceptor that removes the id field from each item in the response json array:

BodyInterceptor.modifyResponseBody(BodyTransformer.transformJsonArray(arr -> {
    arr.forEach(json -> ((JsonObject) json).remove("id"));
    return arr;
}))

Here’s a list of all the interceptors implemented:

  • HeadersInterceptor: Used to modify the headers of HTTP requests and responses;
  • PathInterceptor: used to modify the URI of the request;
  • QueryInterceptor: used to modify the query parameter of the request;
  • BodyInterceptor: used to modify the body of the request and response.

For modification details, see the PR here.

Additionally, it is now possible to apply interceptors to WebSocket handshake packets. This means that you can now modify the WebSocket request URL using an interceptor. Considering that in most cases you might not want to apply all interceptor transformations during the WebSocket handshake (which could lead to connection failures), interceptors are disabled for WebSocket by default. You can enable this behavior by overriding the allowApplyToWebSocket method. For interceptors that have already been created, you can also achieve this by wrapping them with WebSocketInterceptor:

ProxyInterceptor addPrefix = PathInterceptor.addPrefix("/api");
addPrefix = WebSocketInterceptor.allow(addPrefix); // now `addPrefix` can be applied to WebSocket	

For the WebSocket interceptor, see the changes in this PR.

Cache Support

The proxy supports caching responses. In previous versions, caching was only allowed for responses with the Cache-Control header marked as public. This update improves the caching mechanism and meets most of the requirements of RFC9111. The specific changes are as follows:

  • Implemented all cache request and response directives from RFC9111 (except no-transform);
  • By default, all responses are now allowed to be cached unless explicitly prohibited in the Cache-Control header;
  • More standard revalidation mechanism: expired caches will not be immediately deleted but will first be validated to determine if they can still be used;
  • Now revalidation supports both ETag and Last-Modified validators;
  • Fixed the issue where calling unsafe request methods would not invalidate the cache;
  • Fixed the issue with wrong caching responses when including the Vary header;
  • Implemented default heuristic freshness.

A Cache SPI is also currently provided to support custom different Cache implementations (e.g. on heap or Redis).

See code changes in for HTTP cache support in this PR and Cache SPI update in this PR.

Contextual Data

ProxyContext is used as a context for sharing state between different proxy components. In the latest version, it can be more generally shared between interceptors and the originSelector. Since interceptors now support WebSocket, the context can also be used during the WebSocket handshake.

The new version of the context also allows an additional map to be passed in as a contextual data holder when handling requests. This method enables the injection of custom context content at runtime. For example, to register a proxy as a handler for an HTTP server with custom context, you can using the following code:

server.requestHandler(request -> {
    Map<String, Object> customContext = new HashMap<>(); // inject any additional data here
    proxy.handle(request, customContext);
});

It is also planned for vertx-web-proxy to set the RoutingContext bound to the request as custom proxy context.

See this PR for the contextual data changes.

GSoC Experience & Learnings

The most interesting (and hard) of this project was the design process. Most of my work involved adding new features to an existing codebase, which required a lot of API design. The challenge was ensuring that the API was easy to understand and use while avoiding any disruptive changes to the original code, which significantly limited the final feasible design options.

For example, there was an issue regarding making the Interceptor support WebSocket. In the original code, WebSocket packets and regular HTTP packets were handled separately, so only the branch processing HTTP packets applied the interceptors. This was reasonable because the handling methods for the two were entirely different. However, the user’s request to implement this feature was also reasonable, as the need to modify the WebSocket path exists. Balancing these two requirements took a lot of time during the “finding a feasible solution” phase.

Even after the feature was implemented, I had to consider user experience. If the interceptor were to handle all types of requests by default, users would need to add a condition to all regular interceptors to skip WebSocket traffic, as most request transformations shouldn’t apply to them. Therefore, I have to think about a solution where the interceptor doesn’t affect WebSocket by default, which also took considerable time.

In fact, the effective time spent writing code was only one day or even half a day, while the design time was one to two times that. So, if I had to summarize the most important lessons:

  • Design process is more important than coding; it’s best to explicitly allocate time for it during planning.
  • It’s better to seek feedback after completing the design immediately rather than jumping straight into coding, as this can easily lead to unexpected outcomes.

And finally, I would like to thank my mentor, Thomas Segismont, for his helpful advice and support throughout the project.

Posted on 23 August 2024
in news
5 min read

Related posts