Translation¶
beng-proxy knows two ways to locate the resource a request URI points to:
via an external translation server
static translation
The latter is only for debugging. The URI path is appended to the
document root (/var/www by default). For security (by
obscurity) reasons, beng-proxy has no code for generating
directory listings. If the request has a trailing slash,
beng-proxy looks for a file named index or
index.html and serves it. Without the trailing slash,
beng-proxy refuses to handle the request.
The translation server should be the default on production servers. It is a daemon on the same physical machine which does all the translation work for us. beng-proxy connects to a Unix socket to contact this translation server.
A request may consist of several micro commands. The request is
initialized with the BEGIN command, which is followed by any
number of commands which provide parameters. After all parameters have
been transferred, the client sends the END command, and waits for
the server’s response.
The client can send any number of requests over the socket until one side closes the connection.
Example conversation¶
client sends
BEGIN“\x03”client sends
REMOTE_HOST“192.168.1.77:1234”client sends
HOST“www.example.com”client sends
URI“/foo/index.html”client sends
ENDserver sends
BEGIN“\x01”server sends
PATH“/var/www/foo/index.html”server sends
CONTENT_TYPE“text/html; charset=utf8”server sends
PROCESSserver sends
END
Command packets¶
The protocol is binary and uses host byte order. A command packet may look like this in pseudo C:
struct beng_proxy_translate_packet {
uint16_t length;
uint16_t command;
char payload[length];
};
The length only refers to the payload. The maximum supported payload
size is 65535 bytes.
Most parameters are ASCII strings; in this case, the payload contains just the raw string, without terminating zero.
Request¶
BEGIN: Begins the request. The payload is a 8-bit unsigned integer specifying the protocol version. The protocol version described here is 3.END: Finishes the request.
LISTENER_TAG: Thetagof the listener (as specified in thelistenerconfiguration section) that accepted the connection.REMOTE_HOST: the client’s address or host name and the port number (as string) (This packet optional and is only submitted if requested viaWANT, see page )HOST: theHostHTTP request headerURI: the raw URI from the HTTP request (without the query string)QUERY_STRING: the query string from request URI, without the question mark (This packet optional and is only submitted if requested viaWANT, see page )SESSION: a session identifier generated by the translation server, see section SessionsREALM_SESSION: LikeSESSION, but realm-local. UnlikeSESSION, it is only sent under certain conditions (e.g. inTOKEN_AUTHrequests), because the realm is only known after the regular translation response has been applied already.PARAM: a parameter passed by the browserUSER_AGENT: theUser-Agentrequest header sent by the client (not in the widget registry) (This packet optional and is only submitted if requested viaWANT, see page )USER: the user name currently logged in usingAUTH; see page (This packet optional and is only submitted if requested viaWANT, see page )LANGUAGE: theAccept-Languagerequest header sent by the client (not in the widget registry) (This packet optional and is only submitted if requested viaWANT, see page )AUTHORIZATION: theAuthorizationrequest header sent by the client (see RFC 2617); only for HTTP-level Authentication.CONTENT_TYPE_LOOKUP: Look up theContent-Typeof a file name suffix. See Content-Type Lookup for a detailed description.SUFFIX: The file name suffix without the dot forCONTENT_TYPE_LOOKUP. See Content-Type Lookup for a detailed description.ERROR_DOCUMENT: a resource has failed, and the translation server is asked to provide the location of the error document. This is followed by the packetsURIandSTATUS. See Error documents for a detailed description.PROBE_PATH_SUFFIXES: Result ofPROBE_PATH_SUFFIXES. This is an echo of thePROBE_PATH_SUFFIXESfrom the previous translation response. If a file with one of the given suffixes exists, thenPROBE_SUFFIXspecifies the first existing suffix. If noPROBE_SUFFIXfollows, then no file was found.PATH_EXISTS: This is an echo ofPATH_EXISTSfrom the previous translation response, accompanied bySTATUSdescribing whether the given file exists.FILE_NOT_FOUND: The specified file does not exist. The translation server is asked to provide an alternate translation. This is an echo of theFILE_NOT_FOUNDfrom the previous translation response.ENOTDIR: The specified file does not exist, but a portion of the path points to a regular file. This is an echo of theENOTDIRpacket from the previous translation response. The given URI has been shortened: the last slash and what follows has been moved toPATH_INFO. This may be repeated until the regular file has been found.DIRECTORY_INDEX: The specified file is a directory. The translation server is asked to provide an alternate translation. This is an echo of theDIRECTORY_INDEXfrom the previous translation response.
WANT: causes beng-proxy to submit the same translation request again, with this packet echoed plus the requested packets. The payload is an array of 16-bit integers with requested packet ids. The following packets are allowed/supported here:LISTENER_TAG,REMOTE_HOST,USER_AGENT,USER,LANGUAGE,ARGS,QUERY_STRINGWANT_FULL_URI: causes beng-proxy to submit the same translation request again, with this packet appended (its payload is opaque to beng-proxy), and with the full request URI (including semicolon-arguments and the follow-up suffix, but excluding the query string).INTERNAL_REDIRECT: causes beng-proxy to submit the same translation request again, with this packet appended (its payload is opaque to beng-proxy). However, instead of the original request URI, beng-proxy uses the one from this responses’sURIorEXPAND_URIpacket.CHECK: causes beng-proxy to submit the same translation request again, with this packet appended (its payload is opaque to beng-proxy). The current response is remembered, to be used when the second response contains thePREVIOUSpacket. This can be used to implement authentication (see Authentication).CHECK_HEADER: theCHECKrequest shall contain the specified request header. Payload is the header name (lower case). For theCHECKrequest, the payload is the header name and the value separated by a colon; if no such request header exists, the value is empty.AUTH: Indicates that authentication is necessary (see The AUTH packet).READ_FILE: This is a repeated translation in reply to a translation response with aREAD_FILEpacket. The payload is the file contents or empty if the file does not exist (or if there was another problem reading the file). This packet is implicitly on “vary”.
Response¶
BEGIN: Begins the response. The payload is a 8-bit unsigned integer specifying the protocol version. The initial protocol version is 0.END: Finishes the response.URI: the “real” raw URI from the HTTP request (without the query string); this is used to override the URI, e.g. when beng-proxy is behind another proxy which modifies the URIEXPAND_URI: OverrideURIwith the given value (after expanding).HOST: the host name for generating absolute URLs; default is theHostHTTP request headerSCHEME: the scheme for generating absolute URLs; default ishttp. This packet is useful if beng-proxy is behindstunnelALLOW_REMOTE_NETWORK: Allow only clients with addresses in the specified network; all other addresses get a “403 Forbidden” response. The payload is astruct sockaddr_inorstruct sockaddr_in6plus one byte specifying the prefix length (in bits). This packet may be sent more than once.UNTRUSTED: sets the “untrusted” host name for this request: only untrusted widgets matching this host name are allowed. Trusted widgets are rejected.STATUS: HTTP status code, encoded asuint16_t; this parameter is usually not usedHTTP: load the resource from a remote HTTP server (see HTTP proxying). Payload is an absolute URI starting withhttp://orhttps://.HTTP2: force HTTP/2 for the precedingHTTPpacket. No payload.CERTIFICATE: Use the named client certificate for the outbound SSL connection (see CERTIFICATE).PIPE: a local program which reads input from stdin and prints the modified resource on stdout (see Pipe filters).LHTTP_PATH: a local path which is executed as HTTP serverLHTTP_URI: the request URI forLHTTP_PATHEXPAND_LHTTP_URI: the regular expression rule forLHTTP_URILHTTP_HOST: the “Host” request header forLHTTP_PATHCONCURRENCY: a 16 bit integer specifying the maximum number of concurrent requests to this server (FastCGI, LHTTP and Multi-WAS only)PARALLELISM: a 16 bit integer specifying the maximum number of parallel child processes of this kind (FastCGI, WAS, Multi-WAS, LHTTP)DISPOSABLE: Mark the child process as “disposable”, which may give it a very short idle timeout (or none at all). To be used for processes that will likely only be used once.NON_BLOCKING: If present, make the socket passed to a child process non-blocking (LHTTP only currently). This is needed by NodeJS 0.12.CGI: a local path which is executed as CGI script (see CGI, FastCGI, WAS and Pipe)FASTCGI: a local path which is executed as FastCGI script (see CGI, FastCGI, WAS and Pipe)WAS: a local path which is executed as WAS application (see CGI, FastCGI, WAS and Pipe). May be followed byCONCURRENCYto enable Multi-WAS mode.REDIRECT: another alternative toPATH: redirect the HTTP client to this URL;STATUSmust be set to one of the HTTP 3xx codesEXPAND_REDIRECT: OverrideREDIRECTwith the given value (after expanding); see Response.REDIRECT_QUERY_STRING: Append the query string to the givenREDIRECTURL.REDIRECT_FULL_URI: Use the full request URI path (including semicolon-arguments and the follow-up suffix, but excluding the query string) for expandingREDIRECT. This packet must be preceded byBASE,EASY_BASEandREDIRECT. It makes sense to combine it withREDIRECT_QUERY_STRING.
HTTPS_ONLY: Allow this request to be handled only on encrypted connections (HTTPS with SSL/TLS). If the connection is encrypted, then this is a no-op. If it is not encrypted, the server generates a permanent redirect tohttps://. The payload may contain a 16 bit integer specifying the port number (zero means default port).BOUNCE: Redirects the browser with a303 See Otherstatus to this URI, and appends the current absolute URI (form-encoded). This is useful to redirect to another server, which will need to redirect back to the original URI.MESSAGE: Generate a response with the given body (text/plainand US-ASCII).TINY_IMAGE: Generate a response with a tiny (one-pixel GIF) image.EXPAND_PATH: Override thePATHwith the given value (applicable to static files, CGI, FastCGI, WAS,HTTP). Backslash references are expanded to the value of the match group ofREGEX. In the presence of this packet, the URI suffix after the base will not be appended to other paths. The translation server is responsible for ensuring that the resulting path cannot point to files that are not supposed to be published. beng-proxy disallows/../sequences in the URI tail string, but it may nonetheless be possible for an attacker to break out if the regular expression and the expansion string are phrased improperly. (Since version 2.0.5)LISTENER_TAG: override theLISTENER_TAG. All following translation requests will feature the new listener tag.SITE: optional identification or name of the site this resource belongs toEXPAND_SITE: provide a cache expansion for the precedingSITESESSION_SITE: Set aSITEfor all requests in the current session. This packet with an empty payload can be used to clear the session’sSITEvalue.RATE_LIMIT_SITE_REQUESTS: limit the rate of requests to this site. Payload is two 32-bit floats describing the rate and burst for the underlying token bucket. Requests that fail the token bucket get a “429 Too Many Requests” response.RATE_LIMIT_SITE_TRAFFIC: limit the traffic rate of requests to this site. Payload is two 32-bit floats describing the rate [bytes per second] and burst [bytes] for the underlying token bucket. Requests that fail the token bucket get a “429 Too Many Requests” response.DOCUMENT_ROOT: base directory of the site; may also be passed after aCGIcommand, to set the document root only for this CGIFILTER: the next resource address (HTTP,CGI) will denote an output filter, see section Filters for details.CHAIN: similar toFILTER, but the translation server is asked again after the current response has been generated. See section Chains for details.
CACHE_TAG: Mark a cache item with this tag (an opaque string). This can be used to flush/invalidate groups of cache items in one control command. The following parts of the response can be tagged:After
FILTER: for filter cache items, to be used with FLUSH_FILTER_CACHE.After a HTTP resource address (e.g.
HTTP,FASTCGI,WAS): for HTTP cache items, to be used with FLUSH_HTTP_CACHE.Prior to any of the above: for the whole translation response (i.e. the translation cache item), to be used with TCACHE_INVALIDATE.
REVEAL_USER: If present afterFILTER, then the filter will seeX-CM4all-BENG-Useras an additional request header (if a user is logged in).FILTER_4XX: Enable filtering of client errors (status 4xx). Without this flag, only successful responses (2xx) are filtered. Only useful when at least oneFILTERwas specified.PROCESS: enables the beng-proxy processor, see section The Beng Template LanguagePROCESS_TEXT: enables the beng-proxy text processor (Since version 1.3.2)PROCESS_CSS: enables the beng-proxy CSS processorDOMAIN: the domain name for partitioned framesSESSION: a session identifier generated by the translation server, see section SessionsRECOVER_SESSION: A token to be stored in a browser cookie which can later be used by the translation server to recover the current session. In particular, it will be sent back to the translation server in a Token Authentication request.
ATTACH_SESSION: Attach to an existing session (or mark this session to be attached by others with the same identifier). The payload is a non-empty unique identifier for sessions to be attached/merged. This value can also be used to discard the session using the DISCARD_SESSION control packet.USER: the user name associated with this session
REALM: a realm name for this session. An existing session matches only if its realm matches the current request’s realm; on mismatch, a new session with the same public id is created for this realm. If this packet is not specified in the translation response, then the “Host” request header is used.REALM_FROM_AUTH_BASE: Copy theAUTHorAUTH_FILEcontents toREALM(i.e. withoutAPPEND_AUTH).TRANSPARENT: Transparent proxy: forward URI path segment params to the request handler instead of using them. This disables legacy handling of these params (which was used to control widget rendering).LANGUAGE: overrides theAccept-Languagerequest header for this sessionDISCARD_SESSION: discard the current browser sessionDISCARD_REALM_SESSION: Like ``DISCARD_SESSION`, but discard only the part of the session that is specific to the current realm (see t_realm).SECURE_COOKIE: Set the “secure” flag on the session cookie.SESSION_COOKIE_SAME_SITE: Set the “SameSite” attribute on the (realm) session cookie. Valid payloads arestrict,laxandnone(all lower case).CHDIR: change the working directory (after namespace setup).HOME: home directory of the account this site belongs to; will be mounted in the jail; defaults toDOCUMENT_ROOTEXPAND_HOME: Expansion forHOME.ADDRESS: after eachHTTPpacket, there must be one or moreADDRESSpackets which specify the resolved addresses. The payload of each is astruct sockaddr.STICKY: Make the resource address “sticky”, i.e. attempt to forward all requests of a session to the same worker.VIEW: starts a new view; the body of the packet is the name of the view (ASCII letters, digits, underscore, dash only). Each view can have different address/processor/filter settings. The first view (the one before the firstVIEWpacket) is the default and has no name.MAX_AGE: a 32 bit unsigned integer specifying the number of seconds the preceding piece of information is valid without having to revalidate. A value of 0 specifies that beng-proxy should not remember this value at all. Without this packet, the maximum age is not limited. Currently, this is only supported for the following packets:BEGIN(refers to the whole translate response)USER
VARY: similar to the HTTPVaryresponse header; the payload contains an array of translation request commands which this response depends upon.The following request packets are currently supported:
PARAM,SESSION,LISTENER_TAG,REMOTE_HOST,HOST,LANGUAGE,USER_AGENT,QUERY_STRING,USER,INTERNAL_REDIRECT,ENOTDIR.The following request packets are on “vary” implicitly:
WIDGET_TYPE,CONTENT_TYPE_LOOKUP,URI,STATUS,CHECK,WANT_FULL_URI,PROBE_PATH_SUFFIXES,PROBE_SUFFIX,PATH_EXISTS,FILE_NOT_FOUND,DIRECTORY_INDEX,WANT.INVALIDATE: Invalidates existing translation cache items which depend on some of the request values. The payload has the same format asVARY. Additionally, theURIcommand is supported, to invalidate all items pointing to the request URI, andSITEto invalidate all items with the given site name.If you specify more than one command, all must match. If you list a command which was not specified in the request (or a command which is not supported here), nothing will be deleted.
Example:
INVALIDATEonSESSIONinvalidates all cache items for the current session.REQUEST_HEADER_FORWARD: See Forwarding HTTP HeadersRESPONSE_HEADER_FORWARD: See Forwarding HTTP HeadersWWW_AUTHENTICATE: theWWW-Authenticateresponse header sent to the client (see RFC 2617). Currently, this is never cached. This exact behavior is subject to change in the future, and will be cacheable.AUTHENTICATION_INFO: theAuthentication-Inforesponse header sent to the client (see RFC 2617).HEADER: A custom HTTP response header sent to the client. Name and value are separated by a colon (without any whitespace). This will not override existing headers. It is not allowed to set hop-by-hop headers (RFC 2616 13.5.1) this way. This packet shall only be a last resort, when there is no other way to set a required response header.EXPAND_HEADER: Same asHEADER, but expand the value.REQUEST_HEADER: A custom HTTP request header for the backend server. Name and value are separated by a colon (without any whitespace). This will override existing headers. It is not allowed to set hop-by-hop headers (RFC 2616 13.5.1) this way.EXPAND_REQUEST_HEADER: Same asREQUEST_HEADER, but expand the value.CONTENT_TYPE_LOOKUP: Indicates that the translation server is willing to look upContent-Typeby file name suffix. See Content-Type Lookup for a detailed description.ERROR_DOCUMENT: Indicates that the translation server is willing to provide a custom error document. See Error documents for a detailed description.PROBE_PATH_SUFFIXES: Check if theTEST_PATH(orEXPAND_TEST_PATH) plus one of the suffixes fromPROBE_SUFFIXexists (regular files only). beng-proxy will send another translation request, echoing this packet and echoing thePROBE_SUFFIXthat was found. This packet must be followed by at least twoPROBE_SUFFIXpackets.PATH_EXISTS: Check if the givenPATHexists; the translation shall be repeated, echoing this packet accompanied by aSTATUSpacket describing whether the given file exists (200 or 404).FILE_NOT_FOUND: Indicates that the translation server would like to provide an alternate translation when the specified file does not exist. beng-proxy will repeat the translation request with this packet echoed. This is supported by the following address types:PATH,CGI,FASTCGI,WAS,LHTTP_PATH.ENOTDIR: Indicates that the translation server would like to provide an alternate translation when the specified file does not exist, but a portion of the path points to a regular file.DIRECTORY_INDEX: Indicates that the translation server would like to provide an alternate translation when the specified file is a directory. beng-proxy will repeat the translation request with this packet echoed.DIRECTORY_INDEX_SLASH: IfDIRECTORY_INDEXapplies but the request URI path does not end with a slash, automatically send a redirect appending the slash.TEST_PATH: Test the specified file. If this packet is not present, then the path from the resource address is used (PATH,CGI,FASTCGI,LHTTP_PATH). Affects the packetsFILE_NOT_FOUND,DIRECTORY_INDEX,ENOTDIR.EXPAND_TEST_PATH: Override theTEST_PATHwith the given value. Backslash references are expanded to the value of the match group ofREGEX. (Since version 4.0.34)COOKIE_DOMAIN: Set the session cookie’s “Domain” attribute.COOKIE_HOST: Override the cookie host name. This host name is used for storing and looking up cookies in the jar. It is especially useful for protocols that don’t have a host name, such as CGI.EXPAND_COOKIE_HOST: Expansion forCOOKIE_HOST.COOKIE_PATH: Override the cookie’sPathattribute. This is sent to the client when beng-proxy generates a new session cookie. Be careful with overlapping locations that create conflicting cookies.VALIDATE_MTIME: A cached response is valid only if the file specified in this packet is not modified. The first 8 bytes is the mtime (seconds since UNIX epoch), the rest is the absolute path to a regular file (symlinks not supported). The translation fails when the file does not exist or is inaccessible. The special value 0 matches only when the file does not exist; as soon as the file appears, the cached response will be discarded.READ_FILE: Asks beng-proxy to read the specified (small) file and submit another translation request with the file contents in anotherREAD_FILEpacket.EXPAND_READ_FILE: Expansion forREAD_FILE.
DEFER: Defer the request to the next translation server.PREVIOUS: Tells beng-proxy to use the resource address of the previous translation response. Only allowed if the request contains aCHECKorAUTHpacket.UNCACHED: Disable the HTTP cache for the given resource address.IGNORE_NO_CACHE: Ignore theCache-Control:no-cacherequest header, i.e. don’t allow the client to circumvent the HTTP cache.EAGER_CACHE: Enable caching for the given resource address, even if it is not declared to be cacheable.DISCARD_QUERY_STRING: Discard the query string from the request URI. This can be combined withEAGER_CACHEto prevent cache-busting with random query strings.NO_QUERY_STRING: No query string is allowed/supported on this request URI. The webserver is allowed to reject requests with a query string.AUTO_FLUSH_CACHE: All (successful) modifying requests (POST,PUT…) flush the HTTP cache of the specifiedCACHE_TAG.GENERATOR: A short symbolic identifier (alphanumeric, underscore, dash) for the entity that generates the HTTP response (according to the rest of this translation response). If non-empty, then this will set theGENERATORattribute in access log datagrams. Without this packet, the value of theX-CM4all-Generatorresponse header is used.
To send a standard error page, the translation server sends a response
containing only the STATUS parameter with the desired HTTP status.
Sending a packet twice is regarded an error. It cannot be used to override a previous value.
Caching¶
Almost all translation responses must be cacheable. The following response packets allow reusing cache items for different requests:
LIKE_HOST: Repeat the translation, but with the specifiedHOSTvalue (which can be an artificial name, even one which is not RFC-valid). This allows sharing the translation cache between different hosts. It can be combined withBASEandREGEXto share only a part of the URI location space.BASE: Defines a realm in the URI space. The payload specifies the URI prefix (of the original request URI, ending with a slash) which contains this realm. All resources in this realm can be addressed by beng-proxy with a trivial pattern: append the relative URI (within the realm) to the resource address (e.g. thePATH,HTTPorPATH_INFOvalue).The address in this response applies to request URI, not the base URI (to allow backwards compatibility with translation clients which do not support this packet).
Example: in the request,
URIis/foo/bar/index.html; in the response,PATHis/var/www/foo/bar/index.htmlandBASEis/foo/. The beng-proxy translation cache now knows: if a request on/foo/test.pngis received, it can serve/var/www/foo/test.pngwithout querying the translation server.UNSAFE_BASE: Modifier forBASE: omit the security checks. This allows/../to be part of the remaining URI, possibly allowing clients to break out of the given directory.EASY_BASE: ModifierBASEwhich aims to simplify its usage: the resource address given in the response refers to theBASE, not to the actual request URI. It is important to include the trailing slash which is part ofBASEin the resource address (e.g.BASE=”/foo/”,PATH=”/var/www/foo/”). beng-proxy applies the URI suffix before handling the HTTP request.REGEX: Reuse a cached response only if the requestURImatches the specified regular expression (Perl compatible, anchored). This works only when a BASE was specified. (Since version 1.3.2)INVERSE_REGEX: Don’t apply the cached response if the requestURImatches the specified regular expression (Perl compatible, anchored). (Since version 1.3.2)REGEX_TAIL: ApplyREGEXandINVERSE_REGEXto the URI suffix followingBASEinstead of the whole request URI. (Since version 4.0.21)REGEX_RAW: By default, URI paths are normalized when expanding a cached translation response (i.e. mutliple consecutive slashes are compressed to one and occurrences of/./are compressed to/). This option disables the URI path normalization.REGEX_UNESCAPE: Unescape the URI forREGEX.INVERSE_REGEX_UNESCAPE: Unescape the URI forINVERSE_REGEX.REGEX_ON_HOST_URI: Prepend theHostheader to the string used withREGEXandINVERSE_REGEX.REGEX_ON_USER_URI: Prepend the user name (fromUSER) and a ’@’ to the string used withREGEXandINVERSE_REGEX.LAYOUT: The translation server gives an overview of the URI layout. Its payload is a non-empty opaque value which is mirrored in the next request.This packet is followed by one or more
URI/BASE/REGEXpackets specifying exact URI matches, URI bases or regular expressions which shall not share cache items. The first matching base/regex specfies where translation cache items will be stored; all URIs without a match have their own cache.This way, cacheable URI bases can be constructed easily without excessively complex
INVERSE_REGEXpackets.Example for a response after a request to
/.cm4all/foo:BASE=/LAYOUT=[opaque]URI=/robots.txtBASE=/.cm4all/private/BASE=/.cm4all/BASE=/.well-known/REGEX=\.php$
Here, the whole host is separated into three bases (the three which are specified, and everything else). Responses don’t need
INVERSE_REGEXto exclude the specified bases.The following request will mirror the
LAYOUTpacket and the matchingURI/BASE/ ``REGEX` packet:URI=/.cm4all/fooLAYOUT=[opaque]BASE=/.cm4all/
The server recognizes that this is a follow-up request, and responds:
BASE=/.cm4all/EASY_BASEPATH=/var/www/cm4all/
This response can be cached and reused for everything below
/.cm4all/, except for URIs below/.cm4all/private/.If
LAYOUTis followed byREGEX_TAIL, then all regular expressions (and other URI comparisons) are matched against the tail of the URI after the givenBASE. ExampleLAYOUTresponse:BASE=/foo/LAYOUT=[opaque]REGEX_TAILURI=hello.txtBASE=bar/REGEX=\.php$
In the follow-up request, these are mirrored; for example, after a request to
/foo/hello.txt, the next translation request looks like this:URI=/foo/hello.txtLAYOUT=[opaque]URI=hello.txt
Note how there are now two
URIpackets: the first one is the actual request URI and the second one mirrors the matchingLAYOUTitem.As a shortcut for implementing CORS, a layout item may be followed by
ACCESS_CONTROL_ALLOW_ALL. All matchingOPTIONSrequests will then lead to an empty response withAccess-Control-Allow-{Origin,Methods,Headers}: *. Use this for API endpoints with unrestricted script access to avoid roundtrips to the actual API process.
Static files¶
See Static files for an explanation of static file resources.
The response packet PATH declares a static file that will be
served. The following packets are available:
PATH: Absolute path of the local file to be served.EXPAND_PATH: Override the path with the given value (after expanding); see Response.APPEND_PATH: Append this string to thePATH(after applyingBASEorEXPAND_PATH).AUTO_BROTLI_PATH: Build the precompressed Brotli path by appending.brto thePATH.GZIPPED: Absolute path of a precompressed version of the file. The file is compressed withgzip. May follow thePATHpacket.AUTO_GZIPPED: Build the precompressed path by appending “.gz” to thePATH. UnlikeGZIPPED, this is compatible withBASE.AUTO_GZIP: Compress the response on-the-fly if the client accepts thegzipencoding. This consumes a lot of CPU and should only be used for dynamic responses which can be compressed well.AUTO_BROTLI: Compress the response on-the-fly if the client accepts thebrencoding. This consumes a lot of CPU and should only be used for dynamic responses which can be compressed well.AUTO_COMPRESS_ONLY_TEXT: applyAUTO_GZIPandAUTO_BROTLIonly to text responses.CONTENT_TYPE: MIME type of the file (optional)EXPIRES_RELATIVE: Generate anExpiresresponse header. The payload is a 32 bit integer specifying the number of seconds from now.EXPIRES_RELATIVE_WITH_QUERY: LikeEXPIRES_RELATIVE, but this value is only used if there is a non-empty query string. This is useful for serving static files which are usually referenced with a version number in the query string.BENEATH: Absolute path of a directory that thePATHshall not escape, not even using symlinks. This is implemented using theRESOLVE_BENEATHflag of Linux’sopenat2()system call.
Proxying requests¶
When proxying HTTP requests with the a HTTP packet,
beng-proxy forwards the request to the specified location
(with headers filtered as described in Forwarding HTTP Headers), including
the HTTP method and the request body. There is one exception: if
PROCESS is enabled and a widget is focused (see Focus), the
other HTTP server receives a GET request without a body, because
the focused widget is going to receive the request body.
If the filter URL starts with a slash, beng-proxy assumes it is the absolute path to a Unix socket.
CGI, FastCGI, WAS and Pipe¶
The protocols CGI, FastCGI and WAS can be used to generate or filter resources (see CGI and FastCGI and WAS). A “pipe” can be used as a filter (see Pipe filters). The following packets are used to choose the protocol:
CGI: a local path which is executed as CGI scriptFASTCGI: a local path which is executed as FastCGI script. To connect to an existing FastCGI server, specify one or moreADDRESSpackets.WAS: a local path which is executed as WAS applicationPIPE: a local program which reads input from stdin and prints the modified resource on stdout
The following packets can be used to specify more details:
EXPAND_PATH: Override the executable path with the given value (after expanding); see Response.APPEND: appends an argument to the command lineEXPAND_APPEND: provide a cache expansion for the precedingAPPENDPAIR: adds a FastCGI/WAS parameter in the formKEY=VALUE.EXPAND_PAIR: provide a cache expansion for the precedingPAIRSETENV: adds an environment variable for CGI, FastCGI, WAS or LHTTP in the formKEY=VALUE.EXPAND_SETENV: provide a cache expansion for the precedingSETENVPATH_INFO: optional URI substring which was left after finding the fileEXPAND_PATH_INFO: Override thePATH_INFOwith the given value. Backslash references are expanded to the value of the match group ofREGEX. In the presence of this packet, the URI suffix after the base will not be appended to other paths. (Since version 2.0.4)DOCUMENT_ROOT: set the document root passed to this CGI processEXPAND_DOCUMENT_ROOT: Override theDOCUMENT_ROOTwith the given value. Backslash references are expanded to the value of the match group ofREGEX. (Since version 6.0)INTERPRETER: run a CGI script with the specified interpreter: invokes the specified interpreter with the mapped file path added as a command-line argument. This can be used to run Perl scripts without setting the “execute” bit.ACTION: run the specified CGI program instead of the mapped file. This program reads the mapped file path fromSCRIPT_FILENAMEand loads this script. This is modeled after the Apache directiveAction, and implements a protocol understood by PHP and COMA.SCRIPT_NAME: theSCRIPT_NAMEenvironment variable for a CGIEXPAND_SCRIPT_NAME: Override theSCRIPT_NAMEwith the given value. Backslash references are expanded to the value of the match group ofREGEX. (Since version 4.0.33)AUTO_BASE: Auto-calculate theBASEfromPATH_INFO(only CGI, FastCGI and WAS)REQUEST_URI_VERBATIM: Pass the CGI parameterREQUEST_URIverbatim instead of building it fromSCRIPT_NAME,PATH_INFOandQUERY_STRING. (Since version 16.29)
See Resource Limits for how to configure resource limits and Namespaces for how to configure namespaces.
Local HTTP¶
|l|X|
APPEND: appends an argument to the command lineEXPAND_APPEND: provide a cache expansion for the preceding
APPENDSee Resource Limits for how to configure resource limits and Namespaces for how to configure namespaces.
Forwarding HTTP Headers¶
There are two translation packets which control which HTTP headers are going to be forwarded:
REQUEST_HEADER_FORWARD: this packet specifies which request headers are forwarded to the request handler. The payload is a list of group/mode pairs (struct beng_header_forward_packet).RESPONSE_HEADER_FORWARD: same asREQUEST_HEADER_FORWARD, but applies to response headers forwarded to the client.
Group is one of:
IDENTITY: headersVia,X-Forwarded-For,X-CM4all-GeneratorCAPABILITIES:Server,User-Agent,Accept-*COOKIE:Cookie[2],Set-Cookie[2]FORWARD: forward information about the original request/response that would usually not be visible. If set toMANGLE, thenHostis translated toX-Forwarded-Host.CORS: forward CORS request/response headersSECURE: forward “secure” request/response headers such asX-CM4all-BENG-UserSSL: forward information about the SSL connection, i.e.X-CM4all-HTTPS(set toonif the request was received on a SSL/TLS connection, see SSL/TLS),X-CM4all-BENG-Peer-SubjectandX-CM4all-BENG-Peer-Issuer-Subject(see Client Certificates)TRANSFORMATION: forward headers that affect the transformation (i.e.X-CM4all-View)LINK: forward headers that contain links, such asLocation,Content-LocationandReferer; if set toMANGLE, then beng-proxy attempts to rewrite theLocationURI relative to itselfAUTH: forward HTTP authentication headers (e.g. basic/digest auth), such asWWW-Authenticate,Authentication-Infoandauthorization; if set toMANGLE, then beng-proxy allows the translation server to handle HTTP authentication. The default isNOfor request headers andMANGLEfor response headers.MANGLEon the request header settings generates anAutorizationrequest header containingbearer USER, whereUSERis the current user as specified by theUSERtranslation response packet. This can be used for servers which do not understand theX-CM4all-BENG-Userrequest header (from header groupSECURE).OTHER: other end-to-end headers not explicitly mentioned hereALL: all of the above except forSECURE,SSLandAUTH
Mode is one of:
NO: don’t forward the headersYES: forward the headersMANGLE: beng-proxy processes the headersBOTH: both beng-proxy and the backend server process the headers (special case for cookie headers, which is a combination ofYESandMANGLE)
beng-proxy’s session management is only active when
COOKIE is MANGLE (which is the default) or BOTH. The
behavior of the COOKIE setting on widgets is undefined.
Resource Limits¶
The packet RLIMITS specifies Linux resource limits for child
processes. Its payload is a string, a sequence of resource limit codes
and their respective limit values. The following resource limits are
supported:
t(CPU): CPU time limit in seconds.f(FSIZE): The maximum size of files that the process may create.d(DATA): The maximum size of the process’s data segment.s(STACK): The maximum size of the process stack, in bytes.c(CORE): Maximum size of core file.m(RSS): The limit of the process’s resident set, in pages.u(NPROC): The maximum number of processes that can be created for the real user ID.n(NOFILE): The maximum file descriptor number that can be opened by this process.l(MEMLOCK): The maximum number of bytes of memory that may be locked into RAM.v(AS): The maximum size of the process’s virtual memory (address space) in bytes.i(SIGPENDING): The maximum number of signals that may be queued.q(MSGQUEUE): The maximum number of bytes that can be allocated for POSIX message queues.e(NICE): A ceiling to which the process’s nice value can be raised.r(RTPRIO): Ceiling on the real-time priority that may be set for this process.
The letter in the first column is the code for the payload, to be followed by ’!’ (for “unlimited”) or the numeric limit value (with optional prefix “K”, “M” or “G” for “kibi”, “mebi”, “gibi”).
The limits are applied to both “soft” and “hard” by default. The code
S changes all following specifications to “soft” only, and H
does the same for “hard”.
Example:
c!Sv1Gn256Hn512
Explanation:
c!unlimited core file size (both soft and hard)S: the following will be soft limitsv1G: limit address space to \(1 GiB\) (soft; the hard limit is unchanged)n256: maximum 256 file descriptors (soft)H: the following will be hard limitsn512: maximum 256 file descriptors (hard)
Namespaces¶
Child processes such as FastCGI programs can run in separate Linux namespaces to improve separation from the rest of the server. That requires a fairly new Linux kernel.
Articles on http://lwn.net/ on Linux namespaces:
User Namespaces¶
The translation packet USER_NAMESPACE launches the process in a
new user namespace. This creates a new mapping for user ids inside
this namespace. More importantly, this gives the process a full set of
capabilities. This is a precondition for some of the other namespaces.
Requires Linux 3.8 or newer.
PID Namespaces¶
The translation packet PID_NAMESPACE launches the process in a new
PID namespace. This creates a new mapping for process ids inside this
namespace. Only processes in this namespace are visible and only these
can be killed.
The translation packet PID_NAMESPACE_NAME reassociates the process
with an existing PID namespace, selected by its name (in the payload).
This requires the cm4all-spawn daemon, which manages PID
namespaces.
By default, other processes are actually still visible through
/proc. For complete PID namespace support, one would need to
mount a new proc filesystem connected to the new namespace.
Requires Linux 3.8 or newer.
Cgroup Namespaces¶
The translation packet CGROOUP_NAMESPACE launches the process in a
new cgroup namespace.
Requires Linux 4.6 or newer.
Network Namespaces¶
The translation packet NETWORK_NAMESPACE launches the process in a
new network namespace. Without further configuration, this leaves the
process without access to the network, because there is no network
device in the new namespace.
The packet NETWORK_NAMESPACE_NAME instead reassociates the process
with an existing network namespace configured with ip netns.
Requires Linux 2.6.29 or newer.
Mount Namespaces¶
A mount namespace makes the VFS mount table private to the new process. This namespace is created implicitly by the packets described in this section.
PIVOT_ROOTworks like thechrootcommand; its payload specifies the directory which will be the new root. All other mounts will be removed from the namespace. The new root must contain a top-level directory calledmnt. It will be mounted read-only and with optionnosuid.CHROOTis plain oldchroot(). Can be combined withPIVOT_ROOT; and unlike that command, it does not need a top-levelmntdirectory.MOUNT_ROOT_TMPFScreates an empty read-onlytmpfsas the filesystem root. All required mountpoints will be created, but the filesystem will contain nothing else.TMPFS_DIRS_READABLE: Make all directories created in tmpfs (MOUNT_ROOT_TMPFS,MOUNT_EMPTY) readable. By default, such directories are only “executable”, but not “readable”.MOUNT_PROCmounts a new read-only instance of theprocfilesystem.MOUNT_DEVmounts a minimalistic/dev.MOUNT_HOMEbind-mounts the home directory (specified byHOME) to the given directory within thePIVOT_ROOT. It will be mounted with optionnosuid.MOUNT_TMP_TMPFSmounts a newtmpfson/tmp. This is private to the namespace and is deleted when the process exits. The payload may specify additionaltmpfsmount options such assize=64M.By default, code execution from this filesystem is disabled via
MS_NOEXEC. A follow-upMOUNT_TMP_TMPFS_EXECpacket disables this behavior, i.e. allows executing code from thistmpfs.MOUNT_TMPFSmounts a new (user-writable)tmpfson the given path. This is private to the namespace and is deleted when the process exits.MOUNT_NAMED_TMPFSmounts a new (user-writable)tmpfson the given path that can be shared across processes. The payload is the name of the tmpfs source directory and the target directory (absolute path within the new root), separated by a null byte. Thetmpfswill be deleted if it is not used for a certain amount of time.MOUNT_EMPTYmounts a new (read-only)tmpfson the given path. Inside this filesystem, mount points will be created automatically. Other than that, it can be used to hide parts of an existing filesystem.BIND_MOUNTmounts arbitrary directories from the old root into the new root. The payload is the source directory and the target directory (absolute path within the new root), separated by a null byte. The new mount will have the optionsro,noexec,nosuid,nodev.The source directory is an absolute path on the host. If it is prefixed with
container:, it is relative to the new mount namespace, i.e. the container. The prefixhost:is the same as no prefix.This (and all variants of this packet) may be followed by an empty
OPTIONALpacket: if the source directory does not exist, this directive is ignored silently.EXPAND_BIND_MOUNTis the same asBIND_MOUNT, but the source directory is expanded usingREGEXresults.BIND_MOUNT_RWandEXPAND_BIND_MOUNT_RWdo the same, just in writable mode (mount optionrw).BIND_MOUNT_EXECandEXPAND_BIND_MOUNT_EXEComit thenoexecoption.BIND_MOUNT_RW_EXECmakes the mount both writable and executable.BIND_MOUNT_FILEmounts a (read-only, non-executable) regular file onto an existing regular file. The payload is the source path (absolute within the old root) and the target path (absolute within the new root), separated by a null byte.BIND_MOUNT_FILE_EXEComits thenoexecoption.MOUNT_LISTEN_STREAMcreates a stream listener socket and mounts it at the specified path into the container. Once the first process connects to this socket, beng-proxy sends a request to the translation server echoing just this packet; its response may contain one of:STATUS: an error condition.EXECUTE: a process to be spawned which starts with the listener socket on stdin.ACCEPT_HTTP: create a transient HTTP listener which receives HTTP requests from the child process; aLISTENER_TAGpacket may be present which will be echoed on all translation requests for this listener. IfSTATS_TAGis present, it will be used instead ofLISTENER_TAGfor Prometheus metrics.
The payload is the socket path inside the new mount namespace. After the socket path, a null byte may follow with opaque data which is ignored by beng-proxy, but which may be evaluated by the translation server.
WRITE_FILEwrite a small text file in a mount namespace. Payload is the absolute path and the file contents separated by a null byte. The file can either be written to atmpfsthat was already mounted, or bind-mounted over an existing read-only file.SYMLINK: Create a symlink. Payload is target and linkpath separated by a null byte.PIVOT_ROOTdepends on user namespaces.MOUNT_PROC,MOUNT_HOMEandMOUNT_TMP_TMPFSdepend onPIVOT_ROOT, user namespaces and PID namespaces.
UTS Namespaces¶
A UTS namespace allows manipulating the host name reported by the
kernel. UTS_NAMESPACE creates the namespace; its payload is the new
host name.
Namespaces Summary¶
The following example describes part of a translation packets that attempts to execute a child process as securely as possible:
USER_NAMESPACE
PID_NAMESPACE
NETWORK_NAMESPACE
PIVOT_ROOT "/var/lib/lxc/wheezy/rootfs"
HOME "/var/www/foo"
MOUNT_HOME "/home/www"
The child process cannot see or kill processes processes other than the
ones that were started by itself. It cannot access the network. It lives
in another filesystem namespace. It can access the directory
/var/www/foo at /home/www. The proc filesystem is not
mounted.
Cgroups¶
Control cgroups (“cgroups”) are a Linux kernel feature for grouping processes. They are useful in many ways, such as assigning/accounting resources (CPU, memory, network bandwidth, …).
beng-proxy can use cgroups only when launched with
systemd.
CGROUP specifies a cgroup name for the new child process. It
is a name below beng-proxy’s own cgroup assigned by
systemd. All controllers managed by systemd are enabled.
CGROUP_SET set a cgroup attribute. Payload is in the form
controller.name=value, e.g. cpu.shares=42.
CGROUP_XATTR set an extended attribute on the cgroup directory.
Payload is in the form namespace.name=value,
e.g. user.account_id=42.
Other Child Process Options¶
UID_GIDspecifies (effective) uid and gid (and supplementary groups) for the child process. Payload is an array of 32 bit integers. All selected users and groups must be explicitly allowed with theuserandgroupsettings in thespawnconfiguration. The default is to run child processes with the same unprivileged credentials as beng-proxy itself (or the one specified with--spawn-user).MAPPED_UID_GIDis likeUID_GID, but these are the numbers visible inside the user namespace. Currently, only the uid is implemented, therefore the payload must be a 32-bit integer.REAL_UID_GIDspecifies the real uid and gid for the child process. Payload is either one or two 32 bit integers. Defaults to theUID_GIDvalue.This feature works only if https://lore.kernel.org/linux-security-module/20250306082615.174777-1-max.kellermann@ionos.com/ is applied. Without it, the kernel will revert the euid on
execve().MAPPED_REAL_UID_GIDadds user namespace mappings forREAL_UID_GID. Currently, only the uid is implemented, therefore the payload must be a 32-bit integer.CAP_SYS_RESOURCEgrants the new child process the CAP_SYS_RESOURCE capability, allowing it to ignore filesystem quotas. It is not possible to use it together with user namespaces (USER_NAMESPACE).NO_NEW_PRIVSpermanently disables new privileges for the child process. That is,setuidandsetgidbits are ignored on executed programs. It is recommended to set this flag on all processes by default, unless there are strong reasons against it.FORBID_USER_NSforbids the child process to create new user namespaces and thus gaining a full set of capabilities. This is useful because there have been lots of namespace-related vulnerabilities in the kernel.FORBID_MULTICASTforbids the child process to add multicast group memberships. This is useful because it disallows snooping on the host’s multicast traffic.FORBID_BINDmakesbind()andlisten()returnEACCES.ALLOW_PTRACEallows the child process to use theptrace()and similar system calls which are disallowed by default.STDERR_PATHspecifies an absolute path that will be created. The child’s error messages will be appended there.STDERR_NULLredirects standard error to/dev/nullinstead.STDERR_PONDenables thechild_error_loggerwhen it was disabled withis_default="no"(see Child Error Logger).CHILD_TAGspecifies a “tag” string for the child process. This can be used to address groups of child processes (e.g. for FADE_CHILDREN). A child process may have more than one tag.
Filters¶
The translation server can tell beng-proxy to apply a
filter to the resource by sending the FILTER command. It is
followed by a packet specifying the filter server (HTTP, CGI,
FASTCGI, PIPE).
A filter server is a HTTP server. beng-proxy sends the original resource with a POST request and expects the filtered resource as response.
If the filter returns status 200 OK or 204 No Content, then
the previous status code is used instead.
It is important that a filter is completely stateless. Running the same filter twice on the same source must always render the same result, at any time.
There may be more than one filter; the order of the PROCESS and
FILTER packets is important.
According to the HTTP specification, POST requests are not
cached. To gain the necessary performance, beng-proxy
caches filter results, extending the HTTP specification. This is
limited to resources which have an ETag response header, because
beng-proxy uses the ETag internally to address cache
items.
Chains¶
Chained request handlers behave similar to FILTER: the current
handler’s response is passed to the next handler as POST request.
But unlike FILTER, beng-proxy waits for the current
handler to generate the response, and only then asks the translation
server for further instructions. This is useful in situations where
one handler prepares something which the translation server needs to
decide about the next stage.
To enable chaining, the translation sends a response specifying the
request handler plus a CHAIN packet with opaque payload. Once
that request handler has generated the response, beng-proxy
sends another translation request containing a copy of the CHAIN
packet and a STATUS packet. Additionally, the CHAIN_HEADER
may contain the value of the X-CM4all-Chain response header, if
one exists in the current HTTP response.
Now the translation server generates another request handler, or
BREAK_CHAIN to send the pending response to the browser as-is.
Example:
request 1:
URI "/chain/"
HOST "example.com"
...
response 1:
HTTP "http://foo/bar/"
CHAIN "42"
request 2:
CHAIN "42"
CHAIN_HEADER "xyz"
STATUS "200"
response 2:
WAS "/the/filter/program"
If the response packet CHAIN is followed by an empty
TRANSPARENT_CHAIN packet, the chain handler will only see a
GET request without a body, and the original request method/body
will be sent to the following request handler. In that case, the
chain handler’s response body will be ignored.
Sessions¶
beng-proxy lets the translation server manage a “session” variable, which may be empty, or contain an opaque string. It is up to the translation server to manage its contents. With every translation request, beng-proxy sends its contents unless it is empty (in which case it omits this parameter). With every response, the translation server may provide a new value (which may be empty).
Additionally, the REALM_SESSION packet may contain a value that is
specific to the session realm. It is only sent to the translation
server in TOKEN_AUTH requests.
External Session Manager¶
Sometimes, the translation server involves an external entity in its
session management, for example to handle authentication. The
translation server can then ask beng-proxy to handle
refreshes by sending a GET to a specified HTTP server.
The packet EXTERNAL_SESSION_MANAGER contains the HTTP URL, and
must be followed by one or more ADDRESS packets (just like the
HTTP packet). After that, the packet
EXTERNAL_SESSION_KEEPALIVE may contain a 16 bit integer specifying
the refresh interval in seconds.
The refresh is performed only while handling a request for this session.
Example:
EXTERNAL_SESSION_MANAGER=http://foo/session/42
ADDRESS=192.168.1.100:80
EXTERNAL_SESSION_KEEPALIVE=300
This example sends a GET request every 5 minutes to
http://foo/session/42 on IP address 192.168.1.100.
Content-Type Lookup¶
The presence of CONTENT_TYPE_LOOKUP in a translation response
indicates that the translation server is willing to look up
Content-Type by file name suffix. It will disable the normal
lookup via extended attributes.
When a HTTP request for a static file is
handled, beng-proxy will check if the file name has a
“suffix” (short alphanumeric name after a dot). If will ask the
translation server for a Content-Type for this suffix. This
translation request contains the packets CONTENT_TYPE_LOOKUP
(echoing the server’s packet) and SUFFIX (containing the non-empty
suffix without the dot).
Example conversation:
client sends
BEGIN“\x03”client sends
CONTENT_TYPE_LOOKUP“foo”client sends
SUFFIX“png”client sends
ENDserver sends
BEGIN“\x03”server sends
CONTENT_TYPE“image/png”server sends
END
If the suffix is unknown, the translation server may omit the
CONTENT_TYPE packet and only reply with BEGIN and END.
AUTO_GZIPPED and AUTO_BROTLI_PATH may be specified if this
file type is likely to have a precompressed file in the same
directory.
Additionally, the translation server may specify transformations
(PROCESS or FILTER) for all files of this type. They will be
applied before other transformations from the original translation
response.
Error documents¶
Errors from remote servers are forwarded to the client. If no error document is available, beng-proxy generates a simple one.
The translation server indicates that it is willing to override the
error document by sending an empty ERROR_DOCUMENT packet in the
translation response. As soon as an error occurs (response status
400..599), beng-proxy sends another translation request,
consisting of ERROR_DOCUMENT, URI and STATUS. The payload
of ERROR_DOCUMENT is opaque to beng-proxy, and will be
echoed.
The translation server responds with a pointer to another resource which shall be used as the error document. If the translation response is empty, or if the error document itself fails, beng-proxy forwards the original error document (or generates one). The error document cannot be filtered or processed.
CSRF Protection¶
To help applications fix cross-site request forgery vulnerabilities,
beng-proxy implements the X-CM4all-CSRF-Token header.
This feature needs to be enabled explicitly with the following
packets:
REQUIRE_CSRF_TOKENrequires a valid token request header for modifying requests (POST,PUTetc.). This option is not only supported for regular HTTP requests, but also for widgets (for modifying requests to widgets).This requirement only applies to requests with a session cookie. Requests without a session are assumed to be harmless, because there is no authenticated identity associated with it.
SEND_CSRF_TOKENadds a valid token header to successful responses. This option is not supported for widgets.
Covert cross-site requests don’t have this header (with a valid value)
will be denied with status 403 Forbidden, effectively avoiding
this kind of vulnerability.
Clients can obtain a token by inspecting the response header of a
request to a location with SEND_CSRF_TOKEN enabled. They may then
use this token in subsequent modifying requests to
REQUIRE_CSRF_TOKEN locations.
This token is specific to the session and expires after a while (currently an hour). It can be reused until it expires.
Since this is implemented as a header, this cannot be used for plain
HTML FORM requests. If the client is a browser, it is necessary
to use the XMLHttpRequest or Fetch API which allows sending
custom headers.
Widget registry¶
The translation server provides access to the widget database, where all widget servers are registered. A widget request can use the following packets:
WIDGET_TYPE: the name of the widget type
The translation server’s response consists of these packets:
STATUS: in case of a lookup error, this packet provides the HTTP status codePATH,CGI,HTTP: choose one of these packets: a static widget (local file path), a local CGI script, or a HTTP serverPROCESS: enable the BENG processorUNTRUSTED: sets the externally visible host name for requests which are proxied to this widget. This marks the widget as “untrusted” and disallows any other way of embedding it. This is useful for widget code whose JavaScript must not be executed in the same context as another widget.UNTRUSTED_PREFIX: same asUNTRUSTED, but is a prefix for the request host name. This widget can only be used when the request’sUNTRUSTEDpacket begins with this prefix. Example:UNTRUSTED_PREFIX="foo"matches a request withUNTRUSTED="foo.example.com", but notUNTRUSTED="foobar.example.com".UNTRUSTED_SITE_SUFFIX: similar toUNTRUSTED_PREFIX, but matches the suffix instead of the prefix. When generating untrusted URIs, the site name is prepended. During verification, the request’sUNTRUSTEDvalue must exactly match this scheme.UNTRUSTED_RAW_SITE_SUFFIX: LikeUNTRUSTED_SITE_SUFFIX, but do not insert a dot.DIRECT_ADDRESSING: Enable “direct” URI addressing for this widget. It is used when the widget is requested in a “frame”. It is a simpler scheme that is more natural; relative links can be built without URI rewriting and without the special beng-proxy encoding. In some cases, the processor can therefore be disabled, reducing overhead.STATEFUL: Remember the state of this widget, i.e. path info and query string. It is remembered forGETrequests to the widget when it is focused and the XML processor is enabled.POSTrequests do not update the state because thePOSTURI may not be valid in a follow-upGETrequest. AJAX requests on the other hand should not update the state, and they do not because they usually do not use the XML processor, which is only useful for generating the initial HTML page, and not for incremental (AJAX) updates.WIDGET_INFO: Send the request headersX-CM4all-Widget-Id,X-CM4all-Widget-TypeandX-CM4all-Widget-Prefixto the widget server. (Since version 1.3.2)
LOCAL_URI: The URI of the “local” location of a widget class. This may refer to a location that serves static resources. It is used by the processor for rewriting URIs beginning with@/(see Static Widget Resources). The payload must end with a slash. beng-proxy does not process this URI. It is going to be evaluated by the browser, and may be absolute. For example, it may refer to a dedicated resource server.DUMP_HEADERS: Enable header dumps for the widget: on a HTTP request, the request and response headers will be logged. Only for debugging purposes.PEEK: Mark this request as a “peek” request, which means the server shall generate the translation response, but shall not account it (e.g. shall not mark a ticket as “consumed”).
Login translation¶
To support interactive login, the translation server can implement this protocol. It translates a user name to information on how to launch the user’s processes.
The request contains the following packets:
LOGIN: Marks this request as a “login” request. No payload.SERVICE: Payload specifies the service that wants to log in. Examples for well-known service names:ssh: Secure Shell. The response describes how to execute commands in a SSH sesion channel.sftp: SSH File Transfer Protocol, i.e. SSH subsystemsftp.rsync: rsync over SSH. This request is sent by Lukko when it sees arsync --servercommand. The response contains anEXECUTEpacket with a path to a statically linkedrsyncexecutable that will be executed usingexecveat().
LISTENER_TAG: A string which specifies the listener this login was accepted on; this is optional and its configuration is specific to the translation client.USER: Contains the user name specified by the client.PASSWORD: If this packet is present, then the client asks to verify a password (clear-text in the payload). A password mismatch must result in a negative reply.
If the user does not exist, the translation server shall respond with
STATUS=404.
A successful response must contain at least HOME and UID_GID:
HOME: Path of the user’s home directory.SHELL: An absolute path specifying the user’s shell.UID_GID: Specify uid and gid (and supplementary groups) for the child process. Payload is an array of 32 bit integers.TOKEN: A token to be matched by the OpenSSH configuration file.NO_PASSWOORD: If present, then theLOGINrequest can be approved without a password. This can happen when the username is a secret token. An optional payload may describe a service-specific limitation, e.g.sftpto limitLOGIN/SERVICE=sshtoSERVICE=sftp.AUTHORIZED_KEYS: The contents of an OpenSSHauthorized_keysfile.NO_HOME_AUTHORIZED_KEYS: If present, then~/.ssh/authorized_keysis not used.SERVICE: Begin a new partition of the response for the specified service. The translation server can do this to send an individual response for all supported services in a single response. This is useful if the request wasSERVICE=sshwhen the client (i.e. the SSH server, i.e. Lukko) doesn’t yet know whether the SSH client will open a shell or a SFTP session. Returning all possible services eliminates further translation requests: the translation server promises that these are the only allowed services (in the context of theSERVICEspecified in the request) and all other services shall be denied.
Cron translation¶
This sub-protocol can tell the cron job execution layer of
Workshop how to spawn a child process.
The request contains the following packets:
CRON: Marks this request as a “cron” request. The payload is the name of thecronsection in Workshop’s configuration file, or none if none was specified there.URI: If the job refers to a URN instead of a command, then this packet is present and contains the URN. A successful response must specify the program to be executed inEXECUTEwith command-line arguments inAPPENDpackets.USER: The account id owning the job.PARAM: An opaque string from the cron job table. Its contents are specific to the translation server. Its contents should be considered user input, and should not be trusted. Optional.
If the account does not exist, the translation server shall respond with
STATUS=404.
If no STATUS packet is present, the request is assumed to be
successful.
A successful response must contain at least HOME and UID_GID:
HOME: Path of the user’s home directory.UID_GID: Specify uid and gid (and supplementary groups) for the child process. Payload is an array of 32 bit integers.
Additional packets may configure resource limits (Resource Limits, Namespaces) and so on (Other Child Process Options).
The client may assume that all responses may be cached indefinitely.
Execute Translation¶
This sub-protocol is used to query how to spawn a process which was requested to be executed.
The request contains the following packets:
EXECUTE: Marks this request as an “execute” request. The payload is a token describing which process shall be executed. This token was provided by an unprivileged process and should not be trusted.PARAM: An opaque parameter with more details about the process. This parameter was provided by an unprivileged process and should not be trusted.SERVICE: Payload specifies the service that wants to execute the process, e.g.workshop.LISTENER_TAG: A tag which was set in the client’s configuration file.PLAN: If this request was triggered by a Workshop plan, then this is its name.
A successful response contains at least EXECUTE with the path of
the program to be spawned, plus the usual process parameters.
A failed response contains STATUS and optionally MESSAGE.
HOME: Path of the user’s home directory.UID_GID: Specify uid and gid (and supplementary groups) for the child process. Payload is an array of 32 bit integers.
Pool translation¶
This sub-protocol is used beng-lb. It allows the translation server to choose a pool which shall handle a specific HTTP request.
The request contains the following packets:
POOL: Marks this request as a “pool request. The payload is the name of thetranslation_handlersection inlb.conf.HOST: theHostHTTP request header
The response contains the following packets:
POOL: The name of the pool (orbranchorlua_handler…) which shall handle the HTTP request.CANONICAL_HOST: A string which shall be used instead of theHostrequest header for the “host” sticky mode.SITE: Optional identification or name of the site this resource belongs to. It has no meaning for beng-lb, and is only used forTCACHE_INVALIDATE.STATUS: Can be used instead ofPOOLto generate a brief error response.REDIRECT: Can be used instead ofPOOLto generate a redirect response (303 See Otherwith the specifiedLocationheader value). Can be combined withSTATUSto select a different status code.HTTPS_ONLY: See page .MESSAGE: Can be used instead ofPOOLto generate atext/plainresponse. Can be combined withSTATUSandREDIRECT.VARY: See page .ARCH: Prefer this CPU architecture for the selected pool member. Payload can beamd64orarm64. If no member with a matching architecture exists, the behavior is unspecified; the request may fail or be forwarded to a server with a mismatching architecture. (This is implemented forrendezvous_hashingonly.)
The client may assume that all responses may be cached indefinitely.