Performance is a key feature
Low latency, high throughput and extreme scalability is imprinted into the Orc Tbricks DNA.
The principal tenet of the Orc Tbricks system architecture is what we believe is one of its most important strengths: "Do the right thing, in the right place, at the right time." The internal protocols and data flows have been designed with a server-based co-located system in mind, with the goal to minimize the machine resources wasted on performing unnecessary work. This includes extensive use of source-side filtering for data streams whenever possible.
The implementation uses highly efficient data structures and algorithms, with the best possible time-complexity characteristics, all the way down to O(1) for performance-sensitive operations, like source-side filtering of data streams. We have minimized or outright eliminated all possible system calls, database accesses, context switches, mutex locks and
other synchronization primitives in latency-sensitive critical paths.
Sophisticated caching schemes are used when proved beneficial, for example for numerically intensive calculated instrument values, such as options pricing. Platform-unique performance optimizations are used whenever beneficial; for example, adopting the scalable jemalloc memory allocator, tuned file system settings, and tuning kernel, TCP/IP and NIC driver settings for minimum latency, including kernel-bypass using Solarflare® OpenOnload®. Orc Tbricks also has built-in support for creating processor sets and easily assigning component to them.
“Do the right thing, in the right place, at the right time.”
All performance critical services can be run in multiple instances for true horizontal scalability, and transparent multiplexing for market data and trading is built right in. All services have been heavily optimized to perform their designated task quickly and robustly, and all apps are built with native development tools for no-compromise performance.
For excellent vertical scalability, the Orc Tbricks services have been carefully multithreaded to ensure they can use all available processor cores. Multiple services running on the same machine will additionally benefit automatically from the multiprocessing provided by the operating system. To ensure efficient use of threading resources, Grand Central Dispatch has been integrated and is used throughout the system, allowing for lock-free operation of critical sections under load.
"Orc Tbricks inherently supports the fusing of latency-sensitive services into a single process"
Services in Orc Tbricks typically run as separate processes using shared memory for interprocess communication. Orc Tbricks inherently supports the fusing of latency-sensitive services into a single process using our Speedcore® technology. This allows for mimicking the deployment of a typical in-house application, while retaining a clear architectural separation of services.
Services can easily be moved into or out from a Speedcore®.
This innovative approach, allows you to carefully control how services should be deployed to ensure the best possible performance. The benefits of running in a Speedcore® configuration is the removal of the interprocess communication overhead between the services running in the Speedcore® as well as an improved CPU cache hit rate with dedicated CPU resources assigned to the Speedcore® using processor sets.
Orc Tbricks includes a blazingly fast embedded transactional database — Oracle Berkeley DB — which vastly outperforms conventional SQL databases. The embedded database resides in the same address space as the service, so there is zero IPC overhead for communicating with a database server. The fact that each service has access to its own private storage also
allows for highly parallelized I/O across the system. Oracle Berkeley DB is consistently used for all storage in the system and requires virtually no configuration.
“Orc Tbricks includes a blazingly fast embedded transactional database”
Bandwidth and interprocess communication
“A typical front-end only uses 200 Kbit/s on average”
When performing interprocess communication on the same host, Orc Tbricks uses shared memory transport for the best possible latency and throughput. For services running on different machines, TCP/IP is used to allow for the source-side filtering and throttling of data streams.
All interprocess communications in the system are done using an efficient binary encoded protocol, which is further efficiently compressed for traffic sent across the WAN. Partial message updates are fully supported and are consistently used throughout the system to only send the actual delta changes over the wire rather than full business objects each time. The extensive use of source-side filtering also avoids superfluous data transfers and removes unnecessary wake ups of threads and allows trading apps to simply react when something of note has happened, thus avoiding repeated inefficient 'should I do something?' checks.
The Orc Tbricks front-end is carefully implemented to use a minimum amount of bandwidth, as only the exact information that you see on screen is transferred. A typical front-end only uses 200 Kbit/s on average, with a full-fidelity truly responsive user experience. This removes the need for using
remote display solutions such as Citrix, which additionally do not solve the problem of connecting a single unified front-end to a fully distributed system which is running in multiple geographical locations.
It is also possible to further improve performance by dampening quickly oscillating data streams using throttling conditions. This is beneficial when you aren’t interested in, say, market data updates unless they deviate more than a certain amount since the last update you received, or when you don’t need updates more often than at a predefined maximum frequency. For instance, it’s possible to set up a throttling condition that limits the update rate to be at most every X milliseconds, or to only send an update for a currency rate when the bid or ask changes more than Y% since the last update received.
Such throttling conditions provide an additional performance boost as trading strategies don’t have to react on smaller price movements while still making sure an up-to-date value is received periodically by specifying the maximum update frequency.
The use of source-side filtering together with data stream throttling is a powerful combination that allows trading strategies as well as internal Orc Tbricks services to eliminate unnecessary updates that are wasting processing power.