Executive Summary

We benchmarked MetaFFI against gRPC and dedicated native packages (JNI, CPython, JPype, Jep, ctypes) across all 6 language pairs, with 11 scenarios each — 198 benchmarks total, all passed.

Key finding: MetaFFI is 6-14x faster than gRPC across all language pairs, while requiring only a single programming language to write the host code.

Cross-Pair Latency Summary

Cross-pair performance summary

Detailed Numbers

Language Pair MetaFFI (ns) Dedicated (ns) gRPC (ns) MetaFFI vs gRPC
Go → Java 19,769 2,867 (JNI) 191,936 9.7x faster
Go → Python3 51,093 17,134 (CPython) 434,129 8.5x faster
Java → Go 31,092 5,394 (JNI) 185,655 6.0x faster
Java → Python3 60,619 31,029 (Jep) 457,936 7.6x faster
Python3 → Go 24,687 6,083 (ctypes) 355,969 14.4x faster
Python3 → Java 29,946 5,289 (JPype) 364,973 12.2x faster

Values are averages across 11 benchmark scenarios per pair (void call, primitives, strings, arrays, objects, callbacks, dynamic typing, 10k arrays).

Why Faster than gRPC?

MetaFFI and gRPC solve the same problem — calling code in another language — but they take fundamentally different approaches:

  MetaFFI gRPC
Communication In-process function calls Network socket (loopback)
Data transfer Shared memory via CDTs Serialization (protobuf)
Infrastructure None Server process + client stub
Latency Microseconds Milliseconds

gRPC was designed for distributed systems communication. Using it for in-process cross-language calls adds unnecessary overhead from serialization, network stack traversal, and server management.

Code Complexity Comparison

MetaFFI requires significantly less code and fewer languages to achieve the same cross-language integration:

Code complexity comparison

Metric MetaFFI gRPC Dedicated (native)
Avg benchmark SLOC 445 584 1,532
Languages required 1 3 2
Avg max cyclomatic complexity 11 21 18

With MetaFFI, you write host code in one language only. gRPC requires your host language plus protobuf definitions plus the guest language service implementation. Dedicated packages (JNI, CPython, etc.) require your host language plus low-level C/C++ bridge code.

MetaFFI vs Dedicated Packages

MetaFFI is not the fastest option — dedicated native packages like JNI, CPython API, and JPype are faster because they are hand-tuned, low-level bridges with no abstraction layer.

However, dedicated packages:

  • Require writing C/C++ bridge code or learning complex APIs (JNI, CPython)
  • Are specific to a single language pair
  • Have steep learning curves and fragile build configurations

MetaFFI trades some performance for a dramatically simpler developer experience: load a module, call a function. For most applications, the microsecond-level overhead is negligible compared to the actual work being done.

Benchmark Scenarios

Each language pair was tested across 11 scenarios:

Scenario Description
void_call Empty call (no parameters, no return) — measures pure overhead
pass_primitive Pass and return a single integer
pass_string Pass and return a string
pass_array Pass and return a small array
pass_object Create an object, call methods on it
object_method Call a method on an existing object instance
callback Pass a host function to guest code and have it called back
dynamic_type Pass/return values using any/interface{} types
pass_10k_array Pass a 10,000-element array
return_10k_array Return a 10,000-element array
dynamic_10k Pass 10,000-element array as dynamic type

Methodology

  • Each benchmark runs configurable warmup and measurement iterations (default: 100 warmup, 1000 measured)
  • Outliers removed using IQR method (1.5x interquartile range)
  • Statistics computed: mean, median, standard deviation, min, max, p95, p99
  • All benchmarks run on the same machine in sequence
  • Environment: Windows 11, Go 1.23, OpenJDK 22, Python 3.11

For full details, see the MetaFFI paper.