Pure-Python tradeoffs
The premise of this driver is zero native code in the call stack. No CSDK, no JDBC, no C extension we maintain ourselves. Just socket, struct, decimal, and the standard library.
This page is about what that costs and what it pays for.
What pure-Python costs
Section titled “What pure-Python costs”The honest accounting:
| Cost | Magnitude | Mitigation |
|---|---|---|
| Per-row decode overhead | ~2.0 µs/row vs IfxPy’s ~1.1 µs/row | Phases 37–38 codec inlining brought us from 4 µs to 2 µs. |
| Per-PDU parser overhead | ~5–10 µs vs C’s ~1 µs | Phase 39 buffered reader removed the worst of it (the read-side wrapper cost). |
| GIL contention on multi-threaded decode | Threads serialize through codec hot loops | Pool gives one connection per thread; codec releases GIL during I/O. |
| Memory per connection | ~50–500 KB (Phase 39 buffer) | Pool keeps it bounded; freed on connection close. |
The order-of-magnitude intuition: pure-Python is ~2× slower than C-bound for codec-bound workloads (large analytical fetches), and competitive or faster for I/O-bound workloads (transactional, bulk-insert, FastAPI request-response).
What pure-Python pays for
Section titled “What pure-Python pays for”The benefits are mostly deployment, not performance:
- 50 KB wheel — installable in a slim Docker image without a build toolchain.
- No
libcrypt.so.1— works on Arch, Fedora 35+, RHEL 9, and any modern Linux. - Python 3.10–3.14 — no minor-version-specific C extension breakage. We’ve shipped on the day each new Python released.
- Type annotations everywhere —
py.typedflag, full coverage in mypy / pyright. - Auditable codepaths — every byte that enters or leaves a socket goes through Python code you can read. No “the C extension does it” excuses.
- Async without
run_in_executor— nativeasync defAPI, FastAPI-compatible.
For most real workloads, deployment friction matters more than 2 µs/row. The 92 MB OneDB tarball, the four LD_LIBRARY_PATH entries, the absent libcrypt.so.1 — those costs are paid every time you deploy. The 2 µs/row codec gap is paid once per row, and only if your workload is read-heavy enough for it to dominate.
Where the ceiling sits
Section titled “Where the ceiling sits”For codec-bound workloads, the ceiling we’ve hit is around 2 µs/row for tabular data. The breakdown:
struct.unpack_from(format, buf, offset)per field: ~80 nsbytes→strdecoding (varchar): ~150 ns- Tuple construction: ~100 ns
- Cursor / ResultSet bookkeeping: ~50 ns
Five fields × ~250 ns/field + ~250 ns overhead = ~1.5 µs. We’re at ~2.0 µs which means ~30% overhead remains. That’s the gap between “we’ve inlined everything that’s reasonable” and “the C version still wins.”
Strategies for closing further:
- Cython / mypyc compilation. Could shave 30-50% off the codec hot loop. Would compromise the “pure Python” claim — there’d be a build step.
- Bytecode optimization via
exec()-codegen (the Phase 38 approach). Marginal further wins; we’ve already extracted most of what’s available. - Numpy-backed bulk decode for homogeneous columns. Promising for analytical workloads. ~5× speedup possible for
SELECT col FROM huge_tableover the current per-row approach. Probably Phase 41+.
For I/O-bound workloads we’re already at the ceiling. The buffered reader closed the I/O gap; further wins are at the kernel level (e.g. recvmsg for vectored reads), which is moot since the kernel already isn’t the bottleneck.
The honest summary
Section titled “The honest summary”Pure-Python costs us ~5–15% on bulk-fetch workloads and zero (or favorable) on everything else. The deployment, async, and modern-Python wins are large and don’t depend on workload.
If the codec gap matters for your case — analytical reporting against a wide table, pulling millions of rows in a single SELECT — IfxPy is probably the right tool today. If you’re doing transactional or bulk-load work, FastAPI services, or any deployment where IBM’s C SDK is friction, informix-driver is the right tool.
The driver chose the goal — first pure-socket Informix driver in any language — over the local optimum. Phase 37 onward is a sustained effort to make that choice cost as little as possible.