Port native malloc allocation profiling from async-profiler#398
Open
Port native malloc allocation profiling from async-profiler#398
Conversation
ddprof-test/src/test/java/com/datadoghq/profiler/nativemem/NativememProfilerTest.java
Fixed
Show fixed
Hide fixed
ddprof-test/src/test/java/com/datadoghq/profiler/nativemem/NofreeNativememProfilerTest.java
Fixed
Show fixed
Hide fixed
ddprof-test/src/test/java/com/datadoghq/profiler/nativemem/NativememProfilerTest.java
Fixed
Show fixed
Hide fixed
CI Test ResultsRun: #22770980642 | Commit:
Status Overview
Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled Summary: Total: 32 | Passed: 32 | Failed: 0 Updated: 2026-03-06 16:22:22 UTC |
8fc0e0d to
9fc994d
Compare
- Add MallocTracer engine (GOT/PLT patching for malloc/calloc/realloc/free/posix_memalign/aligned_alloc) - Add BCI_NATIVE_MALLOC, MallocEvent, T_MALLOC/T_FREE JFR types - Add nativemem/nofree arguments - Add profiler.Malloc and profiler.Free JFR event metadata - Add recordEventOnly() for stack-trace-less free events - Add NativememProfilerTest and NofreeNativememProfilerTest Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
9fc994d to
5141dd8
Compare
jbachorik
commented
Mar 4, 2026
ddprof-test/src/test/java/com/datadoghq/profiler/context/TagContextTest.java
Outdated
Show resolved
Hide resolved
jbachorik
commented
Mar 4, 2026
jbachorik
commented
Mar 4, 2026
jbachorik
commented
Mar 4, 2026
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
jbachorik
commented
Mar 4, 2026
jbachorik
commented
Mar 4, 2026
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
a938cd3 to
f70abcb
Compare
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Integration Tests❌ 19 passed, 21 failed out of 40 configurations Test Matrix
Failure Detailsglibc-arm64-hotspot-jdk17Profiler-only: glibc-arm64-hotspot-jdk21Profiler-only: glibc-arm64-hotspot-jdk25Profiler-only: glibc-arm64-openj9-jdk21Profiler-only: glibc-arm64-openj9-jdk25Profiler-only: glibc-arm64-openj9-jdk8Profiler-only: glibc-x64-hotspot-jdk11Profiler-only: glibc-x64-hotspot-jdk25Profiler-only: glibc-x64-hotspot-jdk8Profiler-only: musl-arm64-hotspot-jdk11Profiler-only: musl-arm64-hotspot-jdk17Profiler-only: musl-arm64-hotspot-jdk21Profiler-only: musl-arm64-hotspot-jdk25Profiler-only: musl-arm64-hotspot-jdk8Profiler-only: musl-arm64-openj9-jdk17Profiler-only: musl-arm64-openj9-jdk21Profiler-only: musl-arm64-openj9-jdk25Profiler-only: musl-arm64-openj9-jdk8Profiler-only: musl-x64-hotspot-jdk17Profiler-only: musl-x64-hotspot-jdk21Profiler-only: musl-x64-openj9-jdk8Profiler-only: Links
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?:
Ports the native malloc allocation profiler from async-profiler and integrates it with the Datadog JFR pipeline. When enabled via
nativemem=<interval>, the profiler interceptsmalloc,calloc,realloc,posix_memalign, andaligned_allocacross all loaded native libraries using GOT patching, and emitsprofiler.MallocJFR events with Java stack traces. Thefreefunction is hooked to forward correctly through the GOT but free events are not recorded — sampled mallocs mean most frees would match nothing, and the immense event volume with no stack traces provides no actionable insight.Changes:
mallocTracer.cpp/h— ported from async-profiler; GOT-patching hooks, Poisson byte-interval sampling with PID rate-limiting, nested-malloc detection for musl compatibilityflightRecorder.cpp/h—recordMallocSample()forprofiler.MallocJFR events with profiling context (spanId, localRootSpanId, contextAttributes); fixputVar32→putVar64for 64-bit trace IDsjfrMetadata.cpp/h— newprofiler.Malloc(T_MALLOC) event type definition with weight and context fieldsprofiler.cpp/h—BCI_NATIVE_MALLOCpath inrecordSample,dlopen_hookpatching of newly loaded libraries,CSTACK_VMpromotion when VMStructs availablearguments.cpp/h—nativemem=<bytes>argument parsingcodeCache.cpp/h—im_posix_memalign/im_aligned_allocimport IDsevent.h—MallocEventstruct with weight fielddoc/architecture/NativeMemoryProfiling.md— architecture documentMotivation:
Native heap allocations (malloc/free) are a significant source of memory pressure and latency in JVM applications that rely on JNI, off-heap buffers, or native libraries. This feature gives users visibility into native allocation patterns alongside existing JVM heap profiling.
Additional Notes:
Upstream source:
mallocTracer.cppandmallocTracer.hare a port of the equivalent files from async-profiler. The porting involved:recordSample) instead of async-profiler's own serialisationpatchLibrariesloop to use Datadog'sCodeCache/UnloadProtectionAPIStack walking: Native malloc events have no signal context (
ucontext == NULL).CSTACK_VM(HotSpot VMStructs +JavaFrameAnchor) is the only mode that can produce meaningful Java stack traces in this situation.CSTACK_DEFAULTis the initial default; at profiler start it is promoted toCSTACK_VMwhen VMStructs are available. On JVMs where VMStructs are unavailable the profiler stays atCSTACK_DEFAULT.Sampling: Uses Poisson-interval sampling (
shouldSample()) with a lock-free CAS loop. A PID controller (updateConfiguration()) periodically adjusts the interval to maintain ~100 samples/second. Each sample carries a statistical weight reflecting the Poisson sampling probability.No free event tracking: Free calls are hooked (to forward through the GOT correctly) but not recorded. With Poisson sampling on mallocs, most frees correspond to unsampled allocations and would produce meaningless events. The volume of free calls with no stack traces provides no actionable insight.
Reentrancy: Allocations made by the profiler itself during recording (stack walking, JFR buffer writes) will re-enter the hooks. This is a deliberate design trade-off (no TLS guard) documented in the source — it does not cause infinite recursion but may produce minor over-counting.
How to test the change?:
Automated integration tests covering malloc sampling:
Tests pass for both
cstack=vmandcstack=vmxvariants of:NativememProfilerTest#shouldRecordMallocSamplesFor Datadog employees:
credentials of any kind, I've requested a review from
@DataDog/security-design-and-guidance.Unsure? Have a question? Request a review!