Cross-Frontend Attribution Engine
Technical Methodology for Forensic-Level DEX Trade Attribution
ClearTrace solves this with a Cross-Frontend Attribution Engine that uses four advanced
on-chain forensics vectors to identify hidden frontends, institutional routing, and wallet-native swap
extensions. This methodology is built on Dune Analytics SQL queries analyzing the
dex.trades and ethereum.traces tables.
The Four Attribution Vectors
1. Calldata Suffix Trapping
What it catches: Hidden mobile apps, wallet extensions, and private portfolio interfaces
Many modular frontends and wallet-native swap extensions (Phantom, MetaMask Swap, private portfolio
apps) append a static hex suffix to the end of the execution calldata (tx.data).
Because this suffix sits outside the standard ABI function parameters, contract decoders ignore
it—but it remains fully visible in raw trace data.
Implementation Strategy:
- Inspect the raw
datafield of transactions interacting with main routers (Uniswap UniversalRouter, 1inch Router, etc.) - Filter for rows where
length(data)exceeds the standard expected bytecode length for that function signature - Use regex string parsing on the last 4-8 bytes of the binary payload to isolate repeating hex fingerprints
- Group these unknown hex footprints by
tx_fromand volume
The Alpha: You quickly find clusters of millions of dollars in volume originating from specific unidentified mobile apps or private interfaces that you can now uniquely attribute.
SELECT
RIGHT(tx.data, 16) AS calldata_suffix,
COUNT(DISTINCT tx.tx_hash) AS tx_count,
SUM(tx.amount_usd) AS total_volume_usd
FROM dex.trades tx
WHERE LENGTH(tx.data) > expected_length
GROUP BY 1
HAVING SUM(tx.amount_usd) > 1000000
ORDER BY total_volume_usd DESC;
2. Proxy-Router Hunting (tx_to vs project_contract_address Mismatch)
What it catches: Custom enterprise frontends and institutional smart wallets
When a user swaps on a standard interface, tx_to is often the public router. But hidden
frontends or institutional desks frequently route transactions through their own proprietary,
un-decoded proxy contracts before hitting the actual DEX pool. In dex.trades,
the project_contract_address will show up as the Uniswap Pool, but the intermediate
layer is masked.
Implementation Strategy:
- Perform a LEFT JOIN between
dex.tradesandethereum.tracesontx_hash - Look for instances where
tx_to != project_contract_addressANDtx_tois NOT a known aggregator - Aggregate these unknown intermediate
tx_toaddresses by volume and chain
SELECT
t.blockchain,
t.tx_to AS hidden_proxy_contract,
COUNT(DISTINCT t.tx_hash) AS tx_count,
SUM(t.amount_usd) AS total_masked_volume
FROM dex.trades t
LEFT JOIN dex_aggregator.trades a
ON t.tx_hash = a.tx_hash AND t.blockchain = a.blockchain
WHERE a.tx_hash IS NULL -- Exclude known aggregators
AND t.tx_to != t.project_contract_address
GROUP BY 1, 2
HAVING SUM(t.amount_usd) > 100000
ORDER BY total_masked_volume DESC;
Result: This isolates the exact contract addresses of custom enterprise frontends and institutional smart wallets actively bypassing public tracking.
3. Multi-Hop Origin Trace (Behavioral Sequence Analysis)
What it catches: Interface churn patterns and user loyalty metrics
To prove to a protocol that their users are switching to other interfaces, you need to track sequential transaction habits. If a wallet swaps on Uniswap, where did they interact 5 minutes before? This reveals absolute churn vectors.
Implementation Strategy:
- Create a window function pipeline over
dex.tradesgrouped bytx_from(the EOA) - Order by
block_time - Use
LAG()andLEAD()to map immediate interface hops
WITH ordered_swaps AS (
SELECT
tx_from,
block_time,
project AS current_frontend,
amount_usd,
LAG(project, 1) OVER (PARTITION BY tx_from ORDER BY block_time) AS previous_frontend,
LAG(block_time, 1) OVER (PARTITION BY tx_from ORDER BY block_time) AS previous_swap_time
FROM dex.trades
WHERE block_time >= NOW() - INTERVAL '30' DAY
)
SELECT
current_frontend,
previous_frontend,
COUNT(*) AS transition_count,
AVG(DATE_DIFF('minute', previous_swap_time, block_time)) AS avg_time_between_hops
FROM ordered_swaps
WHERE previous_frontend IS NOT NULL
AND current_frontend != previous_frontend
GROUP BY 1, 2
ORDER BY transition_count DESC;
Insight: This lets you tell CowSwap: "When users leave your interface, 68% of them execute a trade on Uniswap within 14 minutes."
4. Fee-Recipient Attribution
What it catches: Anonymous frontends monetizing through hidden fees
Many "hidden" frontends monetize by baking a 10-50 basis point fee directly into swap parameters, routing that cut to a custom multisig wallet during transaction execution. Even if the interface is completely anonymous, the Fee-Recipient Address is a permanent, static, on-chain identifier.
Implementation Strategy:
- Look up sub-traces inside
ethereum.tracesor transfer events within the sametx_hashas the swap - Isolate secondary Transfer events or native asset transfers (
value > 0) that route small percentages of trade volume to an independent EOA or multisig - Cluster these fee-collection wallets
Result: Attribute volume to anonymous interfaces by their fee collection patterns, even when all other identifiers are masked.
Data Sources
- Dune Analytics:
dex.trades,dex_aggregator.trades,ethereum.traces - Chains Supported: Ethereum (primary), with Arbitrum, Base, BSC in development
- Update Frequency: Every 4 hours via Dune API
- Historical Depth: 30-90 days rolling window
Execution Quality Metrics
Once trades are attributed to their true frontend, we compute:
- VWAP Slippage: Difference between expected price (volume-weighted average) and actual execution price
- MEV Toxicity Score: Percentage of trades affected by sandwich attacks, frontrunning, or backrunning
- Revert Rate: Percentage of transactions that fail on-chain
- Execution Quality Score (0-100): Composite metric where 100 = flawless execution.
Calculated as
100 - toxicity_score
Why This Matters
For Protocols: Understand which frontends drive your volume and which provide the best user experience.
For Users: Verify execution quality claims with hard on-chain data instead of trusting marketing.
For Researchers: Access forensic-level attribution data that no other analytics platform provides.
Limitations & Future Work
- Chain Coverage: Currently Ethereum-focused; expanding to L2s and Solana
- Calldata Suffix Whitelist: Building a database of known multi-call patterns to reduce false positives
- Real-Time Detection: Current 4-hour refresh; working toward sub-minute latency with WebSocket feeds
- Flashbots Integration: Planning to integrate Flashbots Protect API for authoritative MEV data
Citations & Further Reading
Last Updated: June 2026
Contact: For research partnerships or methodology questions, visit cleartracedata.com