The Threshold Promise and Its Frequent Betrayal

Threshold signature schemes (TSS) and multi-party computation (MPC) for key management are sold on a simple, seductive promise: eliminate the single point of failure without forcing users or protocols to manage cumbersome multisig ceremonies. The private key never exists in one place. Signing requires cooperation from a threshold of participants. The on-chain artifact is a standard signature.

In theory, this is strictly superior to both single-key custody and naive multisig for many DeFi, treasury, and coordination use cases. In practice, the last several years have produced a steady stream of high-profile failures where the threshold property held on paper but the system lost funds anyway. The most recent and instructive case is the May 2026 THORChain Asgard vault drain of roughly $10.7M, executed against a long-running, heavily forked GG20-based TSS implementation.

The gap is rarely in the underlying hardness assumptions (discrete log, factoring). It is almost always in the gap between the protocol as analyzed and the protocol as implemented, operated, and embedded in an adversarial, permissionless, or high-churn environment.

This piece examines that gap through two of the most relevant families of constructions today: the Paillier-heavy ECDSA schemes descended from GG18/GG20, and the more recent, Schnorr-native FROST family. It draws on primary incident reports, implementation analyses, and recent cryptographic results (particularly the 2025 line of work on adaptive security for key-unique threshold signatures) to extract concrete lessons for teams building or integrating these systems.

GG20, Paillier, and the Recurring Proof Problem

GG20 (Gennaro-Goldfeder 2020) and its predecessors are interactive threshold ECDSA protocols that rely on Paillier homomorphic encryption to convert between multiplicative and additive shares during the signing process (the MtA and MtAwc subprotocols). This design choice buys compatibility with ECDSA but introduces a heavy proof burden: every party must prove that its Paillier key is well-formed (a product of two large primes of appropriate size) and that its encrypted values and range proofs are honest.

In the abstract protocol, these are "standard" zero-knowledge proofs. In real implementations, they have repeatedly been the point of catastrophic failure.

The THORChain 2026 incident is the clearest recent demonstration. The protocol used a fork of bnb-chain/tss-lib (an old version tracing back to roughly 2020) for its Asgard vaults. The critical missing pieces were proper verification of the Paillier modulus well-formedness proofs (MOD and FAC proofs) and related consistency checks during key setup and signing. A single malicious or compromised node that joined the active signing set via the protocol's churn mechanism was able to supply malformed material. Over a period of days, this enabled progressive leakage of share information sufficient to reconstruct the full vault key. One participant in the committee was enough.

This was not a novel attack class. Fireblocks researchers had previously demonstrated a related "Zero Proof" vulnerability in BitGo's GG18/GG20-derived implementation (disclosed 2023), where missing mandatory zero-knowledge proofs in later protocol stages allowed full key recovery in under a minute from a single signature. Verichains' TSSHOCK work catalogued families of related issues in tss-lib forks, including ambiguous encodings, incomplete MtA proofs, and Schnorr proof-of-knowledge gaps. Kudelski's 2019 audit of tss-lib had flagged certain interface and validation problems as low-severity; several later proved to be the vectors in real drains.

The pattern is consistent: the elliptic curve math and the high-level simulation-based security proof are not the problem. The problem is that the "boring" proof verifications that turn the Paillier layer from an oracle into a secure subprotocol are easy to under-implement, easy to skip in forks, and extremely dangerous when omitted. Once a single malicious party is inside the committee, many of these schemes degrade gracefully in theory but catastrophically in code.

FROST: Elegance with Sharp Edges

FROST (Komlo & Goldberg, 2020; now RFC 9591) represents the current state of the art for practical threshold Schnorr signatures. It achieves low round complexity (two rounds in the interactive case, or one round after preprocessing), supports arbitrary thresholds, produces standard Schnorr signatures, and includes a clean mechanism for identifiable aborts: each signer produces a response that can be individually verified against its public verification share using the ordinary Schnorr equation.

The binding technique (each participant derives a per-signing scalar ρ from a hash that includes the exact set of participants, the message, and the nonce commitments) elegantly defeats the concurrent forgery attacks that plagued earlier naive threshold Schnorr constructions. The preprocessing phase allows participants to generate and commit to many single-use nonce pairs offline, enabling efficient signing later.

ROAST (Ruffing et al., CCS 2022) is a lightweight wrapper that adds robustness (guaranteed output delivery whenever t honest signers exist) and better asynchrony properties by running multiple concurrent FROST instances and taking the first to succeed. It is one of the cleaner ways to get both unforgeability and liveness without falling back to older, heavier robust threshold schemes that impose stricter honest-majority requirements or per-signing-session DKGs.

Yet FROST is not a panacea, and its design choices create their own classes of sharp edges in real deployments.

Decentralized FROST and the Identifiability Trap

FROST's identifiable abort property is one of its most practically valuable features in semi-trusted or federated settings. When a participant sends an invalid response, others can detect exactly who misbehaved using the per-share Schnorr check and exclude them.

This property assumes a consistent view of the binding list (the set of nonce commitments chosen for the signing session). In a model with a semi-trusted aggregator or reliable broadcast, this is manageable. In a fully decentralized "everyone is their own aggregator" setting with only best-effort broadcast, it collapses.

A malicious participant can send inconsistent nonce commitments to different honest parties. The honest parties then compute different binding lists, different aggregate nonces R, different challenges, and therefore different responses. When they attempt to verify each other's shares, honest responses fail the check from the perspective of parties with a different view. The result is that honest participants appear to have misbehaved. The attacker can selectively frame victims, trigger mass exclusion, or simply cause the protocol to abort with no clear culprit.

This is not a break of the unforgeability proof. It is a break of the operational assumption that identifiable abort will actually identify the right party in a decentralized execution. The RFC itself warns about split views in the all-participants-as-aggregators case. Real analyses (including Certik's work on decentralized FROST) have made the attack explicit and practical.

The common mitigation (an extra consistency round in which participants cross-check the nonce commitments they received) adds latency and rounds, partially undoing one of FROST's main advantages. In large or high-latency settings, it becomes painful.

Adaptive Security and the Cost of Key-Uniqueness

A significant 2025 line of work (Ciampi, Crites, Komlo, Maller et al., eprint 2025/943 and companion papers) has clarified fundamental limitations on proving adaptive security for "key-unique" threshold signature schemes.

Key-uniqueness means that the threshold public key has a unique correspondence to the secret key material, and there is an efficient way to verify that a set of public verification shares is consistent with the group public key. This property is extremely convenient: it enables the simple per-share Schnorr verification that gives FROST its identifiable abort behavior, and it makes the scheme drop-in compatible with ordinary single-party Schnorr verification.

The recent results show that this same convenience makes full adaptive security (where the adversary can choose corruptions dynamically based on the protocol transcript, up to t-1 parties) difficult or impossible to prove for FROST-like schemes under standard assumptions and natural classes of reductions. There is even a companion "plausible attack" paper that reduces the problem (in the AGM/GGM) to the hardness of a specific linear algebra search problem over the public verification shares (a bounded-distance decoding problem on a Reed-Solomon-like code). The hardness of that problem in the relevant parameter regimes remains open, but the reduction itself is a strong cautionary signal.

In short: the features that make FROST elegant and practical (key-uniqueness + public shares + simple identifiable abort) are exactly the features that separate its provable security from the stronger adaptive notions that some standards bodies and high-security applications care about.

For many real deployments this may not matter. Static security (adversary chooses corruptions in advance) plus competent operational controls is often sufficient. But for long-lived, high-value protocol keys in adversarial environments, the gap is real and worth modeling explicitly.

The Full-Stack Reality

The recurring lesson across both GG20-class and FROST deployments is that the security of a threshold signature system is only loosely coupled to the elegance of its core protocol.

What actually determines whether funds are lost:

Whether every required proof is implemented, verified, and kept up to date across forks and dependencies.
Whether the participation model (churn, bonding, Sybil resistance) matches the threat model assumed by the protocol.
Whether there is reliable, attributable abort and a way to act on it before leakage accumulates over days.
Whether long-lived keys have rotation and proactive refresh mechanisms that are actually exercised.
Whether the surrounding system (monitoring of signing behavior, circuit breakers, recovery paths) can contain a failure in the MPC layer.

THORChain's other vaults were protected not because the TSS was perfect, but because solvency oracles and operational intervention limited the blast radius. Defense-in-depth and fast recovery mattered more than the perfection of the threshold primitive on the day it failed.

Commercial custodians that have shipped MPC at scale (Fireblocks, BitGo, Zengo and others) have generally not relied on the raw primitive alone. They layer additional controls: TEEs, stricter participation policies, continuous adversarial testing of the full stack, and rapid patching processes. The open, permissionless, or semi-permissionless DeFi setting makes many of those controls harder to apply.

Practical Evaluation Framework

When evaluating a threshold signature or MPC system for a DeFi protocol, treasury, or coordination layer, the relevant questions are rarely "Is FROST better than GG20?" They are:

What is the actual participation model, and does it allow an adversary to get inside a signing committee with non-negligible probability?
Which proofs are we actually verifying in our specific code and dependency versions, and when were they last audited against the current state of known attacks on this family?
Do we have identifiable abort that works in our execution environment (decentralized broadcast, unreliable participants, etc.)?
What is our key rotation and refresh story for long-lived keys?
What happens the day after one participant in the committee is compromised or malicious? Do we detect it fast enough? Can we recover without a full governance emergency?

Teams that answer these questions honestly before deployment are the ones that survive the next class of "one malicious co-signer suffices" incidents.

Closing

Threshold cryptography remains one of the most powerful tools available for removing single points of failure in on-chain systems. FROST in particular is a beautiful piece of engineering that has advanced the state of practical Schnorr threshold signing significantly.

But the primitive does not carry its own operational security. The mathematics does not protect against incomplete proof verification in a 2020-era library fork, against churn mechanisms that let adversaries join signing committees, or against the subtle gap between static proofs and adaptive reality for key-unique schemes.

For a studio building DeFi protocols, coordination systems, or treasury infrastructure, the real competence is not in picking the prettiest paper. It is in internalizing the long list of ways these systems have actually failed in production and designing the full stack — cryptography, participation rules, monitoring, and recovery — accordingly.

The threshold is necessary. It is nowhere near sufficient.

Beyond the Threshold: FROST, ECDSA TSS, and the Full-Stack Reality of MPC in DeFi and Distributed Systems