Crypto Exchange Security

Centralised exchanges are the highest-value targets in the Web3 ecosystem. Yet the security failures that have led to the largest losses, from Mt. Gox to Bybit, have not primarily been cryptographic. They have been operational: weak access controls, insufficient separation of duties, absent approval workflows, and governance structures that concentrated too much authority in too few hands.

The pattern is consistent across every major exchange breach. The cryptography held. The wallets were secure by design. What failed were the processes surrounding them: who had access, who could authorise transfers, what monitoring was in place, and whether anyone was watching for the warning signs that preceded the loss. This is the central challenge of crypto exchange security, and it is addressed not by better cryptography but by better operations.

This framework covers the operational controls that distinguish secure exchanges from vulnerable ones. It is written for security directors, CISOs, exchange founders, and institutional investors conducting due diligence on exchange counterparties. The controls described here are not aspirational. They are baseline requirements for any exchange that holds customer funds at institutional scale.

The Operational Security Challenge for Centralised Exchanges

Centralised exchanges occupy a uniquely challenging security position. They combine the high-value custody responsibilities of a traditional financial institution with the always-online operational requirements of a technology platform, the pseudonymous transaction environment of a blockchain network, and the regulatory complexity of a multi-jurisdictional financial services business.

Always-Online Infrastructure

Unlike a bank's overnight clearing cycle, a centralised exchange never closes. Withdrawal requests arrive at 03:00 on a public holiday. API connections process trades continuously. This permanent operational state means that the attack surface is always active, and the security controls that protect it must function reliably at all times. Any gap in monitoring, any lapse in access control enforcement, or any deviation from approval procedures is immediately exploitable.

High-Value Hot Wallets

Exchanges must maintain some proportion of customer funds in hot wallets to service withdrawal demand without delay. Hot wallets, by definition, are connected to internet-accessible infrastructure. They represent the most accessible pool of liquid cryptocurrency on any given exchange and therefore the highest-priority target for both external attackers and malicious insiders. The size of the hot wallet balance, and the controls governing access to it, directly determine the maximum loss in any single incident.

Large Staff with Varying Access Levels

A mid-sized exchange may have dozens of engineers, operations staff, compliance personnel, customer support agents, and finance team members, all of whom require access to different internal systems. Managing this access estate consistently, enforcing the principle of least privilege at scale, and ensuring that access rights are reviewed and revoked promptly when roles change requires mature identity governance processes that many exchanges have not built.

24/7 Operations Pressure

Operational pressure to service customers without interruption creates constant tension with security controls. Approval workflows slow down operations. Multi-signature requirements for large withdrawals add latency. Monitoring alerts generate operational overhead. In exchanges where the security function lacks the authority to enforce controls against operational objections, these pressures systematically erode the control environment over time.

Hot Wallet and Cold Wallet Architecture

The foundational security control for any custodial exchange is the proportion of customer funds held in cold storage versus hot wallets. Cold storage, where private keys are held on hardware that has never been connected to the internet and is stored in physically secured locations, is the primary protection against remote compromise.

The Standard Tiering Model

Industry best practice establishes a clear tiering model for exchange custody:

Cold storage (minimum 95%): The vast majority of customer assets should be held in cold storage wallets where private keys are generated and stored in air-gapped hardware, with access requiring a physical signing ceremony involving multiple authorised keyholders. Cold storage funds are not accessible remotely under any circumstances. Coinbase publicly reports holding approximately 97% of customer assets in cold storage, which is the standard that institutional-grade exchanges should target.

Warm wallets (1-3%): Warm wallets are an intermediate tier: hardware wallets or HSM-backed wallets that are occasionally connected to authorised infrastructure for replenishment but are not persistently online. They provide a faster replenishment path for hot wallets without the full attack surface of an always-online hot wallet.

Hot wallets (2-5%): Hot wallets fund routine withdrawal operations. Their balance should be sized to cover normal peak withdrawal demand, with a defined replenishment process when balances fall below the operational minimum. The replenishment process itself must be protected: a predictable schedule for moving funds from cold to hot storage creates an operational pattern that sophisticated attackers will observe and attempt to exploit.

The Cost of Poor Ratio Discipline

The Binance 2019 breach, in which approximately $40 million in Bitcoin was lost from the exchange's hot wallet, is a case study in the consequences of inadequate ratio enforcement combined with insufficient operational controls. The attackers, using a combination of phishing and API key compromise accumulated over months, waited until they had sufficient compromised credentials to execute a series of large withdrawals from the hot wallet in a single coordinated transaction burst. The loss was bounded by the hot wallet balance at the time. Had the hot wallet security controls and ratio discipline been stronger, the loss would have been smaller. Exchanges that allow hot wallet balances to grow beyond operational requirements, whether due to convenience, poor monitoring, or the absence of a defined ratio policy, are accepting unnecessary risk.

Staff Access Control and Privileged Access Management

Access control is the single most consequential operational security control for an exchange. Who can access which systems, under what conditions, with what level of oversight determines both the insider threat surface and the impact of any credential compromise.

Principle of Least Privilege at Exchange Scale

Every member of exchange staff should have access only to the systems and data required to perform their specific role. This principle is simple to state and difficult to enforce consistently across a large, fast-growing organisation. The common failure mode is access privilege accumulation: staff members who move between roles retain access rights from previous positions, or access rights are granted on an ad-hoc basis for specific tasks and never revoked.

Enforcing least privilege requires a formal access request and approval process, regular access reviews (at minimum quarterly for privileged access), automated revocation when roles change or staff leave, and privileged access management tooling that provides auditability of all privileged sessions. The privileged access management controls appropriate for crypto firms apply directly to exchange operations, with particular emphasis on access to wallet infrastructure, internal trading systems, and customer data.

Role-Based Access Control for Exchange Functions

Exchange operations divide into functionally distinct domains, each of which should have its own access tier:

Trading operations: Access to the trading engine, order book management, and market-making systems. Should not include access to wallet infrastructure or customer funds.

Financial and treasury operations: Access to wallet management systems, withdrawal processing queues, and treasury reporting. Should not include access to the trading engine or customer account management.

Technical and infrastructure operations: Access to servers, databases, and deployment pipelines. Should be segmented so that infrastructure engineers cannot access production wallet signing infrastructure without a separate authorisation process.

Customer support: Access to customer account data sufficient to resolve support queries. Should not include the ability to initiate or approve withdrawals.

Compliance and risk: Read access to transaction data and customer records for monitoring purposes. No ability to modify account states or initiate transactions.

Separation of Duties in Exchange Operations

Separation of duties is the control that most directly prevents large-scale insider fraud. The principle is that no single individual should be able to complete a high-risk operation from initiation to execution without independent authorisation. In exchange operations, this principle has several specific applications.

No single person should be able to propose, authorise, and execute a significant withdrawal. Where this control is absent, the exchange is one compromised account or one rogue employee away from catastrophic loss.

The separation of duties framework for exchanges requires at minimum:

Withdrawal Authorisation

The person who initiates a withdrawal request must be different from the person who approves it. For large withdrawals above a defined threshold, a second independent approval should be required. This dual-control requirement must be technically enforced in the withdrawal system, not merely a policy that relies on staff compliance.

Treasury Function Separation

The treasury function, which manages the allocation of funds between cold storage, warm wallets, and hot wallets, must be operationally separate from the technical signing function, which executes the cryptographic transactions. A treasury officer can propose a cold-to-hot replenishment; a separate signing team executes it. Neither function should be able to act unilaterally.

Independent Reconciliation

The team responsible for transaction reconciliation, verifying that the exchange's on-chain balances match its internal ledger, must be independent of both the treasury and the signing functions. Where reconciliation is performed by the same team responsible for executing transactions, discrepancies that indicate fraud or error can be concealed. Independent reconciliation is one of the most basic controls that the Mt. Gox exchange famously lacked for years before its eventual collapse.

Multi-Signature Governance for Exchange Wallets

Multi-signature wallet governance is the technical implementation of separation of duties for cryptocurrency custody. But the technical capability is only part of the story. The operational policy that governs multi-sig use determines whether the control actually protects against the relevant threats.

Quorum Design for Cold Storage

Cold storage multi-signature configurations for institutional exchanges typically require a 3-of-5 or 4-of-7 quorum of independent keyholders. The specific quorum should be designed to achieve two objectives: preventing a single point of compromise from enabling unauthorised access, and ensuring the exchange can maintain operational continuity even if a subset of keyholders are temporarily unavailable. A 2-of-3 configuration is insufficient for institutional-grade cold storage because compromising two keyholders is not materially harder than compromising one in a well-resourced attack.

Key Distribution and Storage

Multi-signature keys must be held by geographically distributed keyholders, stored in hardware security modules (HSMs) or hardware wallets that are physically secured, and never replicated or transmitted digitally. Each keyholder's identity and role must be formally documented, and the list of keyholders must be subject to regular review and update procedures. Key ceremonies, occasions on which cold storage keys are used to sign transactions, must be conducted under a documented protocol with multiple witnesses and a complete audit record.

The Operational Policy Layer

The multi-sig configuration is the technical control. The operational policy layer defines: who is eligible to hold signing authority, what approval process must be completed before a signing ceremony is initiated, what documentation is required for a cold storage withdrawal request, what minimum time delay applies between a withdrawal request and its execution, and what independent verification is required before signing. Without this policy layer, the technical multi-sig control can be bypassed through social engineering of the keyholders rather than cryptographic attack, which is precisely the vector exploited in the Bybit hack of February 2025.

Security Operations and Monitoring

Exchange-specific monitoring requirements go beyond the standard security operations centre function. The monitoring programme must cover both the security of exchange infrastructure and the integrity of exchange operations.

Withdrawal Pattern Monitoring

Every withdrawal processed by the exchange should be assessed against baseline patterns: normal withdrawal volumes for the relevant asset and time period, typical withdrawal sizes for the withdrawing account, and expected destination address types. Anomalies, including unusually large individual withdrawals, bursts of withdrawals across multiple accounts to the same destination, off-hours large-value withdrawals, and withdrawals to addresses with no prior relationship to the exchange, should trigger automated alerts and require manual review before processing above defined thresholds.

API Key Usage Monitoring

Exchanges that provide API access for algorithmic traders must monitor API key usage for signs of compromise: authentication from unexpected IP addresses, unusual trading patterns that deviate from the key's historical behaviour, attempts to access endpoints outside the key's normal scope, and API keys that suddenly begin placing large withdrawal requests after a period of trading-only activity. The Binance 2019 attack relied heavily on previously compromised API keys being used in a coordinated burst. A monitoring system alert on the API activity pattern would have provided an opportunity to halt the withdrawals before the full loss occurred.

Employee Behaviour Analytics

Monitoring for insider threat requires visibility into employee behaviour on internal systems: unusual access times, access to systems outside the employee's normal scope, large data exports, attempts to access wallet infrastructure from unauthorised devices, and patterns that suggest enumeration of high-value accounts or wallet balances. This monitoring must be conducted within the constraints of applicable employment law and data protection regulations, with appropriate transparency to employees about the nature and scope of monitoring.

Velocity Anomaly Detection

Velocity controls limit the rate at which withdrawals can be processed, regardless of individual transaction legitimacy. A sudden spike in aggregate withdrawal volume, even if each individual transaction appears legitimate, warrants automatic investigation and may trigger a temporary hold on further processing pending manual review. Velocity controls are a last-resort defence that can contain losses when earlier detection mechanisms fail.

Exchange Security Failures and Operational Lessons

The history of exchange security failures provides a consistent lesson: the losses were not caused by cryptographic breakthroughs. They were caused by the absence or circumvention of the operational controls described in this article.

Mt. Gox: The Archetype of Operational Failure

Mt. Gox, which handled approximately 70% of global Bitcoin trading at its peak, lost approximately 850,000 Bitcoin (valued at roughly $450 million at the time of its 2014 collapse). The post-mortem identified multiple cascading operational failures: the exchange had no proper cold storage discipline, with the vast majority of customer Bitcoin held in hot wallets accessible to the exchange's systems. Its internal reconciliation was absent for extended periods, meaning that the actual on-chain balance was lower than the internally recorded balance for years before the collapse. The exchange had been losing Bitcoin to a combination of theft and operational mismanagement for an extended period without its management detecting the discrepancy. Mt. Gox is the template for what happens when an exchange has no meaningful operational security controls whatsoever.

Binance 2019: Access Control and Phishing

Binance lost approximately $40 million in Bitcoin in May 2019. The attackers spent months accumulating compromised user API keys and session cookies through a combination of phishing campaigns, malware, and other attack vectors. When they executed the withdrawal, they did so through the API using legitimate compromised credentials, bypassing the authentication controls. The attack withdrew the maximum available from the hot wallet in a single coordinated transaction. The operational lesson is the combination of phishing resistance controls for staff, API credential security, velocity controls that could have flagged the burst of large withdrawals, and hot wallet ratio discipline that would have bounded the loss.

FTX: Governance Collapse Enabling Fraud

FTX's 2022 collapse, which resulted in approximately $8 billion in customer losses, was the consequence of the complete absence of operational controls rather than their circumvention. The exchange had no independent treasury function, no separation of duties for asset management, no meaningful board oversight, and no reconciliation process that could detect the systematic misappropriation of customer funds. The governance failures that enabled the fraud were not subtle. They were the absence of every basic institutional control that a regulated financial institution would be required to maintain. FTX is the clearest illustration available that operational security controls for exchanges are not compliance overhead; they are the mechanism that makes fraud operationally difficult.

Bybit 2025: Social Engineering at Scale

The February 2025 Bybit breach, in which approximately $1.5 billion in Ethereum was stolen, represents the current state of the art in sophisticated exchange attacks. The Lazarus Group, the North Korean state-sponsored threat actor responsible, compromised the signing infrastructure through a combination of social engineering targeting the signers and manipulation of the transaction display in the signing interface, causing keyholders to authorise a transaction they believed to be routine while actually approving a transfer of the entire cold storage balance to attacker-controlled addresses. The detailed technical and operational analysis is covered in the full post on the Bybit hack. The operational lesson is that multi-signature controls require operational verification procedures that prevent signing of transactions based solely on interface display: independent transaction verification through multiple channels before any signing ceremony is conducted.

Regulatory and Compliance Requirements for Exchange Security

The regulatory environment for centralised exchange security has matured substantially in recent years, particularly in the European Union. Compliance with applicable regulations is not a substitute for genuine security, but it provides a minimum baseline and creates legal accountability for security failures.

MiCA Requirements

The Markets in Crypto-Assets Regulation (MiCA) requires Crypto-Asset Service Providers (CASPs) operating in the EU to meet specific operational security standards. MiCA compliance requirements for exchanges include: segregation of client assets from the exchange's own assets, with records sufficient to identify each customer's holdings at any time; maintenance of prudential reserves; business continuity and disaster recovery planning; and robust IT security and operational risk management. MiCA also requires CASPs to have governance arrangements that include clear lines of responsibility, effective controls, and adequate reporting to senior management.

DORA Requirements

The Digital Operational Resilience Act (DORA) applies to financial entities including crypto-asset service providers within scope. DORA compliance requirements impose a comprehensive ICT risk management framework covering: ICT risk assessment and treatment, ICT-related incident classification and reporting, digital operational resilience testing (including threat-led penetration testing for significant entities), ICT third-party risk management, and information-sharing arrangements with competent authorities. DORA's ICT incident reporting obligations are particularly relevant for exchange security: significant operational security incidents must be reported to competent authorities within defined timeframes, creating a legal obligation to detect and classify incidents promptly.

Insurance and Proof of Reserves

Beyond regulatory compliance, institutional clients and sophisticated retail users increasingly require exchanges to demonstrate adequate insurance coverage and regular proof-of-reserves attestations. Proof of reserves, where an exchange uses cryptographic commitments to demonstrate that its on-chain holdings are sufficient to cover its customer liabilities, is the on-chain equivalent of an audit. It does not prove the absence of fraud, but it provides a real-time check on whether the exchange's stated reserves match its actual holdings. Exchanges should have proof-of-reserves procedures embedded in their operational security programme, not treated as a standalone marketing exercise.

Employee Security for Exchange Staff

The human element of exchange security is where the most sophisticated attacks are increasingly focused. State-sponsored threat actors including Lazarus Group specifically target exchange employees through social engineering, knowing that a compromised insider with privileged access is more valuable than any technical vulnerability.

Background Vetting

All exchange staff with access to privileged systems, wallet infrastructure, or significant customer data should undergo formal background vetting appropriate to the sensitivity of their role. For signing keyholders, treasury staff, and senior technical roles, enhanced vetting including employment history verification, financial background checks, and reference checks is warranted. Vetting should be repeated periodically, not only at hiring, because the circumstances of existing staff members change over time.

Phishing Resistance for Privileged Staff

Exchange staff in privileged roles are high-value phishing targets. The organisation must deploy phishing-resistant multi-factor authentication (hardware security keys rather than SMS or TOTP where technically feasible), provide regular targeted security awareness training that reflects the actual tactics used against exchange employees, and operate a simulated phishing programme to identify staff who need additional support. Staff in signing roles and treasury functions should be specifically briefed on the social engineering tactics used in recent exchange compromises, including the specific techniques used in the Bybit attack.

Device Management and Clear Desk Policies

Privileged operations, including any interaction with wallet signing infrastructure, must be conducted only on managed, hardened devices that are enrolled in the organisation's mobile device management programme. Personal devices must never be used for privileged operations. Clear desk policies, prohibiting the storage or display of credentials, seed phrases, or sensitive operational documentation in physical or digital form outside of approved secure storage, must be enforced and monitored.

Handling Staff Departure from Privileged Roles

When a staff member leaves a privileged role, whether through resignation, termination, or role change, access revocation must be immediate and comprehensive. For signing keyholders, departure triggers a formal key ceremony to revoke the departing keyholder's key and add their replacement. The temptation to delay this process because of operational inconvenience is the path to a compromised access estate. Immediate, comprehensive off-boarding for privileged staff is a non-negotiable requirement.

Building an Exchange Security Programme: The PPT Framework

Security4Web3 structures exchange security programme design around the People, Process, Technology framework because it forces security directors to address all three dimensions rather than defaulting to technology procurement as the primary response to a security challenge. The majority of exchange failures have been in the People and Process dimensions. Technology investments that are not grounded in strong process and supported by appropriately trained and accountable people will not deliver their intended security outcomes.

People

The people dimension of exchange security covers the full lifecycle of security-relevant personnel management. Hiring: security roles require candidates with relevant credentials and verifiable track records; do not hire for privileged roles without proper vetting. Role definition: every role that interacts with sensitive systems must have a clear, documented definition of access rights and responsibilities. Training: all staff must receive security awareness training appropriate to their role; privileged staff must receive enhanced, role-specific training. Accountability: security responsibilities must be assigned to named individuals with reporting lines that give them the authority to enforce controls without being overruled by operational convenience. Culture: the organisation's leadership must treat operational security as a genuine operational priority, not as a compliance checkbox.

Process

The process dimension covers the approval workflows, escalation paths, incident response procedures, and regular security reviews that make the control framework operational. Approval workflows for withdrawals must be documented, technically enforced, and regularly tested. Escalation paths for security incidents must be defined and rehearsed through tabletop exercises. Incident response procedures must be specific enough to be actionable under pressure: a generic incident response plan that requires security professionals to improvise under a live attack is not a plan. Regular security reviews must include access rights reviews, penetration testing, red team exercises, and review of the control environment against the evolving threat landscape.

Technology

The technology dimension covers the specific tools and systems that implement the People and Process decisions. Hardware security modules (HSMs) for key management. Multi-signature wallet governance platforms. Identity and access management systems with privileged access management capability. Security information and event management (SIEM) systems for monitoring and alerting. Endpoint detection and response (EDR) on all managed devices. Phishing-resistant MFA for all privileged access. Each technology choice must be assessed against the specific threat model and the operational constraints of the exchange. Technology that cannot be operationally maintained, that creates excessive friction for legitimate operations, or that is deployed without trained operators will degrade rather than improve the security posture over time.

A mature exchange security programme is one in which all three dimensions are aligned: the right people, following the right processes, supported by appropriate technology. Any programme that is strong in one dimension but weak in another has a gap that a sophisticated attacker will identify and exploit.

Frequently Asked Questions

What are the main security risks for a centralised crypto exchange?

The primary security risks for a centralised exchange are: hot wallet compromise through phishing, malware, or API key theft; insider threats and insider-enabled fraud; inadequate separation of duties allowing a single actor to initiate and authorise large withdrawals; social engineering attacks targeting privileged staff; weak access controls on internal systems; and insufficient monitoring that allows anomalous withdrawal activity to go undetected. The largest exchange losses in history have been caused by operational failures, not cryptographic vulnerabilities.

How much of exchange funds should be held in cold storage?

Industry best practice is to hold a minimum of 95% of customer funds in cold storage, with 3-5% available in hot or warm wallets for operational liquidity. Leading exchanges such as Coinbase publicly report holding approximately 97% of assets in cold storage. The specific ratio should be determined by the exchange's actual withdrawal demand patterns, with the hot wallet allocation sized to cover normal peak demand without requiring cold storage access for routine operations.

What is multi-signature governance for a crypto exchange?

Multi-signature governance for a crypto exchange means that cold storage withdrawals require cryptographic signatures from multiple independent keyholders before a transaction can be authorised. A typical institutional configuration requires 3-of-5 or 4-of-7 signatures. The operational policy sitting on top of the technical multi-sig defines who holds keys, where keys are stored, what approval process must be completed before a signing ceremony is conducted, and what time delays apply between a withdrawal request and execution.

How do exchanges prevent insider theft?

Exchanges prevent insider theft through a combination of controls: separation of duties so that no single person can initiate and authorise a withdrawal; multi-signature requirements for all significant transfers; role-based access control that limits each staff member's access to systems relevant to their function; comprehensive audit logging of all privileged actions; regular access reviews; background vetting for staff in sensitive roles; and behavioural monitoring that flags unusual access patterns or transaction activity. The FTX collapse demonstrated the catastrophic consequences of eliminating these controls entirely.

What regulations apply to crypto exchange security?

In the European Union, exchanges operating as Crypto-Asset Service Providers (CASPs) under MiCA are required to maintain robust operational security controls, including segregation of client assets, business continuity planning, and IT security requirements. The Digital Operational Resilience Act (DORA) applies to financial entities including crypto-asset service providers, imposing ICT risk management, incident reporting, operational resilience testing, and third-party risk management obligations. Outside the EU, exchanges must comply with the security-related requirements of their local regulatory frameworks, which increasingly reference international standards such as ISO 27001 and SOC 2.

Crypto Exchange Security: Operational Framework for Centralised Exchanges