Video bitrate explained for streaming and surveillance: what bitrate is, CBR vs VBR vs CRF, recommended ranges for broadcast and for security cameras, how AI analytics changes the requirement, and the storage and bandwidth math at fleet scale.
Video bitrate is the number that decides whether a camera fleet delivers usable forensic detail, breaks the storage budget, exceeds the network the cameras run on, or kills the AI analytics that the buyer paid for. Getting it right is one of the most consequential technical decisions in a surveillance deployment, and yet it is consistently the decision most buyers and integrators make through inherited defaults rather than deliberate calculation.
Most articles about video bitrate are written for broadcasters and game streamers, which is understandable because that is where the keyword traffic lives. The framing in those articles is genuinely useful for those audiences. It is also wrong, or at least incomplete, for surveillance buyers. Surveillance has different latency floors, different bandwidth profiles, different storage economics, different multi-camera concurrency requirements, and a real AI inference requirement that broadcast streaming does not share.
This guide explains what video bitrate actually is, how it interacts with resolution and codec choice, the bitrate ranges that consistently work for streaming and for surveillance, why the surveillance bitrate decision differs from the broadcast bitrate decision, how AI analytics changes the calculation, and how modern video management software handles bitrate as a per-camera per-workload decision rather than a fleet-wide default.
Video bitrate is the rate at which video data is encoded, transmitted, and stored, measured in bits per second. The unit you almost always see is kilobits per second (kbps) or megabits per second (Mbps). A 4 Mbps stream produces 4 megabits (500 kilobytes) of video data every second. A 25 Mbps stream produces about 3.1 megabytes per second.
Bitrate is the bridge between three things that buyers care about: image quality, storage cost, and bandwidth cost. Higher bitrate generally means higher image quality, larger storage footprint, and higher bandwidth consumption. Lower bitrate generally means lower quality, smaller storage footprint, and lower bandwidth. The relationship is not linear, but the direction is consistent.
Bitrate is not the same as resolution. Resolution is the number of pixels per frame. Bitrate is how many bits are used to describe those pixels over time. The same 1080p resolution can be encoded at 1 Mbps (visibly compressed) or 10 Mbps (visually transparent), with dramatically different storage and bandwidth consequences. The same 4K resolution can be encoded at 8 Mbps (heavily compressed, often unacceptable for forensic detail) or 25 Mbps (high quality). Resolution sets the frame size. Bitrate sets the quality of what is inside the frame.
Bitrate is not the same as codec choice either. The codec (H.264, H.265, AV1) determines how efficiently the bits are used. H.265 at 4 Mbps typically delivers visual quality similar to H.264 at 6 to 8 Mbps. The bitrate is the number. The codec is the efficiency.
Encoders implement bitrate in three primary modes, and the choice of mode affects how the bitrate behaves in practice.
Constant Bitrate (CBR) encodes every second of video at the same target bitrate, regardless of how much motion or detail the scene contains. CBR produces predictable storage and bandwidth consumption, which is operationally useful for surveillance and live streaming where the network and storage budgets are known. The cost of CBR is that low-motion scenes (a static hallway at 3 AM) consume the same storage as high-motion scenes (a busy intersection during rush hour), which means CBR usually wastes bits on the low-motion scenes.
Variable Bitrate (VBR) lets the encoder spend more bits on complex scenes and fewer bits on simple scenes, while targeting an average bitrate across the stream. VBR delivers better quality per bit than CBR on average, with the cost being unpredictable instantaneous bandwidth and storage consumption. VBR is the default for most modern surveillance deployments and most modern streaming workflows because the quality-per-bit advantage is meaningful at scale.
Constant Rate Factor (CRF) encodes for a target visual quality level rather than a target bitrate, letting the actual bitrate float to whatever level achieves the quality target. CRF produces the best quality-per-bit ratio but the least predictable storage and bandwidth profile, which makes it useful for archival and post-production work but less useful for real-time streaming and live surveillance recording where bandwidth predictability matters.
For surveillance deployments, the pattern that consistently works is VBR with a defined maximum bitrate cap. The cap protects the network and storage budget. The variable encoding inside the cap delivers the quality-per-bit advantage. CBR remains useful for specific scenarios (deployments where network shaping requires predictable instantaneous bitrate, deployments where the storage profile has to be exactly predictable per-camera-per-day).
Live streaming and broadcast bitrate guidance is well-established in the OBS, YouTube, Twitch, and conferencing ecosystems. The recommendations vary slightly by platform but converge around the following ranges for H.264 encoding at 30 frames per second.
480p (854x480) typically runs 1.0 to 2.5 Mbps. 720p (1280x720) typically runs 2.5 to 5.0 Mbps. 1080p (1920x1080) typically runs 4.5 to 9.0 Mbps for general streaming and 6.0 to 12.0 Mbps for higher motion content like fast-paced gaming. 1440p (2560x1440) typically runs 9.0 to 18.0 Mbps. 4K (3840x2160) typically runs 18.0 to 25.0 Mbps for streaming and meaningfully higher for higher-quality broadcast applications.
H.265 typically achieves equivalent visual quality at 50 to 70 percent of the H.264 bitrate. AV1 achieves equivalent visual quality at roughly 60 to 80 percent of H.265 bitrate on supported encoders, with hardware availability still uneven in mid-2026.
Adaptive bitrate streaming (ABR) protocols including HLS, LL-HLS, and MPEG-DASH publish multiple bitrate renditions of the same content (commonly called a bitrate ladder), and the player switches between renditions based on the viewer's available bandwidth. The ladder typically spans 5 to 8 renditions from low-bitrate mobile-friendly tiers to high-bitrate full-quality tiers, with the encoder generating each rendition at a specific bitrate target.
These ranges are designed for one-stream-at-a-time consumer playback over consumer-grade internet. They are useful as a baseline reference. They are not directly applicable to surveillance because surveillance does not have the same workflow profile.
A surveillance deployment records many cameras concurrently and continuously, often for 14 to 90 days of retention, with bandwidth and storage costs that compound across the entire fleet. A 100-camera deployment recording at the upper end of the broadcast bitrate guidance would consume 1 Gbps of recording bandwidth aggregate and produce roughly 11 terabytes of footage per day, which crosses budget realities for nearly every deployment.
Surveillance bitrate is also a forensic decision. The bitrate that produces operationally usable footage depends on the workflow the camera is serving. A camera doing general situational awareness can tolerate aggressive compression. A camera doing face recognition or license plate capture needs bitrate that preserves the pixel detail required for the identification. Cameras doing AI analytics need bitrate that does not degrade the model's input below the inference threshold.
Surveillance bitrate is multi-camera concurrent. The bitrate per camera multiplies across the fleet, against both the recording bandwidth at the site and the live-viewing bandwidth across operators. A 4K camera at 12 Mbps consumed by one viewer is 12 Mbps of bandwidth. The same camera consumed by five operators concurrently is 60 Mbps if the VMS does not transcode or proxy. Surveillance bitrate decisions therefore have to account for the multiplication that does not exist in single-stream broadcast.
For these reasons, surveillance bitrate guidance is meaningfully different from broadcast bitrate guidance, and the guidance below reflects production-deployment patterns rather than the broadcast equivalents.
The bitrate ranges that consistently work for surveillance, separated by resolution and codec at 25 frames per second with VBR encoding and moderate scene motion, run approximately as follows.
For H.264 encoding, HD (720p) typically runs 1.5 to 3.5 Mbps for general surveillance and 3.0 to 5.0 Mbps for AI analytics workloads. Full HD (1080p) typically runs 3.0 to 6.0 Mbps for general surveillance, 4.0 to 8.0 Mbps for face recognition workloads, and 6.0 to 10.0 Mbps for license plate capture workloads. 4K (UHD) typically runs 8.0 to 15.0 Mbps for general surveillance, 12.0 to 20.0 Mbps for AI workloads at full resolution, and higher for license plate capture at distance.
For H.265 encoding, the same scenes typically run at 50 to 70 percent of the H.264 bitrate. HD at 0.8 to 2.0 Mbps for general work. Full HD at 1.5 to 4.0 Mbps for general work and 3.0 to 6.0 Mbps for AI workloads. 4K at 4.5 to 10.0 Mbps for general work and 8.0 to 14.0 Mbps for AI workloads.
For AV1 encoding, the bitrate runs roughly 60 to 80 percent of H.265 on supported encoders. AV1 is meaningful in 2026 deployments where the hardware decoder generation supports it, particularly for archive storage tiers where the codec efficiency reduces long-tail cost.
These ranges assume moderate scene motion. High-motion scenes (busy retail floors, transit hubs, traffic intersections) typically need 30 to 50 percent more bitrate than the moderate-motion baseline. Low-motion scenes (overnight hallway monitoring, server room cameras) can frequently encode 30 to 50 percent below the baseline without operational quality loss.
The bitrate range required for AI surveillance analytics is consistently higher than the bitrate range required for human-only surveillance, and the difference is one of the most common gaps in production deployments.
AI face recognition accuracy degrades sharply below a threshold bitrate that preserves the pixel detail the model needs on the face region. The exact threshold varies by model and resolution, but the operational rule is that face recognition typically needs at least 4 to 6 Mbps at 1080p and 8 to 12 Mbps at 4K to deliver production-grade accuracy on faces at typical doorway distances.
AI license plate recognition (ANPR/ALPR) has even sharper bitrate sensitivity, particularly at distance. The model needs pixel-level detail across the plate, and aggressive compression that smears letters into illegible blocks degrades plate accuracy fast. Plate-capture cameras typically need higher bitrate than general-purpose cameras at the same resolution.
AI object detection (person, vehicle, package) is meaningfully more bitrate-tolerant than face recognition or ANPR. Full HD at 3 to 4 Mbps with H.265 typically delivers reliable object detection. The bitrate range that delivers face recognition is more than sufficient for object detection.
AI pose estimation and unsafe behavior detection sit between face recognition and object detection on bitrate sensitivity. The model needs enough pixel detail to resolve body joints, which is meaningfully more than object detection requires but less than face recognition requires at the same scene distance.
The implication is direct. Buyers specifying AI analytics should specify the bitrate per camera against the AI workload the camera serves, not against a fleet-wide bitrate default. Cameras serving face recognition and ANPR should run at the higher end of the surveillance bitrate range. Cameras serving general AI analytics can run at the middle of the range. Cameras serving general situational awareness without AI can run at the lower end.
For deployments where AI analytics is part of the operational workflow, the bitrate ranges that consistently support production-grade AI accuracy run approximately as follows, assuming H.265 encoding at 25 fps with VBR.
For face recognition at typical doorway distance, Full HD at 4 to 6 Mbps or 4K at 8 to 12 Mbps. Cameras with the subject closer to the camera can run at the lower end; cameras at longer distances need the upper end.
For license plate recognition at vehicle entry distance, 4K at 10 to 16 Mbps with the right lens, or specialized ANPR cameras with even higher effective pixel density on the plate region.
For pose estimation and unsafe behavior detection, Full HD at 3 to 5 Mbps for typical scene distances. 4K at 6 to 10 Mbps for wide scenes where the worker is farther from the camera.
For object detection, person tracking, and motion detection, Full HD at 2.5 to 4 Mbps with H.265 is typically sufficient.
For crowd density and counting, Full HD at 3 to 5 Mbps with H.265 supports density estimation reliably. 4K at higher bitrate is required if individual-person tracking through the crowd is part of the workflow.
These ranges are baseline guidance. Real deployments should pilot the AI workload on real plant or site footage before locking in the per-camera bitrate, because lighting, weather, scene motion, and camera placement all affect what the model actually needs.
Bitrate translates into storage at a known rate. A camera recording at 4 Mbps continuously produces 4 megabits per second of footage, which is 30 megabytes per minute, 1.8 gigabytes per hour, and 43 gigabytes per day. Multiply by retention days to get storage required per camera. Multiply by camera count to get total fleet storage.
Representative storage profiles per camera per day, at 25 fps with H.265 encoding, run approximately as follows. HD at 2 Mbps consumes about 20 GB per day. Full HD at 4 Mbps consumes about 40 GB per day. Full HD at 6 Mbps for AI workloads consumes about 60 GB per day. 4K at 10 Mbps consumes about 100 GB per day. 4K at 16 Mbps for AI license plate capture consumes about 160 GB per day.
For a 100-camera mixed fleet (50 cameras at Full HD 4 Mbps, 30 cameras at Full HD 6 Mbps for AI, 20 cameras at 4K 10 Mbps), the daily storage consumption is approximately 5.8 terabytes per day. At 30 days retention, that is 174 terabytes of storage capacity. At 90 days retention, that is 522 terabytes.
These numbers compound quickly. The architectural decisions that materially reduce storage cost are event-triggered recording (record continuously at lower bitrate, switch to higher bitrate when motion or AI events fire), modern codec efficiency (H.265 over H.264, AV1 where supported), and tiered retention (recent days at full bitrate, older days at lower bitrate or downsampled).
Bitrate is bandwidth in real time, both for camera-to-VMS recording bandwidth and for VMS-to-operator viewing bandwidth.
A 4K camera at 12 Mbps consumes 12 Mbps of recording bandwidth from the camera to the VMS continuously. A 100-camera fleet of 4K-at-12-Mbps cameras consumes 1.2 Gbps of aggregate recording bandwidth at the VMS, which exceeds the symmetric uplink of nearly every commodity site internet connection. Deployments with that fleet profile need local recording at the site, with selective replication of important time ranges to centralized storage.
Live viewing bandwidth multiplies across operators. A single operator viewing one 4K 12 Mbps stream consumes 12 Mbps. Five operators viewing the same stream concurrently consume 60 Mbps if the VMS does not transcode or proxy. The architectural pattern that handles this scale is for the VMS to maintain a lower-bitrate proxy stream for routine live viewing and transcode to the full-quality stream on demand when an operator needs forensic detail on a specific camera.
For remote operators, mobile field teams, and multi-site distribution, the bandwidth math gets harder. The bandwidth profile of mobile and remote workflows almost always requires the VMS to deliver a lower-bitrate proxy rather than the full-bitrate recording stream.
The bitrate decisions that surveillance actually requires are per-camera, per-workload, and adaptable to network and viewer conditions. Legacy DVR and NVR appliances frequently handle bitrate as a fleet-wide default. Modern video management software handles bitrate as a managed configuration across the fleet, with workflow-aware proxy generation and transcoding.
Visylix is built around the bitrate realities of multi-camera AI surveillance deployments. The platform supports per-camera bitrate configuration with separate settings for recording, live primary, and live proxy streams, which means a camera can record at 8 Mbps for forensic quality while delivering a 1.5 Mbps proxy to live operators, with the VMS transcoding to the recording quality on demand.
Visylix supports H.264, H.265, and increasingly AV1 codecs natively, with server-side transcoding that lets the same camera stream record in one codec, live-view in another, and replicate to a remote site in a third. Event-triggered recording is supported across the fleet, with motion detection, AI detection events, and operator-defined triggers driving recording behavior. Tiered retention reduces long-tail storage cost by transitioning older footage to lower-bitrate storage automatically.
The 12 self-learning AI models in Visylix are tuned to operate on a range of source bitrates, but the bitrate that produces production-grade AI accuracy is part of the deployment design rather than left to defaults. For deployments where face recognition, ANPR, or pose estimation is part of the workflow, Visylix recommends per-camera bitrate matched to the AI workload at the working distance.
For Indian deployments, Visylix operates natively in 13 Indian languages, supports INR pricing through Razorpay, and runs on customer infrastructure rather than foreign cloud, which matters for data sovereignty considerations under the DPDP Act and for predictable bandwidth economics inside Indian site networks.
If you are designing a new surveillance deployment, running into the storage or bandwidth wall that compounded across a high-bitrate fleet, or specifying AI analytics that need bitrate matched to the model's accuracy threshold, the Visylix team would welcome a conversation about how to design the bitrate profile for both operational fit and AI accuracy. Reach us at https://visylix.com/contact.
Video bitrate is the rate at which video data is encoded, transmitted, and stored, measured in Mbps. Bitrate, resolution, and codec choice together determine image quality, storage cost, and bandwidth cost. For live streaming and broadcast, 1080p typically runs 4.5 to 9 Mbps with H.264 and 50 to 70 percent of that with H.265. For surveillance, the workflow profile is different: many cameras concurrent, long retention, AI inference requirement, and multi-operator viewing. Surveillance bitrate is consistently lower than broadcast bitrate for general work and consistently higher than broadcast bitrate for AI workloads at the same resolution. AI face recognition typically needs 4 to 6 Mbps at 1080p with H.265, ANPR needs more, pose estimation needs slightly less, and object detection is comparatively bitrate-tolerant. The storage and bandwidth math compounds across the fleet, and the architectural decisions that materially help are VBR encoding, event-triggered recording, modern codec choice, tiered retention, and a VMS that maintains separate recording and proxy streams with on-demand transcoding to the full-bitrate forensic stream. Choosing bitrate per camera per workload is the pattern that consistently delivers usable AI accuracy without breaking the storage or bandwidth budget.
Video bitrate is the rate at which video data is encoded, transmitted, and stored, measured in bits per second (typically kbps or Mbps). A higher bitrate generally means higher image quality, larger storage footprint, and higher bandwidth consumption. Bitrate is the bridge between image quality, storage cost, and bandwidth cost. Bitrate is not the same as resolution (number of pixels per frame) or codec choice (efficiency of the encoding), but it interacts with both.
Good streaming bitrate depends on resolution and codec. For 1080p at 30 fps with H.264, 4.5 to 9 Mbps is typical for general streaming, with higher motion content (fast-paced gaming) requiring 6 to 12 Mbps. For 4K at 30 fps with H.264, 18 to 25 Mbps is typical. H.265 achieves equivalent visual quality at 50 to 70 percent of the H.264 bitrate, and AV1 achieves equivalent quality at roughly 60 to 80 percent of H.265 on supported encoders. YouTube and Twitch publish slightly different recommended ranges for their platforms, but the ranges above are operationally close to the consensus.
For 1080p general streaming with H.264 at 30 fps, 4.5 to 6 Mbps is typically sufficient for static or low-motion content, with 6 to 9 Mbps for general content and 9 to 12 Mbps for high-motion content like fast-paced gaming. For 1080p surveillance with H.265 at 25 fps, 3 to 6 Mbps is typical for general work and 4 to 8 Mbps for AI face recognition workloads.
Surveillance camera bitrate depends on the resolution, the workflow, and the codec. For Full HD (1080p) at 25 fps with H.265, general surveillance typically runs 3 to 6 Mbps, AI face recognition typically runs 4 to 6 Mbps, and license plate capture typically runs 6 to 10 Mbps. For 4K at 25 fps with H.265, general surveillance runs 4.5 to 10 Mbps, AI workloads run 8 to 14 Mbps, and ANPR at distance runs higher. These ranges assume VBR encoding and moderate scene motion. High-motion scenes need 30 to 50 percent more bitrate.
Bitrate affects video quality directly: higher bitrate generally produces higher quality, lower bitrate produces visible compression artifacts (blocking, smearing, banding, loss of fine detail). The relationship is not linear. The first jumps in bitrate (from very low to moderate) produce dramatic quality improvements. The later jumps (from already-high to higher) produce diminishing returns. Codec choice changes the relationship: H.265 at a given bitrate produces visibly better quality than H.264 at the same bitrate, because H.265 is a more efficient codec.
Adaptive bitrate streaming (ABR) is a streaming approach in which the encoder produces multiple bitrate renditions of the same content (typically 5 to 8 renditions spanning low-bitrate mobile tiers to high-bitrate full-quality tiers) and the player switches between renditions in real time based on the viewer's available bandwidth. HLS, LL-HLS, and MPEG-DASH are the primary adaptive bitrate protocols. ABR is the standard pattern for consumer-facing video streaming because it accommodates variable internet conditions automatically.
Resolution is the number of pixels per frame (1920x1080 for Full HD, 3840x2160 for 4K). Bitrate is how many bits are used to encode those pixels over time, measured in Mbps. The same 1080p resolution can be encoded at 1 Mbps (visibly compressed) or 10 Mbps (visually transparent). The same 4K resolution can be encoded at 8 Mbps (heavily compressed) or 25 Mbps (high quality). Resolution sets the frame size. Bitrate sets the quality of what is inside the frame. Both matter.
CBR (Constant Bitrate) encodes every second at the same target bitrate, regardless of scene complexity. CBR produces predictable bandwidth and storage but wastes bits on low-motion scenes. VBR (Variable Bitrate) spends more bits on complex scenes and fewer on simple scenes, while targeting an average bitrate. VBR delivers better quality per bit at the cost of unpredictable instantaneous bitrate. CRF (Constant Rate Factor) encodes for a target visual quality and lets the bitrate float to whatever level achieves that target. CRF delivers the best quality per bit but the least predictable bandwidth profile. For surveillance, VBR with a maximum bitrate cap is the pattern that consistently works.
AI analytics accuracy is meaningfully sensitive to bitrate. Face recognition typically needs at least 4 to 6 Mbps at 1080p with H.265 to deliver production-grade accuracy. License plate recognition is even more bitrate-sensitive and typically needs higher bitrate at the same resolution. Object detection, person tracking, and motion detection are comparatively bitrate-tolerant. Pose estimation sits between. Specifying bitrate too low for the AI workload produces models that work in demo conditions and fail in production. The right pattern is to specify bitrate per camera against the AI workload the camera serves, not against a fleet-wide default.
A camera recording continuously at 4 Mbps with H.265 produces about 40 GB of footage per day. A camera at 8 Mbps produces about 80 GB per day. A camera at 16 Mbps produces about 160 GB per day. Multiplied across a fleet and across retention days, storage consumption compounds quickly. A 100-camera mixed fleet at 30-day retention typically requires 150 to 400 TB of storage depending on bitrate profile. Event-triggered recording, modern codec efficiency (H.265 over H.264, AV1 where supported), and tiered retention can collectively reduce storage requirements by 40 to 70 percent versus continuous high-bitrate recording.