Screen Capture API: Complete Guide to Browser-Based Screen Sharing

Learn how to capture display content directly in the browser with the Screen Capture API. Build video conferencing, screen recording, and collaborative tools with modern web capabilities.

What Is the Screen Capture API?

The Screen Capture API is a web platform feature that enables web applications to request access to a user's display content, whether that content comes from an entire monitor, a specific application window, or a browser tab. When a user grants permission, the API returns a MediaStream that can be used for real-time streaming via WebRTC, recording with the MediaRecorder API, or any other purpose that requires access to visual screen content.

Unlike traditional media capture that typically involves a single video or audio source, screen capture must handle multiple potential sources including entire monitors, individual windows, and browser tabs, each with different characteristics and privacy implications. The API's primary entry point is the getDisplayMedia() method on the navigator.mediaDevices object, which prompts the user to select what content to share and grants permission to capture it.

This user-mediated selection process is fundamental to the API's design, ensuring that users maintain control over what content is shared with web applications. The selection dialog presents users with options to share their entire screen, a specific window, or a browser tab, and users can change their selection at any time during the capture session.

For teams building web applications that require screen sharing capabilities, understanding this API opens possibilities for video conferencing tools, collaborative editing platforms, and remote support software.

Key Capabilities

The Screen Capture API enables powerful applications across many domains

Video Conferencing

Share presentations, documents, and applications during video meetings

Screen Recording

Record tutorials and demonstrations directly in the browser

Remote Support

Implement screen sharing for customer assistance

Collaboration Tools

Share dynamic content between participants in real-time

Getting Started with getDisplayMedia()

The getDisplayMedia() method serves as the entry point to the Screen Capture API. It must be called from a secure context (HTTPS) and requires user interaction to trigger the permission prompt. The method accepts an optional configuration object and returns a Promise that resolves to a MediaStream containing the captured screen content. Understanding the method's signature, parameters, and return value is essential for effective implementation in your web applications.

Basic Implementation

The getDisplayMedia() method follows a straightforward async pattern that integrates well with modern JavaScript development practices. The basic syntax involves calling the method on navigator.mediaDevices, optionally passing a configuration object that specifies requirements for the returned stream. When the user selects content and grants permission, the Promise resolves to a MediaStream that can be used for real-time streaming via WebRTC, recording with the MediaRecorder API, or display in a video element.

const stream = await navigator.mediaDevices.getDisplayMedia({
 video: true,
 audio: false
});

const videoTrack = stream.getVideoTracks()[0];
console.log('Capture started:', videoTrack.label);

Configuration Options

The configuration object supports several parameters that control the capture behavior. The video parameter can be set to true to request video capture (which is the default) or to a MediaTrackConstraints object for more specific requirements. The audio parameter, when set to true, requests that the capture include system audio, though this depends on browser support and user selection.

Additional options like preferCurrentTab suggest the current tab as the default selection in the permission dialog. The selfBrowserSurface option controls whether the current tab appears in the selection list, preventing the "infinite hall of mirrors" effect. The surfaceSwitching option enables controls that allow users to change what is being shared without restarting the capture session.

Handling Stream Lifecycle

When getDisplayMedia() succeeds, it returns a MediaStream containing at least one video track representing the captured screen content. The stream may also include an audio track if audio capture was requested and the user selected content that includes audio. It is crucial to handle the track's ended event, as screen capture sessions can end in multiple ways: users may stop sharing through the browser UI, switch to a different application, or lock their screen. Applications must handle these scenarios gracefully, cleaning up resources and updating the user interface appropriately.

videoTrack.onended = () => {
 console.log('Capture ended by user');
 // Clean up resources and update UI
 tracks.forEach(track => track.stop());
};

Failing to handle ended events can result in zombie streams and memory leaks, particularly in long-running applications. Always implement proper cleanup when capture ends.

Developers working on real-time web applications should build robust error handling to ensure smooth user experiences when screen capture sessions end unexpectedly.

Basic Screen Capture Setup

1const stream = await navigator.mediaDevices.getDisplayMedia({2 video: true,3 audio: false4});5 6const videoTrack = stream.getVideoTracks()[0];7console.log('Capture started:', videoTrack.label);8 9videoTrack.onended = () => {10 console.log('Capture ended by user');11};

Recording Screen Content with MediaRecorder

Combining the Screen Capture API with the MediaRecorder API enables browser-based screen recording functionality. This powerful combination allows applications to capture screen content, encode it in real-time, and save the result to the user's device. The pattern is straightforward but requires careful attention to stream management, data handling, and user experience considerations.

Recording Implementation

The recording process begins by obtaining a screen capture stream and passing it to a MediaRecorder instance. The MediaRecorder can be configured with specific mime types and will fire dataavailable events containing chunks of encoded video data. These chunks can be assembled into a complete recording and saved when the recording stops. The dataavailable event fires according to the interval specified in the start() method or whenever the recorder decides to emit data.

let mediaRecorder;
let recordedChunks = [];

async function startRecording() {
 const stream = await navigator.mediaDevices.getDisplayMedia({
 video: { width: { ideal: 1920 }, height: { ideal: 1080 } },
 audio: true
 });

 mediaRecorder = new MediaRecorder(stream, {
 mimeType: 'video/webm;codecs=vp9'
 });

 mediaRecorder.ondataavailable = (event) => {
 if (event.data.size > 0) {
 recordedChunks.push(event.data);
 }
 };

 mediaRecorder.onstop = exportVideo;
 mediaRecorder.start(1000);
}

Saving Recordings

Modern browsers support the File System Access API, which provides a clean way to save files to the user's device. This API enables applications to propose a filename and location while giving users control over where the file is saved. The export process creates a Blob from the recorded chunks, then uses the File System Access API to save the data directly to the file system, avoiding the need to load the entire file into memory.

async function exportVideo() {
 const blob = new Blob(recordedChunks, { type: 'video/webm' });
 const url = URL.createObjectURL(blob);

 const handle = await window.showSaveFilePicker({
 suggestedName: 'screen-recording.webm',
 types: [{
 description: 'WebM video',
 accept: { 'video/webm': ['.webm'] }
 }]
 });

 const writable = await handle.createWritable();
 await writable.write(blob);
 await writable.close();

 recordedChunks = [];
}

Complete Recording Flow

Putting it all together, a complete recording implementation handles the full lifecycle from starting capture through stopping and saving. The flow involves requesting the display media, setting up the MediaRecorder with appropriate encoding options, accumulating recorded chunks, and finally exporting the combined video to the user's file system. This client-side approach has significant advantages for privacy-sensitive applications, as recording data never leaves the user's device.

Screen Recording Implementation

1let mediaRecorder;2let recordedChunks = [];3 4async function startRecording() {5 const stream = await navigator.mediaDevices.getDisplayMedia({6 video: { width: { ideal: 1920 }, height: { ideal: 1080 } },7 audio: true8 });9 10 mediaRecorder = new MediaRecorder(stream, {11 mimeType: 'video/webm;codecs=vp9'12 });13 14 mediaRecorder.ondataavailable = (event) => {15 if (event.data.size > 0) {16 recordedChunks.push(event.data);17 }18 };19 20 mediaRecorder.onstop = exportVideo;21 mediaRecorder.start(1000);22}23 24async function stopRecording() {25 mediaRecorder.stop();26}

Advanced Screen Capture APIs

Beyond basic screen capture, modern browsers provide additional APIs that offer more granular control over what content is captured and how it can be manipulated. These advanced APIs address specific use cases like capturing specific DOM elements, cropping captured regions, and controlling the captured surface during a session.

Element Capture and Region Capture

Element Capture restricts the captured region to a specific DOM element and its descendants, ignoring all other screen content. This is useful for applications that want to share only a particular component without exposing surrounding content. The API uses CropTarget to associate a DOM element with the capture, enabling precise content selection. Region Capture provides similar functionality but operates at the geometric level, cropping the captured video to the region where a specified element is rendered. Unlike Element Capture, Region Capture captures the actual screen pixels in that region, including any overlaid content.

Captured Surface Control API

The Captured Surface Control API enables applications to control the captured surface during a recording session. This includes capabilities like zooming, panning, and scrolling the captured content. These controls are transmitted to the captured application, affecting what is shown in the capture stream. The API is controlled through the CaptureController object, which can be passed to getDisplayMedia() and used to manipulate the capture session.

const controller = new CaptureController();
const stream = await navigator.mediaDevices.getDisplayMedia({
 controller
});

const [videoTrack] = stream.getVideoTracks();
await controller.setZoomLevel(150);

The setZoomLevel(), increaseZoomLevel(), and decreaseZoomLevel() methods control zoom, while wheel events can be forwarded to the captured surface for scrolling. These capabilities enable interactive demonstrations and remote support scenarios where the controller needs to navigate the captured content. Notably, Captured Surface Control requires additional permission grants beyond the basic screen capture permission, providing an additional layer of protection against malicious use.

AI-powered automation tools can leverage these advanced capture capabilities to create intelligent video documentation and automated testing workflows.

Security Considerations

Screen capture involves significant privacy implications. Always obtain explicit user consent, handle sensitive content carefully, and implement proper cleanup when capture ends. Users must maintain control over what content is shared at all times.

Security Best Practices

The Screen Capture API has significant privacy and security implications that developers must address carefully. Screen content may include sensitive information, and the API must balance powerful capabilities with appropriate user protection.

User Consent Model

The getDisplayMedia() method requires explicit user consent before any capture can begin. The browser displays a permission prompt that shows what content will be shared and allows the user to select what to share. Users can choose to share their entire screen, a specific window, or a browser tab, maintaining control over what information is exposed. Applications can suggest preferences through configuration options, but these are only hints. The user remains in complete control of the final selection. The permission is session-scoped and can be revoked at any time through the browser's permission management interface.

Protecting Sensitive Content

Applications should be cautious about what content is displayed when screen capture is active. Sensitive information like passwords, personal data, or confidential content should be obscured or hidden during capture sessions. One effective approach is to implement visual indicators that show when capture is active and provide controls to pause capture or hide sensitive content. Some applications implement a "presenter mode" that shows a simplified view during screen sharing. Applications should provide clear instructions and controls that help users share only what they intend to share.

Preventing Unintended Capture

The Screen Capture API includes several mechanisms to prevent unintended capture scenarios. The selfBrowserSurface option can be used to exclude the current tab from capture options, preventing the "infinite hall of mirrors" effect. The surfaceSwitching option enables controls that allow users to change what is being shared without restarting the capture session. Browser tab capture also has additional considerations around same-origin policy and iframe permissions. Parent frames can control whether child iframes can be captured, and cross-origin restrictions apply.

Building secure web applications with screen capture requires careful attention to these security considerations and privacy best practices.

Performance Optimization

Screen capture can be resource-intensive, particularly for high-resolution displays or extended recording sessions. Optimizing performance requires attention to stream configuration, encoding settings, and resource management.

Stream Configuration

The configuration options passed to getDisplayMedia() significantly impact performance. Higher resolution and frame rate settings require more processing power and network bandwidth. Applications should request only the quality they actually need, rather than defaulting to maximum quality. For many use cases, 1080p at 30fps provides adequate quality while maintaining good performance. Using ideal rather than exact constraints allows the browser to find a good balance between quality and performance.

const stream = await navigator.mediaDevices.getDisplayMedia({
 video: {
 width: { ideal: 1920 },
 height: { ideal: 1080 },
 frameRate: { ideal: 30 }
 }
});

Resource Management

Screen capture sessions consume system resources that must be properly managed. Video tracks should be explicitly stopped when no longer needed, and MediaRecorder instances should be properly closed. Failing to release resources can lead to memory leaks and degraded system performance, particularly for long-running applications. Applications should also consider implementing inactivity detection that automatically stops capture after a period of no user interaction.

function stopCapture() {
 const tracks = stream.getVideoTracks();
 tracks.forEach(track => track.stop());
 stream.getAudioTracks().forEach(track => track.stop());

 if (mediaRecorder && mediaRecorder.state !== 'inactive') {
 mediaRecorder.stop();
 }
}

Display Change Handling

Screen configurations can change during capture, such as when users connect or disconnect monitors, change resolution, or switch between displays. Applications should handle these changes gracefully, either by adjusting to the new configuration or by ending the capture session appropriately. The video track's settingschange event provides notification of configuration changes, allowing applications to respond to monitor switches or resolution adjustments.

Our web development services include performance optimization for media-intensive applications, ensuring smooth screen capture experiences.

Browser Support

72+

Chrome Version

66+

Firefox Version

13+

Safari Version

79+

Edge Version

Browser Compatibility

The Screen Capture API has broad but not universal browser support. Understanding browser compatibility is essential for building applications that work across different platforms. Modern browsers support the core API, but some advanced features remain experimental or browser-specific.

Core API Support

The getDisplayMedia() method is supported in Chrome 72+, Firefox 66+, Safari 13+, and Edge 79+. These versions represent the current stable releases of major browsers, meaning the API is widely available to most users. Chrome and Edge, which are both Chromium-based, typically have the most complete implementation of Screen Capture API features. Firefox has strong support but may differ in some options and behaviors. Safari's support has improved significantly but may lag behind in experimental features.

async function supportsScreenCapture() {
 if (!navigator.mediaDevices?.getDisplayMedia) {
 return false;
 }
 try {
 await navigator.mediaDevices.getDisplayMedia({ video: true });
 return true;
 } catch {
 return false;
 }
}

Audio Capture Considerations

Audio capture through getDisplayMedia() has more limited support than video capture. System audio capture is supported in Chrome and Edge but may have restrictions or be unavailable in other browsers. Applications should implement feature detection for audio capture and provide alternatives when audio is not available. When audio capture is not supported, applications can fall back to alternative approaches such as microphone-only audio or no audio at all.

Integration with Modern Web Frameworks

Modern web frameworks like Next.js provide powerful abstractions for building complex applications, and the Screen Capture API can be integrated effectively within these frameworks. Understanding how to combine native browser APIs with framework patterns helps developers build robust screen capture functionality.

React Hook Pattern

In React-based applications, screen capture functionality can be encapsulated in custom hooks that manage the stream lifecycle. This approach provides clean separation of concerns and integrates well with React's state management model. The hook exposes methods for starting and stopping capture while managing the underlying stream and track objects. The cleanup function ensures that capture is stopped when the component unmounts, preventing resource leaks.

function useScreenCapture() {
 const [stream, setStream] = useState(null);
 const [isCapturing, setIsCapturing] = useState(false);

 const startCapture = async (options) => {
 const newStream = await navigator.mediaDevices.getDisplayMedia(options);
 setStream(newStream);
 setIsCapturing(true);
 return newStream;
 };

 const stopCapture = useCallback(() => {
 stream?.getTracks().forEach(track => track.stop());
 setStream(null);
 setIsCapturing(false);
 }, [stream]);

 useEffect(() => {
 return () => stopCapture();
 }, [stopCapture]);

 return { stream, isCapturing, startCapture, stopCapture };
}

State Management

Screen capture introduces asynchronous operations and complex state that must be integrated with application state management. The capture stream, recording state, and user permissions all represent state that must be tracked and managed appropriately. Recording state, including the recorded chunks, MediaRecorder instance, and recording status, requires careful management to ensure smooth operation. The Permissions API can be used to check permission status proactively, and capture events provide notification when permission is revoked during a session.

Our team specializes in custom web development using modern frameworks, integrating powerful browser APIs like Screen Capture into production-ready applications.

Frequently Asked Questions

Is the Screen Capture API available in insecure contexts?

No, getDisplayMedia() requires a secure context (HTTPS) to function. This security requirement protects users from potential misuse of screen capture capabilities.

Can I capture audio along with screen video?

Yes, by setting audio: true in the options. However, system audio capture support varies by browser, and users must explicitly allow audio sharing in the permission prompt.

How do I save the recorded video to a file?

Use the MediaRecorder API to capture the stream, collect the data chunks, and use the File System Access API to save the resulting Blob to the user's device.

Can I capture just one browser tab instead of the whole screen?

Yes, users can select a specific browser tab in the permission dialog. The preferCurrentTab option can suggest the current tab as the default selection.

How do I handle when the user stops sharing?

Listen to the videoTrack.onended event, which fires when the user stops sharing through the browser UI or takes other actions that end the capture session.

Build Powerful Screen Capture Applications

Our team of experienced developers can help you implement screen capture functionality in your web applications, from video conferencing to screen recording tools.

Sources

MDN Web Docs: Screen Capture API - Core API documentation including concepts, usage, and interfaces
MDN Web Docs: MediaDevices.getDisplayMedia() - Detailed method documentation with parameters and return values
web.dev: How to record the user's screen - Google's practical implementation guide with code examples