Link checker

Periodically checks for broken links in content by extracting hyperlinks and evaluating HTTP response codes, displaying results under Administration > Reports > Broken links.

linkchecker
17,436 sites
78
drupal.org

Install

Drupal 11, 10 v2.1.0
composer require 'drupal/linkchecker:^2.1'
Drupal 9 v2.0.1
composer require 'drupal/linkchecker:^2.0'

Overview

The Link checker module provides comprehensive broken link detection and management for Drupal sites. It extracts links from configured content fields when entities are saved and periodically checks these links by sending HTTP requests to remote servers and evaluating response codes.

The module supports extraction from various HTML elements including hyperlinks, images, audio, video, iframes, and embedded content. It can automatically repair permanently moved links (301 redirects) and unpublish content containing broken links (404 errors) after a configurable threshold of failed checks.

All broken links are displayed in a centralized report accessible at Administration > Reports > Broken links, where administrators can review link status, error messages, failure counts, and related content. The module integrates with Drupal's cron system for background processing and provides Drush commands for command-line operations.

Features

  • Extracts links from text fields, link fields, and various HTML tags (a, area, audio, video, img, iframe, embed, object) when content is saved
  • Periodically checks link status using HTTP HEAD/GET requests with configurable intervals (1-90 days)
  • Supports concurrent HTTP connections (2-128 simultaneous) with per-domain limits to prevent server overload
  • Provides broken links report view accessible at /admin/reports/broken-links with filtering and pagination
  • Automatically repairs 301 redirected links by updating URLs in content after configurable failure threshold
  • Automatically unpublishes content containing 404 broken links after configurable failure threshold
  • Configurable User-Agent header for HTTP requests to handle sites that block default Drupal user agent
  • Field-level configuration allowing selective scanning per content type field
  • Supports both internal and external link checking with configurable URL type filtering
  • Dispatches events for customizing HTTP request headers during link checking
  • Provides Drush commands for analyzing content and checking links via command line
  • Tracks link check history with failure counts, last check timestamps, and error messages
  • Supports migration of settings from Drupal 6/7 to current version
  • URL blacklist for excluding specific domains from checking (e.g., example.com reserved domains)

Use Cases

Content Quality Audit

Run periodic audits of your site content to identify broken external links that may have become unavailable. Configure the module to scan all text fields and link fields, set an appropriate check interval (e.g., weekly), and regularly review the Broken links report to maintain content quality.

Automated Link Repair

Enable automatic repair of permanently moved links by setting 'Update permanently moved links' to 'After three failed checks'. This ensures that when external sites provide proper 301 redirects, your content is automatically updated to use the new URLs without manual intervention.

Content Unpublishing for Quality Control

Configure automatic unpublishing for content with persistent broken links by setting 'Unpublish content on file not found error' to a threshold like 'After three file not found errors'. This prevents users from seeing content with dead links while you review and fix the issues.

Image and Media Link Verification

Enable extraction from <img>, <audio>, and <video> tags to verify that embedded media files are still accessible. This is particularly useful for sites with user-generated content or content imported from external sources.

SEO Maintenance

Use the broken links report to identify and fix dead links that negatively impact SEO. Search engines penalize sites with many broken links, so regular monitoring helps maintain search rankings.

Migration Validation

After migrating content from another system, use 'Clear link data and analyze content for links' to extract all links and run 'drush linkchecker:check' to immediately verify that all migrated links are functional.

Tips

  • Start with a conservative configuration - enable only <a> tag extraction initially and expand as needed
  • Use 'After three failed checks' for auto-repair and auto-unpublish to avoid acting on temporary outages
  • Configure per-domain connection limits are set to 2 by default to avoid overwhelming external servers
  • The reserved documentation domains (example.com, example.net, example.org) are always preserved in the URL blacklist per RFC 2606
  • Enable new revisions in content type settings before enabling auto-repair to maintain edit history
  • Run 'drush linkchecker:analyze' after upgrading the module to ensure all links are properly indexed
  • For large sites, consider running link checks via Drush cron during off-peak hours to minimize impact

Technical Details

Admin Pages 3
Link checker /admin/config/content/linkchecker

Configure link extraction, checking behavior, and error handling settings for the Link checker module. This page allows administrators to control which HTML tags are scanned, set check intervals, configure concurrent connections, and define automated actions for broken links.

Broken links report /admin/reports/broken-links

View and manage all extracted links with their HTTP status codes, error messages, and failure counts. Filter by status code, link type, or content. Click through to view or edit related content.

Edit link /admin/config/content/linkcheckerlink/{linkcheckerlink}/edit

Edit settings for individual link entities. Allows changing the request method and enabling/disabling link checking for specific URLs.

Permissions 4
Access broken links report

Allows users to access the global broken links report showing all broken links across the site.

Access own broken links report

Allows users to access their user-specific broken links report showing only links in their own content.

Administer Link checker

Allows users to administer Link checker settings including extraction options, check intervals, and error handling. This is a restricted permission.

Edit link settings

Allows users to edit individual broken link settings such as request method and check status.

Hooks 5
hook_entity_insert

Triggered when an entity is created. Link checker extracts links from configured fields and creates linkcheckerlink entities.

hook_entity_update

Triggered when an entity is updated. Link checker re-extracts links and updates/creates linkcheckerlink entities, removing orphaned links.

hook_entity_delete

Triggered when an entity is deleted. Link checker removes associated linkcheckerlink entities and cleans up queue entries.

hook_cron

Called during cron runs. Link checker processes unindexed entities for link extraction and queues links for HTTP checking.

hook_form_field_config_form_alter

Alters field configuration forms to add Link checker settings. Adds 'Scan broken links' checkbox and extractor selection.

Drush Commands 3
drush linkchecker:analyze

Reanalyzes content for links by extracting URLs from all configured fields. Recommended after module upgrade or configuration changes.

drush linkchecker:check

Processes queued links and checks their HTTP status. Links are checked based on configured intervals.

drush linkchecker:clear

Clears all link data and reanalyzes content. WARNING: Custom link settings are deleted.

Troubleshooting 6
Links are reported as broken but work in browser

Some servers block the default Drupal User-Agent. Try changing the User-Agent setting to a browser user agent like Firefox or Edge.

Internal links all reported as broken

Ensure cron is called with the correct public site URL, not localhost. Configure the Base path setting or pass --uri parameter to Drush commands.

No links are being extracted

Verify that 'Scan broken links' is enabled in field settings under each content type. Check that the appropriate HTML tags are enabled in Link extraction settings.

Links not being checked during cron

Links are checked based on the configured interval. New links are queued first. Check Recent log messages for linkchecker activity. You can force a check with 'drush linkchecker:check'.

Auto-repair creating incorrect URLs

The 301 repair trusts the redirect location provided by the remote server. If sites provide incorrect redirects, disable the auto-repair feature and manually update links.

High server load during link checking

Reduce 'Number of simultaneous connections' to a lower value (e.g., 2 or 4) to decrease concurrent HTTP requests and server resource usage.

Security Notes 5
  • The 'Administer Link checker' permission is marked as restricted - grant only to trusted administrators
  • The impersonate account setting should use a user with appropriate permissions but consider security implications of automatic content modifications
  • Be cautious with auto-repair feature as it trusts 301 redirects from external sites - a malicious redirect could inject unwanted URLs
  • URLs in the blacklist are still extracted but not checked - they remain visible in content
  • The module sends HTTP requests to external URLs which could potentially be used to trigger actions on remote servers if your content contains specially crafted URLs