Migrate Source CSV

Provides a CSV source plugin for the Drupal Migrate API, enabling migrations from CSV files.

migrate_source_csv
30,725 sites
59
drupal.org

Install

Drupal 9 v8.x-3.7
composer require 'drupal/migrate_source_csv:8.x-3.7'
Drupal 8 v8.x-3.4
composer require 'drupal/migrate_source_csv:8.x-3.4'

Overview

Migrate Source CSV is a contributed module that provides a source plugin for Drupal's Migrate API, allowing CSV files to be used as data sources for migrations. The module leverages the League\CSV PHP library for robust CSV parsing capabilities.

The module provides a highly configurable source plugin that supports various CSV formats including custom delimiters, enclosure characters, and escape sequences. It handles files with or without header rows and supports both single and composite primary keys. The plugin also offers optional record numbering for tracking row positions in the source file.

This module is essential for content migration scenarios where data is exported from legacy systems or spreadsheet applications in CSV format. It integrates seamlessly with Drupal's migration framework and can be used with tools like Drush and the Migrate Plus module for enhanced migration management.

Features

  • CSV source plugin for Drupal Migrate API that reads data from CSV files
  • Support for custom CSV delimiters, enclosure characters, and escape sequences for handling various CSV formats
  • Configurable header row position with support for files without headers
  • Field definition override allowing manual specification of column names and labels
  • Composite primary key support using multiple columns as unique identifiers
  • Optional automatic record numbering for tracking row positions in source files
  • File stream support enabling reading from various file sources
  • Integration with League\CSV library for robust and efficient CSV parsing
  • Empty source support using /dev/null for migration_lookup scenarios

Use Cases

Migrating user data from a spreadsheet export

When migrating users from a legacy system that exports data as CSV, configure the csv source plugin with the path to your exported user file. Set the unique identifier column (such as email or user ID) as the ids configuration. Map CSV columns to Drupal user fields in the process section of your migration.

Importing product catalog from external system

For e-commerce sites, use the csv source to import product data exported from inventory management systems. Configure custom delimiters if the source uses pipe-delimited or tab-delimited formats. Use the fields configuration to provide human-readable labels for documentation.

Content migration with composite keys

When source data uses multiple columns as a composite key (e.g., language code + content ID), configure multiple values in the ids array. The plugin will use all specified columns together to uniquely identify each source record.

Processing files without headers

For CSV files that lack a header row, set header_offset to null and use the fields configuration to define column names. Each field definition requires at minimum a 'name' property that will be used as the column identifier in your process mappings.

Extending for custom row processing

Create a custom source plugin that extends the CSV class to add pre-processing logic. Override the initializeIterator() method to transform, filter, or yield additional rows. The test module csv_source_yield_test demonstrates this pattern.

Using record numbers as identifiers

When source CSV lacks a natural unique identifier, enable create_record_number to generate sequential row numbers. Set this field as part of your ids configuration to use the row position as the migration identifier.

Tips

  • Use absolute paths for CSV files in production, or use hook_migration_plugins_alter() to construct paths dynamically based on module location.
  • For large CSV files, the League\CSV library handles memory efficiently through iterators, but consider splitting very large files for easier debugging.
  • Test migrations with a small subset of data first using the --limit option in drush migrate:import.
  • Use the migrate_plus module for additional features like migration groups, configuration entities, and the migration UI.
  • When debugging, the fields() method output shows how the source plugin interprets column names - check this if mappings seem incorrect.
  • The /dev/null path is specially handled to provide an empty source, useful for migration_lookup operations that need a source but don't actually migrate data.
  • Consider using file stream wrappers (e.g., public://, private://) for CSV files stored within Drupal's file system.

Technical Details

Hooks 1
hook_migration_plugins_alter

Can be used to alter migration plugin definitions, such as modifying the source path for CSV migrations. The test module demonstrates this pattern to set absolute paths for CSV source files.

Troubleshooting 7
Error: You must declare the "path" to the source CSV file in your source settings.

Ensure your migration YAML includes a 'path' key under the source configuration pointing to an absolute path or valid file stream URI for your CSV file.

Error: You must declare "ids" as a unique array of fields in your source settings.

Add an 'ids' array to your source configuration containing the column name(s) that uniquely identify each row. Example: ids: [id] or ids: [language, content_id]

Error: delimiter/enclosure/escape must be a single character

Ensure these configuration options are exactly one character. Multi-character delimiters are not supported by the underlying CSV library.

Error: File "path" was not found.

Verify the file path is correct and the file exists. Use absolute paths or Drupal stream wrappers. Check file permissions.

Error: File "path" could not be opened.

Check file permissions ensure the web server user has read access to the CSV file.

Column names not matching expected values

If using a header row, verify header_offset is set correctly (0 for first row). If header_offset is wrong, column names will be data values instead of headers. Use fields configuration to override column names if needed.

Data appearing in wrong columns

Verify the delimiter configuration matches your file format. Common issues include using comma delimiter for tab-separated files or vice versa.

Security Notes 3
  • Ensure CSV source files are not accessible via web URLs if they contain sensitive data. Store files outside the web root or use private:// stream wrapper.
  • Validate and sanitize data during the process phase of migrations, especially for user-provided CSV files.
  • Be cautious with the path configuration - it accepts file paths and streams, so restrict who can create or modify migration configurations in production.