Bulk-creating WooCommerce products from a CSV or Google Sheet with a script comes down to three things: read the source rows, resolve each one to a product by its SKU, and create it through the WooCommerce CRUD (not by poking meta). The key move is making it idempotent by SKU: before creating, call wc_get_product_id_by_sku($sku); if the SKU already exists, update that product, and if it does not, create a new one. That single check is what lets you re-run the same file as many times as you like without ever duplicating a product.
WooCommerce ships a perfectly good CSV importer in wp-admin, and for a flat catalog of simple products it is the right tool. You reach for a script when the importer cannot express what you need: idempotent re-runs keyed on SKU, computed fields (a price derived from a cost column, a SKU built from two others), conditional category logic, custom meta the importer does not map, or a feed you want a cron to ingest unattended. This article builds that script as a WP-CLI command, dry-run by default, on top of the pull-model service-account reader the rest of this cluster uses.
When the core CSV importer is not enough
The built-in importer (Products, Import) is genuinely good. It handles variations, image URLs, and most standard columns, and for a one-off load of a clean catalog you should just use it. The reasons to drop down to a script are specific:
- Idempotent re-runs. The importer can update existing products by ID or SKU, but it is a manual, click-through flow. A script keyed on SKU can run on a schedule and converge: new rows create, changed rows update, unchanged rows are skipped.
- Computed fields. A retail price that is
cost * 1.4rounded to.99, a SKU assembled from a brand code and a model number, a name templated from several columns. The importer maps columns one to one; it does not compute. - Conditional category logic. "If the type column is
hoodie, put it under Apparel > Hoodies and tag itwinter." Branching like that belongs in code. - Custom meta and integrations. A supplier ID, an ERP key, an ACF field. You can sometimes shoehorn meta into the importer, but writing it explicitly through the CRUD is clearer and survives the next column rename.
- A live feed. When the source is a supplier sheet that changes weekly, you want a cron to read it and reconcile, not a person to remember to export and upload.
If none of those apply, use the importer and move on. If one or more do, read on.
The source: a sheet or a published CSV
The input is one row per product. The join key (the column that makes the run idempotent) is sku; the rest are the fields to set.
| sku | name | price | category | description |
|---|---|---|---|---|
| TE-HOODIE-001 | TechEarl Hoodie | 49.99 | Apparel > Hoodies | Heavyweight cotton hoodie. |
| TE-MUG-002 | TechEarl Mug | 14.99 | Drinkware | 11oz ceramic mug. |
| TE-CAP-003 | TechEarl Cap | 24.99 | Apparel > Caps | Six-panel cotton cap. |
How the script reads that source is a decision in itself, and I covered both ends of it in reading a Google Sheet in PHP: CSV vs the API. The short version: if the sheet can be public, "Publish to the web" as CSV and fgetcsv it in three lines; if it must stay private, read it with the Sheets API as a service account (the te_sheet_token JWT helper from that article mints the access token). The creation logic below does not care which one you use; it just needs an array of associative rows. I will show a plain CSV reader since it is the lowest-friction starting point.
/** Read a CSV into an array of associative rows keyed by the header. */
function te_read_csv( $path ) {
$fh = fopen( $path, 'r' );
$head = fgetcsv( $fh );
$rows = array();
while ( ( $line = fgetcsv( $fh ) ) !== false ) {
$rows[] = array_combine( $head, $line );
}
fclose( $fh );
return $rows;
}One safety note before the values go anywhere. A spreadsheet cell that starts with =, +, -, or @ is a formula, and the danger is on the way back out: if this catalog data is ever re-exported to CSV and reopened in Excel or Sheets, a value like =HYPERLINK("http://evil","click") or =cmd|... executes in the viewer's spreadsheet. This is CSV (formula) injection. The product name and description here flow into WooCommerce, which escapes them on output, so the storefront is fine, but any field you might round-trip back into a sheet should be neutralised by prefixing a leading apostrophe or stripping a leading =+-@ before re-export. Sanitize where the data leaves your control, not just where it enters.
Create through the CRUD, idempotent by SKU
This is the heart of it. Never write _price or any product meta key directly: WooCommerce keeps several derived values in sync (the _price meta is a synced copy of _regular_price and _sale_price, and writing it by hand desyncs them and skips the runtime caches the CRUD clears for you). Go through the object. WC_Product_Simple and its setters have been the supported path since WooCommerce 3.0 (April 2017), so this is safe on any modern store.
The idempotency check is one line at the top: resolve the SKU to an existing product ID, and branch.
/**
* Create or update one product from a source row. Returns ['action' => ..., 'id' => ...].
* Idempotent: the same SKU updates the same product, never duplicates.
*/
function te_upsert_product( array $row ) {
$sku = trim( $row['sku'] ?? '' );
$name = trim( $row['name'] ?? '' );
// Validate the required fields before touching anything.
if ( '' === $sku || '' === $name ) {
return array( 'action' => 'SKIP (missing sku/name)', 'id' => 0, 'sku' => $sku );
}
$existing_id = wc_get_product_id_by_sku( $sku );
$product = $existing_id ? wc_get_product( $existing_id ) : new WC_Product_Simple();
$action = $existing_id ? 'update' : 'create';
$product->set_name( $name );
$product->set_sku( $sku );
$product->set_regular_price( (string) $row['price'] ); // string, e.g. '49.99'
$product->set_description( $row['description'] ?? '' );
$product->set_status( 'publish' );
// Categories: resolve (and create) the terms, then assign by ID.
if ( ! empty( $row['category'] ) ) {
$product->set_category_ids( te_resolve_category_ids( $row['category'] ) );
}
// A custom meta field (supplier ID, ERP key, anything). Written via the CRUD.
if ( ! empty( $row['supplier_id'] ) ) {
$product->update_meta_data( '_te_supplier_id', $row['supplier_id'] );
}
$id = $product->save();
return array( 'action' => $action, 'id' => $id, 'sku' => $sku );
}A few things worth calling out. Prices are passed as strings ('49.99'), which is what the setters expect and what avoids float-formatting surprises. set_status('publish') is explicit so a freshly created product is live; drop it to 'draft' if you would rather stage and review. And every write goes through save(), which is the single call that persists the object, updates the lookup tables, and busts the caches.
Categories: resolve or create the terms
The sheet has a human-readable category path like Apparel > Hoodies. WooCommerce wants term IDs. The resolver walks the path, creating any term that does not exist yet, and returns the ID of the leaf (with its ancestors as parents). I use wp_set_object_terms semantics under the hood, but since the work is inside the CRUD I hand the IDs to set_category_ids so the product object stays the source of truth.
/** Turn "Apparel > Hoodies" into the leaf category term ID, creating terms as needed. */
function te_resolve_category_ids( $path ) {
$parent = 0;
$leaf = 0;
foreach ( array_map( 'trim', explode( '>', $path ) ) as $name ) {
if ( '' === $name ) { continue; }
$term = term_exists( $name, 'product_cat', $parent );
if ( ! $term ) {
$term = wp_insert_term( $name, 'product_cat', array( 'parent' => $parent ) );
}
if ( is_wp_error( $term ) ) { break; }
$parent = $leaf = (int) $term['term_id'];
}
return $leaf ? array( $leaf ) : array();
}If you would rather assign categories on a product that already exists outside the CRUD flow, wp_set_object_terms($product_id, $term_ids, 'product_cat') does the same job directly against the taxonomy. Inside te_upsert_product the set_category_ids route is cleaner because it batches into the one save().
Attributes: build them as objects
Global or custom attributes (Size, Color) are not strings on the product, they are WC_Product_Attribute objects. Build one per attribute, set its options, mark it visible on the product page, and hand the array to set_attributes. This example treats them as custom (per-product) attributes, which is the common case for a flat import; promoting them to global attribute taxonomies is a separate decision.
/** Build a custom (non-taxonomy) product attribute from a name and a list of values. */
function te_build_attribute( $name, array $values ) {
$attr = new WC_Product_Attribute();
$attr->set_name( $name ); // e.g. 'Size'
$attr->set_options( $values ); // e.g. ['S', 'M', 'L']
$attr->set_visible( true ); // show in the Additional Information tab
$attr->set_variation( false ); // simple product, not a variable one
return $attr;
}
// Inside te_upsert_product, before save(), if the row carries a 'sizes' column:
// $product->set_attributes( array( te_build_attribute( 'Size', explode( '|', $row['sizes'] ) ) ) );set_variation(false) matters here: these attributes describe a simple product, they do not drive variations. If you need variable products with real variations, that is WC_Product_Variable plus WC_Product_Variation children, which is enough extra surface to be its own article.
The WP-CLI command, dry-run by default
Wrapping the upsert in a WP-CLI command gives you the bulk-script harness the rest of this cluster leans on: a clean place to pass arguments, a table of what changed, and the discipline I insist on everywhere, change-only and dry-run by default, which I laid out in safe batch updates to custom fields. Nothing writes unless you pass --live. Back up the database before the first live run.
<?php
/**
* Plugin Name: TE Product Importer
* Plugin URI: https://techearl.com/woocommerce-bulk-create-products-from-csv
* Description: WP-CLI command that creates or updates WooCommerce products from a CSV or sheet, idempotent by SKU.
* Author: Ishan Karunaratne
* Author URI: https://techearl.com
*/
defined( 'ABSPATH' ) || exit;
if ( ! defined( 'WP_CLI' ) || ! WP_CLI ) { return; }
class TE_Product_Import_Command {
/**
* Create or update products from a CSV. Idempotent by SKU.
*
* ## OPTIONS
* --file=<path> : Path to the CSV (sku,name,price,category,description).
* [--live] : Actually write. Without it, dry run.
*/
public function import( $args, $assoc ) {
if ( ! function_exists( 'wc_get_product_id_by_sku' ) ) {
WP_CLI::error( 'WooCommerce is not active.' );
}
$rows = te_read_csv( $assoc['file'] );
$live = isset( $assoc['live'] );
$report = array();
foreach ( $rows as $row ) {
$sku = trim( $row['sku'] ?? '' );
$name = trim( $row['name'] ?? '' );
if ( '' === $sku || '' === $name ) {
$report[] = array( 'sku' => $sku, 'action' => 'SKIP (missing sku/name)', 'id' => '' );
continue;
}
$exists = wc_get_product_id_by_sku( $sku );
$action = $exists ? 'update' : 'create';
if ( $live ) {
$result = te_upsert_product( $row );
$report[] = array( 'sku' => $sku, 'action' => $result['action'], 'id' => $result['id'] );
} else {
$report[] = array( 'sku' => $sku, 'action' => "WOULD {$action}", 'id' => $exists ?: '' );
}
}
WP_CLI\Utils\format_items( 'table', $report, array( 'sku', 'action', 'id' ) );
$verb = $live ? 'applied' : 'previewed (dry run)';
WP_CLI::success( sprintf( '%d rows %s.', count( $report ), $verb ) );
}
}
WP_CLI::add_command( 'te-products', 'TE_Product_Import_Command' );The dry run resolves each SKU and reports whether it would create or update, without writing a thing. That preview is the whole safety story: you see exactly which rows are new and which already exist before any product is touched.
Run it
Dry run first, every time:
wp te-products import --file=products.csv
The table is keyed on SKU. New SKUs show WOULD create, SKUs that already exist show WOULD update with the existing product ID, and rows missing a required field show SKIP. When the preview looks right, add --live:
wp te-products import --file=products.csv --liveRe-run the same file immediately and every row flips to update rather than create, because each SKU now resolves to a real product. That is idempotency in practice: the file is a desired-state document, and the command converges the catalog to it without duplicating.
Putting it on a schedule
Once it is idempotent, scheduling is trivial and safe. A nightly cron that runs wp te-products import --file=/path/to/feed.csv --live will create new products as the feed adds them and update existing ones as prices or descriptions change, with no duplicates and no human in the loop. If the feed is a private Google Sheet rather than a file on disk, swap te_read_csv for the service-account reader and the rest is unchanged. This is the same hands-off, pull-on-a-schedule shape the cluster uses for custom fields; products are just a richer payload that has to route through the CRUD.
For the inverse job, taking a catalog that already exists and pushing price or stock edits back from a sheet, see the bulk-update WooCommerce products from a sheet sibling. Creation and update share the same SKU-resolution and CRUD discipline; they differ mainly in whether a missing SKU is an error (update) or a new product (create).
Sources
Authoritative references this article was fact-checked against.
- WC_Product_Simple - WooCommerce Code Referencewoocommerce.github.io
- WC_Product_Attribute - WooCommerce Code Referencewoocommerce.github.io
- wc_get_product_id_by_sku() - WooCommerce Code Referencewoocommerce.github.io
- wp_set_object_terms() - WordPress Developer Referencedeveloper.wordpress.org
- WP_CLI class - WordPress Developer Referencedeveloper.wordpress.org





