The external developer on the ERP integration for a client site sent a message saying he couldn’t push the product sync to production. The database was a mess of encodings: latin1, utf8mb3, and utf8mb4 mixed across tables. On top of that, wp-config.php had DB_CHARSET = 'latin1', a setting that belongs to the MySQL 5.x era and has no business being in a production site in 2026.
A quick query against information_schema.tables confirmed three groups of tables with different collations. Core WordPress and WooCommerce tables were on utf8mb4_unicode_520_ci, so the actual product and order content was fine. The problem was with a handful of plugin tables created with old MySQL defaults, sitting on latin1_swedish_ci.
The obvious fix that wasn’t
The logical next step: ALTER TABLE ... CONVERT TO CHARACTER SET utf8mb4. I cloned the production database to a staging environment, ran the statements for the 13 problematic tables, and verified, zero rows with non-utf8mb4 collation. I opened the site navigation in the browser and I saw that not everything was OK. Košarica (the Croatian word for cart) had become KoÅ¡Arica. Račun (invoice) was now RaÄÅUn.
The reason is worth understanding, because the same trap appears in any system that has been running consistently wrong for years. The data in those tables had been written through a latin1 connection and physically stored as latin1 bytes. It displayed correctly because the entire stack, database, PHP connection, browser, was consistently using the wrong encoding. CONVERT TO CHARACTER SET doesn’t convert the actual byte values; it just changes the label. MySQL then started reading the same bytes through a utf8mb4 lens and saw garbage. Broken by design, works by accident. The ALTER just broke the accident.
What doesn’t work
The standard ALTER TABLE approach only works when the data is already correctly encoded and the metadata is wrong. In this case, the metadata was wrong and the data was stored incorrectly, but the two wrongs had been cancelling each other out. Running ALTER in that situation doesn’t fix the encoding; it just surfaces the underlying problem.
The dump/reimport method
The only correct path is to export the data in a way that extracts the raw bytes as they are, then re-import them with the right encoding declared. Concretely:
- Export with
--default-character-set=latin1. This tellsmysqldumpto pull the bytes without reinterpreting them. - Verify the dump header contains
SET NAMES latin1, confirmation the export went through correctly. - Run
sedon the dump file to replace charset declarations:latin1toutf8mb4,utf8mb3toutf8mb4. - Import with
--default-character-set=utf8mb4. MySQL now reads those same bytes and interprets them correctly. - Update
wp-config.php: setDB_CHARSETtoutf8mb4and addDB_COLLATEwithutf8mb4_unicode_ci.
After that, the navigation was clean, all characters rendered correctly, and the ERP developer got the green light to proceed with the sync.
The actual lesson
When you inherit a WordPress site and find DB_CHARSET = 'latin1' in wp-config.php, try ALTER TABLE if you like. Maybe you’ll have better luck than I did. If not, the dump/reimport method is there and it works. At least it did wotk work for me.
Either way, don’t rush it. Make a backup first. Clone to a staging site, run through the whole procedure there, verify everything looks right, and only then touch production. None of this is particularly complicated, but it’s the kind of work where skipping steps costs you an afternoon you didn’t plan on spending.