When I worked on being able to update my database with the latest datafiles from IMDb I used this strategy:
for every data field that wasn't re-importable (movie ratings, imdb ids, movie reviews, et cetera) dump data to file end for Build new database import saved data production to archive temporary to production
Everything was scripted, and it was all fine... except that "import saved data" was starting to take upwards of 25 minutes. MySQL was having to reparse all of the data, do a index match, build a record set, do the update, et cetera, once for each movie (at around 1.5 seconds per (420 movies currently)) and actor (a lot quicker at the unit level, but 19000 of them added up)
So I moved my data (movie ratings and reviews) to a user namespace (database name is the owning user's username) and the IMDb IDs to an archive database.
Why is this a win? Time to import my data: 0 time units. Time to import movie ids: 0.11s (total). Time to import person ids: 1.97s (total).
So now my only problem is that the JMDB import that I use to then build my format opens a transaction handle but doesn't close it... so lately I've been running out of hard drive space on the mysql partition because the import is building up a multi gigabyte transaction log that it never commits.