iosdev

High-performance Core Data import

There is no silver bullet when it comes to importing large sets of data into a complex Core Data graph. But it's not impossible to do. This is common sense approach that worked very well in my apps, for several years now.

Let’s say you are writing an iOS client for Spotify’s API, specifically the search endpoint. Here you can get multiple entities, like Artist, Album, Track, Playlist and all of them are inter-related in that same JSON.
You might get particular artist instance as top-level result and/or inside album.artists or even inside track.album.artists. It’s quite possible that one same artist can appear in all of those places.

When searching repeatedly for various terms and parsing/importing each received JSON, it would be a disaster to insert the same artist as multiple records into Core Data store. Easy way to prevent this is by setting id as unique attribute. That path leads to lots of merge conflicts when attempting to save into the store and figuring out how to resolve them automatically (better UX but rather complicated in some edge cases) or bother your customers to choose (way worse UX, you should almost never do this).
It’s not a good path.

Another way to prevent this is to check if an artist with particular id already exists and then do an update of the other properties. Naturally, such check should be done for each JSON object. This is what Apple calls upsert in the SwiftData videos as it’s essentially an insert/update combined into one operation.

The least performant way is to do this for each record you encounter. Scenario of:

This will perform as many CoreData fetches as you have objects to import. Inherently it is super slow as for a 1000 JSON objects you will do 1000 fetches for 1 object and another one to perform save. iPhones/iPads have very fast SSD chips but they are still much, much slower than their memory chips. Thus the right way to do such import is:

This will result in just a handful of fetches and one save. The more objects you process in a batch, the more performance benefit you get with this approach.

This is The Way!™