Update
I have resolved and removed the distracting error. Please read the entire post and feel free to leave comments if any questions remain.
Background
I am attempting to write relatively large files (video) to disk on iOS using Swift 2.0, GCD, and a completion handler. I would like to know if there is a more efficient way to perform this task. The task needs to be done without blocking the Main UI, while using completion logic, and also ensuring that the operation happens as quickly as possible. I have custom objects with an NSData property so I am currently experimenting using an extension on NSData. As an example an alternate solution might include using NSFilehandle or NSStreams coupled with some form of thread safe behavior that results in much faster throughput than the NSData writeToURL function on which I base the current solution.
What's wrong with NSData Anyway?
Please note the following discussion taken from the NSData Class Reference, (Saving Data). I do perform writes to my temp directory however the main reason that I am having an issue is that I can see a noticeable lag in the UI when dealing with large files. This lag is precisely because NSData is not asynchronous (and Apple Docs note that atomic writes can cause performance issues on "large" files ~ > 1mb). So when dealing with large files one is at the mercy of whatever internal mechanism is at work within the NSData methods.
I did some more digging and found this info from Apple..."This method is ideal for converting data:// URLs to NSData objects, and can also be used for reading short files synchronously. If you need to read potentially large files, use inputStreamWithURL: to open a stream, then read the file a piece at a time." (NSData Class Reference, Objective-C, +dataWithContentsOfURL). This info seems to imply that I could try using streams to write the file out on a background thread if moving the writeToURL to the background thread (as suggested by @jtbandes) is not sufficient.
The NSData class and its subclasses provide methods to quickly and
easily save their contents to disk. To minimize the risk of data loss,
these methods provide the option of saving the data atomically. Atomic
writes guarantee that the data is either saved in its entirety, or it
fails completely. The atomic write begins by writing the data to a
temporary file. If this write succeeds, then the method moves the
temporary file to its final location.
While atomic write operations minimize the risk of data loss due to
corrupt or partially-written files, they may not be appropriate when
writing to a temporary directory, the user’s home directory or other
publicly accessible directories. Any time you work with a publicly
accessible file, you should treat that file as an untrusted and
potentially dangerous resource. An attacker may compromise or corrupt
these files. The attacker can also replace the files with hard or
symbolic links, causing your write operations to overwrite or corrupt
other system resources.
Avoid using the writeToURL:atomically: method (and the related
methods) when working inside a publicly accessible directory. Instead
initialize an NSFileHandle object with an existing file descriptor and
use the NSFileHandle methods to securely write the file.
Other Alternatives
One article on Concurrent Programming at objc.io provides interesting options on "Advanced: File I/O in the Background". Some of the options involve use of an InputStream as well. Apple also has some older references to reading and writing files asynchronously. I am posting this question in anticipation of Swift alternatives.
Example of an appropriate answer
Here is an example of an appropriate answer that might satisfy this type of question. (Taken for the Stream Programming Guide, Writing To Output Streams)
Using an NSOutputStream instance to write to an output stream requires several steps:
- Create and initialize an instance of NSOutputStream with a
repository for the written data. Also set a delegate.
- Schedule the
stream object on a run loop and open the stream.
- Handle the events
that the stream object reports to its delegate.
- If the stream object
has written data to memory, obtain the data by requesting the
NSStreamDataWrittenToMemoryStreamKey property.
- When there is no more
data to write, dispose of the stream object.
I am looking for the most proficient algorithm that applies to writing
extremely large files to iOS using Swift, APIs, or possibly even
C/ObjC would suffice. I can transpose the algorithm into appropriate
Swift compatible constructs.
Nota Bene
I understand the informational error below. It is included for completeness. This
question is asking whether or not there is a better algorithm to use
for writing large files to disk with a guaranteed dependency sequence (e.g. NSOperation dependencies). If there is
please provide enough information (description/sample for me to
reconstruct pertinent Swift 2.0 compatible code). Please advise if I am
missing any information that would help answer the question.
Note on the extension
I've added a completion handler to the base writeToURL to ensure that
no unintended resource sharing occurs. My dependent tasks that use the file
should never face a race condition.
extension NSData {
func writeToURL(named:String, completion: (result: Bool, url:NSURL?) -> Void) {
let filePath = NSTemporaryDirectory() + named
//var success:Bool = false
let tmpURL = NSURL( fileURLWithPath: filePath )
weak var weakSelf = self
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), {
//write to URL atomically
if weakSelf!.writeToURL(tmpURL, atomically: true) {
if NSFileManager.defaultManager().fileExistsAtPath( filePath ) {
completion(result: true, url:tmpURL)
} else {
completion (result: false, url:tmpURL)
}
}
})
}
}
This method is used to process the custom objects data from a controller using:
var items = [AnyObject]()
if let video = myCustomClass.data {
//video is of type NSData
video.writeToURL("shared.mp4", completion: { (result, url) -> Void in
if result {
items.append(url!)
if items.count > 0 {
let sharedActivityView = UIActivityViewController(activityItems: items, applicationActivities: nil)
self.presentViewController(sharedActivityView, animated: true) { () -> Void in
//finished
}
}
}
})
}
Conclusion
The Apple Docs on Core Data Performance provide some good advice on dealing with memory pressure and managing BLOBs. This is really one heck of an article with a lot of clues to behavior and how to moderate the issue of large files within your app. Now although it is specific to Core Data and not files, the warning on atomic writing does tell me that I ought to implement methods that write atomically with great care.
With large files, the only safe way to manage writing seems to be adding in a completion handler (to the write method) and showing an activity view on the main thread. Whether one does that with a stream or by modifying an existing API to add completion logic is up to the reader. I've done both in the past and am in the midst of testing for best performance.
Until then, I'm changing the solution to remove all binary data properties from Core Data and replacing them with strings to hold asset URLs on disk. I am also leveraging the built in functionality from Assets Library and PHAsset to grab and store all related asset URLs. When or if I need to copy any assets I will use standard API methods (export methods on PHAsset/Asset Library) with completion handlers to notify user of finished state on the main thread.
(Really useful snippets from the Core Data Performance article)
Reducing Memory Overhead
It is sometimes the case that you want to use managed objects on a
temporary basis, for example to calculate an average value for a
particular attribute. This causes your object graph, and memory
consumption, to grow. You can reduce the memory overhead by
re-faulting individual managed objects that you no longer need, or you
can reset a managed object context to clear an entire object graph.
You can also use patterns that apply to Cocoa programming in general.
You can re-fault an individual managed object using
NSManagedObjectContext’s refreshObject:mergeChanges: method. This has
the effect of clearing its in-memory property values thereby reducing
its memory overhead. (Note that this is not the same as setting the
property values to nil—the values will be retrieved on demand if the
fault is fired—see Faulting and Uniquing.)
When you create a fetch request you can set includesPropertyValues to NO > to reduce memory overhead by avoiding creation of objects to represent the property values. You should typically only do so, however, if you are sure that either you will not need the actual property data or you already have the information in the row cache, otherwise you will incur multiple
trips to the persistent store.
You can use the reset method of NSManagedObjectContext to remove all managed objects associated with a context and "start over" as if you'd just created it. Note that any managed object associated with that context will be invalidated, and so you will need to discard any references to and re-fetch any objects associated with that context in which you are still interested. If you iterate over a lot of objects, you may need to use local autorelease pool blocks to ensure temporary objects are deallocated as soon as possible.
If you do not intend to use Core Data’s undo functionality,
you can reduce your application's resource requirements by setting the
context’s undo manager to nil. This may be especially beneficial for
background worker threads, as well as for large import or batch
operations.
Fina