Don’t Over React! Render Binary Data with Class.
This post first appeared on the Big Nerd Ranch blog.
Sooner or later, your React web app will probably accept file uploads—perhaps to change out a user’s avatar or share images on a social site.
In modern browsers, the story for working with binary data is downright impressive thanks to objects like File
, Blob
and ArrayBuffer
. You can even store large complex binary data directly in the browser with IndexedDB
!
But working with binary data in a sandboxed tab is different from how a backend or native desktop app handles it. If you read in a 5MB image to a String
, you will probably crash the tab. Read in 10 images simultaneously and you may crash the browser!
Luckily, JavaScript exposes natively implemented APIs to handle chunks of binary data. With some creativity, you can have the user’s browser pull its own weight, like resizing images on the front-end before upload. But before you create your own React-powered Hipstergram, it’s important to understand the performance implications of binary data in a web app.
Recap: File Objects and Blobs
The browser can’t directly access the file system for security reasons, but users can drop files into the browser with drag-and-drop.
Here’s a barebones React component that accepts a file, like an image:
let Drop = () =>
<div onDragOver={e => e.preventDefault()}
onDrop={e => {
e.preventDefault()
let file = e.dataTransfer.files[0]
console.log(file)
} }
>
...
</div>
Once the user drags-and-drops an image onto this <Drop>
component, they probably expect to see a thumbnail-sized preview in the browser. The browser provides access to read in the file contents in a few formats like a String
or ArrayBuffer
, but each image could be 5 MB; drop 10 in the browser and you have 50 MB strings in memory!
So instead of directly returning a String
or ArrayBuffer
, the browser returns a Blob
object. A Blob
is essentially a pointer to a data source—it could point to a file on disk, an ArrayBuffer
, streaming data, etc. Specifically, the e.dataTransfer.files
array holds one or more File
objects, which are Blob
s with some extra metadata. File
objects come with a few more properties, like the source file’s name.
To display the image in the DOM, e.g. with an <img />
tag, you can ask the browser for an ephemeral URL to the Blob
object. This URL will only be valid while the tab is open:
...
let file = e.dataTransfer.files[0]
let url = URL.createObjectURL(file)
console.log(url)
// => "blob:http://localhost:3000/266c0711-76dd-4a24-af1f-46a8014204ff"
You can use a blob:
URL wherever you would put any other URL—like http://localhost:3000/images/logo.png
—and it just works!
The Trouble with “Just Rerender”
How do you use blob:
URLs in React? Here’s a simple React app that accepts a dropped image and renders it on screen:
class App extends Component {
state = { file: null }
onDrag = event => {
event.preventDefault()
}
onDrop = event => {
event.preventDefault()
let file = event.dataTransfer.files[0]
this.setState({ file })
}
render() {
let { file } = this.state
let url = file && URL.createObjectURL(file)
return (
<div onDragOver={this.onDrag} onDrop={this.onDrop}>
<p>Drop an image!</p>
<img src={url} />
</div>
)
}
}
The App
component starts without any file; when an image file is dropped onto the <div>
element, it updates the state and rerenders with a Blob
URL. Easy peasy!
But what happens if this component’s props or state changes? Let’s add a counter that changes 10 times a second:
class App extends Component {
- state = { file: null }
+ state = { file: null, counter: 0 }
+ refresh = () => {
+ this.setState(({ counter }) => ({ counter: counter + 1 }))
+ }
+ componentDidMount() {
+ this.timer = setInterval(this.refresh, 100)
+ }
+ componentWillUnmount() {
+ clearInterval(this.timer)
+ }
onDrag = event => {
event.preventDefault()
}
onDrop = event => {
event.preventDefault()
let file = event.dataTransfer.files[0]
this.setState({ file })
}
render() {
let { file } = this.state
let url = file && URL.createObjectURL(file)
return (
<div onDragOver={this.onDrag} onDrop={this.onDrop}>
<p>Drop an image!</p>
<img src={url} />
</div>
)
}
}
This forces React to rerender the <App>
component 10 times a second. That’s fine since React is designed to handle this well, but there’s a problem: the blob:
URL changes on every rerender! We can confirm this from the Sources panel in Chrome:
It seems the inline call to URL.createObjectURL()
creates tons of extra blob:
URLs that never get cleaned up: we’re leaking memory! Changing the URL every single rerender also causes the DOM to change, so sometimes the image will flicker since the browser’s caching mechanism doesn’t know the old and new blob:
URLs point to the same image.
At a rerender rate of just 10 times a second, CPU usage explodes to an entire core and bloats memory usage. Eventually garbage collection will catch up, but at the cost of even more CPU usage.
Solution #1: Memoize in Class Component
For our trivial example, we can introduce an easy fix: just create the Blob
URL once and store it in the <App>
component’s state:
class App extends Component {
- state = { file: null, counter: 0 }
+ state = { url: '', counter: 0 }
...
onDrop = event => {
event.preventDefault()
let file = event.dataTransfer.files[0]
- this.setState({ file })
+ this.setState({ url: URL.createObjectURL(file) })
}
render() {
- let { file } = this.state
- let url = file && URL.createObjectURL(file)
+ let { url } = this.state
return (
...
)
}
}
That totally works, but only if you plan to do nothing else with the data. After the file is dropped, you will likely need to pass the original Blob
object around to other React components, perhaps to store it in IndexedDB
or upload it with FormData
.
Solution #2: It’s Just an Object, Add a Property!
What if we just passed around the immutable Blob
object, but added a url
property to it with the memoized Blob
URL?
class App extends Component {
...
render() {
let { file } = this.state
- let url = file && URL.createObjectURL(file)
+ let url = file && blobUrl(file)
return (
...
)
}
}
let blobUrl = blob => {
if (!blob.url) {
blob.url = URL.createObjectURL(blob)
}
return blob.url
}
That one change brings down CPU usage to near zero! But… we violated a design principle by modifying an object—the Blob
object—from an API that we don’t own.
Solution #3: Global Variable
What if we passed around the Blob
object, but instead of modifying it, we stored the generated Blob
URL in a big lookup table that only the blobUrl()
function can access?
Sounds like a global variable, right?
let hash = file => `${file.name}:${file.type}:${file.size}`
let urls = {}
let blobUrl = blob => {
let key = hash(blob)
if (!urls[key]) {
urls[key] = URL.createObjectURL(blob)
}
return urls[key]
}
It’s a great idea, but difficult to execute because the keys in a Plain Ol’ JavaScript Object must be strings, so we can only make a best effort at creating a collision-resistant key per Blob
object.
While this will likely work for File
objects, it won’t do for Blob
objects: they don't have a .name
property, so the likelihood of a key collision would be much higher.
The only real way to create a unique hash per Blob
object is to tag each Blob
object with a unique ID, but then we’re back to modifying the Blob
object. However, we’re on the right track.
Solution #4: ES2015 Maps
We need a map type that accepts objects as keys. The POJO won’t do that, but the Map
datatype introduced in ES2015 will! Each object has a unique identity because it has its own pointer (place in memory). The Map
datatype uses that pointer as the key, so entries are guaranteed to be collision-free!
let urls = new Map()
let blobUrl = blob => {
if (urls.has(blob)) {
return urls.get(blob)
} else {
let url = URL.createObjectURL(blob)
urls.set(blob, url)
return url
}
}
Boom! But we introduced a subtle problem: we’re leaking memory.
That’s right! In JavaScript we normally don’t manually manage memory, but that doesn’t “free” you from thinking about memory management!
JavaScript employs several strategies and heuristics for efficient garbage collection (like reference counting and generational garbage collection), but we can assume that objects are garbage collected when they are no longer “reachable.”
The urls
local variable is in scope and reachable during the app’s entire lifetime. All keys and values in a Map
stick around explicitly until removed. So unless we explicitly delete entries from the Map
, the Blob
objects and blob:
URLs will always be reachable—they’ll never be garbage collected. We’re leaking memory!
Solution #5: ES2015 WeakMaps
What if we had a Map
datatype that doesn’t prevent the keys from being garbage collected, and automatically deletes the key-value pair once the object becomes unreachable?
That’s precisely what a WeakMap
does! It allows us to associate data with an object, but without modifying the original object. A WeakMap
behaves like weak references do in Swift and Objective C. Think of them as a noncommittal friend: “If no one needs you, neither do I.”
-let urls = new Map()
+let urls = new WeakMap()
let blobUrl = blob => {
if (urls.has(blob)) {
return urls.get(blob)
} else {
let url = URL.createObjectURL(blob)
urls.set(blob, url)
return url
}
}
WeakMap
s are a great way for third-party libraries to “tag” external objects without modifying them. They’re especially useful for adding application-wide memoization.
Here’s the final solution for performant, flicker-free Blob
previews:
let urls = new WeakMap()
let blobUrl = blob => {
if (urls.has(blob)) {
return urls.get(blob)
} else {
let url = URL.createObjectURL(blob)
urls.set(blob, url)
return url
}
}
class App extends Component {
state = { file: null, counter: 0 }
refresh = () => {
this.setState(({ counter }) => ({ counter: counter + 1 }))
}
componentDidMount() {
this.timer = setInterval(this.refresh, 100)
}
componentWillUnmount() {
clearInterval(this.timer)
}
onDrag = event => {
event.preventDefault()
}
onDrop = event => {
event.preventDefault()
let file = event.dataTransfer.files[0]
this.setState({ file })
}
render() {
let { file } = this.state
let url = file && blobUrl(file)
return (
<div onDragOver={this.onDrag} onDrop={this.onDrop}>
<p>Drop an image!</p>
<img src={url} />
</div>
)
}
}
To reuse blob:
URLs throughout your React application, just extract blobUrl()
to its own utility file and invoke it directly from any component’s render()
method! Or better yet, use stateless functional components.
Wrap-Up
JavaScript is well-equipped to deal efficiently with large chunks of memory, but you have to determine the best way to represent them. When possible, it’s best to use Blob
URLs to keep them outside the JavaScript VM’s memory. Objects stored in global variables will never be garbage collected, but WeakMap
s are a great solution to break reference cycles.
ES2015 data structures like WeakMap
s and ES2017 async functions highlight just how dedicated the JavaScript language is to high-performance modern application development!