Don’t Over React! Render Binary Data with Class.

React December 5, 2017

This post first appeared on the Big Nerd Ranch blog.

Sooner or later, your React web app will probably accept file uploads—perhaps to change out a user’s avatar or share images on a social site.

In modern browsers, the story for working with binary data is downright impressive thanks to objects like File, Blob and ArrayBuffer. You can even store large complex binary data directly in the browser with IndexedDB!

But working with binary data in a sandboxed tab is different from how a backend or native desktop app handles it. If you read in a 5MB image to a String, you will probably crash the tab. Read in 10 images simultaneously and you may crash the browser!

Luckily, JavaScript exposes natively implemented APIs to handle chunks of binary data. With some creativity, you can have the user’s browser pull its own weight, like resizing images on the front-end before upload. But before you create your own React-powered Hipstergram, it’s important to understand the performance implications of binary data in a web app.

Recap: File Objects and Blobs

The browser can’t directly access the file system for security reasons, but users can drop files into the browser with drag-and-drop.

Here’s a barebones React component that accepts a file, like an image:

let Drop = () =>
  <div onDragOver={e => e.preventDefault()}
       onDrop={e => {
         e.preventDefault()
         let file = e.dataTransfer.files[0]
         console.log(file)
       } }
  >
    ...
  </div>

Once the user drags-and-drops an image onto this <Drop> component, they probably expect to see a thumbnail-sized preview in the browser. The browser provides access to read in the file contents in a few formats like a String or ArrayBuffer, but each image could be 5 MB; drop 10 in the browser and you have 50 MB strings in memory!

So instead of directly returning a String or ArrayBuffer, the browser returns a Blob object. A Blob is essentially a pointer to a data source—it could point to a file on disk, an ArrayBuffer, streaming data, etc. Specifically, the e.dataTransfer.files array holds one or more File objects, which are Blobs with some extra metadata. File objects come with a few more properties, like the source file’s name.

To display the image in the DOM, e.g. with an <img /> tag, you can ask the browser for an ephemeral URL to the Blob object. This URL will only be valid while the tab is open:

...
let file = e.dataTransfer.files[0]
let url = URL.createObjectURL(file)
console.log(url)
// => "blob:http://localhost:3000/266c0711-76dd-4a24-af1f-46a8014204ff"

You can use a blob: URL wherever you would put any other URL—like http://localhost:3000/images/logo.png—and it just works!

The Trouble with “Just Rerender”

How do you use blob: URLs in React? Here’s a simple React app that accepts a dropped image and renders it on screen:

class App extends Component {
  state = { file: null }

  onDrag = event => {
    event.preventDefault()
  }

  onDrop = event => {
    event.preventDefault()
    let file = event.dataTransfer.files[0]
    this.setState({ file })
  }

  render() {
    let { file } = this.state
    let url = file && URL.createObjectURL(file)

    return (
      <div onDragOver={this.onDrag} onDrop={this.onDrop}>
        <p>Drop an image!</p>
        <img src={url} />
      </div>
    )
  }
}

The App component starts without any file; when an image file is dropped onto the <div> element, it updates the state and rerenders with a Blob URL. Easy peasy!

But what happens if this component’s props or state changes? Let’s add a counter that changes 10 times a second:

 class App extends Component {
-  state = { file: null }
+  state = { file: null, counter: 0 }

+  refresh = () => {
+    this.setState(({ counter }) => ({ counter: counter + 1 }))
+  }

+  componentDidMount() {
+    this.timer = setInterval(this.refresh, 100)
+  }

+  componentWillUnmount() {
+    clearInterval(this.timer)
+  }

   onDrag = event => {
     event.preventDefault()
   }

   onDrop = event => {
     event.preventDefault()
     let file = event.dataTransfer.files[0]
     this.setState({ file })
   }

   render() {
     let { file } = this.state
     let url = file && URL.createObjectURL(file)

     return (
       <div onDragOver={this.onDrag} onDrop={this.onDrop}>
         <p>Drop an image!</p>
         <img src={url} />
       </div>
     )
   }
 }

This forces React to rerender the <App> component 10 times a second. That’s fine since React is designed to handle this well, but there’s a problem: the blob: URL changes on every rerender! We can confirm this from the Sources panel in Chrome:

A long list of duplicate blob: URLs

It seems the inline call to URL.createObjectURL() creates tons of extra blob: URLs that never get cleaned up: we’re leaking memory! Changing the URL every single rerender also causes the DOM to change, so sometimes the image will flicker since the browser’s caching mechanism doesn’t know the old and new blob: URLs point to the same image.

High CPU usage

At a rerender rate of just 10 times a second, CPU usage explodes to an entire core and bloats memory usage. Eventually garbage collection will catch up, but at the cost of even more CPU usage.

Solution #1: Memoize in Class Component

For our trivial example, we can introduce an easy fix: just create the Blob URL once and store it in the <App> component’s state:

 class App extends Component {
-  state = { file: null, counter: 0 }
+  state = { url: '', counter: 0 }

   ...

   onDrop = event => {
     event.preventDefault()
     let file = event.dataTransfer.files[0]
-    this.setState({ file })
+    this.setState({ url: URL.createObjectURL(file) })
   }

   render() {
-    let { file } = this.state
-    let url = file && URL.createObjectURL(file)
+    let { url } = this.state

     return (
       ...
     )
   }
 }

That totally works, but only if you plan to do nothing else with the data. After the file is dropped, you will likely need to pass the original Blob object around to other React components, perhaps to store it in IndexedDB or upload it with FormData.

Solution #2: It’s Just an Object, Add a Property!

What if we just passed around the immutable Blob object, but added a url property to it with the memoized Blob URL?

 class App extends Component {
   ...

   render() {
     let { file } = this.state
-    let url = file && URL.createObjectURL(file)
+    let url = file && blobUrl(file)

     return (
       ...
     )
   }
 }
let blobUrl = blob => {
  if (!blob.url) {
    blob.url = URL.createObjectURL(blob)
  }
  return blob.url
}

That one change brings down CPU usage to near zero! But… we violated a design principle by modifying an object—the Blob object—from an API that we don’t own.

Solution #3: Global Variable

What if we passed around the Blob object, but instead of modifying it, we stored the generated Blob URL in a big lookup table that only the blobUrl() function can access?

Sounds like a global variable, right?

let hash = file => `${file.name}:${file.type}:${file.size}`

let urls = {}
let blobUrl = blob => {
  let key = hash(blob)
  if (!urls[key]) {
    urls[key] = URL.createObjectURL(blob)
  }
  return urls[key]
}

It’s a great idea, but difficult to execute because the keys in a Plain Ol’ JavaScript Object must be strings, so we can only make a best effort at creating a collision-resistant key per Blob object.

While this will likely work for File objects, it won’t do for Blob objects: they don't have a .name property, so the likelihood of a key collision would be much higher.

The only real way to create a unique hash per Blob object is to tag each Blob object with a unique ID, but then we’re back to modifying the Blob object. However, we’re on the right track.

Solution #4: ES2015 Maps

We need a map type that accepts objects as keys. The POJO won’t do that, but the Map datatype introduced in ES2015 will! Each object has a unique identity because it has its own pointer (place in memory). The Map datatype uses that pointer as the key, so entries are guaranteed to be collision-free!

let urls = new Map()

let blobUrl = blob => {
  if (urls.has(blob)) {
    return urls.get(blob)
  } else {
    let url = URL.createObjectURL(blob)
    urls.set(blob, url)
    return url
  }
}

Boom! But we introduced a subtle problem: we’re leaking memory.

That’s right! In JavaScript we normally don’t manually manage memory, but that doesn’t “free” you from thinking about memory management!

JavaScript employs several strategies and heuristics for efficient garbage collection (like reference counting and generational garbage collection), but we can assume that objects are garbage collected when they are no longer “reachable.”

The urls local variable is in scope and reachable during the app’s entire lifetime. All keys and values in a Map stick around explicitly until removed. So unless we explicitly delete entries from the Map, the Blob objects and blob: URLs will always be reachable—they’ll never be garbage collected. We’re leaking memory!

Solution #5: ES2015 WeakMaps

What if we had a Map datatype that doesn’t prevent the keys from being garbage collected, and automatically deletes the key-value pair once the object becomes unreachable?

That’s precisely what a WeakMap does! It allows us to associate data with an object, but without modifying the original object. A WeakMap behaves like weak references do in Swift and Objective C. Think of them as a noncommittal friend: “If no one needs you, neither do I.”

-let urls = new Map()
+let urls = new WeakMap()

 let blobUrl = blob => {
   if (urls.has(blob)) {
     return urls.get(blob)
   } else {
     let url = URL.createObjectURL(blob)
     urls.set(blob, url)
     return url
   }
 }

WeakMaps are a great way for third-party libraries to “tag” external objects without modifying them. They’re especially useful for adding application-wide memoization.

Here’s the final solution for performant, flicker-free Blob previews:

let urls = new WeakMap()

let blobUrl = blob => {
  if (urls.has(blob)) {
    return urls.get(blob)
  } else {
    let url = URL.createObjectURL(blob)
    urls.set(blob, url)
    return url
  }
}

class App extends Component {
  state = { file: null, counter: 0 }

  refresh = () => {
    this.setState(({ counter }) => ({ counter: counter + 1 }))
  }

  componentDidMount() {
    this.timer = setInterval(this.refresh, 100)
  }

  componentWillUnmount() {
    clearInterval(this.timer)
  }

  onDrag = event => {
    event.preventDefault()
  }

  onDrop = event => {
    event.preventDefault()
    let file = event.dataTransfer.files[0]
    this.setState({ file })
  }

  render() {
    let { file } = this.state
    let url = file && blobUrl(file)

    return (
      <div onDragOver={this.onDrag} onDrop={this.onDrop}>
        <p>Drop an image!</p>
        <img src={url} />
      </div>
    )
  }
}

To reuse blob: URLs throughout your React application, just extract blobUrl() to its own utility file and invoke it directly from any component’s render() method! Or better yet, use stateless functional components.

Wrap-Up

JavaScript is well-equipped to deal efficiently with large chunks of memory, but you have to determine the best way to represent them. When possible, it’s best to use Blob URLs to keep them outside the JavaScript VM’s memory. Objects stored in global variables will never be garbage collected, but WeakMaps are a great solution to break reference cycles.

ES2015 data structures like WeakMaps and ES2017 async functions highlight just how dedicated the JavaScript language is to high-performance modern application development!

Jonathan Lee Martin

Jonathan is an educator, writer and international speaker. He guides developers — from career switchers to senior developers at Fortune 100 companies — through their journey into web development.