A look at Trello: adopting GraphQL and Apollo in a legacy application

Trello is currently undergoing a big technological shift on the frontend, and an important part of this shift has been our approach to incrementally adopting GraphQL. We've used a client-side GraphQL schema (in a creative way) to quickly unlock the benefits of GraphQL. If you're considering moving to GraphQL in your product, then this would be a great place to start before investing time and energy into a server-side schema.

For the past ten years, the development teams at Trello have been writing features in Backbone and CoffeeScript backed by a solid REST API and WebSockets. This was bleeding edge when we first started working in this architecture, but as the codebase has grown, we have begun to reach the limits of its capabilities. For example, it became very difficult at times to understand how a piece of data ended up in our client-side cache, and whether or not it was stale. Not to mention, our client-side cache was entirely proprietary (built on top of Backbone models). Over the past two years, we've been working on modernizing our architecture by moving to React and TypeScript, and as part of that we began to explore the idea of adopting GraphQL (and Apollo).

We spent a lot of time considering how we'd implement GraphQL before making any code changes, and we began to have a very good understanding of what our end-state architecture would look like, but in practice, we didn't know how we'd move from where we were today into this new world. With tens of millions of users and 250,000 lines of code, it was critical that we could head towards our architecture in an incremental way, whilst still delivering new features and keeping regression risk low. We realized that one of the areas where we'd struggle would be GraphQL. We now had many developers with extensive React experience, but almost no-one who'd written a production-level GraphQL schema. We needed to begin to understand what impact our schema decisions would have, and how we could consume the schema from React, without locking ourselves into a publicly supported GraphQL API.

Why did we choose to adopt GraphQL?

There are some very sizable benefits to adopting a GraphQL schema in your frontend, even if the implementation itself is backed by REST. The first is knowing the shape of the data being requested ahead of time. In REST, you can have an API endpoint that has existed for many years (let's use /1/board/{boardId} as an example) that has grown organically over time to return more and more data. It becomes increasingly difficult to say, with confidence, the subset of that data that is actually required by the frontend feature making that request. In Trello, as we started to convert some of our most used REST requests to their equivalent GraphQL queries, it became abundantly clear that we were over-fetching a lot of data but it was very difficult to tell which data wasn't actually required by the UI.

The second is type safety. GraphQL schemas are strictly typed, so it becomes trivial to generate static types for the response of any given query (in Trello we are using graphql-code-generator). When combined with an editor like VSCode, these static types give both an excellent developer experience and a much higher level of confidence when requesting remote data. It can also eliminate an entire class of "contract based" testing that ensures your API is delivering the data that is promised by its specification, as the type generation for queries will fail (at build time instead of runtime) if the schema has changed in a way that prevents it from being able to satisfy a query. This type safety greatly simplifies a huge range of complex issues when integrating between a frontend and an API, as your app won't even compile if any of its queries can't be satisfied by your GraphQL schema.

Why was Apollo the obvious choice for Trello?

Choosing a frontend framework to interact with GraphQL could be an article just on its own. Our family at Atlassian had a lot of experience with moving to GraphQL incrementally, with Apollo as the framework of choice, so it made sense for us to leverage this knowledge and make the same choice. Obviously, if there was a compelling downside to Apollo we would have considered other alternatives (relayamplifyurql), but it had already been used heavily in production with no significant drawbacks.

Apollo can be thought of as the "glue" that holds our components and our GraphQL queries together. It comes with some pretty big benefits out of the box that make it an excellent choice for managing your remote data.

The biggest benefit by far is declarative data fetching. In your typical REST application (backed by something like Redux or MobX), components are responsible for imperatively requesting data when they are mounted, or when some interaction occurs in the UI. This commonly leads to the following situation:

declarative-data-fetching-example

In this scenario we have two separate components backed by the same data. The question that immediately follows is: which component is responsible for triggering the fetch from your REST API? You can end up with many of your components containing complicated logic in their lifecycle methods like this:

❌ This is bad

class CardDescription extends React.Component {
  componentDidMount() {
    const { card, cardId } = this.props;

    // If we don't have a card loaded in our cache, fetch the full card
    if (!card) {
      dispatch(loadFullCard(cardId));
      return;
    }

    // If we do have a card in our cache, but no description, we want to load it
    if (!card.description) {
      dispatch(loadCardDescription(cardId));
      return;
    }
  }

  render() {
    const { card, isLoading } = this.props;
    if (isLoading) {
      return null;
    }
    return <span>{card.description}</span>;
  }
}

This is particularly troublesome in Trello, as much of the app is still written in Backbone and CoffeeScript, but we want to safely be able to write new, isolated, "leaf" components with confidence that they will (if necessary) fetch the data they require for rendering. Luckily, Apollo does just this. Each component specifies its query, containing only the data it needs for rendering and Apollo manages making the requests and caching the data for you.

✅ This is much better

const CARD_DESCRIPTION_QUERY = gql`
  query CardDescription($cardId: ID!) {
    card(id: $cardId) {
      description
    }
  }
`;

const CardDescription = ({ cardId }) => {
  const { data, loading } = useQuery({ query: CARD_DESCRIPTION_QUERY, variables: { cardId }});
  if (loading) {
    return null;
  }

  return <span>{data.card.description}</span>
}

In this example, all the potentially complicated logic of checking whether you have the data required to render your component (and whether a fetch might be required) is managed entirely by Apollo. The end result is that we can write a new component that specifies its data requirements, and mount it anywhere in the app with confidence that it will render correctly, even if the data wasn't present when the component was mounted. Not to mention, we also don't need to maintain all the code required for managing a cache anymore.

We now had a good picture of what we wanted our components to look like, and a vision of how this could be solved with Apollo and GraphQL, but how could we achieve this without investing in a complete re-write of our REST API?

Using a client-side schema to incrementally adopt GraphQL and Apollo

Wrapping an existing REST API server to support GraphQL can be a pretty significant undertaking. What if there are costly refactors that need to be made (around authentication for example)? What if the application you work on is split across multiple teams? What if there are multiple upstream services required by the frontend that need to be consolidated behind a single GraphQL API? The good news is that you don't have to wait!

There are many advantages to adopting a server-side GraphQL solution, including defense against over-fetching and a more explorable API for third party consumers. However, many of the advantages in the frontend can still be enjoyed using an existing REST API (or any remote data source). How? By wrapping the API with a client-side GraphQL-based library (like apollo-client). In fact, this approach can actually be beneficial when compared to diving straight into a server-side solution for a few reasons:

✅ Starting with a client-side solution results in a faster developer loop,where schema changes are applied, consumed and contract-tested all in the same PR.

✅ The GraphQL schema can more easily be built incrementally based on the requirements of the frontend, rather than trying to convert your entire REST API into a GraphQL-based solution in one go.

✅ A client-side schema gives a great "starting point" should you eventually decide to take the plunge and invest in a real GraphQL server.

There are a few ways to go about implementing a client-side GraphQL approach, the two most popular being apollo-link-rest and local resolvers.

apollo-link-rest allows you to construct a client-side schema, but use the @rest directive to provide information about how the query should be executed via REST, eg:

query MyBoards {
  boards @rest(type: "Board", path: "/1/board/") {
    name
  }
}

This is the "lowest touch" solution to getting up and running with Apollo. It's a great way to start reaping the benefits of Apollo's caching and declarative data-fetching. However, there are a few drawbacks to this approach.

The first is that the structure of your REST API will end up directly impacting the structure of your GraphQL schema. This might be okay in some cases, but often this can result in making compromises to your schema that wouldn't have manifested in a server-side schema. We ideally want our schema to be directly portable to a server-side solution at some point, so allowing these compromises in the short-term can end up coming back to bite us in the future.

Secondly, apollo-link-rest is a "leaky" abstraction, in the sense that your queries are aware that there is no GraphQL server involved, i.e. boards @rest(type: "Board", path: "/1/board/") {.

Again, this makes it more difficult to untangle should you plan on making the jump to a server-side solution in the future. In an ideal world, our components (and queries) wouldn't have to change at all if we were to adopt a GraphQL server.

Lastly, with "nested resources", some queries can get very expensive. Take the following query:

query MyBoards {
  boards {
    name
    lists {
      name
      cards {
        name
        dueDate
      }
    }
    members {
      username
      email
    }
  }
}

Using apollo-link-rest for a query like this would end up "fanning out" and potentially making many individual REST API requests, which can obviously be quite detrimental to performance (remember that this is client to server, not server to server). Thankfully there exists a solution that gives us an escape hatch for these issues.

Achieving greater flexibility with local resolvers

Apollo Client now provides tools for managing client-side state out of the box. Typically, this solution is used for storing/accessing local UI state using Apollo, but it also works extremely well for querying a REST API. The same mechanisms that exist in a GraphQL server (a schema paired with resolvers) are used for managing this data, so the end result is something that much more closely resembles a server-side GraphQL solution.

Effectively, this approach boils down to:

  1. Writing a GraphQL schema (the same way you would if it existed on the server)
  2. Writing local resolvers that fetch the requested data from your REST API

Now your components are entirely shielded from the fact that you are wrapping a REST API on the client, with queries looking like this:

query {
  member(id: "me") @client {
    id
    fullName
    boards(filter: open) {
      name
      lists {
        name
      }
    }
    organizations {
      name
      displayName
    }
  }
}

Which, apart from the @client directive, are exactly what they would look like with a GraphQL server. The query is then satisfied using your local resolvers, which might look something like this:

const resolvers = {
  Query: {
    member: (_, args) => {
      const results = await fetch(`/1/member/${args.id}`);
      const member = await results.json();
      return member;
    },
  },
  Member: {
    boards: (member, args) => {
      const results = await fetch(`/1/member/${member.id}/boards`);
      const boards = await results.json();
      return boards;
    },
    organizations: (member, args) => {
      const results = await fetch(`/1/member/${member.id}/organizations`);
      const organizations = await results.json();
      return organizations;
    },
  },
  Board: {
    lists: (board, args) => {
      const results = await fetch(`/1/boards/${board.id}/lists`);
      const lists = await results.json();
      return lists;
    },
  },
};

This approach is far more flexible than apollo-link-rest, and allows you to encapsulate all knowledge of the REST API to your resolvers. Additionally, if your server implementation is based on JavaScript, this can become a great starting point for a Node-based GraphQL server. But what about the performance issues we mentioned earlier, related to the "fanning out" of requests when querying nested resources?

Trello's Solution

Luckily for us, Trello's REST API actually supports many of the features that make a GraphQL endpoint so appealing, namely:

  1. Field narrowing (eg. https://trello.com/1/members/me?fields=username,email)
  2. Nested resource expansion (eg. https://trello.com/1/members/me?boards=open&board_fields=name)

A request to a REST API that supports field narrowing and nested resource expansion has a lot in common with a GraphQL query. With this in mind, we were able to write a "GraphQL query → REST API URL" translation layer that turns a query like this:

query {
  member(id: "me") @client {
    id
    fullName
    boards(filter: open) {
      name
      lists {
        name
      }
    }
    organizations {
      name
      displayName
    }
  }
}

Into a single REST API request like this:

https://trello.com/1/member/me?fields=id,fullName&boards=open&board_fields=name&board_lists=all&board_list_fields=name&organizations=all&organization_fields=name,displayName

Translating a GraphQL query into a REST API request

We achieve this by using our own generic local resolver to traverse the GraphQL query and generate a single REST URL with all of the required query parameters. A single data structure specifies the query parameters required to "expand" the nested resources, and to narrow the requested fields down to only what is required by the GraphQL query. Here's a small snippet of the data structure that drives this traversal (including the configuration that makes the above query possible):

/**
 * This data structure represents valid 'chains' of nested resources according
 * to the Trello API.
 * See https://developers.trello.com/reference#understanding-nested-resources
 *
 * @name
 * Represents the node's name as it would appear in a graphql query. For
 * example:
 *
 * board {
 *   cards {
 *     checklists
 *   }
 * }
 *
 * Would be expected to match a 'path' down this tree according to the `name`
 * property.
 *
 * @nodeToQueryParams
 * A function which is called when parsing a graphql query into query params
 * for a REST API request. It's given a FieldNode and expected to return all
 * the necessary query params to satisfy the data for that given node
 *
 * @nestedResources
 * Recursive property used to define the 'tree' of nested resources according to
 * the above.
 *
 */
const VALID_NESTED_RESOURCES: NestedResource[] = [
  {
    name: 'member',
    nodeToQueryParams: (node) => ({
      fields: getChildFieldNames(node),
    }),
    nestedResources: [
      {
        name: 'organizations',
        nodeToQueryParams: (node) => ({
          organizations: getArgument(node, 'filter') || 'all',
          organization_fields: getChildFieldNames(node),
        }),
      },
      {
        name: 'boards',
        nodeToQueryParams: (node) => ({
          boards: getArgument(node, 'filter') || 'all',
          board_fields: getChildFieldNames(node),
        }),
        nestedResources: [
          {
            name: 'lists',
            nodeToQueryParams: (node) => ({
              board_lists: getArgument(node, 'filter') || 'all',
              board_list_fields: getChildFieldNames(node),
            }),
          },
        ],
      },
    ],
  },
];

The important part here is nodeToQueryParams. This is a function that is given a node of the query (this represents a portion of the GraphQL query), and returns all of the query parameters required to expand that nested resource. We have a few utilities (getArgument and 
getChildFieldNames) that makes extracting the information from the GraphQL query node trivial.

This diagram shows, at a high level, how a single GraphQL query flows from the component to the server:

client-side-query-flow

This solution has allowed us to move forward with most of the benefits of a "real" GraphQL server, without needing to invest the time or resources to build it out.

✅ Declarative data fetching in our UI

✅ Type safety and "automatic" contract testing of the schema and its queries

✅ Protection from over-fetching data (using field-narrowing in our REST API)

✅ Clear transition path for migrating our client-side schema to the server

✅ An environment that makes it easy to learn GraphQL without the need to support third-party consumers

This also comes with some drawbacks, but they are minor in comparison (and disappear once you move to a server-side GraphQL schema):

❌ Cannot be consumed by other clients (e.g. Mobile, PowerUp developers)

❌ Performance implications of client to server network latency

❌ The page weight implications of additional client libraries and schema information

In the six months since we started down the path of GraphQL adoption, we have reached roughly 80 percent coverage of our REST API with our schema, and shipped multiple new features to production. Overall the feedback from Trello developers has been overwhelmingly positive about both the developer experience and the simplicity of the resulting React components. Our vision for the next 18 months is that the client-side schema will have stabilized and we'll actively move the schema to the server for the benefit of other consumers.

We'd love to hear your feedback on our approach, and whether you've tried something similar, or had success with an alternative approach!