Nick Lockwood
Written by Nick Lockwood
Published 2016-10-12

SwiftFormat (Part 1 of 3)

SwiftFormat solves the tabs vs spaces argument forever! OK, maybe not, but it might help Swift developers to get along. So what is SwiftFormat? Why did I make it? And how does it work?

SwiftFormat is a Swift application for swiftly formatting Swift code, but don’t be put off if you don’t know Swift. If you’re curious about programming languages and how they are parsed, or designing a command-line tool that interacts with your development workflow, this article is for you.

What is SwiftFormat?

SwiftFormat (available here on GitHub) is an open-source command-line application that accepts any valid Swift program as input, then outputs the same program after normalizing whitespace and indentation, stripping trailing semicolons, and tweaking other punctuation in places where it won’t affect the logic.

It takes its style cues from Apple’s own Swift documentation and code examples. The rules are mostly hard-coded, but I’ve added configuration options for cases where the defaults may be contentious.

SwiftFormat doesn’t usually add or remove line-breaks, because programmers often decide to put arguments on one line or split them over several for aesthetic or legibility reasons that a formatter shouldn’t try to second-guess.

In a few cases it does intervene though, for example, Allman-style braces are converted to K&R style, and sequences of multiple blank lines are collapsed into one.

swift_image1

These transformations produce clean, consistent code formatting, regardless of the input. It can’t fix bad logic (at least, not yet), but making it readable is a good first step.

Why format Swift?

After joining Schibsted, I encountered a familiar issue with the codebase: code written by different developers follows different styles of indenting and spacing, and a clash of styles in a project can result in unproductive arguments and endless bikeshedding.

Code style is mostly a cosmetic issue – the compiler doesn’t care – but it can lead to problems such as noise in commits caused by whitespace changes, which makes reviews more difficult and leads to merge conflicts.

In the past, I’ve enforced project style guidelines by code review, or by manually reformatting other developers’ code. This is a terrible waste of human resources, and can lead to bad feeling if people fail to correct style issues, or don’t appreciate others “fixing” their code.

These issues can be solved by using an automated formatter. It eliminates the human effort and depersonalizes the format decisions, so nobody feels like their code is being criticised.

Many programming languages have standard tools for beautifying code, sometimes provided as a secondary function of code linters, which find potential bugs or portability issues. But I only found a handful of such utilities for Swift, most quite limited and either incomplete or seemingly abandoned. Here are a couple of the better ones:

  • SwiftLint – primarily a linter, but with a basic re-formatting function built in
  • Swimat – a fairly advanced formatter, implemented from scratch in Objective-C

I wasn’t sure if the scarceness of such tools was a reflection of the newness of the language or the difficulty of the problem, but regardless, writing a code formatter from scratch seemed like a very time-consuming and ambitious task.

So that’s what I decided to do!

How does it work?

To format code, you must first understand it. Finding and replacing strings can be used for basic transformations, but anything nontrivial requires some understanding of the code’s meaning, and that requires a parser.

Parsing a programming language is an interesting problem in theory, but rarely put it into practice (unless your day-job is working on developer tools). Consequently, it’s not in most programmers’ comfort zone.

I’ve always been fascinated with the implementation of programming languages, and the process of parsing (which has remained largely unchanged since the 1960s).

Parsing is typically performed in two distinct phases:

  • Lexical analysis (or “lexing”) – breaking the text into tokens, such as “operator”, “string”, “identifier”, etc.
  • Semantic analysis – grouping the tokens into semantically meaningful structures, such as loops, functions, classes, etc.

Tokens form a simple list of type/value pairs, whereas semantic structures form a hierarchical tree, called an Abstract Syntax Tree (AST). For example, a class contains a number of functions, which in turn contain a number of statements, which contain a number of expressions, which contain operators and operands, which may be individual tokens, or may themselves be more complex structures, such as inline function definitions or object literals.

Here are some of the tokens and semantic structures in a simple fibonacci function:

tokens-vs-ast-nodes@2x

To implement SwiftFormat I needed a lexer to convert Swift code to tokens for formatting, and some form of semantic analyzer to determine higher-level structures (required for code indenting and other context-specific rules).

Reinventing the Wheel

There is already a perfectly good parser for Swift – the one used by the Swift compiler itself. Apple makes this available via a code library and system service called SourceKit. This was the first place I looked for inspiration.

SourceKit is written in C++, and has a fairly arcane interface. Fortunately, a nice wrapper library called SourceKitten exposes SourceKit’s functionality in a more friendly way – it even has a built-in formatting function.

The built-in formatter does little more than indent the code, and I wanted to implement a much more sophisticated set of rules, requiring direct access to the parsed source code.

SourceKitten can export either the raw tokens or AST as a JSON file, which sounds ideal, but in both cases some of the data was missing, or would require considerable work to extract. Here is a sample of the token output from the fibonacci function:

Although comments are included, whitespace and operators are not, so additional parsing of the text between the listed tokens is required.

I tried exporting the AST, but again it was missing critical elements. Here is what was produced from the same source code (slightly abridged for space reasons):

High-level structures (functions, parameters, statements, comments) are captured, but again, no individual operators and keywords, nor the whitespace between them.

This makes sense, because for many purposes (e.g. static analysis, linting, compiling) this metadata is irrelevant. You would want the parser to strip it out as early as possible. But SwiftFormat needs all of that information, so it can put it back, intact after cleaning up the code.

I could have designed SwiftFormat as an ideal formatter, to take complete control of all formatting and ignore what’s already in the file, but this would require it to be perfect in order to be useful; if it encountered a structure it had no rule for, it would have to abort formatting, since it wouldn’t know what to do with it. To make my task manageable, I wanted to add gradual support for new rules and structures, with the fallback behaviour being to do nothing and leave the original code alone.

Both the tokens and AST are helpfully annotated with character offsets, so I could have tried to create the missing data by cross-referencing both the token and AST output with the original source code (parsing the missing areas myself if necessary), but this was starting to sound like more work than parsing the source myself.

I gave up on using SourceKit as the front-end for my formatter, and started to write a parser from scratch in Swift.

In retrospect this may have been premature – SwiftLint uses SourceKitten, so they must have found workarounds for these issues – but I’m reluctant to rely on huge, complex dependencies that don’t seem to do exactly what I want, especially when I have a pretty good idea how to do it myself.

(Also, a bonus of not relying on SourceKit is not having to worry about this:)

sourcekit-crash@2x

The Low-Down on Top-Down

So, how does one write a parser?

You might think that parsing source code is a problem ideally suited to regular expressions, but regular expressions work best when matching simple, non-recursive patterns, and programming languages are highly recursive in nature. Regular expressions can have a role to play, but we’ll get to that later.

The standard approach to parsing programming languages is to use something called a recursive descent parser, which is a top-down solution involving mutually recursive functions.

In recursive descent parsing, lexical and semantic analysis are typically performed concurrently, with a loop that repeatedly reads the next token, then guesses which AST node it is part of, and dispatches to an appropriate AST node parsing function based on the current context.

For simple languages, the AST node can be determined from the current and previous tokens, without needing to peek ahead at future tokens to disambiguate. For example, here is how a JSON parser might determine what high-level structure it’s looking at:

More complex languages, such as C, often require the parser to pre-fetch one or more tokens in order to disambiguate an expression (the number of pre-fetched tokens required by a parser is referred to as the lookahead value). For example:

Swift’s syntax rules are much more complex than most languages, and often require several tokens of lookahead to disambiguate. A typical such case in Swift would be when trying to distinguish between a “less than” operator (<) and the start of a generic parameter list (e.g. <T, U>).

Given the following tokens, it’s impossible to tell if this is a comparison between the variables Foo and Bar, or if we are creating an instance of the class Foo with a generic argument Bar:

In some cases, it’s not possible to determine this until we see the token after the closing chevron:

Generic parameter lists can be arbitrarily long, so we cannot know many tokens ahead we might need to look (i.e. how many to consume before we begin parsing). The alternative to lookahead is backtracking, where instead of peeking at future tokens, the parser makes an educated guess about the structure it is looking at based only on the current token. If it later finds that it guessed wrong, it rolls back that decision, and tries an alternative.

If the language is highly ambiguous, and the parser frequently guesses wrong, backtracking can be very inefficient. But such situations are rare in Swift, and backtracking is the approach used by Apple’s own SourceKit parser.

That’s enough about the theory. In part 2 we’ll take at look at how all of this is actually implemented in Swift.

Written by Nick Lockwood
Published 2016-10-12