Nick Lockwood
Written by Nick Lockwood
Published 2016-10-12

SwiftFormat (Part 3 of 3)

In part 2 I discussed in detail how SwiftFormat’s parser and formatting rules are implemented. Swift is a very complex language, and although it’s syntax is fairly regular, there are a lot of edge cases. So how is it possible to write and maintain the rules that handle all of these cases without introducing bugs?

If you’ve done much software development, I suspect you know where this is going…

Note: This is the third-and-final part of a series about SwiftFormat, a formatting tool for Swift source code. The first part can be found here.

Fail Fast

I’ve read many breathless endorsements of Test-Driven Development (TDD), but I am not what you might call a “TDD fanboy”. Like a lot of iOS engineers, I regard unit tests as a necessary evil that I resent having to write; they slow me down, suck the fun out of programming, and get in the way of the serious business of writing apps!

I began SwiftFormat the way I begin most projects: writing a basic implementation of how I thought it should work, and then iterating on it – providing more complex sample input and fixing bugs or refactoring as needed.

That approach lasted about a week before it became too complicated, and fixes for new bugs started re-introducing old ones.

Then I saw the light. For each new bug I would first write a simple test that replicated the problem, and then write the fix for it. The fixes were eventually superseded by sweeping refactors, but the tests remained, and so I could be confident I wouldn’t re-break anything I had already fixed.

After writing tests for (and fixing) all the known bugs, I resumed adding new format rules. But this time, instead of ploughing straight in, I wrote an empty function for each rule, then wrote a test to verify the output was what it should be. And, of course, the test would fail at first, so next I filled in the function so the test passed.

And so, without any conscious decision to do so, I started developing the whole project using TDD.

Tests became my to-do list. Every bug I saw, I wouldn’t fix right away, but instead I’d write a quick test and go back to what I was working on. Every new feature I thought of, I’d write a test and go back to the refactor I was working on.

Everyone knows that developers hate context-switching. A colleague of mine described it as like causing a cache-miss in a CPU – jump away from the thing you were doing for too long and all of the context gets flushed from your short-term memory. When you eventually come back to it, you have to page it all back in from your subconscious.

Writing failing tests helps with this. Instead of fixing problems when you notice them, or implementing features as soon as you think of them, write a quick test and come back to it later.

After the initial release of SwiftFormat, bug reports started coming in from users, and each time, the first thing I did was write a test. This let me process each bug immediately and let the reporter know if it was reproducible long before I actually had time to fix it. It’s like inbox zero for bug reports!

So am I a convert? Will I use TDD for every project?

Probably not. It lends itself incredibly well to a utility like SwiftFormat, where the problem domain is immensely complicated, and every test can be reduced to a simple input and output string. For a GUI app? I’m still not convinced it’s worth testing everything this way.

That said, I didn’t write tests for the command-line frontend of SwiftFormat. All that does is take a file path, load it and then pass the contents to the formatter, which does all the heavy lifting. What could go wrong?

Needless to say, that came back to bite me.

Command & Conquer

When building SwiftFormat, one aspect I didn’t really consider beforehand was the user interface. I assumed it would be a command-line application, but didn’t really think about what that would involve.

As an iOS engineer, I’m used to building touch-based interfaces with a GUI of some description. With a command-line app like SwiftFormat, the “user interface” is a single array of strings, passed to the application in CommandLine.arguments.

The first string in this array (or rather, zeroth, since array indices start at zero) is always the path to the program itself. I erroneously thought this was the working directory at first, but that’s not the case.

The remaining strings represent the space-delimited arguments passed by the user when the application is loaded from the terminal.

You can typically ignore the zeroth argument, unless your app has external dependencies or assets it needs to load (which SwiftFormat doesn’t). The first argument in my initial implementation of SwiftFormat (i.e. the first one actually passed by the user in the terminal) was the path to the Swift file being formatted.

That sounds straightforward, but even this simple interface has a couple of challenges:

  1. In command-line tools, file paths are generally assumed to be relative to the working directory, so we need to find out what that is.
  1. Command-line arguments are just provided as a flat list of strings. No distinction is made between argument names and values.

The first problem can be solved by using the FileManager to get the working (aka “current”) directory, and some URL functions to combine it with the path. The solution is very specific to Apple’s standard libraries, but if you’re interested, here it is:

The second problem is harder. If you really only need to support one argument, you can simply treat the first argument as the path and leave it at that. But a different approach is required if you intend to support additional arguments later (as I did).

Capture the Flag

Once SwiftFormat reached a certain level of sophistication, it needed some configuration options. The first one? You guessed it – tabs or spaces!

Command-line options, sometimes called keywords or flags, are typically indicated with a leading hyphen followed by a single letter (e.g -o for output file). I searched for guidelines for flag naming or usage, but there are no hard-and-fast conventions. The closest thing to a standard is the POSIX Utility Argument Syntax specification (which just confirmed what I already knew about hyphen prefixes for flags).

I also found this helpful note about the GNU approach of using a double-hyphen prefix for long-form arguments (e.g. –help, or –version):

“The GNU style uses option keywords (rather than keyword letters) preceded by two hyphens. It evolved years later when some of the rather elaborate GNU utilities began to run out of single-letter option keys (this constituted a patch for the symptom, not a cure for the underlying disease). It remains popular because GNU options are easier to read than the alphabet soup of older styles. GNU-style options cannot be ganged together without separating whitespace. An option argument (if any) can be separated by either whitespace or a single “=” (equal sign) character.”

I decided to follow this practice, making each argument available in both long form (–indent) or short form (-i). In accordance with convention, I would also display the list of available flags when the –help (or -h) flag is passed.

To implement this, I wrote a preprocessor function that takes the flat array of argument strings and, for each argument, applies the following logic:

  • If the previous value was quoted or escaped it will be appended to that value
  • If beginning with — it is treated as a long-form option and matched against all known arguments
  • If beginning with – it is treated as a single-letter option and matched as a prefix against all known arguments (an error is reported if there’s more than one match
  • If not beginning with – it’s assumed to be a value. If the previous argument was an option name it’s assigned as the value for that option, otherwise it is treated as an anonymous value, and assigned a numeric index.

The processed arguments are then returned as a dictionary of name/value pairs which can then be handled more conveniently by the application logic. The implementation looks like this:

My original implementation of the command-line argument preprocessor was simpler (and didn’t handle all the edge cases). It worked fine with my own files so, being test-averse, I didn’t write any formal tests for it. When I did eventually get around to writing a couple of tests for it, guess what? I found a bunch of bugs.

Pipe Dream

One early request from SwiftFormat users was support for processing source code from STDIN and writing it to STDOUT, as an alternative to accepting a file and modifying it.

This would allow SwiftFormat to be used as part of a chain of commands, joined by UNIX pipes. A trivial example usage might be to use the cat command to load a source file, then pipe it through SwiftFormat as follows:

This would load the some-file.swift file, format it (using tabs, not spaces), then output it to STDOUT to appear in the terminal (unless piped into another app for further processing).

Reading from STDIN is typically done line-by-line in Swift by calling the readLine() function repeatedly in a loop until it returns nil. The code looks like this:

But this poses a design problem: when you pipe text into a command-line app from cat (or any other command), a loop like the one above will consume all available input and then exit. But running the app directly without piping in any input, readLine() will pause and wait for user input – the loop will never terminate. Pressing Return will simply run another iteration, and the only way to escape the loop is to kill the whole app with “Ctrl-C”.

I designed the SwiftFormat command-line app so that if launched without any arguments it would display the help page and then quit. But to support UNIX pipes I would need it to call readLine() instead.

There seemed to be no way to tell if it had been launched directly or via a pipe; how could I support both use cases?

The (rather hacky) solution was to call readLine() on a background thread with a timeout. If the app doesn’t receive any input from STDIN within a fraction of a second, it shows the help and then it quits. If input is received, the app assumes it’s been piped and keeps reading lines until it gets nil.

It would never previously have occurred to me that a simple, noninteractive command-line application might need to be multithreaded!

Automated Automagic

A command-line utility to format code is all very well, but running it manually soon gets tedious, and running it by hand means you’re still wasting some developer’s time.

Ideally, it should run automatically every time the code is modified. There are a few possible execution points that make sense:

  • Every time the code is edited
  • Every time the code is saved
  • Every time the code is compiled
  • Every time the code is checked in

Reformatting code every time it is edited is possible, but requires integration with your code editor, which for Swift would mean writing an Xcode plugin. Apple has a history of making breaking changes to the Xcode plugin API (in fact, they just broke it again in Xcode 8), so this option wasn’t very appealing. It’s also not clear that it’s a good idea – reformatting code as it’s being typed might cause confusion.

The code can be reformatted every time it is saved by running a process such as Watchman to check when files are updated. I wanted to avoid additional dependencies, so I ruled this out as a solution (though it might work well for other users).

Reformatting code at compile time can be done by adding a Run Script Build Phase in Xcode that runs SwiftFormat each time the app is compiled. This involves committing both the script and the SwiftFormat binary to the repository.

Reformatting code every time it’s checked in can be done with a git pre- or post-commit hook. Pre-commit hooks are a good option, but aren’t checked in with the source code, so must be set up individually by each developer on their machine. Post-commit hooks run on the CI server, which means you must be very confident that the formatter won’t break anything in the code, because if it does you can’t fix it locally.

I opted for the Run Script Build Phase. It has the benefit of ensuring all developers on the project are using the same version, but can still disable it locally if something goes wrong.

The script is run immediately before the code is compiled, ensuring that the code you are compiling is the same code you see. This does introduce an unexpected complication though: I originally assumed there was no need to worry about invalid programs, but when running the script before the code is compiled there’s no guarantee that’s true.

Developers sometimes press “build” just to test if their code is correct, and if they’ve missed a brace somewhere SwiftFormat has the potential to scramble the source code, get stuck in a loop, or crash due to “impossible” code structure.

For this reason, I added some basic validation to the tokenizer; it will abort if it encounters mismatched braces, or an unexpected end of file. This prevents SwiftFormat attempting to format code that has any serious syntax errors. Better to fail early than do the wrong thing.

Lessons Learned

Lesson 1: Don’t reinvent the wheel

Just kidding! Writing my own parser was one of the best experiences in the project. I learned a lot about Swift and about parsing in general; and while I encountered many bugs, I never felt anything could not be fixed (thanks in part to lesson 2 below).

I don’t necessarily advocate reinventing the wheel for the sake of it, but I believe that you should have complete control over the core functionality of your app. If you are building an image processing app, don’t use a third-party image library; if you are building a code formatter, don’t use a third-party parser.

Users don’t care whether a bug is your fault or the fault of some library author – if a piece of functionality is critical to your app then you should either write it yourself or know it so well that you could have written it (and can fix it when it goes wrong).

Lesson 2: Test all the things

Writing tests isn’t much fun, but working on fully tested code is so much more enjoyable than the alternative. I’m proud of myself for realizing this soon enough in SwiftFormat’s development to ensure that code coverage of the tokenizer and rules is close to 100%. Despite that, I failed to make the logical leap of applying it to other areas like the command-line app itself.

UI in a traditional GUI app can be hard to test. The trick is to move as much of the UI logic out of the UI flow – and into pure functions – as possible. Pure functions are easy to reason about and write tests for. If you’re writing a command-line app (as I was) there’s no excuse, because the whole damn app is a pure function!

Lesson 3: Eat your own dogfood

More than once, I found bugs in SwiftFormat that the unit tests had missed by running it on our internal codebase at Schibsted. Automated tests are fantastic, but they can only test the scenarios you’ve thought of. Swift is a complex language and I continued to find new, valid code patterns I’d never seen, long after the first release.

Lesson 4: Publish early

Second only in importance to dogfooding is beta testing. The entire development time from starting development to the first release of SwiftFormat was about three weeks, but I made more improvements in the week post-release than in the week prior, thanks to user feedback.

The concept of shipping a Minimum Viable Product (MVP) has drawn a lot of criticism in recent years (thanks to a lot of dubious decisions by product companies about what qualifies as “viable”) but there’s the kernel of a good idea in there, you just need to manage expectations.

If you think your app sucks, it probably does. But if you think it’s working well, get it in front of people as soon as possible, because they will find bugs and usability issues you never considered.

Make it clear this is not a final product, but get it out there. I released SwiftFormat as version 0.1 (not 1.0) and now, more than a dozen releases later, it still hasn’t quite reached the feature set envisioned for the first release. But if I’d waited until it did before shipping it, I’d never have received the critical feedback needed to get it to where it is now.

Conclusion

That concludes this three-part series on SwiftFormat. I hope you’ve found it interesting and that you’ll be inspired to build your own tools to improve your workflow.

If you’d like to contribute to SwiftFormat, or if you find bugs or have feature suggestions, please open an issue (or even better, a pull request) on the SwiftFormat GitHub page.

Written by Nick Lockwood
Published 2016-10-12