Alok Menghrajani
Previously: security engineer at Square, co-author of HackLang, put the 's' in https at Facebook. Maker of CTFs.
This blog does not use any tracking cookies and does not serve any ads. Enjoy your anonymity; I have no idea who you are, where you came from, and where you are headed to. Let's dream of an Internet from times past.
Home | Contact me | Github | RSS feed | Consulting services | Tools & games
I recently stumbled upon Designing a Delightful Functional Programming Language and it reminded me I should write about the story behind Hack, a programming language I co-designed when working at Facebook.
Facebook was originally built on a LAMP stack: Linux servers, Apache web servers, Mysql databases and PHP as the programming language (a standard technology decision in 2004). Around 2010, driven by performance needs, the company developed a compiler (hphpc, later evolved to hhvm) which compiled PHP and also served as a web server.
Over time, engineers added various features to PHP such as xhp (a templating engine better suited for building web applications), traits, generators, etc. We had clearly forked away from Zend and were not afraid to experiment with ideas / language extensions.
At some point, I was working on some security sensitive code (a crypto library) and I wanted a validation tool. I talked to Yoann, who implemented a linter dubbed strictmode. The linter analyzed the body of functions and enforced the use of specific PHP coding patterns and could catch things like unused variables (it turned out that some of our runtime type coercions saved us by making some bugs unexploitable in practice).
Other engineers wanted to use strictmode and requested more features (types being one of them). We felt that type annotations (and a type checker) would catch at development time bugs which would otherwise trigger at deploy time or randomly in the middle of the night.
A simple linter was however not going to scale, we needed to rethink our approach and thus Hack, a gradual type system for PHP was born. The language added parameter and return type annotations to PHP. It also supported generics, provided a safer way to work with arrays (which can be arrays or maps) and null values.
Julien and I worked on Hack full time; it was a great partnership: he was an OCaml coder with deep language design knowledge and I was a security engineer with my fair share of experience dealing with PHP’s pitfalls and shortcomings.
Initially, our peers were skeptical that we needed a type checker. They associated type checking with productivity pain (https://xkcd.com/303/). We were however able to convince them to give us a chance by having a reversible experiment (we wouldn’t change the runtime very much and if things didn’t work out, we could remove our type annotations and revert to vanilla PHP). We also told people that we would leverage type annotations to deliver an improved developer experience (better IDE/refactoring tools). Finally, there was the potential performance gains sometime down the road.
Given these concerns, we built Hack as tool which lived outside the runtime (outside HHVM) similar to a linter in form but somewhat different in essence.
The first step was to modify our various tooling to recognize the type annotations but ignore them (Java does something similar for generics, a process called type erasure). It's not the ideal way to handle types, but it's an easy starting point. We modified a whole bunch of parsers (hhvm, code coloring tools, code review tools, pfff's, etc).
We decided to build a tool which would type check arguments and return types but perform type inference on local variables. We didn’t want to perform global type inference because type annotations serve a useful documentation purpose; engineers were describing types in comments anyways, might as well keep them in a structured form.
Not having some form of type inference was also out of the question, as it would involve more significant changes to our parsers and would potentially hurt productivity. We wanted to reduce the amount of code and comments engineers were writing, not increase it (not having a syntax for declaring the type of local variables did cause some other challenges).
Our type checker ended up being a service which observed the filesystem and any time a file changed, we incrementally type checked the file, as well as the transitive closure of dependencies (the tool keeps a dependency graph for the entire codebase, the graph itself is modifying incrementally). The type checker itself simply asks the service what is the current state of the codebase. The result was a pair of tools which could handle a giant codebase and answer most queries in under a second!
For the type inference part, we spent a lot of time providing detailed error messages. The type checker builds a "proof" for each inferred type which it later reveals when a type error is found.
1 let f1 (): int = 2 42 3 4 let f2 (s: string): unit = 5 Printf.printf "%s" s 6 7 let () = 8 let x = f1() in 9 f2(x) line 8, characters 4-7: Error: This expression has type int but an expression was expected of type string
1 <?hh 2 function f1(): int { 3 return 42; 4 } 5 function f2(string $s): void { 6 echo $s; 7 } 8 9 $x=f1(); 10 f2($x); line 10, characters 4-5: Invalid argument line 5, characters 13-18: This is a string line 2, characters 16-18: It is incompatible with an int
As we got feedback from engineers using Hack, we tried to recognize and give special treatment to common coding patterns and idioms. The Hack team grew and merged with the HHVM team, allowing us to make more interesting runtime changes. We hired a technical writer who did an amazing job documenting our work.
March 2014, we open sourced Hack
When I left Facebook in 2014, Hack had gained massive adoption within Facebook, without impacting engineer’s productivity. Other “PHP shops” started considering adopting it. The same technology was used to build Flow, a gradual type system for JavaScript.
Some links
- Taking PHP Seriously, Strange Loop talk by Keith Adams
- CUFP 2013, talk by Julien
- Hack: a new programming language for HHVM
- Facebook introduces Hack: a new programming language for HHVM, reddit discussion