Nailing the take-home exam

Cracking the Mutt Data tech interview

Posted by Joaquin Torré, Javier Mermet

on August 10, 2022 · 10 mins read

Nailing the take-home exam

At Mutt, we don’t do live-coding or whiteboard interviews. We especially don’t believe in leetcode-type interviews. They generally miss the mark and only serve to filter out people who haven’t taken a data structures class in the past two years or haven’t memorised Cracking the Code Interview in the past month. Most importantly, they don’t test the candidate’s knowledge of the subject at hand.

Imagine coming to an interview for a Tensorflow position and getting asked about FizzBuzz. You might as well ask them to do jumping jacks! Arguably, by asking algorithmic questions, you’re just evaluating whether the candidate has studied for those kinds of questions. This can be a proxy for knowing if the candidate is good – i.e. they can study abstract concepts and apply them effectively – but why not directly quiz them on topics related to the job?

In this blog post, we’ll explain why we don’t follow the usual tech screening process and give you some recommendations for solving a take-home programming problem.

Why we do take-home exams

The benefits of interviewing with leetcode are clear […]. First of all, they’re standard. Most candidates are used to them, so there’s no overhead in explaining things. They’re also easy to implement.

As a company, you could simply choose a couple of exercises off the internet […], change the setting a little bit, and use it in either live coding or a timed async session; or better yet, use one of the many online platforms which offer automated grading.

There’s little overhead for the interviewer as well. They usually need to memorise a few problems and prepare a small set of standard follow-up questions. In many cases, the ease of following the script conflicts with the interviewee’s mental processes. A great interviewer will empathise with the interviewee and try to understand their process, letting them follow their ideas instead of forcing them through their known script.

Whiteboard interviews are a fast and cheap solution to screen for candidates. In comparison, creating custom exams for each job description and different seniorities takes much more work and requires careful calibration. So why do we opt for the latter?

There are three main reasons why we prefer take-home exams. A big reason is that they are truly asynchronous, which is a core value of how we work at Mutt. There are platforms for leetcode-like interviews that are async in the sense that they don’t require an interviewer present at the moment, but usually once the exam is started a 2-hour countdown starts — it isn’t truly asynchronous for the interviewee. Our take-home exam does have a countdown but it is generally longer than a day 1, and the timer also starts whenever you want. If you want to start at 3AM, that’s OK for us! Giving more than a full day to solve it means you can carry on your usual activities and you don’t have to be glued to your desk. Do you want to walk your dog […] while you think about the best way to solve a problem, or how to improve some code? Go for it! Not being extremely time-bound and not having your webcam’s menacing LED staring at you means you can think things through, so we’re not penalising slow thinkers or people with stage fright.

Our second reason for preferring take-home exams is that they are better for us as well. We can actually assess the quality of the candidate’s code, how they structure the repo and the functionality, and so on. This also helps us understand their seniority better. Sometimes a problem can be solved by both junior and senior candidates, but the quality of the code of the senior’s solution is much better. For example, it can be more maintainable, easier to read, be less error-prone, have better modularization, etc.

Our third reason for not doing algorithmic-type exercises is that by straying away from traditional problems, we can focus on the things you’d be doing if you join us at Mutt. This works both ways: the solutions we grade are relevant to the job description, and you get a better understanding of what the role is about. Not only that, but you wouldn’t believe the amount of times we’ve had feedback from candidates saying that they enjoyed the exam and that they learned from it! Since we generally have some suggestions of libraries and best practices in the exam statement, the candidates might learn something they didn’t know before or get the chance to solve something in a different way.

Our recommendations for solving a take-home exam

Mutt’s interview system for tech positions generally consists of a take-home exam, a tech interview, and a cultural interview. In this post we’ll be focusing on the first part. We’ve graded 300+ exams, and we’ve seen everything […]. We want to give you some tips on do’s and don’ts when attempting to solve Mutt’s – or generally any – take-home coding exams.

The take-home exam is designed to mimic things you’ll be doing on a regular basis at your job, and we’ll not only be evaluating if the solution solves the exam question, we’ll also be judging how you write your code, how clear and maintainable it is.

It’s not about edge-cases and cosmic rays

There are no trick questions in our exams or extreme edge cases we want you to discover and handle. We won’t be testing whether your code supports the passage of radioactive cows.

The hundred-percent surefire way to send code in without any bugs is to send [no code at all](https://github.com/kelseyhightower/nocode).

The hundred-percent surefire way to send code in without any bugs is to send no code at all.

In most cases, handling the happy path is fine - just make sure to handle the most common exceptions. For example, if you’re doing a request to an external service, be prepared to handle an error response. For all other uncommon edge cases, it’s perfectly fine to add a comment with a brief description of the scenario and how you would solve it.

Your code: it’s all about location, location, location.

Technically, we could all program everything in one single file, but that would be hellish to maintain 2 Many people get this wrong when solving a take-home exam, probably because it’s “such a small exercise it probably doesn’t matter”. It’s true that the project is small, but we shouldn’t apply good practices only when they are absolutely necessary, they should always be followed. The key here is modularization. Don’t put your CLI and database classes in the same file. Try to separate things in different files and modules such that the organisation makes sense, and things can be reused later if necessary. While we’re discussing code location, it’s also important to organise the repo’s directory structure. There shouldn’t be a main folder with documentation, code and tool configuration. Many open source projects are good examples of how to organise these. As an example, you can take a look at our very own open source project, Muttlib.

See, even Moss gets it.

See, even Moss gets it.

There are automated tools to help you do this. Cookiecutter is a utility that bootstraps a package with basic structure and tool configuration. A good template is Claudio Jolowicz’s Hypermodern Python Cookiecutter. Among other things, it is centered around Poetry, a tool we love.

The good thing about Poetry is that it makes Python environments almost a hundred-percent reproducible 3. It not only specifies dependencies down to the hash, it also manages the virtual environment by itself. That’s very important when submitting an exam – there’s no worse feeling than receiving a reply hours later saying I can’t run your code!.

Let you know that I know that you know

I guess this has happened to you too: you grab some codebase, look at it, and unless you follow everything through, you have a hard time understanding what it does and how it does what it does.

So, if the objective of the challenge is for us to get a grasp of your skills, let us know that you know. So you’ll know that we know that you know. Convey information to the unsuspecting reviewer. Use descriptive variable and function names, organize your code and write docstrings.

Be like Phoebe.

Be like Phoebe.

This last bit can’t be stressed enough. If you don’t write docstrings, it takes way longer to understand how you structured your solution. If you write docstrings, we can take a glance at everything and understand the overall intended design. Then, whatever might not work or be missing, can be disregarded as implementation issues. Later, we double click on the code to evaluate best practices.

Snake charming

Python might not be your main language, or the one you are the most experienced with. But you sure are comfortable with some other language, right? Programming languages are just tools, and most of the ecosystems of each programming language share the same concepts: testing tools, linters, prettifiers/formatters, static-type checking (for dynamically typed languages) and so on.

You might not know what the tool to do X in Python might be, but you know what X is. Say you are a javascript developer, and you use prettier as a code formatter. You might not know black, but you can search for a python code formatter.

Bring on the best practices you know!

We love C99 too - but maybe don’t use argv

If you’re tasked with creating a command line interface, please don’t process commands with sys.argc and sys.argv. Doing so means that you’ll have to parse things yourself, and basically re-invent a wheel like we’re programming in C99 in the Pentium 4 days. How do you infer types, or handle default values, or handle argument ordering, and handle positional vs keyword arguments?

There’s even a Python built-in library for this: argparse since Python 3.2 […]. The docs do a good example of providing most of the common use cases. However, there are also third-party alternatives, like Click - from the same people that did Flask and Jinja - and Typer - from the FastAPI authors.

Just to show you how good Typer is, we’ll copy below an example similar to the one in the docs:

import typer

app = typer.Typer()

@app.command()
def hello(name: str):
   typer.echo(f"Hello {name}")

@app.command()
def goodbye(name: str, formal: bool = False):
   if formal:
       typer.echo(f"Goodbye Ms. {name}. Have a good day.")
   else:
       typer.echo(f"Bye {name}!")

if __name__ == "__main__":
   app()

If you’ve ever programmed in Flask or FastAPI, this kind of syntax might seem familiar. It’s trying to make a command line interface a bit more similar to programming a REST API, which is very sensible.

And just with these lines, we already have two commands, each with their own positional and keyword arguments, and we have tied each to their respective function. If we were using argparse, we’d have to add subparsers and tie their behaviour by hand:

import argparse

def hello(args):
   print(f"Hello {args.name}")

def goodbye(args):
   if args.formal:
       print(f"Goodbye Ms. {args.name}. Have a good day.")
   else:
       print(f"Bye {args.name}!")

if __name__ == "__main__":
   # create the top-level parser
   parser = argparse.ArgumentParser()
   subparsers = parser.add_subparsers()

   # parser for `hello`
   parser_hello = subparsers.add_parser("hello")
   parser_hello.add_argument("name", type=str)
   parser_hello.set_defaults(func=hello)

   # parser for `goodbye`
   parser_goodbye = subparsers.add_parser("goodbye")
   parser_goodbye.add_argument("name", type=str)
   parser_goodbye.add_argument("--formal", action="store_true")
   parser_goodbye.set_defaults(func=goodbye)
   args = parser.parse_args()
   args.func(args)

There’s a clear winner here in length, readability, extensibility and maintainability.

If you’re short on time: dontworryaboutit

Seriously, relax. We don’t expect you to solve everything and do it perfectly. That’s also part of the day-to-day work, sometimes things come up and maybe we can’t complete what we wrote in the daily […]. We also don’t want you to spend too many hours on the exam, so if you see that you’ve put in considerable work and there are still a couple of things to do then it’s alright to call it a day.

In fact, we usually check the commit history and try to get some input on how much time you’ve dedicated to the challenge. It helps us calibrate the exam to our ideal length, and it also helps us understand whether you couldn’t do more because you were constrained on time or decided to put a big chunk of hours in, and grade accordingly.

Our suggestion is to try to demonstrate knowledge in the different tasks, but don’t submit half-works or code that doesn’t run. If there is a part that you didn’t manage to solve in the time frame, then explaining how you would solve it is perfectly fine as well. Remember: on the other side of the challenge, there is someone trying to assess your skills. Help them understand what you know and how you work.

One usual criticism of these kinds of processes is that it takes a toll on the candidate, more so if they are doing several interview processes at once. And that’s why we highlight that we don’t expect you to finish everything and pull an all-nighter. Treat it as any other time-bound project. This challenge will help you understand if your expectations for the role will be met and lets us see how you handle operative tasks.

Most importantly, try to have fun! Think of it as a challenge that happens to be related to the job you are applying for. The tasks included in the exam are very similar to those we do in our daily operations and should serve as an overview of what it would be like to work with us:)

A final word

The decision to use this format is not one we’ve made lightly. Like any approach, it has some advantages, but also some drawbacks. Taking them into account, we still believe this is the process that best reflects our culture and values.

The hiring process is something we’re constantly trying to improve and tune. We love to hear feedback on how to improve the take-away exams and what we should do differently. It’s that same spirit of trying to do better that motivated us to steer away from traditional leet-code interviews. It takes time and effort, but we’re convinced that it pays tenfold in the hiring process and the candidates appreciate it.

References and footnotes

  1. We’ve had discussions about what the best time length is for candidates. On one hand, a too short time will make it less async - a big no-no for us. On the other, having a too long window of time, like a week, makes it more probable for candidates to ping their peers about suggestions on how to solve something or even code review them. Since our exam was designed (and re-designed) to not to be too long, we settled on around one to two days 

  2. Though apparently you can make $65K in revenue out of it. 

  3. Since some Python packages have external dependencies, some code may not be exactly reproducible in one environment or another. For this, you’d have to make a Dockerfile, which would be a big plus in the exam. 


Hiring