8 Reasons Why Manual Testing is Still Important

The increase of test automation adoption has unjustly framed manual testing as an archaic and unnecessary practice. After watching an automation suite swiftly execute an entirely library of test cases, it can be easy to tunnel vision on the great benefits of automation. However, the value of manually executing your tests cannot be understated; here are a few reasons why manual is still relevant as ever.

Tape 1: Cycle Times

There’s no way around it; initial automation requires an increased investment in both, time & resources. You are setting up a foundation to continually benefit from in your future testing endeavors. However, in some cases, your automation efforts will not be the ideal solution for your testing.  Attempting to initialize automation while close to the end of your testing cycle would be a moot effort; the time you take to set up (and the sudden resource shift) means you’ll be nearing your release date before you can start running reliable and core automated testing. During that same timeframe, you could be focusing your testing resources towards manual execution. As the majority of their time is focused on test case validation, the end result is more coverage within your test cycle.

Tape 2: Even Your Automation Has Errors

Like any piece of code, your automation will contain errors (and fail). An error filled automation script may be misinterpreted as failed functionality in your tested application, or (even worse) your automation script will interpret an error as a correct functionality. Manually testing your core, critical-path functionality ensures that your test case is passing from a user perspective, with no room for misinterpretation.

Tape 3: UI Validations

The advent of automated testing platforms for Responsive and UI testing has provided a much appreciated convenience. However, it should be a boost to your UI testing efforts, not a crutch. These programs validate your test cases by checking element distance, image placement, and alignment of elements in relation to each other. Because of this, there are more than a dozen ways that something such as alignment between a menu and logo can be misinterpreted; a manual tester would immediately be able to catch something that looked “off”, and fail the test case.

Tape 4: Un-Automatable Scenarios:

Some scenarios are simply not feasible to automate; they are either actually impossible due to technological limitation + the complexity of the scenario, or the resource cost of automating it greatly outweighs the cost of a simple manual test. Case in point, we recently had a customer who needed to test their manual tap-and-pay function for their mobile wallet app. Developing a way to automate this scenario is not worth it when compared to manually testing it with your device.

Tape 5: (Short-Term) Cost

Over time, automation leads to cost savings, faster execution, and continous testing. In the immediate short term however, there is an investment cost (and learning curve for the unfamiliar) that can be a situational disadvantage. The cost of setting up and running your initial automation framework can range anywhere from 5-15x the cost of your manual testing endeavors. And as discussed earlier, implementing automation while crunched for time towards the end of a test cycle will not allow you to enjoy automation’s full potential. Choosing to conduct manual testing at this stage provides an immediate, tangible result from your testing resources.

Tape 6: Exploratory Testing

Exploratory testing describes the process of freely testing the application for the purpose of finding defects can`t subsequently designing new test cases. Defects found through exploratory testing are often the results of testing complex scenarios that would not have been addressed through your predefined test cases. Having a foundation of core, repeatable tests automated will free up time to designate resources towards exploratory testing.

Tape 7: Skills

While the end result of Automation is ease, the set up of framework and development of scripts are no easy tasks. An effective automator has a foundation of programming skills, as well as an inherent understanding of test design. These skills are learned over years of experience in both QA and Development, and acquiring somebody with these specific skillsets (especially on short notice) is not a simple process. On the other hand, the majority of Manual test cases are simple to execute and can easily be taught; follow the steps in the test case, and validate that your actual results are consistent with the expected results.

Tape 8: Agile

In the context of Agile testing, automation is of great benefit. Having a library of tests reliably and quickly executable truly helps with test completion & coverage during a tight sprint. By that same token, manual testing is a quick way to execute for any test cases that are not yet automated. There may be no time to build automation for new features introduced in the current build, making manual the best option for test completion.

As a conclusion, the need for increased test coverage across an ever increasing range of software and devices has made test automation more important than ever. As automation continues to grow, it can be easy to forget about the wide spectrum of benefits manual testing still has to offer. Appreciating the value of both approaches will make for a wholesome testing experience.

Looking under the hood of the Eventbrite data pipeline!

Eventbrite’s mission is to bring the world together through live experiences. To achieve this goal, Eventbrite relies on data-driven decisions at every level. In this post, we explore Eventbrite’s treasure trove of data and how we leverage it to push the company forward. We also take a closer look at some of the data challenges we’ve faced and how we’re pushing forward with new improvements to address these challenges!

The Data Engineering team at Eventbrite ingests data from a number of different sources into a central repository.  This repository is the foundation for Eventbrite’s analytics platform. It empowers Eventbrite’s engineers, analysts, and data scientists to create data-driven solutions such as predictive analysis, algorithmic newsletters, tagging/clustering algorithms, high-value customer identification, and search/recommendations.

The Situation: Degrading performance and increasing cost of existing Data Warehouse running on Hadoop infrastructure (CDH5)

We use MySQL as our main production datastore, and it supported most of our data analytics/reporting until a few years ago. Then we implemented a Cloudera CDH ecosystem, starting with CDH3 and upgrading to CDH5 when it was released. Prior to phasing in our CDH environment, our main OLTP (online transaction processing) databases were also powering our reporting needs.

Our production MySQL topology consists of a Primary-Secondary setup, with the majority of the read-only traffic directed to the MySQL secondary instances. For many years, we met our reporting requirements by leveraging the read-only MySQL instances but it came at a steep price due to contention and performance issues caused by long-running SQL queries.

As a result, we moved much of our reporting to our new CDH environment, and we designed a new set of transformation tables to simplify the data access  for Engineers, Analysts and Business users. It’s served us well as the backbone for our Data Warehouse efforts, but the time had come to take the next step as we’ve faced a number of challenges:


Our CDH5 cluster lives on Reserved Instances, and all of the data in the cluster is housed on local solid state drives.  As a result, the cluster is expensive to maintain.

A Reserved Instance is a reservation of resources for an agreed upon period of time.  Unlike on-demand, when you purchase an RI (reserve instance), you commit to paying for all  the hours of the 1-year or 3-year term. The end result is a lower hourly rate, but the long term costs can really add up.


We have a large collection of uncurated data, and we had not transformed the data into a single source-of-truth about our business. As a result, core business metrics (such as organizer, consumer, and event data) were reported differently in different places in the organization, and attributes such as currency, location and timezone were reported differently across business units.


Most jobs were scheduled via Oozie, there was little effective monitoring in place, and there was no method to track or enforce dependencies between coordinators. In addition, other analytics jobs that utilize Salesforce and MySQL data were scheduled through a local Windows machine that was prone to errors and regularly failed without warning or notification.


All ETL-processing and  ad-hoc queries executed on the same CDH5 cluster. Each process had its own load profile, so the cluster was configured to fit an aggregate of those loads. The end result was that jobs frequently conflicted with each other and competed for resources.

Our workload required burst capacity to support experimental development, ad-hoc queries, and routine ingestion scripts. In an ideal setup, we would scale up and scale down computing resources without any interruptions or data loss.


For MySQL ingestion, we used a home-grown wrapper called Sqoozie to integrate with our MySQL databases. Sqoozie combines Apache Sqoop – a command-line application for transferring data between relational databases and Hadoop – and Apache Oozie, a Hadoop workflow scheduler. It allows for writing MySQL tables directly to Hive tables. While this approach worked for smaller datasets, it became prohibitive as our data grew. Unfortunately, it was setup as a full ingestion of all tables each day and typically took most of a day to finish, putting high load on the shared resource cluster for an extended period of time.

For web analytics ingestion, we used a proprietary tool called Blammo-Kafka that pulled the web logs directly from Kafka daily and dumped them to Hive tables partitioned by day.

For Salesforce ingestion, we used the Salesforce Bulk API to ingest all objects daily and overwrite the previous day’s ingestion.

The Solution: EMR, Presto, Hive, and Luigi to the rescue!

In the past year, we’ve invested heavily in building a shiny new “data-foundry” ecosystem to alleviate many of the pain points from our previous CDH environment. It is the result of many whiteboard sessions, sleepless nights, and walks around the block at Eventbrite’s offices at our SOMA location in San Francisco and Cummins Station in Nashville.

We focused not only on improving stability and cost, but also on designing a new set of transformation tables that would become the canonical source-of-truth at the company level. This involved meeting with key stakeholders to understand business metrics and exploring new technologies. The following diagram depicts sample output from some of our working sessions. As you can tell, it was a tedious process.

The end result was the implementation of a new “data-foundry” infrastructure. The following diagram shows a general layout:

EMR (Elastic MapReduce) Clusters

Ingestion and ETL jobs run on daily and hourly scheduled EMR clusters with access to most Hadoop tools. Amazon’s EMR is a managed cluster platform that simplifies running big data frameworks such as Hadoop, Spark, Presto, and other applications in the Apache/Hadoop stack.

The EMR/S3 solution decouples storage from compute. You only pay for compute when you use it (high utilization). Multiple EMR clusters can access the data (S3, Hive Metastore), and interactive workloads (Hive, Presto, Spark) can be launched via on-demand clusters.

We’ve seen some benefits with Amazon EMR:

Intelligent resizing

  • Incrementally scale up (add nodes to EMR cluster) based on available capacity
  • Wait for work to complete before resizing down (removing nodes from EMR cluster)
  • Can scale core nodes and HDFS as well as task nodes

Cost Savings

By moving to EMR and S3, we’ve been able to considerably cut costs. With S3 we pay only for the storage that we use, not for total capacity. And with EMR, we’re able to take advantage of  “on-demand” pricing, paying low hourly rates for clusters only when we need the capacity. Also, we’ve reduced the cost even further by purchasing Reserved Instances and bidding on Spot instances.

  • Use Amazon EC2 spot instances to save > 80%
  • Use Amazon EC2 Reserved Instances for steady workloads

Reliability/Improved Operational Support

Amazon EMR monitors nodes in each cluster and automatically terminates and replaces an instance if there is a failure. Plus the new environment has been built from scratch, is configured via Terraform, and uses automated Ansible templates.

Job Scheduling

We use Luigi to orchestrate our Python jobs. Luigi enables us to easily define task workflows without having to know much about other workflows. It is an open source Python framework created by Spotify for managing data processing jobs, and it is really good at dependency management, which makes it a perfect tool for coalescing dependent data sources.

Centralized Hive Metastore

We have a centralized Hive metastore that saves all the structure information of the various tables, columns, and partitions for our Hive metadata. We chose Hive for most of our Hadoop jobs primarily because the SQL interface is simple. It is much cleaner than listing files in a directory to determine what output exists, and is also much faster and consistent because it’s backed by MySQL/RDS. This is particularly important since we rely on S3, which is slow at listing files and is prone to “eventual” consistency issues.


We continue to ingest production data from MySQL tables on a daily basis using Apache Sqoop, but in the “data-foundry” ecosystem we ingest the tables incrementally using “changed” columns to allow for quicker updates.

We ingest web analytics data by using Pinterest Secor to dump data from Kafka to S3. We then process it from that S3 path using Spark, both hourly and daily. Hourly we  ingest the latest data for each web analytics table since the last time it was ingested and write it to Hive tables partitioned by day and hour. Daily we also ingest the web analytics data to day partitioned Hive tables.

We ingest Salesforce data using a combination of the Salesforce REST and Bulk APIs using custom internal built Python clients for both. Tables are ingested through Spark using the API that makes the most sense based on the size of the data. Also, where available, we use primary key chunking in the Bulk API to optimize ingestion of large tables.

In addition to the ingestion processes that bring us to feature parity with the old CDH5 infrastructure, we also ingest data from a few other sources, including Google Analytics and several other 3rd party services.

We ingest Google Analytics data three times a day for the current day and once for the previous day based on SLAs provided by Google. We use Spark in addition to Google’s BigQuery and Cloud Storage clients to ingest Google Analytics data for our mobile app, organizer app, and web app to Hive tables partitioned by day.


By separating analytics processing from visualization and queries, we’ve been able to explore more tooling options. Both Presto and Superset have proven to be useful.  

Presto is a distributed SQL query engine optimized for ad-hoc analysis. It supports the ANSI SQL standard, including complex queries, aggregations, and joins. Presto can run on multiple data sources, including Amazon S3. We’re using Presto with EC2 Auto Scaling Groups to dynamically scale based on usage patterns.

Presto’s execution framework is fundamentally different from that of Hive/MapReduce. It has a custom query and execution engine where the stages of execution are pipelined, similar to a directed acyclic graph (DAG), and all processing occurs in memory to reduce disk I/O. This pipelined execution model can run multiple stages in parallel, and it streams data from one stage to another as the data becomes available. This reduces end-to-end latency, and we’ve found Presto to be quite snappy for ad-hoc data exploration over large datasets.

An additional benefit is that Facebook and the open-source community are actively developing Presto, which has no vendor lock-in because it speaks ANSI-SQL.

Superset is a data exploration and visualization tool that was open sourced by Airbnb. It allows for fast and flexible data access and comes complete with a rich SQL IDE, which is used heavily by Eventbrite’s business analysts.


We’ve introduced a new set of staging tables in our data warehouse that transform the raw data into dimension tables aligned specifically to meet business requirements.  These new tables enable analytics, data science, and reporting. The goal is to create a single “source-of-truth” for company metrics and company business concepts.

Data Exports

The Data Engineering team has developed a set of exporter jobs in Python to push data to targets such as Redis, Elasticsearch, Amazon S3 or MySQL. This allows us to the cache the results of queries to power reports, so that the data is available to everyone, whenever it is needed.

What next?

We’re looking for new ways to decrease our ingestion times from MySQL using stream processing with products such as Maxwell (http://maxwells-daemon.io/), which has been well-documented by Zendesk. Maxwell reads MySQL binlogs and writes row updates to Kafka as JSON. We’re also using SparkSQL and excited to use Apache Spark more broadly, especially Spark streaming.

We have a ton of enhancement requests to extend our Data Warehouse tables to meet the growing needs of the business and to provide better ways of visualizing the data via new Tableau dashboards.

As the Eventbrite family continues to grow with the acquisitions of Ticketscript, Ticketfly, and Ticketea, we continue to explore ways to migrate/combine data sources. This includes ingesting data from sources new to us, such as Amazon Redshift and Amazon Dynamo.

It’s fun times here at Eventbrite!

Special thanks to Eventbrite’s Data Engineering team: (Brandon Hamric, Alex Meyer, Will Gaggioli, Beck Cronin-Dixon, Jasper Groot, Jeremy Bakker, and Paul Edwards) for their contributions to this blog post. This team rocks!

Be the change

Since this is our first post on our blog in Spanish (this article is a translation of Ser el Cambio), we wanted to start off with a bang by featuring one of our most challenging projects we’re facing as a company. Though our offices might not be new, our engineering team is constantly growing, and one of our primary resolutions this year is that our team continue to grow from life experiences, different cultures, and, above all, achieve a balance in terms of gender representation.

Our goal is clear… More women in engineering! However, when we sat down to try and figure out how we would achieve this goal, we found that it was a whole lot more than just looking for women to submit their resumes. We discovered that the industry offers little-to-no support for women, leaving them little room to grow and practically no voice when it comes time to make important decisions.

Faced with all of this, we realized that we didn’t want to be a business that just simply tried to put more women to work. In order to tackle this issue more head-on, we developed a working group that we chose to call #ada-lovelace in honor of the great scientist and role model who served as the face for the representation of women in computer science. The idea of this group is to dream up and create a working environment in which all women feel safe, represented, and fully integrated into our business.

The group is made up of men and women who understand that women’s presence and opinions are necessary and important within our company and, above all, within every level of the field of engineering. During most of 2017 and the first part of 2018, this group has taken on the following tasks:

  • Create a space where current and future mothers within our company can feed their babies in total privacy without the fear of being observed.
  • Promote and sponsor groups dedicated to educating women in science and technology, such as Django Girls and Agile Woman.
  • Create confidential working groups within the company where concerns over day-to-day work and workplace issues can be shared in a respectful environment.
  • Ensure that interview panels have women on them for positions of all levels and roles. The idea behind this is that the hiring process shouldn’t be segregated by gender, rather, than all interviews should be focused on job-relevant abilities.
  • Creation of the first “seedlings of engineers” within our company, which will be a school for up and coming professionals that may or may not already have professional experience. Within this group, we will strive for a certain percentage representation of women in order to promote their entry into the workforce.

As part of this projects, we held interviews with women that are part of our engineering team in order to find out what had made them consider Eventbrite as a possible workplace (Interview in Spanish):

We are fully aware that this is an on-going effort, and we must continue to make progress in order to solve this problem, which not only affects us as a company, but the industry, and society as a whole.. It’s a challenge in and of itself to identify the problems that cause the gender gap to form within the realm of education and training, which then extend to the workplace and the positions that women one day might find themselves in.

You might ask yourselves, Can they pull this off? We’re on the way. Like all big changes, we need time to generate results and see the overarching benefits of this work; however, we’ve begun with the first step in the right direction.

“Never doubt that a small group of thoughtful, committed citizens can change the world; indeed, it’s the only thing that ever has.”

– Margaret Mead

Wrote in Spanish by Natalia Cortese

Translated into English and published by Melisa Piccinetti

Reviewed translation by Sebastian Torres

Doctor Python: Or How I Learned to Stop Worrying and Love ES6

Have you learned ES6 yet? Oof. When people started asking me that, I’d feel a sense of pressure that I was missing out on something. What was this “ECMA” people kept talking about? I was worried.

But Python helped me learn ES6. Weird, right? Turns out a lot of ES6 syntax overlaps with that of Python, which I learned at Eventbrite. Much of the syntax is shared between the two languages, so they kind of go hand in hand. Kinda.

Without further ado, let’s talk about these two buddies.


Block Scope

When I first started learning JavaScript (back in “ancient” ES5 days), I assumed several things created scope. I thought that conditionals created scope and was quickly told that I was wrong.

“NO. Only functions create scope in JavaScript!”

So when I found out that with ES6, we now have block scope, I was like, “WAT”.

A massive inflatable rubber ducky floating in front of a pier and building.

With the addition of const and let to ES6, block scope! Wow! I felt like I’d predicted the future.

function simpleExample(value) {
  if (value) {
    var varValue = value;
    let letValue = value;
    console.log(varValue, letValue); // value value

  // varValue is available even though it was defined
  // in if-block because it was "hoisted" to function scope
  console.log(varValue); // value

  // letValue is a ReferenceError because 
  // it was defined within the if-block
  console.log(letValue); // Uncaught ReferenceError: letValue is not defined

What else creates scope in JavaScript, ES6, and Python? And what kind of scope do they use? Check out the following table:

JavaScript Python
Scope Lexical Lexical
Namespace Functions, Classes [ES6!], Modules [ES6!], Blocks [ES6!] Functions, Classes, Modules
New Identifiers Variables, Functions Variables, Functions, Classes

Template Literals

I like to think of template literals as Mad Libs. Did you have them as a child? Sentences were missing words, and you could write anything you wanted into those spaces. You only had to conform to the specified word type: noun, pronoun, verb, adjective, exclamation.

Mad Libs that read "mothers sit around burmping. Last summer, my little brother fell in a/an hairdo and got poison palmtree all over his butt. My family is going to Winsconsin, and I will.."

Similarly, template literals are string literals that allow embedded expressions. They were originally called “template strings” in prior editions of the ES2015 specification.

Yup, these already exist in Python. I had actually learned about literal string interpolation in Python, which made it that much easier for me to understand in ES6. They are great because you no longer need the ridiculous concatenation found in older versions of JavaScript.

let exclamation = 'Whoa!';
let sentence = `They are really similar to Python.`;

console.log(`Template Literals: ${exclamation} ${sentence}`);
// Template Literals: Whoa! They are really similar to Python.
print '.format(): {} {}'.format('Yup.', 'Quite!')
# .format(): Yup. Quite!


Default Parameters

Yup. Python’s got ‘em too. Default parameters set a default for function parameters. This is most effective for avoiding bugs that pop up with missing arguments.

function nom(food="ice cream") {
  console.log(`Time to eat ${food}`);

nom(); // Time to eat ice cream
def nom(food="ice cream"):
  print 'Time to eat {}'.format(food)

nom() # Time to eat ice cream

Rest Parameters & *args

Rest parameter syntax allows us to represent an indefinite number of arguments as an array. In Python, they’re called *args, which again, I’d already learned! Are you sensing a pattern here?

Check out how each of the languages bundles parameters up in neat little packages:

function joke(question, ...phrases) {
  for (let i = 0; i > phrases.length; i++) {

let es6Joke = "Why does JS single out one parameter?"
joke(es6Joke, "Because it doesn't", 'really like', 'all the REST of them!');

// Why does JS single out one parameter?
// Because it doesn't
// really like
// all the REST of them!
def pirate_joke(question, *args):
  print question
  for arg in args:
    print arg

python_joke = "What's a Pyrate's favorite parameter?"

pirate_joke(python_joke, "*args!", "*arrgs!", "*arrrgs!")

# What's a Pyrate's favorite parameter?
# *args!
# *arrgs!
# *arrrgs!



Oh boy, we’re gonna talk about prototypal inheritance now! ES6 classes are actually syntactic sugar and based on the prototype chain found in ES5 and previous iterations of JavaScript. So, what we can do with ES6 classes is not much different from what we do with ES5 prototypes.

Python has classes built in, allowing for quick and easy Object Oriented Programming (Python is down with OOP.). I always found the prototype chain extremely confusing in JavaScript, but looking at Python and ES6 classes side by side really hit home for me.

Let’s take a look at these ES6 “classes” based on the prototype chain:

class Mammal {
  constructor() {
    this.neocortex = true;

class Cat extends Mammal {
  constructor(name, years) {
    this.name = name;
    this.years = years;

  eat(food) {
    console.log('nom ' + food);

let fryCat = new Cat('Fry', 7);
class Mammal(object):
  neo_cortex = True

class Cat(Mammal):
  def __init__(self, name, years):
    self.name = name
    self.years = years

  def eat(food):
    print 'nom %s' % (food)

fry_cat = Cat('Fry', 7)

A big difference between ES6 Classes and ES5 Prototypes: you can inherit more easily with classes than with the prototype chain. This is very similar to Python’s structure. Neato!

So there you have it. Five quick examples of Doctor Python helping me stop worrying and love ES6. It’s been many months now, and my ES6 usage is now pretty explosive.

Screen capture of Major Kong riding on top of a bomb falling from a plane in the film, Doctor Stangelove.

Mother May I?

Important announcement about updates to API V3 and evolving permissions at Eventbrite

If you want to skip straight to the content on the changes that will impact our API developers please visit our Google Group and read the message pinned to the top.

Permissions have been an ever growing challenge at Eventbrite as we have grown over the years. With scale, permissioning has become difficult because of the storage requirements, speed, and latency. Imagine a feature where you need to check the permissions for ten users of an account and 100 of the account’s events. Now take into consideration that each individual event can have multiple permissions associated with it.  You can start to get an idea of both the storage requirements and the speed considerations. Even if each permissions check is very fast, executing all of them serially will become slow.

Continue reading

Britecharts v2.0 Released

Britecharts, Eventbrite’s D3 based charting library, has grown with additional charts contributed by the community. It is now a mature library, but it still lacks some charts used in today’s standard DataViz suites. We want to add these charts, and that means we will experience some growing pains. We wondered, how we could make that process easier? Continue reading

Packaging and Releasing Private Python Code (Pt.1)

When dealing with a large Python code base managed by multiple teams, you often find that you need to be able to package and release this code independently. Most best-practices guides for releasing Python packages focus on public packages, and do not cover complex dependencies. In this post I’ll focus on how we, at Eventbrite, release our internal Python packages and avoid dependency hell while doing so. This first part will cover defining packages and their dependencies, while the second part will cover building and distributing Python wheels internally.

Continue reading

Introducing Britecharts: Eventbrite’s Reusable Charting Library Based on D3

The usual workflow when developing interactive data visualizations with D3.js is based on the significant number of examples that the D3 community provides. They are broad and useful, but they are still not ideal. Most of the time, they require a lot of effort to integrate into your code and to make them production-ready.

In a previous series of posts about Leveling Up D3, I talked about a different way of building D3.js charts, using the Reusable API, building our components via TDD and improving them with events and refactorings. Following those ideas, and with the help of Eventbrite’s design team, we have been working on our chart library, and now we want to share it with you. It’s called Britecharts.

Continue reading

5 Good Practices I Follow When I Code Using Git

Nowadays using Git is almost a rule and of course tools like GitHub, GitLab and Bitbucket are almost a standard.

To me, it really doesn’t matter the size of the project that I am coding, it could be for my current job, a freelance one or my personal apps: I always use Git.

I think that habit is like a cane to walk the road to perfection. That’s why I do not just use Git but also always follow some best practices that I have learned.


  1. Use branches for features, AB tests, fixes, etc.
  2. Commit often.
  3. Use clear commit messages.
  4. Always use pull requests.
  5. Keep master releasable.