All the way back in 2015 when Drew and I were first bouncing around the ideas that eventually led to dbt, the very first problem I was obsessed with was the complete lack of community in the data analyst ecosystem. Software engineers had Hacker News and Stack Overflow and so many great conferences and communities tiny and huge organized around projects and methodologies and demographic groups. The social capital within software engineering is massive.
It’s communities that are responsible for the unbelievably fast rate of innovation in the way that software is written. The migration from waterfall to XP to Agile, the migration from bare metal to virtualization to cloud to containerization — each of these implies changes in the way that software is built. To take advantage of these shifts, practitioners need to be constantly evolving, constantly learning, and the field needs to go through this process together.
Communities act as the primary transmission vector for these (and so many other) new software engineering practices. Communities determine what’s exciting and how it gets used, what new practitioners learn, and where investment dollars flow. And in turn, they give participants professional development, a sense of identity, and a way to give back.
Compared to this rich social fabric, analysts in 2015 had nothing.
I remember trying to figure out how to build MRR models in SQL. I knew I needed a table with one record per customer per month, but the thing I couldn’t figure out was how to handle the situation where a customer didn’t pay for one month and then reactivated. How could I generate a row if there was actually no data for that month?
It turns out that this is a well-understood problem, and the answer is to join on a months table. But I had never had to do that before and it just didn’t occur to me. Stack Overflow? Nothing. Google results? Useless.
If you were doing this work back in 2015 you know this experience. Your only sources of knowledge were your coworkers and your personal network. There were tons of new tools — the entire modern data stack was brand new — but nowhere for analysts to learn from each other. There was no consensus on best practices, analysts had no sense of shared identity and limited ability to grow their skillsets and networks. And certainly no way to give back to their peers.
As a result, “data analyst” was kind of a shit job. The standard career advice was to migrate to data science or data engineering as quickly as possible.
This is the environment into which the dbt community was born.
As frustrated as I was with this, I don’t really understand how groups of people work. I would’ve loved to have some brilliant idea about how to create a community full of like-minded analysts who wanted to support each other and push the field forwards, but I literally didn’t know what the first step towards doing that could possibly be. So I shrugged and moved on. I actually had good ideas about how to build this data transformation product I had been thinking about and figured I’d get to work on that.
We started dbt Slack before we started the company, all the way back in April of 2016. We didn’t think of it as a community, we thought of it as a public support and feedback forum, a way for us to do cheap customer development for dbt. At first it was crickets. I still remember when our first users randomly showed up—who’s that and how did they hear about us!?
Very quickly, we had a very special group on our hands. Because there weren’t very many of us, we all felt like we knew each other. And because we were all engaged in doing the same kind of work, we didn’t just talk about dbt, we talked about the entire ecosystem. Questions about Stitch and Fivetran, Mode and Looker, Redshift and Snowflake were all asked and answered every single day. So many arguments were had about SQL clients, about star schemas. Inside jokes developed, including my favorite “Have you tried switching to Snowflake?” response to every single question in #redshift. Cultural norms were established: threading, non-gendered pronouns, and many more. Meetups were organized and much fun was had.
Eventually, the community became self-aware. New members would post thank-yous and you-are-awesomes, saying how much the support of the community had helped them. And this kind of positive feedback created a magical virtuous cycle—everyone likes to pay it forwards.
As of this writing, dbt Slack just passed 4,000 members. It’s no longer a small group, but it still has the same essential qualities that it always has. Members are helpful, with many questions leading to threads with dozens of responses. Members are friendly, enthusiastic, and shockingly thorough.
All we meant to do was to support some early dbt adopters, but somehow all of us together built something much more.
It’s time this community had an event to meet in person. That’s what Coalesce is: the one event every year where the global dbt community gathers together in one location.
We’ll have a lot more to share about how we see this events is coming together in the coming weeks and months. For now, if you’re willing to trust us, go ahead and book your early-bird ticket.
If analytics is a subfield of software engineering, then analysts need communities every bit as vibrant as those in software engineering.
Thanks for reading. I can’t wait to see you there.