November 17, 2013

The Online Dating Email: Personalized vs Generic

The following is an excerpt from my new book, A Hacker's Guide to Online Dating: How to Train Your Computer to Get You Dates. In it I discuss all my tips and tricks learned over years of online dating, how to optimize your online dating strategy, and how to automate the whole process.

So you want to know if your online dating emails need to be personalized? You want to know whether you need to read her profile to craft an eloquent essay about all the things you have in common, or if you can get away with sending a generic, cut-and-paste email to everyone. It’s an important question. Crafting that perfect personalized email takes a lot of time. And if you consider the fact that even with a perfect email, the likelihood of a response is fairly low, you’re going to have to send somewhere between 5 and 20 emails to get one response. All that time writing emails adds up fast.

If you do some research online, you’ll probably see a lot of articles written by women whining about how generic emails are such a turn off, how unromantic they are, and how no one wants to respond to them, yada, yada, yada. Some articles are less emotional, but also conclude that generic emails are less likely to get a response, usually using a mixture of opinion an anecdote. But those people are missing the point. They’re answering the wrong question. The question isn't, Do people prefer personalized emails to generic ones? The question isn't even, Are people more likely to respond to personalized emails than generic ones? No, the relevant question is much more specific: Is the increase in response likelihood gained by a personalized email over a generic email worth the time it takes to write it? The answer to that question is an unequivocal no. Here’s how I know.

The experiment

In an effort to optimize the online dating process, I set out to determine which factors determine whether or not a woman will respond to my email. To do this, I needed some data. So I conducted a little Match.com field experiment. With the help of some open source software, I set up an algorithm to randomly email women in my area and record the vital stats of each. Things like Age, Height, Location, Ethnicity, self-reported Body Type, and whether or not they were listed as “new” according to whatever criteria Match uses. Out of curiosity, I also recorded the “percent match” to see if the algorithm used by Match had any effect on whether a woman would respond to one of my emails. I wanted to somehow track attractiveness, so I began by subjectively rating each profile myself, but it became too time consuming, so I gave that up.

The results were interesting to say the least. Take a look at the table below. You can read all about the nitty gritty statistical details of the analysis in the sidebar if you’re so inclined, but first let’s just look at the results. The table shows which variables have the biggest impact on whether or not a woman is going to respond to an initiation email.

What doesn't matter

Note what’s conspicuously missing. The “percent match” produced by Match.com’s algorithm had no significance. I thought that height might have an impact, but that had not discernible effect either. Many online dating advice articles recommend sending the initial email on certain days of the week, but the timing didn't have any effect in my experiment. Neither did the factor that I call “delay,” which is the amount of time after a member joined the site before I sent the initial contact email. After the initial new member period of a week or two, it didn't matter how much longer I waited.

What does matter

What did matter? Well, basically, all the items in the table above, accept the email text. If you’re not familiar with statistical analysis, let me just briefly explain the numbers in the table. The “P Value” estimates how significant the analysis results are for a certain variable. One way to look at the P Value is to think of it as the likelihood that the analysis is due to coincidence. So a lower P Value means the results are stronger. A general rule of thumb is that any variable with a P Value below 0.05 or so is considered “statistically significant.” So in the table above, Age and location (the Nearby variable) are very strong indicators of email response rate. Recent profile activity and the profile being a “new” member also are fairly strong indicators. The email text is not significant at all according to the P-Value.

The Z-Score Delta is essentially an indirect measure of how each variable affects the probability of an email response. This is easier to understand if you look at the chart below, which shows how the estimated email response rate changes with each variable. All else being equal, the likelihood of a response increases with the age of the emailee. This makes sense to me, since age seems to be a good proxy for general desirability. Men prefer younger women, and those women get more emails, making them less likely to respond to yours.

Now look at how the line shifts when we start considering the other variables. There’s a huge jump for nearby profiles, and a huge decrease for inactive profiles. There’s also a fairly big drop for “new” profiles. But take a look at the email curve. Even if you ignore the fact that the P Value is so large and you pretend the analysis results are valid, the impact of the email is still tiny. The difference between the baseline curve and the email curve is almost imperceptible, and it’s dwarfed by the other variables that have a meaningful impact.

Bottom line, the email just doesn't matter. Let me say that again in all caps so there’s no confusion, THE EMAIL JUST DOESN'T MATTER!

Now you might still be tempted to ignore these results, because hey, I’m just one guy. True, these are the results for me in my specific situation. But consider this: My results are supported by Plenty of Fish, who probably has access to a little more data than me. Here’s a note that Plenty of Fish displays to help you with your profile:

We've found that women READ your profile BEFORE they open your email. If your profile description sucks it doesn't matter how good your email is. Make sure your description contains "Hopes and aspirations", "Hobbies/interests in general", "Musical Tastes". These are all conversation starters and will show a woman you have something in common with her. If your profile description is blank or super short you are 9 times more likely to get "unread deleted". Edit your description section.

The key here is that women read the profile before they read your email! She’s likely to have already decided whether or not she’ll respond before she even opens your email.

The Guts of My Analysis

For those who are statistically inclined, here’s what I did. Using a combination of AutoIt (open source automation software) and R (open source statistical analysis software), I set up an algorithm to scrape profile search results from Match.com and automatically email random profiles in my area over the course of several weeks. I tracked key variables like age, height, percent match, city, ethnicity and whether or not I got a response. All told, I had close to 600 pieces of data. Using R I conducted single variable probit analysis for each variable with respect to the email response likelihood and found which ones showed any significant correlation (as measured by the P Value). Then, using only the significant variable, I conducted a multinomial probit analysis, the results of which are reported here.

If you’d like to conduct your own analysis, or just automate your emails, I've provided the code in the appendix to my book