Coding, How-To, New York City, Random

How a Copywriter Becomes a Software Engineer

When I graduated college, I had a degree in Philosophy and a desire to work as a writer — specifically as a journalist.

“So much liberal, so much arts.”

This eventually became, “so much for liberal arts!”

And for a while, writing was precisely the work that I did. I moved to NYC and started my career as a creative (copywriter and strategist) at a digital ad agency. I was given a cheap laptop and a seat without much leg room and was put right to work — thank God the work didn’t involve me moving because my legs were asleep from the lack of space (not true, unless I was in outer space, or something).

But before I started that job as a copywriter, I had this crazy tech company idea: Newspanion — “it’s like Pandora, but for news.” Back in 2010/2011, there weren’t a lot of Pandoras for news, so it was a novel idea, which is no longer the case. My business partner, Andrew, was a guy I met some many months earlier in line for a bar (apparently my early-20s self thought it was acceptable to wait in lines to access alcohol).

Together, he and I wanted to start a tech company…

and we had pretty much zero tech skills.

What does that mean? Well, we weren’t going to be technologically engineering anything, unless you count engineering e-mails to try to find a technical cofounder.

So here I was, a copywriter/strategist moonlighting as a tech entrepreneur — writing copy by day and business plans by night. Collectively, Andrew and did a lot of research, wrote a lot of business stuff, sent a lot of business e-mails, and tried our damned-est to find a technical cofounder.

But it never worked out; something always went wrong.

It was around my 5th or 6th month at the agency when I decided that I would try to build a prototype for our startup so we could pitch investors. I didn’t know where to start.

I hadn’t written any HTML since Geocities in 1997.

Hell, I didn’t even know if HTML would do it. I did some research, and everyone was always talking about this Ruby on Rails thing. That sounded like it would do the trick because it sounded fancy enough. But some people said I should learn Ruby before Rails (and I was like, “Where’s the train platform again? Is that near the jewel excavation site?”).

I took their advice and bought a book on Ruby. And the next thing I noticed was that everyone using Rails seemed to be using Macs. In fact, lots of software engineers were on Macs, and I had a rickety-ass, 7-year-old Toshiba that looked a lot like something you could use as an ad-hoc baseball plate (if you were actually caught in a base-less-baseball-game pinch). This laptop looked like a stove, or an ironing board, or a model UFO. Actually, not a model UFO, that would look cool; this thing was so uncool.

toshiba laptop

My Toshiba laptop and netbook. Remember netbooks?!

Coming to terms with these facts, I bought a 13-inch Macbook pro, which I’ve been using as a reliable (and expensive) paperweight recently. When I unpackaged that puppy though, it was like coding Christmas. Ruby was already installed! Things were going to be so great!

… And pretty much every day for a few months, I went to work, wrote some ads and copy.

Then I went home and wrote some really, really bad code.

Actually, a lot of the time I didn’t even write it, I copy-pasta’d it. Those early days were filled with plenty of soup and copy pasta. I might as well have called myself a sous-do chef!

I really had no idea what I was doing. I didn’t have a good framework for understanding the material. I found my studies in philosophy to help me during these Boeing-747-turbulent times by helping me ask good questions (I also had plenty of others… caution: article contains bad words!):

  • “What is the problem I am trying to solve here?”
  • “What is this doing? I want to understand this.”
  • “If this works and that works, what happens when I remove this? Does it still work?

Good ol’ deductive reasoning. I never thought I’d be deducing bug origins (of type software or creepy-crawly) when I was sitting reading Nichomachean Ethics, but gosh was I wrong.

I just kept plugging away, until one day,

I had a really shitty-spaghetti prototype (Bon Appetite!)

of our web application.

But hey, it worked, and now we could at least show people our idea. Long story short: we showed and showed, but just never could get the traction we were looking for, and the startup died.

Meanwhile though, back in real-life world, I was still a copywriter and strategist, writing ads that you probably didn’t even notice while surfing the interwebs. About 12 months into that job, I decided I’d try a hackathon. At the event, I luckily sat next to an engineering wizard, Toby, (basically the Harry F’ing Potter of coding), and we won. With the taste of victory so fresh, I went to another hackathon and won that one too. I will always hold that we won because of our teams; my teammates defined excellence. At these hackathons I provided product/creative contributions, in addition to writing mangled HTML/CSS/Javascript.

It was about this time that I was feeling pretty awesome about my new skillz. I mean, I could actually pay the billz now!

Fast forward a little bit to 2012. I quit my job as a copywriter and joined AppNexus, that global ad tech company you may have heard of. I started at AppNexus as a technical account manager (TAM), since there was no way I could really be hired as a software engineer (well, maybe, if I only did front-end). But still, at tech companies, there are lots of people more technical than you when you are a self-taught web developer. And so I was a TAM, TAMing away at the beat of my drum. TAMing was great because it taught me that I had enormous gaps in my computin’ knowledge. I didn’t know how to use CURL or write SQL statements when I started at AppNexus. But most important of all, I learned that I didn’t really know how to figure things out on my own yet — I mean the type of figuring out that doesn’t include Stack Overflow, but instead, includes RTFMing.

Just RTFM!!

Just RTFM!!

As a TAM, I found myself doing all sorts of repeated tasks in Excel, which I didn’t like much, so I took up scripting and Python.

I’ll tell you what, my first Python scripts were so bad

that someone probably should have written a script…

to rewrite all of mine.

Sometimes I wondered if I should have stuck to writing movie scripts.

With time, though, I got faster and better, and was on to the next challenge: Data Analysis with Python. I moved into a new role as an analyst. Being a data analyst is cool because you get to decide on which tools you want to use to conduct your analysis, and then you also get to decide how you want to display the data. I chose Python for everything I did, just to get better. And better I got. (I also would often do things the hard way just to get the most learning out of the task.)

In August of 2015 I moved into a Software Engineering role,

approximately 1540 days after my initial efforts

to learn how to build software back in May, 2011.

Granted, I took plenty of breaks in between and probably could have accelerated the process by years by going for an internship, or more school (… ew). But I didn’t go that route, because it’s about the journey, not how you get there! Wait a minute. It’s how you got there, not the journey. Uh, scrap the platitudes… Did someone say journey?

What I meant to say is: Don’t stop believing.

Standard
Coding

How to Retrieve and Analyze Your iOS Messages with Python, Pandas and NLTK

I’m one of those people that keeps every text message I send or receive — I never delete them. Meet a girl at a bar, text her the next day and never hear back from her? I keep that. Weird wrong-number texts? I keep those too. Ex-girlfriend texts? Definitely keepers.

I had 65,378 messages on my phone at the time of writing this post.

I’m not a digital hoarder or anything, but I primarily do this because I like the idea of being able to search through the past. But, digital hoarder or not, collecting anything takes up some sort of space, and when I found that my text messages were taking up 4GBs of space on my phone, I decided it was time to back them up. It was at that point that I realized I could also probably analyze them.

As it turns out, you can do this, and I’ll tell you how. For this project, I used Python/Pandas/NLTK for the analysis and an iPython Notebook to render the datasets. I’ve also uploaded the code to GitHub, which you can view here.

An overview of the steps to make this happen:

  1. Sync/back up your iPhone because the messages need to be stored on your computer.
  2. Load the SQLite file and retrieve all messages
    • You can follow the directions for retrieving the right file here.
  3. Analyze those mensajes (I used Pandas)!!

Let’s get into some details.

You need to sync and back up your phone’s contents to your computer. There’s a great post on how to do this here. In case you want to skip that read, you’re ultimately getting a file with the text messages in it; copying it and moving it into your working directory.

You can find the file with this bash command:

$ find / -name 3d0d7e5fb2ce288813306e4d4636395e047a3d28

Now, loading the SQLite file — you can actually see what’s in this file via the command line:

 $ sqlite3 3d0d7e5fb2ce288813306e4d4636395e047a3d28 

Then you can check out the available tables:

sqlite> .tables
_SqliteDatabaseProperties chat_message_join
attachment handle
chat message
chat_handle_join message_attachment_join

From here, the main tables I found useful were “message” and “handle.” The former contains all of your text messages, and the latter contains all of the senders/recipients. I only wrote code around the messages table, primarily because I could never figure out how to make a join between message and handle, but that was probably something trivial that I overlooked. Please tell me how you did it, if you did!

Continuing on, the message table has lots of columns in it, and I chose to select from the following:

['guid', 'service', 'text', 'date', 'date_delivered', 
'handle_id', 'type', 'is_read','is_sent', 'is_delivered',
'item_type', 'group_title']

The key field is “text,” which is where the content of the message is stored, which includes emojis! (A cool thing is that your emojis will show up if you try to plot them in something like an iPython notebook. You could run an entire analysis on emoji usage…)

My analysis, however, ultimately breaks down into two pieces:

  1. Analyzing the content of the “text” field (excluding emojis).
  2. Analyzing the messages themselves (for example, total text messages, or, what I sent vs. what I received, for instance).

For #1, I wrote code that:

  • Classifies all words and assigns a part of speech to them, then check the counts of each part of speech.
    • You should get a table looking like this.

      You should get a table looking like this.

  • Counts the number of times each word appears in the dataset, and gives an overview of the dataset:
    • total_words_filtered
  • Excludes boring words, like prepositions, and words that are < 2 characters.
  • Classifies all words as is_bad=1 or 0. I did this by using a .txt file full of bad words, found here:
  • Plots usage of bad words
    • I’d love to show you my plot, but let’s just assume I never swear…

For #2, the code allows you to:

  • Plot the number of text messages received each day (check out the spike on your birthday or during holidays). You can see my data below has a huge gap (that’s when my phone was replaced and not backed up for many months. My timestamp conversions are also apparently incorrect, but I haven’t looked into it.
    • The timestamp conversion is off, so someone can fix that... we're not in 2016, yet... Are we??

      The timestamp conversion is off, so someone can fix that… we’re not in 2016, yet…Or am I??

  • Count the number of sent versus received messages.

Anyway, I hope you can get some use out of this, and instead of blabbing on about the code here, I’ll just let you read it and use it on your own. Please check out my git repo, and please reach out to me with questions, comments, etc.

Standard
Coding, How-To

How to Create Geo HeatMaps with Pandas Dataframes and Google Maps JavaScript API V3

Get excited because we’re going to make a heatmap with Python Pandas and Google Maps JavaScript API V3. I’m assuming the audience has plenty of previous knowledge in Python, Pandas, and some HTML/CSS/JavaScript. Let’s begin with the DataFrame.

The DataFrame

First, you’re going to need a dataframe of “addresses” (can be a physical address, or even just a country name, like USA) that you eventually want to plot. (For the sake of simplicity, I’ll try to refer to the “address” as the “geo” for the rest of this document.) Second, since you are planning on using a heatmap, you’re going to want some sort of number that represents the weighted value of that row in comparison to other rows.

Let’s say your DataFrame looked like this:

grouped_country_df = main_df.groupby('country')\
                            .agg({'pink_kitten': lambda x: len(x.unique())})\
                            .sort('pink_kitten', ascending=False)
print grouped_country_df
geo_name count_of_pink_kittens
USA 3430
Spain 577
United Kingdom 352
Israel 292
Austria 196
Argentina 151
India 133
Singapore 66

Now you have a list of geos and some values to use as the weight when later creating the heatmap. But to plot these points, you’re going to need some lat and long coordinates.

Getting Lat Long Coordinates from Google Maps API

If you have a list of geos or “addresses,” you can use Geocoding to convert those geos into lat/long coordinates. From Google: “Geocoding is the process of converting addresses (like “1600 Amphitheatre Parkway, Mountain View, CA”) into geographic coordinates (like latitude 37.423021 and longitude -122.083739), which you can use to place markers on a map, or position the map.”

To use this Google Maps service, you need to have a Google Maps API key. To get a key, you can follow the directions here. When you sign up for an API key, you should select “Server Side Key,” since we will be running a Python script server-side to access the Google Maps API.

Once you have your api_key, you can work on getting geocoded results for all of your geos. You can do this with the following code:

import requests
# set your google maps api key here.
google_maps_api_key = ''

# get the list of countries from our DataFrame.
countries = grouped_country_df.index
for country in countries:
    # make request to google_maps api and store as json. pass in the geo name to the address 
    # query string parameter.
    url ='https://maps.googleapis.com/maps/api/geocode/json?address={}&amp;key={}'\
         .format(country, google_maps_api_key)
    r = requests.get(url).json()

    # Get lat and long from response. "location" contains the geocoded lat/long value.
    # For normal address lookups, this field is typically the most important.
    # https://developers.google.com/maps/documentation/geocoding/#JSON

    lat = r['results'][0]['geometry']['location']['lat']
    lng = r['results'][0]['geometry']['location']['lng']

This only gets you so far, since you still need to do something with those latitude and longitude coordinates. We have a few options here:

  1. If you are building a web application, you can pass those values into an HTML template as variables and they will end up getting plotted via JavaScript.
  2. We can print out the format of the JavaScript, and later past it into our HTML file within script tags.
  3. Other approaches that I’m not going to talk about.

For the sake of time, I’m going to show #2, which lends itself to a one-off analysis. You’d probably want to go with some dynamic templating approach, like #1, if you are going to pull and plot the same data repeatedly.

Add the following code to your for-loop from above, right underneath

lng = r['results'][0]['geometry']['location']['lng']

# set the country weight for later. by getting the value for each index in the dataframe
# as it loops through.
country_weight = int(grouped_country_df.ix[country])
 
# print out the Javascript that we will be copy-pasting into our HTML file
print '{location: new google.maps.LatLng(%s, %s), weight: %s},' % (lat, lng, country_weight)

After running your script, copy the output, which should look like this:

{location: new google.maps.LatLng(37.09024, -95.712891), weight: 3430},
{location: new google.maps.LatLng(40.463667, -3.74922), weight: 577},
{location: new google.maps.LatLng(55.378051, -3.435973), weight: 352},
{location: new google.maps.LatLng(31.046051, 34.851612), weight: 292},
{location: new google.maps.LatLng(47.516231, 14.550072), weight: 196},
{location: new google.maps.LatLng(-38.416097, -63.616672), weight: 151},
{location: new google.maps.LatLng(20.593684, 78.96288), weight: 133},
{location: new google.maps.LatLng(1.352083, 103.819836), weight: 66},

You’re going to use these values in the next step.

Creating an HTML file that contains Javascript for Plotting your Lat Long Points.

You need to create an HTML file that contains some script tags within it. I am simply going to paste my code below with annotations. If you copy the location strings from above, you will be able to paste them directly into this HTML file under the “heatmapData” array (defined below in the code).

<!DOCTYPE html>
  <head>
    <title>Simple Map</title>
    <meta name="viewport" content="initial-scale=1.0, user-scalable=no">
    <meta charset="utf-8">
    <style>
      html, body, #map-canvas {
        height: 100%;
        margin: 0px;
        padding: 0px
      }
    </style>
    <!-- Load Google Maps API. -->
    
    
  
    
    function initialize() {
      var heatmapData = [
        {location: new google.maps.LatLng(37.09024, -95.712891), weight: 3430},
        {location: new google.maps.LatLng(40.463667, -3.74922), weight: 577},
        {location: new google.maps.LatLng(55.378051, -3.435973), weight: 352},
        {location: new google.maps.LatLng(31.046051, 34.851612), weight: 292},
        {location: new google.maps.LatLng(47.516231, 14.550072), weight: 196},
        {location: new google.maps.LatLng(-38.416097, -63.616672), weight: 151},
        {location: new google.maps.LatLng(20.593684, 78.96288), weight: 133},
        {location: new google.maps.LatLng(1.352083, 103.819836), weight: 66},
      ];
       
      // Add some custom styles to your google map. This can be a pain. 
        // http://gmaps-samples-v3.googlecode.com/svn/trunk/styledmaps/wizard/index.html
      var styles = [ 
        {
          "featureType": "administrative",
          "stylers": [
            { "visibility": "off" }
          ]
        },
        {
          "featureType": "road",
          stylers: [ 
            { "visibility": "off"}
          ]
        },
        {
          "featureType": "landscape",
          "elementType": "geometry.fill",
          "stylers": [
            { "color": "#ffffff" },
            { "visibility": "on" }
          ]
        },
      ];
      // create a point on the map for the Atlantic Ocean, 
      // which will later be used for centering the map.
      var atlanticOcean = new google.maps.LatLng(24.7674044, -38.2680446);
      // Create the styled map object.
      var styledMap = new google.maps.StyledMapType(styles, {name:"Styled Map"});
      // create the base map object. put it in the map-canvas id, defined in HTML below.
      map = new google.maps.Map(document.getElementById('map-canvas'), {
        center: atlanticOcean, // set the starting center point as the atlantic ocean
        zoom: 3, // set the starting zoom 
        mapTypeControlOptions: {
          mapTypeIds: [ google.maps.MapTypeId.ROADMAP, 'map_style'] // give the map a type.
        }, 
      });
       
      // Create the heatmap object.
      var heatmap = new google.maps.visualization.HeatmapLayer({
        data: heatmapData, // pass in your heatmap data to plot in this layer.
        opacity: 1, 
        dissipating: false, // on zoom, do you want dissipation?
      });
      heatmap.setMap(map); // apply the heatmap to the base map object.
      map.mapTypes.set('map_style', styledMap); // apply the styles to your base map.
      map.setMapTypeId('map_style'); 
       
      // Add a custom Legend to Your Map
        // https://developers.google.com/maps/tutorials/customizing/adding-a-legend
      var legend = document.getElementById('legend');
      map.controls[google.maps.ControlPosition.RIGHT_BOTTOM]
         .push(document.getElementById('legend'));
       
      // This is hard-coded for the countries I knew existed in the set.
      var country_list = ['USA','Spain','United_Kingdom','Israel',
                          'Austria','Argentina','India','Singapore'];
       
      // for each country in the country list, append it to the Legend div.

      for (i = 0; i < country_list.length; i++) {
          var div = document.createElement('div');
          div.innerHTML = '<p>' + country_list[i] + '</p>'
          legend.appendChild(div);
      } 
    }

     google.maps.event.addDomListener(window, 'load', initialize);

</script>
</head>

<body>
    <'div id="legend" style="background-color:grey;padding:10px;">
    <strong>Countries Mapped</strong>
    </div>

    <'div id="map-canvas"></div>
    </body>
</html>

Open the HTML file in your browser, and you should see something like this.

google maps heatmap

Et Voila!

Standard
Coding

Creating and Using SlugFields for URLs with Django

I haven’t written a post centered around coding in a long while, but I’ve recently been learning Django via the Tango with Django tutorial and I got hung up on SlugFields. My problem was that I had never heard of a slugfield, and thus, really had no idea how they would end up being implemented. Slugs? Yes. Fields? Yes. Metal Slug? Yes (great game series)! But SlugFields? No. Nope. Dunno what those are.

SlugField from Django:

Slug is a newspaper term. A slug is a short label for something, containing only letters, numbers, underscores or hyphens. They’re generally used in URLs.”

Related notes from StackOverflow.

Still not getting it? Well, if you are creating a model with a field type Slug, and you use the slugify function, then that field will expect and hold down-cased, space-replaced values. The reason you want a down-cased, space-replaced value is to be able to use that tidy field value in a URL later on. Here is the regex validation that occurs on the slugfield:

slug_re = re.compile(r'^[-a-zA-Z0-9_]+$')
validate_slug = RegexValidator(slug_re, _("Enter a valid 'slug' consisting of letters, numbers, underscores or hyphens."), 'invalid')

In the Tango with Django tutorial, the code for creating your model is given as follows:

from django.template.defaultfilters import slugify
class Category(models.Model):
    name = models.CharField(max_length=128, unique=True)
    views = models.IntegerField(default=0)
    likes = models.IntegerField(default=0)
    slug = models.SlugField()
    def save(self, *args, **kwargs):
        self.slug = slugify(self.name)
        super(Category, self).save(*args, **kwargs)

In this case, any time a Category is saved, we are setting the slug attribute to the name of the category by calling the slugify function and passing in self.name:

 
self.slug = slugify(self.name)

This is the slugify docstring:

Definition:  slugify(*args, **kwargs)
Docstring:
Converts to lowercase, removes non-word characters (alphanumerics and
underscores) and converts spaces to hyphens. Also strips leading and
trailing whitespace.

After you’ve made the changes to your model and performed your migrations, you can see how this will look when a new category is created in your category table:

mysql> select * from rango_category;
+----+------------------+-------+-------+------------------+
| id | name             | views | likes | slug             |
+----+------------------+-------+-------+------------------+
|  4 | Python           |   128 |    64 | python           |
|  5 | Django           |    64 |    32 | django           |
|  6 | Other Frameworks |    32 |    16 | other-frameworks |
+----+------------------+-------+-------+------------------+
3 rows in set (0.00 sec)

From here, you would use these slugs to create pretty urls by either using the slug itself, or using a combination of the slug/id fields.

Standard