Posts

Hackbright Week 6: Sweet, Terrifying Freedom

By Aimee Morgan (Engineering Fellow, Hackbright Academy – summer 2013 class)

Last week was the much anticipated start of full time work on our individual projects. Although there were a few moments of terror on Monday morning (“where do I start?”), it was mostly a very good week. I exceeded my own expectations in terms of what I got done and had a lot of fun doing it. I feel a little weird about saying that – I know that it’s still early on and I haven’t had time to get sick of my project yet.

I expect that at some point over the next few weeks the honeymoon period will end and I’ll hit a wall (or five). But for now, things are good. My general stress level is exponentially lower now that I’m free to work at my own pace and decide on my own tasks for any given day.

Some accomplishments so far:

1. Developed a new data model for the NYPL menu collection dataset.

2. Loaded all of the data into a Postgresql database. Since I was reconfiguring the data model, this wasn’t as simple as importing the CSV(comma separate values) files I got from NYPL – one of those files was split between two database tables, and a third table was cobbled together out of columns from two different files.

3. Also, in the “valuable life skills” department: learned how to dump a backup of my database, then reimport it. (More on this later.)

4. Set up ORM using sqlalchemy and wrote some basic methods for retrieving information from the database.

5. Decided for sure that I will use Flask for my web app and deploy on Heroku. Read a lot of Heroku documentation, which is surprisingly good. Impressed with Heroku so far and I think it will work well for my beginner-level deployment needs.

6. Worked through large chunks of the Flask mega-tutorial – we used Flask for several exercises earlier this month but just barely scratched the surface.

7. Read a whole lot about natural language processing (mostly this) and took a lot of notes on how I might use NLP techniques on my data.

8. Started work on a Python script to normalize / de-duplicate the database table containing information on restaurants. Since the restaurants table includes a column that serves as a foreign key in the menus table (so that menus are linked to the restaurants that issued them), I can’t just go in there and delete rows without updating the corresponding information in the menus table to point to the authoritative version.

9. Started compiling lexicons to use in the data processing functions (for example: if a dish contains one or more words that appears in this particular list, it is a dish that contains meat).

There was one major frustration this week, but not one that was directly related to my project: yesterday morning I had to wipe out my hard drive and reinstall everything from scratch. Let this be a lesson to you: if you choose the lazy method of setting up a Windows 7 / Ubuntu dual boot system (in which your hard drive is not actually repartitioned and your Linux install lives as one giant 40gb file in a Windows directory), it will eventually come back to haunt you. By which I mean, Windows will spontaneously eat your Linux install. And when that happens, not even a spouse with expert-level Linux skills will be able to help you.

Being that I am a well-trained Hackbright student, I’ve been pushing all my project work to Github, so nothing was lost. (If anyone is interested, my Github is at https://github.com/aimeemorgan; almost everything I’ve done this summer is there.) This is where the “learning to restore a Postgres database from a dump” came in.

This disaster prompted me to finally part ways with Windows and go Ubuntu-only, so I suppose it’s a net positive. I’d been keeping Windows 7 around because there is proprietary software for my camera that I prefer to any of the Linux alternatives, but my husband just built a Windows box for gaming so I can use that when I need it.

My concerns for the coming weeks:

1. Natural language processing is a huge timesuck. And I mean that in the best possible way; I find it 100% fascinating. If I don’t watch myself, I’ll spend all of the next four weeks working on that and end up with no user interface whatsoever.

2. Another huge timesuck: playing with database queries. For example, I was fascinated to discover that the word “local” appears in descriptions of menu items only 78 times, while “imported” appears 2833 times — no doubt because the dataset is dominated by menus that predate the locavore movement.

Looking forward to another week of hacking on this project, although lots of other Hackbright events will keep me away from the keyboard. We’ve got field trips to SurveyMonkey, Facebook, and Google on the calendar. And tomorrow morning is a workshop on negotiating salaries. I’m so glad Hackbright has chosen to incorporate negotiation skills into the curriculum – let’s just say that a career spent in academic libraries has not prepared me well for that kind of thing.

This post was originally posted at Aimee Morgan’s blog.

Hackbright Week 5 (AKA: The Week of So Many Feelings)

(Alternate subtitle: “The Week of ‘Surely No One Will Notice that I Wore the Same Pair of Jeans Multiple Times This Week.'”)

By Aimee Morgan (Engineering Fellow, Hackbright Academy – summer 2013 class)

I’m not going to lie, week 5 was tough. As I’ve probably said before, it’s not that any one thing was overwhelming from a technical perspective – just the overall experience of learning new things for five weeks straight with little time to reflect on what we’ve just covered before moving on to the next thing. And the energy and focus that pair programming requires, even when you’ve got a great partner (as I had this week. Not that any of my partners have been slackers…)

I also suspect that a lot of us are putting tremendous pressure on ourselves to cover additional material outside of class – Coursera algorithms class, I am looking at you – and to attend every networking event.

It’s hard not to feel that I should be spending every possible moment on something code-related, even when I know that way madness (or burnout) lies. Which is not to say that I have been spending all of my time on code, just that I’ve been dealing with massive guilt for not doing so.

Oh, and I was sick for a good part of the week.

Much of week 5 was dedicated to building a movie-ratings web application that incorporated most of what we’d learned so far (Python, SQL, Flask, HTML) as well as introducing some machine learning concepts. But most of us decided on final projects this week, so there was a LOT of impatience to start working independently on those.

Other random happenings this week:

  • * On Monday, I gave a tech talk on two-factor authentication. Whee.
  • * On Wednesday, many of us went to Tesla’s headquarters for a Girl Geek Dinner. I got to ride in a Model S, which is a car that my husband and I have been lusting after for months (despite having zero justification for the purchase of a luxury sedan). Super smooth ride, great acceleration. A+++.
  • * Late in the week we had a lecture on jquery, which I’m going to have to revisit on my own – I was brain dead and none of it took. None of it.


It’s amazing to think that Hackbright is half over. I’m excited but anxious about the coming week – even though pair programming was sometimes frustrating, I’ve gotten used to the routine and the structure. From this point forward, we have a lot more freedom. I hope that I can settle quickly into a new (and productive) routine – there are still so many details to figure out for my final project, not to mention the actual writing of code.

One thing that our instructors have warned us about is the danger of getting sucked into the research and planning phase of our projects, at the expense of the coding and building phase. As a research-loving former librarian/archivist, I know I’m at risk for this. Note to self: you don’t have to know ALL THE THINGS right now. These 10 weeks are just the start of what (I hope) will be a career-long process of learning new things and acquiring new skills.

This post was originally posted at Aimee Morgan’s blog.