Once again I’m looking for an amazingly bright Ph.D. student to work with me over the course of the summer. The position is open to Ph.D. students from any university and at any point of their studies, and I can nearly guarantee it’s going to be an awesome experience.
The primary task will be applying machine learning techniques (lexical analysis, network extraction, predictive analytics) to the usage data from a large piece of commercial software. With a little bit of luck the software will be instrumented by this point in time so you’ll just need to slice and dice the data and find awesome stuff. The goal, of course, is to publish an amazing paper that provides great insight into how users actually use this type of software and provide guidance to architects and developers of such a system.
A loose list of skills that are desirable are:
- Java: Most of our tools are written in Java. It took me a while to get used to this, but Java has some nice advantages for developing code to run in an enterprise. Here at IBM we really love it and most of our software, including the tool we’re looking at, is built in Java.
- Software Engineering Processes: Domain expertise in understanding the relationships between the different levels of stakeholders in a software project is immensely helpful and will make it a lot easier to tease great bits of nuggets out of the data.
- Machine Learning: We use various types of machine learning, both Java libraries and some R to understand the data. On the Java side knowledge of text analysis packages such as OpenNLP is helpful.
- Statistics: I love R. If you love R it helps out.
- Visualizations: I’m big on making great visualizations to show off our findings. If you’re a ninja with ggplot or d3 then you probably qualify.
Of course, there’s a variety of other skills that are helpful too. The intern absolutely must be self motivated and able to find answers to questions on their own. This isn’t an unsupervised position, but I travel a lot and am frequently out of the office, which limits my ability to provide direct daily supervision. As a result, excellent communication skills are also helpful — you should know how to ask questions over email in way which is succinct while providing enough information to other people to answer the question. If you’ve got a great profile on StackOverflow you’re probably already there.
There’s some great advantages to spending a summer working with me at IBM TJ Watson Research in Yorktown Heights, NY. First, you’ll be working with some of the smartest people in the world at a facility that has an amazing legacy. IBM Research was the genesis of DRAM, the processors in all major video game consoles, Watson - the Jeopardy! playing computer, LASIK, and thousands of other things. We make the world awesome.
Second, our interns come from around the world and are generally smarter than we are. You know that feeling you get when you go to a conference? You’re always excited about new ideas and feel like you could go home and churn out your thesis in a week. Imagine that feeling for an entire summer! I had a blast when I interned here and met some incredible young researchers who I’m still friends with.
Thirdly, we’re just outside of New York in scenic Westchester County, NY. I took the train into the city every Friday, Saturday, and Sunday when I interned here. It was the perfect combination of excitement from New York City and a setting where you can really get work done. You may be saying “isn’t New York really expensive?”. You’re entirely right. Don’t worry, we pay enough that it’s totally worth your time.