I believe that every graduate student should try to do at least two internships in the industry. It is a great experience. Below you can find a list I compiled by aggregating information from some of the companies I am in touch with as a part of our GraphLab project. This list is a academic resource - I am not involved in any of the companies below. I also got some angry comments about some company or another missing - this is a personal list. I will be happy to add more companies providing the are doing some interesting research.
Openings in the US - summer 2012
Note: Openings for summer 2014 are here.
Srinivasan Soundar from Bosch Research sent me the following: The Bosch Research and Technology Center, with labs in Palo Alto, CA, Pittsburgh, PA, and Cambridge, MA focuses on innovative research and development for the next generation of Bosch products. The data mining team is developing advanced statistical and machine learning methods for application to patient health and electronic medical records. We are looking for highly qualified, motivated, and innovative individuals to join our team. Internships are expected to be at least 10-12 weeks long during the summer months. Previous internships in our group have led to successful publications and/or patents. Topics include Latent Variable Models, Unsupervised Clustering, Privacy Preserving Data Mining and Association Rule Mining.
This is what I got from Grant Ingersoll, a well known Mahout contributor: Lucid Imagination, the leading commercial company for Apache Lucene and Solr, is looking for interns to work on building next generation search, analytics and machine learning technologies based on Apache Solr, Mahout, Hadoop and other cutting edge capabilities. This internship will be practically focused on working on real problems in search and machine learning as they relate to Lucid products and technologies as well as open source. Interested students should send their resume/profile, course work and evidence of open source activity (github account, ASF patches or other, etc.) to firstname.lastname@example.org. Note: position requires eligibility to work in the US.
In the NIPS big learning workshop I had the pleasure of meeting Vaclav Petricek who is a senior research of matching in eHarmony. eHarmony is an online dating startup, with around 33M users around the world, based in Santa Monica, LA.
The first time I heard about eHarmony is in John Langford's talk on Vowpal Wabbit at the same workshop. John mentioned, that out of the many companies who is using his software, he is most proud that Vowpal Wabbit is being used by eHarmony, thus promoting love in the world.
His an excerpt from their website, I was not aware of:
"Nearly 5% of all marriages in the U.S. are created by eHarmony. That’s 271 marriages per day."
This is absolutely amazing!
So if you like to promote love, and you are a graduate student in top US universities in a related area to machine learning you are welcome to apply here for internship. Relevant previous internship and an opensource project involvement are a plus. And tell them I sent you!
Ron Bekkerman, a senior researcher at LinkedIn is looking for interns for the coming summer.
With hundreds of millions of users, there is infinite amount of data and exciting new applications to explore.
RocketFuel, a company specializing in display advertising. I got the following from Abhinav Gupta, Founder and VP Engineering:
We’re hiring interns to work on machine learning/ optimization problems as well as our core platform (ad-serving, bidding, modeling and data infrastructure) built using a mix of proprietary and open-source technologies. We’re looking for those excited about working on tough problems related to scalable/ reliable/ available algorithms, machine learning, data mining and optimization. We are building a platform to do automatic targeting and optimization of ads. Our pitch to advertisers is very simple - If you can measure metrics of success of your campaign, we can optimize. We buy most of our inventory through real time auctions on exchanges such as Google Doubleclick. We’re integrated with real time exchanges processing requests @100k qps. We have over 1PB of data and growing fast.You can apply to RocketFuel here.
excellent talk at the NIPS big learning workshop where he identified some of the coming challenges in large scale machine learning. And here is what I got from him:
We're hiring data science interns to work on developing new (and not necessarily MapReduce-based) optimization and model fitting algorithms that can be used on data stored in a Hadoop cluster. Specifically, we're interested in ways to more closely integrate open-source projects like Spark, GraphLab, and modifications to MapReduce (such as AllReduce) with the rest of the components of CDH in order to optimize every step of the model building process, from feature extraction to model deployment to evaluation. At Cloudera, the work you do doesn't just impact our company, it impacts the entire Hadoop community.Additional opening in Cloudera is with Josh Patterson: building ML / NLP tools on Hadoop, HBase, and openNLP. Email him at email@example.com
If that sounds like fun, and you are a graduate student at a top US university in CS/math/operations research, email me your resume at firstname.lastname@example.org.
Shon Burton is the founder of Wildcog, a company specializing in assignments of technical dudes in top bay area companies. Currently they are working with Twitter, Tumblr, Palantir, and Yahoo!. And guess what? they are looking for interns! You are welcome to email Shon at: email@example.com
The wet dream for any big data lover. Who can have more data then Walmart - ranked no. 1 in Fortune 500 list? Patrick Harrington is looking for both interns and big data engineers:
@WalmartLabs is seeking outstanding engineers and scientists to build our next generation
multi-‐dimensional targeting system to help revolutionize eCommerce. This targeting
system aggregates a variety of user based signals, e.g., click stream, social, web,
geo-‐location, etc, and outputs a portfolio of relevant products on a user specific
basis. As a senior engineer, you will be joining a team devoted to increasing the
percent of sales attributable to targeting via developing a portfolio
of diverse data-‐driven algorithms and the underlying batch-‐oriented and
real-‐time systems. For more details about his opening, contact Patrick Harrington at:
And here is a note I got from Mike Spreitzer from IBM. He asks not to forget that IBM is very interested in big data, as the whole "smarter planet" thing is about big data. IBM has internships in both product divisions and in Research.
Additional internship positions are available in the data mining and business analytics dept in IBM. And here is what I got from Priya Nagpurkar, a research stuff member:
Data mining for business analytics is one of the primary areas of focus in our department this year. More specifically our focus is on systems support (software and hardware) for high performance analytics, with the goal of designing next generation systems. Potential topics include, performance analysis for hardware-software co-design, acceleration (e.g. GPU), optimization of storage systems. For more details contact Priya.
Other internship jobs are found using IBM general job search.
Well, as a former IBMer I have sweet spot towards IBM. So it definitely gets a place in my list!
One particular domain of interest is large graph-data analysis.
We are developing a DSL that simplifies implementing such algorithms and we are interested in all aspects from applications all the way down the hardware architecture. If you are interested in a great internship program in the SF Bay Area contact firstname.lastname@example.org
Mahout (and there are thousands if not more of users) knows Ted Dunning. To any question ever asked in the area of applied machine learning he knows the answer. After forming several successful startups, Ted has a new initiative for improving Hadoop infrastructure. He is looking for interns. His email is: email@example.com
Knewton is revolutionizing the practice of education with the world’s most powerful adaptive learning engine. We are a recognized leader in the
education and technology space by the World Economic Forum in Davos, and one of the top 25 best places to work by Crain’s New York Business. We're looking for Machine Learning interns with the know-how to help build an innovative online education system that adapts to each individual student. Interns will join a world-class team of data scientists and engineers who are pushing the boundaries of machine learning in both scalability and complexity. You'll get to work with a mountain of data and an exciting array of projects. If you have a passion for building scalable systems that analyze huge data sets and have coursework in machine learning, statistics, and advanced mathematics get in touch with us here.
Udi Weinsberg from Technicolor raised my attention that Technicolor are also looking for interns. Technicolor Palo Alto research lab studies personalized computing, data privacy and recommendation systems. You can apply here.
Openings in Europe
Nuria Olivier at our big learning workshop at NIPS about research done in Telefonica research. You can take a look at the slides here. In a nutshell once you have mobile phone call data combined with geographical data you can get into very interesting observations.
My avid reader alter0de sent me a link to internships in Xerox research center in Europe: http://www.xrce.xerox.com/About-XRCE/Internships. Thanks!