Hey I'm Bowen! I am a research scientist on OpenAI's Robotics Team. I am interested in environments that allow for unbounded learning, multi-agent reinforcement learning, and generalization to unseen environments (e.g. simulation to reality). So far I've worked on unsupervised multi-agent reinforcement learning, state estimation from vision, and most recently attention based network architectures for reinforcement learning policies. I have an M.Eng. in Electrical Engineering and Computer Science with a focus in AI from MIT, and also a B.S. in EECS and Physics from MIT. During my masters I worked on neural network architecture search, a subfield at the intersection of meta-modeling and hyperparameter optimization.


I finished my M.Eng. at MIT! My masters thesis, Towards Practical Neural Network Meta-Modeling, has been awarded the Second Place Charles and Jennifer Johnson Computer Science MEng Thesis Award from MIT’s EECS department.

I’ve accepted a full time role on OpenAI’s robotics team as a research scientist!

I’ve joined OpenAI’s robotics team as a research scientist intern!

I submitted my thesis to complete my M.Eng degree at MIT! Find it here.

I gave a presentation at for the Boston Machine Learning meetup on practical CNN meta-modeling. Slides here.

We’ve finally released the MetaQNN Code! Find it here.

I gave a presentation at Google Research - Cambridge on practical CNN meta-modeling. Slides here.

I gave a presentation for the MIT Vision Group on CNN meta-modeling. Slides here.


Towards Practical Neural Network Meta-Modeling

Bowen Baker

MIT Masters of Engineering Thesis

This thesis largely expounds the work presented in [Designing Neural Network Architectures Using Reinforcement Learning] and in [Practical Neural Network Performance Prediction for Early Stopping]. We present all the material described in these papers, as well as some updated results. Notably, after re-analyzing the MetaQNN models, we found that MetaQNN was actually able to achieve 4.7% error on CIFAR-10, a new record for models with only standard convolution and pooling layers. We also present some brief work on visualizing varying architectures and an improved algorithm for speeding up Hyperband.

Accelerating Neural Architecture Search using Performance Prediction

Bowen Baker*, Otkrist Gupta*, Ramesh Raskar, and Nikhil Naik

NIPS Meta Learning Workshop 2017

Methods for neural network hyperparameter optimization and meta-modeling are computationally expensive due to the need to train a large number of model configurations. In this paper, we show that standard frequentist regression models can predict the final performance of partially trained model configurations using features based on network architectures, hyperparameters, and time-series validation performance data. We empirically show that our performance prediction models are much more effective than prominent Bayesian counterparts, are simpler to implement, and are faster to train. Our models can predict final performance in both visual classification and language modeling domains, are effective for predicting performance of drastically varying model architectures, and can even generalize between model classes. Using these prediction models, we also propose an early stopping method for hyperparameter optimization and meta-modeling, which obtains a speedup of a factor up to 6x in both hyperparameter optimization and meta-modeling. Finally, we empirically show that our early stopping method can be seamlessly incorporated into both reinforcement learning-based architecture selection algorithms and bandit based search methods. Through extensive experimentation, we empirically show our performance prediction models and early stopping algorithm are state-of-the-art in terms of prediction accuracy and speedup achieved while still identifying the optimal model configurations.

Designing neural network architectures using reinforcement learning

Bowen Baker, Otkrist Gupta, Nikhil Naik, and Ramesh Raskar

International Conference on Learning Representations, 2017

At present, designing convolutional neural network (CNN) architectures requires both human expertise and labor. New architectures are handcrafted by careful experimentation or modified from a handful of existing networks. We introduce MetaQNN, a meta-modeling algorithm based on reinforcement learning to automatically generate high-performing CNN architectures for a given learning task. The learning agent is trained to sequentially choose CNN layers using Q-learning with an ϵ-greedy exploration strategy and experience replay. The agent explores a large but finite space of possible architectures and iteratively discovers designs with improved performance on the learning task. On image classification benchmarks, the agent-designed networks (consisting of only standard convolution, pooling, and fully-connected layers) beat existing networks designed with the same layer types and are competitive against the state-of-the-art methods that use more complex layer types. We also outperform existing meta-modeling approaches for network design on image classification tasks.

Determining the resolution limits of electron-beam lithography: direct measurement of the point-spread function

Vitor R. Manfrinato, Jianguo Wen, Lihua Zhang, Yujia Yang, Richard G. Hobbs, Bowen Baker, Dong Su, Dmitri Zakharov, Nestor J. Zaluzec, Dean J. Miller, Eric A. Stach, and Karl K. Berggren

Nano letters 14, no. 8 (2014): 4406-4412.

One challenge existing since the invention of electron-beam lithography (EBL) is understanding the exposure mechanisms that limit the resolution of EBL. To overcome this challenge, we need to understand the spatial distribution of energy density deposited in the resist, that is, the point-spread function (PSF). During EBL exposure, the processes of electron scattering, phonon, photon, plasmon, and electron emission in the resist are combined, which complicates the analysis of the EBL PSF. Here, we show the measurement of delocalized energy transfer in EBL exposure by using chromatic aberration-corrected energy-filtered transmission electron microscopy (EFTEM) at the sub-10 nm scale. We have defined the role of spot size, electron scattering, secondary electrons, and volume plasmons in the lithographic PSF by performing EFTEM, momentum-resolved electron energy loss spectroscopy (EELS), sub-10 nm EBL, and Monte Carlo simulations. We expect that these results will enable alternative ways to improve the resolution limit of EBL. Furthermore, our approach to study the resolution limits of EBL may be applied to other lithographic techniques where electrons also play a key role in resist exposure, such as ion-beam-, X-ray-, and extreme-ultraviolet lithography.



I am a co-founder at Perch. We are an early stage weight room analytics startup and went through the MIT delta v accelerator last summer (2016). I work on machine vision, rep tracking algorithms, and most other aspects of the product back-end.


I was a Data Science Intern at Quora in the summer of 2015. I worked on identifying and fixing categorically misused topics, improving automated topic labeling, and exploring topic geometries. I also helped in creating metric dashboards, responding to company data inquiries, and fixing bugs in data logging.


I was a Data Science Intern at AgilOne during my 2014 Summer break. I created a framework for validating customer data before running the machine learning models. On top of this, I built a deployment framework that would automatically select features to use and initialize models for new customers. I also did some minor work on the product front end.


Kinect 2-Chain

The Kinect 2-Chain was a project I worked on for HackMIT 2015. The goal of the project was to aid the visually impaired in navigation. We used a Kinect 2 to map the space in front of the user and send stereo audio signals with varying pitch to indicate the direction and distance of obstacles. We also used a deep learning API so that the user could also request that a description of the scene in front of them be read aloud. We took 2nd place overall and also won the Microsoft prize; some news coverage can be found here.

MIT Robotics Team

I co-founded the MIT Robotics Team in late 2013. I led the software team for 2 years, during which we placed 2nd in the 2014 NASA RASC-AL ROBO-OPS Competition and competed in the 2015 NASA Sample Return Centennial Challenge.